🔮 The complicated costs of genAI; Gemini & bias; good news for energy transition; AI patents, Nvidia, gambling ++ #462
An insider’s guide to AI and exponential technologies
Hi, I’m Azeem Azhar. I advise governments, firms, and investors on how to make sense of our exponential future. Every Sunday, I share my views on AI and other exponential technologies in this newsletter. I am also on LinkedIn, Threads, and Substack Notes.
Sunday chart: The complicated costs of generative AI
AI chip startup Groq (not to be confused with Musk’s Grok!) has put all its chips on the table. The team released a demo demonstrating Mixtral, an open-source LLM, running through their API, generating responses four times faster than other services at highly competitive rates. Unlike NVIDIA, which produces GPUs with versatile capabilities across various tasks, Groq’s processors are tailored for specific, high-performance AI computations, potentially offering more specialised efficiency in these areas.
Groq addresses one problem that both startups and large enterprises looking to scale their AI products: the cost of running models. As one anonymous founder said, they only make money “if people don’t use the product.” I’m hearing that enterprises are facing sticker shock when they move from a proof-of-concept, for a few users, to widespread deployment across their orgs.
To add to the complications, we’re now in a phase where LLMs are one part of increasingly complex systems. Building valuable applications involves linking AI models with databases and other applications and even sampling multiple LLMs. These systems yield more robust results, but they also multiply costs and create a challenging headache due to the need to constantly update components within a rapidly evolving ecosystem.
For any firm, this volatility in innovation makes planning hard. Should you buy now? Or wait a bit longer, when prices come down? Should you build for scale straight out of the gate or risk unsustainable economics later down the line?
AI software startups need to figure out when and whether their unit economics can work, especially in an environment where capital is scarce. The FTC is watching Big Tech acquisitions like a hawk, stemming from concerns about their computing dominance. Consequently, many startups may find themselves without a viable exit strategy.
And that brings us back to Groq. They aren’t an AI software company, yet they face similar questions about their unit economics. Their eye-catching release is likely a fundraising tactic, and the broader economics of their hardware costs raise questions about long-term viability. SemiAnalysis breaks down their economics and questions the hardware’s ability to handle larger models and context windows:
The question that really matters though, is if low latency small model inference is a large enough market on its own, and if it is, is it worth having specialised infrastructure when flexible GPU infrastructure can get close to the same cost and be redeployed for throughput or large model applications fairly easily.
Groq’s gone all in, but unfortunately, their niche combined with high costs may prevent them from scaling effectively – a complicated reality for many AI startups.
Key reads
Gemini’s generalisation. Gemini sparked political rage this week by generating images of historical people and groups with unexpected ethnic diversity. The eyebrow-raising performance likely results from Google coaching the models to respond with diverse types of people, addressing the well-identified “CEO vs receptionist” problem of its search results. Google aimed to reduce bias in its models (which stems from biases within its internet training data), but the outcome compromised historical accuracy.
It highlights two fundamental issues. The first is that LLMs are hard to engineer and control. They are unreliable infrastructure. Easier to use for narrower uses or more accepting users than the breadth of the Internet, where every edge case becomes a PR calamity. For large platforms like Google, it also shows the limits of their internal testing, perhaps making the case for much more comprehensive, perhaps externally-led, red-teaming processes.
The second is that, ultimately, technology infrastructure is political. As I said in my book:
The issue here is less whether the digital platforms are making the right calls, and more whether we want these questions to be handled by private companies in the first place.
Gemini is the most exaggerated example of an effort to deliver a broad-based view of a diverse world. After all, 7.5 billion people are not Americans, 6.5 billion are not Indian, 7.888 billion are not Maltese. The problem is Sisyphean, but Google was also cack-handed.
Our experience of the world is going to be filtered by AI systems. The people producing them need to act with a greater degree of transparency. Google’s search boss, Prabhakar Raghavan, offered an apology, but its diagnosis is not sufficiently detailed yet.
My book argues that dominant firms in this position should disclose their processes and decision-making for wider scrutiny. I expect this mini-fracas to increase interest in localised AI models for particular countries.
Silicon reset? After their short sojourn in other states, startups are returning to San Francisco. The AI boom has triggered a revival in the Bay. The Wall Street Journal highlights the homeward journey of individual tech founders and companies, while The Economist emphasises the city’s powers of agglomeration - its universities, companies, startups, VCs and mythic allure to the technologically inclined. This power has revitalised the city, but its preexisting problems live on. The Economist points the finger at governance failure, which is undoubtedly an issue, but let’s not forget: it’s a two-way street. The tech community has played its part in shaping the city and added to its challenges (see EV#459).
Doubt(ing the exponential transition). The New Yorker released a piece on Vaclav Smil, an intellectual giant and one of the three researchers I consider a major influence on my work. I spoke with Smil on my podcast in 2019, when he gave one of his rare interviews. I’ve read perhaps ten of his books. The piece highlights his perspective on the Exponential Age, which is contrary to mine:
We are told that rapid exponential growth, driven by digitisation and advances in AI, already prevails in such fields as solar cells, batteries, electric cars, and even urban farming.” Such growth, where it actually does exist, can’t continue permanently. Belief that it can, in his view, is consistent with “America’s Barnumian approach to science and innovation, where every dubious claim is treated as ‘transformative change’ and where every patently impossible promise”—nuclear fusion, high-temperature superconductivity, the colonisation of Mars—“is worshipped as another effusion of history’s most brilliant minds.
Smil is simultaneously right and wrong. While not every ‘dubious claim’ leads to exponential change, some – particularly those within the oil industry he highlights in his works – certainly do. Exponential technologies have emerged and will continue to appear in our lifetimes. While their growth isn’t eternal, existing technologies like solar, batteries, and AI are still in the early stages of their S-curve, with years to run. New technologies will replace them. And they will have their own exponential phases.
Smil’s most interesting book - which I hope he will be able to write - will be written in 20 years when the energy systems in the bulk of the world’s economy will be non-fossil. His after-the-fact explanation will be the best around, even if his vision of the future is exponentially more sceptical than it needs to be.
Newsreel beta
This new section we’re testing contains important news items about AI and exponential technologies for the week.
OpenAI and Microsoft disrupted 5 state-affiliated attempts to exploit AI for malicious purposes.
Researchers show that ChatGPT can help predict future investments by interpreting managerial expectations.
Google released Gemma, a family of small open-source models.
Google reportedly made a $60 million deal with Reddit to use their data to train AI.
Nvidia’s revenue for the third quarter of 2023 was 265% higher than a year earlier.
BYD launches a 420km (260-mile) range EV for under $15,000.
CATL, the world’s leading battery storage maker, prepares to halve the cost per kWh of its lithium iron phosphate (LFP) cells by mid-2024.
Data
48% of news websites are blocking OpenAI’s crawler.
BNEF projects 574GW of new solar additions this year, a 29% increase from 2023.
2.34 billion metric tonnes of rare earth metals were discovered in Wyoming, dwarfing China’s 44 million tonne reserves.
Less than 8% of the world live in a full democracy, while almost 40% live under authoritarian rule. Via EV member Claudia Chwalisz.
Four-fifths of the under-40s surveyed in 14 countries think social media has benefited their democracy.
Cultivated meat funding dropped 78% in 2023.
The latest from Exponential View
Will AI make us dumb? We’ve seen this question asked more and more recently, so we invited EV member Gianni Giacomelli to answer it. Gianni is one of the leading experts in collective intelligence and organisational design, with over 25 years’ of experience in senior leadership roles in innovation.
Short morsels to appear smart at a dinner party
America had its record gambling year. A way to deal with uncertainty?
The eyeball test is an indicator of how empathetic a society is.
Basal cognition: thinking without brains.
AI helps detect 40% more polyps during colon examination.
The decimal point is 150 years older than previously thought.
End note
We’re testing a new section in the newsletter: newsreel. It highlights important events this week that didn’t make our primary analysis in Key Reads. We’ll run this test a little longer to see if it works.
Cheers,
A
What you’re up to — community updates
Tom Loosemore has published The Radical How, a paper making the case for governments adopting exponential-era ways of working.
Dominic Bristow has launched a targeted fundraising round at his company, Stylus, which offers AI-marking of paper-based assessments in schools.
Matt Webb writes about moving from technology inspired by Star Trek to Douglas Adams.
Fabian Westerheide has joined the AI fund as a founding partner.
Rudy de Waele’s RegenerateX is running a 12-week online program focusing on regenerative business innovation. Get a discount with the code EXPOVIEW24.
Share your updates with EV readers by telling us what you’re up to here.
Pl. continue newsreel, it’s very helpful.
I like Newsreel.