🎄 The AI inference race; a theory of ageing; replacing transformers; Mastodon migration, animal memory, rsrsrs ++ #454
An insider's guide to AI and other exponential technologies
Hi all,
here with our last regular Sunday edition of the year as we slow down to rest & recharge. I want to wish you all a joyful holiday season — and a happy Christmas if you celebrate on the 25th!P.S. If you are in a pinch and need a last-last-minute gift for a future-curious relative, may I suggest a year of Exponential View membership? 😉
Latest posts
If you’re not a subscriber, here’s what you missed recently:
Sunday chart: The AI (infe)race that will shape 2024
The AI industry is experiencing a pivotal shift as inference costs1 plummet. This trend is evident in the competitive pricing strategies of companies: Mistral charges $0.65 per million input tokens and $1.96 per million output tokens, OpenAI offers $1.00 and $2.00, respectively, Fireworks.ai comes in at $1.60 per million output tokens, and Deepinfra presents a low $0.27 rate.
and ask whether it’s a “race to the bottom”. I wouldn’t call it that… The race is to decrease the prices, but the quality remains impressive. The market is now teeming with GPT-3.5 calibre models, both open-source and proprietary, and these are plenty good enough — especially when this cheap — to be built into useful applications.This pricing trend indicates the commoditisation of state-of-the-art LLMs, and a vast market expansion. For LLM providers, these dynamics are a double-edged sword. On one hand, an expanded market means more opportunities for application builders to innovate and create increasingly sophisticated applications. This expands AI use amongst the general population and creates a loyal group of builders who will stick to one foundation model as a base. On the other hand, many providers face intense price competition, often operating at a loss to attract customers, as mentioned before.
As the market continues to evolve, the strategic focus for companies will shift towards differentiation and leveraging technological advancements. The challenge ahead lies in maintaining profitability and unique value propositions in a sector where the only constant is rapid change and innovation. In addition, the providers who excel at cost because of their expertise in large-scale distributed systems could do well. One to watch is Vipul Ved Prakash (whom I have known for years) from TogetherAI, which offers decentralised cloud services.
Companies must navigate this dynamic landscape carefully, balancing cost leadership with strategic investments in technology. What emerges from this careful dance could shape the 2024 economic landscape.
See also: ByteDance secretly used OpenAI’s API to build their own LLM.
Key reads
The information theory of ageing. One of the most prominent geroscientists, David Sinclair, has published a new theory of ageing this week, one that he’s been working on for many years. The paper is publicly available here. The Information Theory of Ageing (ITOA) proposes that the loss of epigenome — information involved in regulating the genome — plays a critical role in ageing. David explained the ITOA in our 2020 conversation through a metaphor of a concert being played by an ageing pianist:
It’s beautiful when we’re young. Not a missed note. What I’m proposing is that during ageing, the pianist loses their ability to play. Perhaps she loses her eyesight and can’t read the notes perfectly. So by the time you’re my age — 50 years-old — it’s not sounding so good. There are some notes skipping even pages. And by the time you’re 80, it’s probably a pretty bad-sounding concert and you want to walk out of the hall. If I’m right about this, what’s interesting is that you can’t replace the piano, that’s inbuilt, that’s our genome. But you can either put spectacles or glasses on the pianist so she can read the notes. Or you can hopefully bring in a whole new pianist that can play the notes perfectly again.
Disruptions in the epigenome, or the ‘epigenomic noise’, are caused by environmental factors and cellular damage. If ITOA holds, we may be able to reverse some of the loss of information, reset the epigenome, and cure age-related diseases altogether. I hope that policymakers are keeping an eye on this!
Big AI year for big tech. After a slow year in sales, Apple will be looking to reinvigorate consumer demand in 2024. We can expect Apple to launch LLM-based products on their phones. The company’s AI researchers published a paper in ArXiv last week proposing a method to run large language models on devices with limited memory. As for Meta, according to the CTO Andrew Bosworth, the focus will be on the intersection between AI and the metaverse — mixed reality, wearable AI and embodied AI, aiming to build “models that experience the world the way humans do”.
The winning architecture. 2023 was the year of the transformer, the ultimate attention-based architecture. The AI community is increasingly looking at more efficient and simpler non-attention-based mechanisms. One contender are state-space LLMs, like Mamba and StripedHyenna. This post by
is a good primer.Market data
Major consumer LLM services receive about 2bn visits a month.
52% of Germany’s electricity will have come from renewables by the end of 2023.
More than half of Americans met their partner online in 2023.
Global debt levels are now at 238% of global GDP.
In 2023, $44bn was invested by VCs in climate tech, $20.7bn in genAI and $15.5bn in electric mobility.
Short morsels to appear smart at dinner parties
🦠 Scientists use AI to discover new compounds effective against antibiotic-resistant infections in mice. h/t EV member Fred Casella
🍃 Ireland is considering giving nature constitutional rights, inspired by human rights. h/t EV member
🎳 The migration of Twitter users to Mastodon is an opportunity to study collective behaviours. This paper in Nature looks at social influence and network characteristics that affect migration.
💭 Chimpanzees and bonobos can recognise friends they haven’t seen for decades — this is the longest-lasting social memory documented in non-human species.
🤣 How different languages express laughter online… haha jaja www ktk rsrsrs awdjyt
🎅🏽 End note
As a testament to how fast things have changed this year, which has felt like a decade, a vaguely thought about sharing an exponentially-themed Christmas carol co-authored by me and by robot assistant, ChanteurGPT. But things have moved so quickly that it would be horribly January 2023 for me to share it with you. (Although if I get 50 comments requesting it, I’ll consider it.2)
And that, in a sense, is the nature of the time-space compression we are all feeling.
The holiday period - and this period of Christmas and the New Year is the biggest holiday period for the UK, where I live - gives us a chance to move at a more human pace. If the machines want to race, they can race. I’ll be elevating myself, at least as far as the sofa, with mince pies and family games.
So if you have time off over the coming days, please enjoy them far from the madding buzz of notifications and ArXiv pre-prints. If you don’t, of course, please enjoy them too.
I’ll return next week to wish you all a Happy New Year with my 2024 outlook.
Merry Christmas,
A
Share your projects and updates with EV readers by telling us what you’re up to here.
Inference is the act of getting an answer or response from an AI system. The cost of inference is a key factor in the costs of using an AI tool.
Honestly, it isn’t worth it. ChatGPT remains a poor lyricist.
Merry Christmas everyone!