🔮 Code interpreter; decoding GPT-4; EV surge in China; $1600 meetings, brain health & observant llamas++ #EV431
Your insider guide to AI and exponential technologies
Hi, I’m Azeem Azhar. This week I took a couple of days off to think about wheat cultivation and the relationship between energy and society.
Every Sunday, I share my view on developments that I think you should know about in this newsletter.
Latest posts
If you’re not a subscriber, here’s what you missed recently:
Sunday chart: Interpret carefully
Just coming back from holiday,
was feeling a bit lazy with the Sunday chart, so he got Code Interpreter to do it for him.He set it the task of comparing the MMLU Benchmark, a common LLM yardstick, with ChatbotArena, a crowd-sourced, random chatbot battle platform.
In under 3 minutes, I loaded up a CSV file and let GPT-4 visualise and fish out the prime nuggets from the data. It managed to glean the core insights, that there was a positive link (r2=0.854) between the benchmarks, and that private models tend to outdo open-source on both benchmarks. As someone who doesn’t code every day, I estimate that without Code Interpreter this would’ve taken me around 30 minutes.
It is a groundbreaking tool. Back in February, I delved into how ChatGPT empowers users with a tool that offers deep dissection of particular problem sets, all for a modest monthly fee. Back then, it could only deal with text inputs. Now, with the birth of the Code Interpreter, it can deal with data. Its speed and accessibility are truly impressive, and that gives it revolutionary potential. See our quarterly overview of AI from earlier this week where we shared a paper looking into the long-term impact of AI on data science.
But like many revolutions, this one often falls short of accurately hitting its target. Code Interpreter turns every person into a data analyst without training them as a data analyst. Data can be scrutinised through multiple lenses, some more accurate than others. Unless the people using it have a thorough understanding of analytical methods, Code Interpreter could inadvertently contribute to the propagation of misinformation. Yes, it’s a powerful tool, but proceed with caution – and validate your outputs!
Key reads
Decoding GPT-4 Two weeks ago, we discussed rumours that GPT-4 uses a mixture-of-experts architecture. The murmurs are now getting louder. Semianalysis has released a technical deep-dive into GPT-4 based on insider knowledge, highlighting that OpenAI solved the problem of scalability while keeping costs low by dividing the work amongst 16 smaller expert models.
Some standout figures:
They estimate that GPT-4’s parameter count is ~1.8 trillion versus the ~175 billion of GPT-3,
Training FLOPS for GPT-4 was 2.15e25, approximately double the computing power needed to emulate the human brain (at the metabolome1 level),
Estimated training cost of $63 million on NVIDIA A100 GPUs2. If this were done on the new H100 GPUs, the cost would drop by roughly two-thirds.
The key takeaway is that GPT-4’s engineering is impressive but replicable; this is likely the reason they kept the architecture behind closed doors. Other companies will soon make models that match or surpass the capabilities of GPT-4, especially with new hardware (such as NVIDIA H100) making model training cheaper.
See also:
Excellent argument by Mel Mitchell on the need to measure AI performance, arguing for robust, transparent methodologies that assess AI’s understanding and its ability to generalize from varied inputs.
Anthropic continues to improve large-context window performance with the release of Claude 2. With the capability to handle 100k tokens, it is truly a best friend for TLDRers.
LLMs as cognitive synergists (agents that use multiple personalities) outperform a single or a fixed number of personas. via EV member Gianni Giacomelli
Rethinking transmission. Fossilised energy grids have long been viewed as one of the biggest obstacles to expanding access to renewable energy. In EV#417, we wrote
There’s a strong need to update these legacy systems, make grids as flexible as is needed for renewables, and reverse conservative regulation. This needs to happen fast: in the US, the capacity of generation and storage projects (a total of $4.3 trillion of projects) waiting to be connected to the grid far exceeds the capacity of all existing power plants!
Casey Handmer makes the case that the falling costs of batteries will mean that future grids will depend far less on increasing transmission capacity. By contrast to deflating batteries, transmission lines get more expensive with length. This could herald a future of much more localised energy systems.
Skills for the near future. George Monbiot identifies shortcomings of today’s education as limiting students’ readiness for an uncertain future. One, it is too rigid. Two, we don’t teach students about complex systems. Three, it doesn’t equip students with meta-skills and metacognition, the ability to think about thinking.
In Monbiot’s words:
[M]any students will complete their education without ever being taught the principles of complex systems. Yet everything of importance to us is a complex system. [...] Schoolchildren should be taught to understand how thinking works, from neuroscience to cultural conditioning; how to observe and interrogate their thought processes; and how and why they might become vulnerable to disinformation and exploitation. Self-awareness could turn out to be the most important topic of all.
See also, first-hand encounters with the difficulty of algorithmic decision-making in healthcare. EV member, doctor, healthtech investor and entrepreneur Vishal Gulati says: “Focusing too much on AI Hollywood tropes is diverting our attention away from how AI is harming patients today.”
Market data
$1600: the cost of a typical 30-minute internal meeting at Shopify.
OECD countries have seen a 3.6% annual decline in real wages in Q1 of 2023.
I hope you’re not travelling via Gatwick this summer holiday. 54% of their flights in June were disrupted, the worst airport in Europe.
In the first six months of 2023, the purchases of electric and plug-in hybrid passenger vehicles surged by 44% in China, equating to over 3.5 million units sold. Also see, the IEA’s first ever Critical Minerals Review emphasises the demands by the growing EV sector, the reason automakers are increasingly getting involved in the critical minerals value chain.
In 2022, for each dollar lent for coal projects, $14 planned for other similar loans were stopped.
Threads has a terrible 16% seven-day retention. (Twitter’s is 35%, Instagram’s is 60%.)
Short morsels to appear smart at dinner parties
🤳🌾 WhatsApp voice notes are helping transform how Senegalese farmers collaborate and share knowledge.
🧠 Use it or lose it. Brains get smaller without regular and frequent social contact with others.
🏗️ Using AI could help lower the enormous energy and pollution footprint of the construction industry. And… AI is automating astrologers.
🌱 Age reversal — making cells younger — is possible not just genetically, but also through specific chemical means.
🦙 Llamas perform better at a task after observing humans do it.
End note
Our new referral programme has kicked off nicely! It is super simple. Just share the newsletter with the button below. You can check how well you are doing at our Leaderboard.
Hope you enjoy!
A
📢 Over 260,000 global leaders rely on our weekly newsletter to navigate the transition to a new era driven by exponential technologies. We provide expert analysis and insider insights to help leaders think about the future with clarity.
If you want to showcase your company or brand in front of this influential audience, reach out 👉 connect with us
Thanks to EV member Etienne Pollard, who shared the counter-argument to the morsel we ran last weekend about the British taking the iron mass production process from Jamaica. You can read it here: Age of Invention: Cort Case.
What you’re up to — updates from EV members:
Paola Bonomo has joined the Scientific Board at the International Foundation Big Data and Artificial Intelligence for Human Development with the goal to steer AI towards serving the good of humanity by building links between research, businesses and society as a whole.
Emmet King and J12 Ventures announced the launch of Nexus, the catalyst program to build Europe’s next category-leading data and AI companies.
Definition: The metabolome is the global collection of all low molecular weight metabolites that are produced by cells during metabolism, and provides a direct functional readout of cellular activity and physiological status.
Estimate only of final training run cost ignoring experimentation, failed training runs and other costs.