We keep hearing that large language models have emergent properties. What does that mean? And how does that relate to a process of acceleration?
Complex systems are commonplace in nature. Perhaps it is better to say that nature is a complex system. Quotidian examples are flocks of birds or economies, which comprise numerous components that interact.
How to read these essays. The ideas in the essay and the video are not identical. While they stand alone, they are best savoured together, like a nice mature stilton and a glass of tokaji.
These systems exhibit behaviours on a larger scale that you couldn’t predict from the behaviours of individual entities or units within the system. Emergence is a fascinating characteristic of complex systems, and we’re quite familiar with them in various forms like traffic jams, stock market behaviours, and internet memes.
Reductive approaches often struggle to understand complex systems when a high level of complexity is at play. As the internet population grew, new behaviours emerged that we didn’t see when fewer people were online. Gamestonk and similar viral stock frenzies were not obvious outcomes of connecting the first Interface Message Processors in 1969.
Large language models (LLMs) or neural networks are complex systems because they consist of smaller units like neurons, subsystems, and local clusters that interact with each other by passing weights and their interactions with training tokens.
Emergent phenomena in LLMs include transfer learning, creative text generation, conversational skills, and abstract reasoning, which developers didn’t explicitly program. These behaviours arise through the complex interactions of different components within these massive machines1. And we don’t know why they arise. As this recent paper puts it: “It is still mysterious why emergent abilities occur in LLMs.”
Predicting emergent phenomena is quite challenging. Many complex systems like economies can be studied using complexity science methods such as agent-based modelling or simulation. LLMs, it seems, can only be simulated by building the system itself. Yes, it’s rather like Jorge Luis Borges’s short story On Exactitude in Science, where cartographers create a map that is as large and detailed as the land they are mapping.
So if we can’t easily simulate the system without building it,2 what tools can we use? We can attempt to understand training data and behaviour, trying to find whether we can predict or describe behaviours within the system. But accurately predicting emergent phenomena that arise across the whole system remains elusive.