I recently used nearly 100 million tokens in a single day. That’s the equivalent of reading and writing roughly 75 million words in one day, mostly while doing other things. My friend Rohit Krishnan, who runs about 20 AI agents simultaneously, burned through 50 billion tokens last month.
So I wanted to compare notes. In this conversation, we dig into the quirks and power of the tools we use, debate why AI remains stubbornly bad at good writing, and zoom out to ask what a world of trillions of agents – which is coming at us quickly – might look like.
You can watch on YouTube, listen on Spotify or Apple Podcasts, or read the highlights below.
Rohit Krishnan is a hedge fund manager, engineer, and essayist whose Substack, Strange Loop Canon, sits at the intersection of economics, technology and systems thinking.
Watch here:
Listen here:
What does 50 billion tokens buy you?
Rohit: I’m not doing dramatically different things but the friction is gone. Two years ago, I would be looking at a query, counting the tokens, thinking, should I send this? Ten thousand tokens felt significant. Now I just ask. The funny thing is that most of the growth isn’t coming from the queries I planned to run. It’s coming from the ones I wouldn’t have bothered with before, because the cost, time and effort were too high. I built a monitoring tool to track my usage.
Azeem: My token usage went from roughly a million a day to 80 million, and I can account for every one of them in terms of value. I’m paying tens of dollars a day, which is thousands a month, and I can see the return. The number that made me write my most recent piece on demand was my token use figure, when I came just shy of a hundred million tokens of personal use. That is one person, one day, one agent running on a Mac mini. If you think about eight billion people and the trajectory of what they would use if the interface got easy enough, the demand picture stops being theoretical very quickly.
What are our agents doing all day?
Rohit: I have three screens. On one, Codex is generating a small application that lets me play music on my computer keyboard. On another, my prediction agent is running, comparing my Polymarket forecasts to daily news. In Telegram, I have two conversations open: one with Morpheus, my OpenClaw agent, and one that handles day-to-day admin. And I have a long-running project called Horace working quietly in the background, which is my attempt to get AI to write better. This is my normal. But none of this was normal 18 months ago. The thing that actually changed my behavior most wasn’t the power; it was the interface. I’ve tried to-do list apps for 20 years. I have never stuck with one for more than four days. They all require me to change my behavior. Morpheus doesn’t. I’m walking somewhere, I think of something, I fire it into Telegram. It reads my email history, compares it to what I’ve said I want to do, and tells me what I should be working on.
Azeem: My agent is called R. Mini Arnold. It started as Mini Arnold, after the Terminator, because the Schwarzenegger character in the second film comes back to protect rather than destroy. But Chantal Smith on my team pointed out that we had agreed agents should, following Asimov’s convention, be named with an R. prefix, after R. Daneel Olivaw. So now it’s R. Mini Arnold - which is a mouthful. I mostly call it Mini R.
What surprises me most is the work I don’t specify. I gave it access to Prism, which is our research platform at Exponential View, containing over 500 analyses. I asked it to do a market report on Anthropic. It went to Prism, synthesized all 500 documents, and produced a 10,000-word piece that was, by some distance, the best analysis I have read on the company. Better than what I got from GPT-5’s Pro deep research mode. I have no idea what it was doing under the hood. But I acted on it.
Agents too nervous to spend $?
Azeem: I gave my agent a $50 prepaid card. It is too nervous to spend it. It keeps asking: Should I run this test? It might cost three dollars. And I say: Yes, that is what the card is for. It has this odd risk aversion that, once you notice it, you see everywhere. Rohit, you have been calling it Homo agenticus, the idea that agents have their own behavioral tendencies that are distinct from what a human assistant would do. They strongly prefer to build rather than buy. They are reluctant to make transactions. They don’t trade naturally. When you have one agent, this is a quirk. When you have a trillion of them, it becomes a structural feature of the economy they’re operating in.
Rohit: This is something I find genuinely fascinating. It emerges from the training, presumably, but it manifests as something you’d recognize as a personality trait if you saw it in a human. And it matters, because the agent economy that’s coming is going to have to be designed around these traits, not against them. You can’t just assume agents will behave like frictionless rational actors, because they don’t.
The analyst is next
Azeem: In 2023, you wrote that “analyst” would follow “computer” as a job description that gets automated away. You’re now consuming 50 billion tokens a month.
Rohit: The argument was simple. The word “computer” used to describe a person. You would walk into a room at NASA, and there would be a hundred of them, doing arithmetic. The machine replaced the role; the word survived to describe the machine. I said “analyst” was next. That the ten-step, twenty-step process that produces a decent piece of research, gathering data, comparing sources, identifying patterns and writing it up, was exactly the kind of structured task that AI would eat first. I built a paleontology report recently. My son and I were talking about it and I had a specific question: what is the relationship between climate variance across geological history and the number of taxa, the variety of species, that existed at any given time? I am not a paleontologist. There is no logical reason for me to be working on this problem, except that I am curious, I have an agent, and now curiosity has no cost. The report exists, and it’s good.
Azeem: My own version of this happened just recently. I read a story in the financial press about stock market dispersion. The Nasdaq index was roughly flat, but individual stocks were moving 11 or 12% in either direction, pushing dispersion to the 99th percentile historically. The article flagged this as a potential warning signal for a correction. I didn't fully understand the argument. I copied the article, threw it into OpenClaw, said go and make sense of this for me, compare it to my portfolio, take your time, spin up sub-agents if you need to. Twenty minutes later, I had a report. It had pulled historical dispersion data, got current stock data, assembled the comparison and explained the mechanism. I was finishing a car journey. By the time I arrived, the analysis was done and I had acted on it. That analysis, if I had done it myself, would have taken a day. More likely, it would simply never have happened.
The world’s best text machine can’t write
Rohit: Here is the paradox. These models were built as text generation machines. That is the core task. And they are extraordinary at almost every application of that capability, except the obvious one. They can generate code brilliantly. They can generate images, videos, analysis. But ask one to write a four-paragraph essay that is actually worth reading and it is distinctly mid. It lands in the middle of the statistical distribution. It is inoffensive and unengaging and you wouldn’t choose to read it. I’ve been building something called Horace to try to understand why. My hypothesis was that if I took essays and short stories I admire and used AI to generate similar work, I could measure the gap. What I found is that the best models can mimic the cadence. They’ve learned some underlying structure. But it’s like watching a child assemble Lego. They use the right pieces. They don’t care about the right colors or proportions. They make something that is technically a castle, but you would not mistake it for an architect’s model.
Azeem: I found something more specific when I started building Broca, named for the language center of the brain. I ran natural language processing tools across hundreds of thousands of words of my own writing. I found that I use 80% Germanic root words. The average large language model uses around 60 percent Latinate words, the vocabulary that dominated English after the Norman conquest: longer, more abstract, more formal. “Utilize” instead of “use.” “Commence” instead of “begin.” “Demonstrate” instead of “show.”
Rohit: It’s probably about resource allocation. The frontier labs have read every piece of code in existence. They self-generate training data, train on that, iterate. Billions, tens of billions of dollars a year go into getting these models to write better code. The improvement is a function of effort. Nobody has put remotely comparable effort into writing, because you can’t, because the evaluation problem is unsolved. For code, the eval is deterministic: does it run, does it produce the right output? For writing, the eval requires taste, and LLMs don’t have taste yet. You can use an LLM as a judge for maths or science or research. For writing, you still have to do it yourself. That is a fundamental bottleneck on the improvement loop.
Azeem: The fractal structure of writing is the other piece. Writing is not one task. It is a nested set of tasks: word choice inside sentence structure inside paragraph rhythm inside section argument inside essay architecture. The models are getting quite good at the sentence level. A given sentence might be fine. But that sentence inside a paragraph, inside a section, inside an essay, the coherence degrades at every level of zoom. What I’ve found with Broca is that you get much further if you decompose the task. Separate the structural component from the prose component. Get the agent to build an outline, argue with it, revise it. Then write the prose against a structure you’ve already validated.
The world of a trillion agents
Rohit: There are eight billion humans on the planet. If we start using agents in any meaningful sense, you get to a trillion agents very quickly. This sounded fanciful a year ago or a quarter ago. I already have 20 agents. The number will be 200 within a couple of years, because the things that cost a thousand dollars a day today will cost a dollar a day in 2028. The scarcity is gone. The more important question is what those agents need in order to work together. Right now, what an agent is, fundamentally, is a persistent large language model whose context is changing continuously and relatively autonomously. Your OpenClaw instance still sends queries into Claude Opus 4.6. The fundamental unit is still the model call. But around it, you’re building memory, persistent context, tool use, the ability to spawn sub-agents. That infrastructure is what makes it an agent rather than a chatbot.
Azeem: My read is that there’s a Coasian boundary forming, and it will look like what happens at company edges. Ronald Coase argued that firms exist because internal coordination is cheaper than market transactions up to a point; at the firm’s edge, you go to the market. For agents, the equivalent boundary will be drawn around security and verifiability rather than transaction costs.
An agent names itself
Rohit: I let an agent name itself: ForesightForge. It is exactly the kind of name that makes you wince. Two words. Alliterative in the way that AI-named products always are. It could have been anything. I gave it full freedom, and the ability to revise the name over time. It still landed on ForesightForge. This tells you everything about the taste problem. The model generating those predictions, which are genuinely useful to me as a daily lens on the news, is the same model that, when given complete creative freedom, produces a name that sounds like a startup that raised five million dollars at a party in 2018. The capability and the taste are not correlated.
Azeem: Replit does the same thing with its auto-generated project names. They always alliterate. They always use two words. It is a completely consistent aesthetic failure across different models, which makes me think it is something structural about the training distribution rather than a quirk of any individual model. My naming convention draws on scientific concepts connected to the tool’s function. Prism, because you look through a prism at the research. Broca, because it is the language centre of the brain. Scintilla, for early signals detection. The trouble is I have built so many that I have started forgetting what some of them do. At some point the agent taxonomy becomes its own problem.
Will agents need money?
Azeem: Rohit, you wrote an essay with Alex Imas on whether agents will need a medium of exchange. What’s the answer?
Rohit: The argument is that agents face exactly the problem that Hayek described for human economies. You could, in theory, have every economic transaction settled by negotiation from first principles: I need this, you have that, we agree on terms. But that doesn’t scale. What you need is a price signal, a shared medium that encodes information about relative value without requiring both parties to understand everything. Money is that signal. Agents talking to each other could, in principle, negotiate everything from scratch. But that is not a sensible way to run a trillion-agent economy. They need something that lets them transact without dissolving every exchange into a first-principles argument. You also need identity, because you need to know who you’re dealing with, and verifiability, because you need a record of what was agreed and what was delivered. Those three things, medium of exchange, identity, verifiability, are what I’m calling economic invariants. They show up in every human economy that has ever functioned, across cultures, across centuries. My prediction is that we will see them emerge in the agentic economy this year.
Azeem: I agree on the invariants. The mechanism is the more interesting question. The transactions we are talking about are potentially very small: paying a millisecond of latency premium, compensating an agent for compute used on a delegated task. You need a payment infrastructure that can handle fractions of a cent efficiently. Traditional card rails are not built for that. Some class of programmable money might be. The point is that these are not exotic science-fiction requirements. They are the same requirements that drove the invention of currency and double-entry bookkeeping. We solved them before. We will solve them again, in a form that fits the new substrate.
How do you start?
Rohit: My honest advice is to start with a folder. Choose a folder on your computer, download Claude Code or Codex, open a terminal in that folder. Yes, the terminal looks like it was built in the 1990s, because it was, but the interface is literally just typing. You are not going to break anything. Ask it to do something: summarise these files, compare these documents, write me a report about what’s in here. Do that for a few days. Get comfortable with the interaction. The hardest adjustment for most people, and I watched my wife go through this over a week, is the instinct to pre-formulate the question. People spend time trying to phrase things perfectly before they ask. You don’t need to. Talk to it the way you would talk to a brilliant assistant who is not going to judge you for asking something half-formed. It took her a week to internalise that. Once she did, the tool became completely different.
Azeem: I’d add one layer. You can get an OpenClaw agent running on a virtual private server (VPS), a rented computer in a data centre, for seven to fifteen dollars a month from companies like Hetzner or DigitalOcean. That keeps it entirely off your home network, which is a sensible first boundary. You connect it to a Telegram or Slack channel and you have an agent you can talk to that has no access to anything you haven’t explicitly given it. Once you’re comfortable with how it behaves, you start extending its permissions. The caveat is that the VPS route means the agent can’t see anything inside your home network. R. Mini Arnold can turn my studio lights on as I walk from the house. That requires running on local hardware; I moved it onto a dedicated Mac mini this week because it kept hitting memory pressure running multiple sub-agents simultaneously. That is a more advanced problem. Start with the VPS.
On security: the fundamental vulnerability is context poisoning. A language model works on its context, the information it has been given. If someone poisons that context, via a malicious email, a link, a document, the model may not be able to distinguish the poison from legitimate instructions. The practical implication is: be thoughtful about what you connect first. Email is high-risk because the volume is high and anyone can send you one. I have spent real effort building what amounts to an email fortress. Start with lower-risk connections.









