The Claude 4 mess-up of arithmetic subtraction is a bit of a non-issue, isn't it? Obviously, if instead Clause had been asked: "write a python script to work out 9.11-9.9 & execute it", it would have had no trouble. It is presumably not hard for Claude to have a script running in the background with some logic going: "does this look like a maths question? if so, write a bit of code, execute it and slot in the answer". A few more tokens to identify the likelihood a problem is a maths problem, but not many. Hardly a step to full-blown reasoning.
I enjoy this, but would be great if you could sustain ongoing research on a few topics, eg followup on Niall Ferguson discussion — “now what?”
The bird’s eye view is interesting but even more valuable if we can get into the meat of a few urgent strategic topics, eg China, US automation, etc. thanks
Re "We may need to apply hard budgets to autonomous agents – just like the robot police in THX 1138, who abandon a chase once the cost crosses a set threshold. This kind of cost-governed autonomy where agents must justify, cap, or cancel actions based on compute limits, could become a defining constraint. The inference bottleneck won’t kill agentic AI, but it might force it to act with surgical precision."
Constraints are part of what makes evolution so creative. Introducing specific costs limits, consistently applied and known in advance, will lead to the evolution of more efficient effective response strategies. In such circumstances it is likely that the robot police will have worked out a better response that simply stopping in their tracks.
Here is an suggested experiment, on a much smaller scale: Introduce different word limits to AI responses and observe effects on quality of responses to a specific prompt
Agreed on AI and water: especially in vulnerable regions like Nevada. We need to get a clearer picture of AI’s full lifecycle impacts. That includes the communities near mines extracting critical minerals, as well as those living next to data centres and power plants.
The incentives to reduce inference-related energy demand are broadly aligned, as long as we’re not firing up fossil fuel plants to meet that demand. But we also don’t want AI hoovering up all the new renewable capacity. We need that to decarbonise the wider economy.
As with any tech, it comes down to choices. AI doesn’t exist in a vacuum. It replaces or adds to existing activities. If it ends up driving more extraction and fossil fuel use than it prevents and the climate benefits stay vague or hand-wavy - that’s not a trade-off we should be making.
I think the mental model we use for AI agents is flawed. We picture a $2000 a month "knowledge worker" or talk about the cost of "always-on" agents, as if they were employees idly waiting for instructions.
Most engineers would recognise these metaphors as borrowed from HR and applied to a technical setting where they don’t quite fit. It’s an inefficient way to think about (and price) AI.
A more realistic picture: a centralised knowledge pool holding a project’s state, plus a swarm of specialised tasks that operate on that pool and sometimes orchestrate one another. Different models handle different tasks, much as in modern software stacks: high-reasoning models (e.g. o3, Gemini 2.5 Pro) coordinate faster workhorses (e.g. Claude Sonnet, GPT-4.1).
Pricing those agents monthly would kill the very flexibility their architecture affords.
The fact that we reach for that model at all shows we’re still viewing AI through outdated patterns.
I guess the seat-based model makes sense for some, especially big companies that want predictability. But it’s limiting.
I think the real upside will come from teams that let it go and embrace flexible, task-based orchestration.
I‘d love to see a dedicated deep dive on AI vs climate change, both sides of the medal
The Claude 4 mess-up of arithmetic subtraction is a bit of a non-issue, isn't it? Obviously, if instead Clause had been asked: "write a python script to work out 9.11-9.9 & execute it", it would have had no trouble. It is presumably not hard for Claude to have a script running in the background with some logic going: "does this look like a maths question? if so, write a bit of code, execute it and slot in the answer". A few more tokens to identify the likelihood a problem is a maths problem, but not many. Hardly a step to full-blown reasoning.
I enjoy this, but would be great if you could sustain ongoing research on a few topics, eg followup on Niall Ferguson discussion — “now what?”
The bird’s eye view is interesting but even more valuable if we can get into the meat of a few urgent strategic topics, eg China, US automation, etc. thanks
Re "We may need to apply hard budgets to autonomous agents – just like the robot police in THX 1138, who abandon a chase once the cost crosses a set threshold. This kind of cost-governed autonomy where agents must justify, cap, or cancel actions based on compute limits, could become a defining constraint. The inference bottleneck won’t kill agentic AI, but it might force it to act with surgical precision."
Constraints are part of what makes evolution so creative. Introducing specific costs limits, consistently applied and known in advance, will lead to the evolution of more efficient effective response strategies. In such circumstances it is likely that the robot police will have worked out a better response that simply stopping in their tracks.
Here is an suggested experiment, on a much smaller scale: Introduce different word limits to AI responses and observe effects on quality of responses to a specific prompt
You were talking about energy usage and AI and if it is worth it. I would be interested how see this in terms of water usage?
Thank you for referencing the Niall Ferguson article, giving a wider view of what’s at stake.
Agreed on AI and water: especially in vulnerable regions like Nevada. We need to get a clearer picture of AI’s full lifecycle impacts. That includes the communities near mines extracting critical minerals, as well as those living next to data centres and power plants.
The incentives to reduce inference-related energy demand are broadly aligned, as long as we’re not firing up fossil fuel plants to meet that demand. But we also don’t want AI hoovering up all the new renewable capacity. We need that to decarbonise the wider economy.
As with any tech, it comes down to choices. AI doesn’t exist in a vacuum. It replaces or adds to existing activities. If it ends up driving more extraction and fossil fuel use than it prevents and the climate benefits stay vague or hand-wavy - that’s not a trade-off we should be making.
I think the mental model we use for AI agents is flawed. We picture a $2000 a month "knowledge worker" or talk about the cost of "always-on" agents, as if they were employees idly waiting for instructions.
Most engineers would recognise these metaphors as borrowed from HR and applied to a technical setting where they don’t quite fit. It’s an inefficient way to think about (and price) AI.
A more realistic picture: a centralised knowledge pool holding a project’s state, plus a swarm of specialised tasks that operate on that pool and sometimes orchestrate one another. Different models handle different tasks, much as in modern software stacks: high-reasoning models (e.g. o3, Gemini 2.5 Pro) coordinate faster workhorses (e.g. Claude Sonnet, GPT-4.1).
Pricing those agents monthly would kill the very flexibility their architecture affords.
The fact that we reach for that model at all shows we’re still viewing AI through outdated patterns.
I guess the seat-based model makes sense for some, especially big companies that want predictability. But it’s limiting.
I think the real upside will come from teams that let it go and embrace flexible, task-based orchestration.