I suspect if Steve Jobs were alive he would have created an user experience that interfaced to an Apple LLM model which would blow our minds. He would have created a fully interactive experience with a next generation SIRI that would be far out ahead of what our current experience is (full interactive speech, memory, personalization, etc...). I remember the magic of the first mac and iphone and I suspect that he would have given us imind or something akin.
If you run a local model like qwen 2.5 coder non-stop 24/7 on a top-of-the-range Macbook Pro (M5 Max 128Gb), you'll get ~2 million output tokens a day.
Compare that to running the same model on the cloud at *today's* rates; it would take over a decade before your original investment is repaid.
There's a lot of assumptions being smuggled in here – I'd love to see them properly explored.
Your Mac is doing two things. First, running the OpenClaw harness and any additional tools you build around it — the machine operates essentially 24 hours a day, handling crons and scheduled tasks, and it's also where you run Claude Code and GPT Codex, which soaks up a significant amount of activity. I do try to max out my Claude Code and GPT budgets most days, though I don't always succeed.
Second, it's handling all the outbound calls to whatever external LLM you're using. As a raw LLM processor right now, local hardware will struggle to do anything genuinely useful — but as we know, open-source models are improving rapidly. There's already a whole class of tasks you could handle with one of the distilled models, if not today then within three or four months. That will steadily move the frontier between what you run locally and what you push out externally.
As an example: we don't currently run any models locally in anger — not even the heartbeat models. We do run some agentic simulations locally when a task might run to 400–500 million tokens, using much cheaper models like Hunter Alpha or Qwen.
The upshot is that with all these workstreams running simultaneously, you want a reasonably capable local machine. An M1 Mac mini with 16GB of RAM would regularly max out and become very slow under this kind of load. That hasn't happened often with R Mini Arnold’s mac.
1. Mac minis are great home servers – that's a "CPU", price and form factor story
2. OSS models (particularly the claude distillations) are becoming actually useful
3. Mac consumer hardware is suprisingly good at inference
And those themes are possibly converging because if you need a beefy mac to run OpenClaw, you also get hardware capable of running OSS models?
It seems that a lot hangs on how what models you can get running on consumer hardware; right now it's only very small OSS models that run on even the top-end macbooks (irrespective of how good OSS models get). I see a few folks are doing things like quanitzing and routing to different layers of OSS MoE models so that's probably a space to pay attention to.
The main reasons I can think to run something locally are speed, privacy or cost. Cloud is always going to win on cost (spare local compute is marginal), there's a long way to go to compete on speed (local inference is currently a couple of order of magnitude behind), so that just leaves privacy, and I don't think many people actually care about that.
So on that basis, you could get to an Opus 4 quality model running on a 24gig macbook middle of next year -- maybe. Obviously this improvement will slow down.
But I'd expect our Mac Studio (only has 96 gb of RAM) to be able to run a Qwen3.5 270b quality model by July. I agree, it feels like its quite far out, but I also think we're getting to the point that for many tasks we do daily, we might not need much better than we currently have.
Ran a headless Mac Mini for my AI agent for about a month now. The hardware story changes completely when you stop thinking of it as a computer and start thinking of it as infrastructure.
The transition was rougher than expected though. Screen capture and UI automation break immediately without a virtual display. Tailscale for remote access, passwordless sudo, sleep disabled. Once those are in place it runs like a server but with native macOS apps.
The cost argument is real too. VPS at equivalent specs runs $150+/month. Mac Mini pays for itself in under a year. What I didn't expect: the M4 handles 7B parameter models locally without breaking a sweat. Runs inference, cron jobs, and browser automation simultaneously.
Mac Mini running OpenClaw is one thing. Another is the MacBook Neo. Really good battery life, perfectly capable for everyday office work and even things like 4K video editing. And the price.
Considering how enthusiastically Microsoft has been making Windows worse for months and years, I wonder if we are going to see a huge shift away from Windows PC's starting to happen.
Mac is now affordable. Valve continues doing fantastic job eating away Windows for gaming. And Linux in general is becoming better and more user friendly.
Is there any reason at all to use Windows anymore unless there's an absolutely necessary legacy piece of software that doesn't work on any other platform?
I don’t know. Apple has a good share of corporate laptops (at least in the creative industry), laptops that probably are replaced by mac mini, as much as workers are replaced by AI Agent (semplification). Still I don’t see how this would be different from a cannibalization, I won’t call this a very smart AI strategy based on hardware selling. Not to mention smartphones: AI inside iOS is none. Google is far ahead on this.
So betting that the AI-transition value could be captured because of the mac mini boom, while iPhones (and its accessories), Macbooks, and probably also Apple Watches and iPads, will decrease their market share due to the lack of GenAI OS integration, well I don’t see this as a smart move
Google does have many advantages. But it isn’t the iOS ecosystem. for those who care.
it is far from proven that google’s feature hardware will be desirable or usable and the apples won’t be if I had to bet on one producing hard we cared about it would be Apple.
I suspect if Steve Jobs were alive he would have created an user experience that interfaced to an Apple LLM model which would blow our minds. He would have created a fully interactive experience with a next generation SIRI that would be far out ahead of what our current experience is (full interactive speech, memory, personalization, etc...). I remember the magic of the first mac and iphone and I suspect that he would have given us imind or something akin.
If you run a local model like qwen 2.5 coder non-stop 24/7 on a top-of-the-range Macbook Pro (M5 Max 128Gb), you'll get ~2 million output tokens a day.
Compare that to running the same model on the cloud at *today's* rates; it would take over a decade before your original investment is repaid.
There's a lot of assumptions being smuggled in here – I'd love to see them properly explored.
Your Mac is doing two things. First, running the OpenClaw harness and any additional tools you build around it — the machine operates essentially 24 hours a day, handling crons and scheduled tasks, and it's also where you run Claude Code and GPT Codex, which soaks up a significant amount of activity. I do try to max out my Claude Code and GPT budgets most days, though I don't always succeed.
Second, it's handling all the outbound calls to whatever external LLM you're using. As a raw LLM processor right now, local hardware will struggle to do anything genuinely useful — but as we know, open-source models are improving rapidly. There's already a whole class of tasks you could handle with one of the distilled models, if not today then within three or four months. That will steadily move the frontier between what you run locally and what you push out externally.
As an example: we don't currently run any models locally in anger — not even the heartbeat models. We do run some agentic simulations locally when a task might run to 400–500 million tokens, using much cheaper models like Hunter Alpha or Qwen.
The upshot is that with all these workstreams running simultaneously, you want a reasonably capable local machine. An M1 Mac mini with 16GB of RAM would regularly max out and become very slow under this kind of load. That hasn't happened often with R Mini Arnold’s mac.
Thanks for replying. So, three themes:
1. Mac minis are great home servers – that's a "CPU", price and form factor story
2. OSS models (particularly the claude distillations) are becoming actually useful
3. Mac consumer hardware is suprisingly good at inference
And those themes are possibly converging because if you need a beefy mac to run OpenClaw, you also get hardware capable of running OSS models?
It seems that a lot hangs on how what models you can get running on consumer hardware; right now it's only very small OSS models that run on even the top-end macbooks (irrespective of how good OSS models get). I see a few folks are doing things like quanitzing and routing to different layers of OSS MoE models so that's probably a space to pay attention to.
The main reasons I can think to run something locally are speed, privacy or cost. Cloud is always going to win on cost (spare local compute is marginal), there's a long way to go to compete on speed (local inference is currently a couple of order of magnitude behind), so that just leaves privacy, and I don't think many people actually care about that.
Densing Law. Capability density in smaller/open-source models doubles every 3.3–3.5 months (R²=0.93). (https://www.nature.com/articles/s42256-025-01137-00
So on that basis, you could get to an Opus 4 quality model running on a 24gig macbook middle of next year -- maybe. Obviously this improvement will slow down.
But I'd expect our Mac Studio (only has 96 gb of RAM) to be able to run a Qwen3.5 270b quality model by July. I agree, it feels like its quite far out, but I also think we're getting to the point that for many tasks we do daily, we might not need much better than we currently have.
Thanks for the share!
Re Qwen 3.5, I have good news for you. Today, someone already got it running on a macbook at 10 tokens/sec.
https://x.com/jtdavies/status/2034805696240218555
Ran a headless Mac Mini for my AI agent for about a month now. The hardware story changes completely when you stop thinking of it as a computer and start thinking of it as infrastructure.
The transition was rougher than expected though. Screen capture and UI automation break immediately without a virtual display. Tailscale for remote access, passwordless sudo, sleep disabled. Once those are in place it runs like a server but with native macOS apps.
The cost argument is real too. VPS at equivalent specs runs $150+/month. Mac Mini pays for itself in under a year. What I didn't expect: the M4 handles 7B parameter models locally without breaking a sweat. Runs inference, cron jobs, and browser automation simultaneously.
Mac Mini running OpenClaw is one thing. Another is the MacBook Neo. Really good battery life, perfectly capable for everyday office work and even things like 4K video editing. And the price.
Considering how enthusiastically Microsoft has been making Windows worse for months and years, I wonder if we are going to see a huge shift away from Windows PC's starting to happen.
Mac is now affordable. Valve continues doing fantastic job eating away Windows for gaming. And Linux in general is becoming better and more user friendly.
Is there any reason at all to use Windows anymore unless there's an absolutely necessary legacy piece of software that doesn't work on any other platform?
I don’t know. Apple has a good share of corporate laptops (at least in the creative industry), laptops that probably are replaced by mac mini, as much as workers are replaced by AI Agent (semplification). Still I don’t see how this would be different from a cannibalization, I won’t call this a very smart AI strategy based on hardware selling. Not to mention smartphones: AI inside iOS is none. Google is far ahead on this.
So betting that the AI-transition value could be captured because of the mac mini boom, while iPhones (and its accessories), Macbooks, and probably also Apple Watches and iPads, will decrease their market share due to the lack of GenAI OS integration, well I don’t see this as a smart move
Google does have many advantages. But it isn’t the iOS ecosystem. for those who care.
it is far from proven that google’s feature hardware will be desirable or usable and the apples won’t be if I had to bet on one producing hard we cared about it would be Apple.
The stampede thesis will be led by Apple in a starfish hybrid model.