Well, I Can Hammer the CPUs & the GPUs & the NPUs of an M4Max MacStudio After All!: Chart of the Day

A world full of dark compute as the journey from ENIAC to Nvidia built a world of idle transistors; or, your MacStudio vs. the planet’s datacenters: a thought experiment in silicon and power…

Well, I Can Hammer the CPUs and the GPUs and the NPUs of an M4Max MacStudio After All!! AI—excuse me, GPT LLM MAMLM—workloads will do it, in this case OCRing an entire book in French and translating it into English:

It took it five hours. The reliability and fidelity of the translation is—unknown. I will report back.


How much computation can an M4Max MacStudio with 36GB of memory do, according to some reasonable metric that combines integer, floating point, parallel, and signal processing workload performance?

As best as I can tell, think of the beast as capable of delivering 100,000,000,000,000 8-bit equivalent-computation operations per second: one hundred trillion, 100 TOPS. <https://flopper.io/gpu/apple-m4-max> <https://en.wikipedia.org/wiki/Apple_M4>.

In 1946, there were a handful of ENIACS: not 100,000,000,000,000 but rather 5000 OPS: 0.00000000005, 5 x 10^(-11) as much computation potential as exists burbling away under the table in the corner of the dining room—and that is not counting the watch and the iphone currently charging, the laptop I am typing this on on my lap, or the ipad in rthe corner.

  • 1966: Guess at 5 x 10^8 (compared to my living-dining room’s 10^14 OPS)? By 1966 Seymour Cray’s CDC 6600 was on the scene, with performance is quoted at “up to three megaFLOPS” by its 400,000 transistors <https://en.wikipedia.org/wiki/CDC_6600>. We had then a few dozen CDC 6600-equivalent machines, plus a few hundred lesser mainframes.

  • 1986: Guess at 10^13 OPS available? 1986 sees Hilbert-López’s <https://www.martinhilbert.net/worldinfocapacity-html>. That is 1/10 as much as I have around me.

  • 2006 sees Hilbert-López say: 10^5 TOPS: the world then having a 1000 times as much computational capability as I have.

  • And for 2026? Today? Maybe 10^10 TOPS? Maybe I have 1 x 10^(-8)—one hundred-millionth—of the world’s computational capacity in my living-dining rooms?

That would seem low. The average one of the roughly 10^10 humans has 1/100 of the computing power that I have? Yet I think that is about right, because the overwhelming chunk of computational capability is not in individual people’s houses. My “share” consists not just of what is in my house, but of a pro-rata tranche of all the machines on the Berkeley campus and in the cloud that I interact with and that watch me, plus those machines that I do not interact with but that are dedicated to tasks aimed ultimately at selling things to or communicating with me. If my personal machines are a tip-of-the-iceberg tenth, than I have a 1000 times average computational capability as my “share”, which feels like a not unreasonable guess.

So 10^(-8) of the world’s computational capability, 100 of the world’s 10^10 current TOPS, feels like not a bad anchor for thinking about this.


Note that the twenty years after 2006 have seen a true phase change. Three things happened:

  1. Smartphones, PCs, and IoT turned not quite everyone into a computer owner.

  2. Hyperscale data centers and AI accelerators turned a few firms into de facto stewards of the world’s serious compute.

  3. Most recently, “AI”.

With respect to (3), Epoch AI right now is trying to do for AI chips what Hilbert did for all computing. They guess that “total available computing capacity from AI chips across all major designers has grown by approximately 3.3× per year since 2022,” with NVIDIA at over 60 percent of total. <https://epoch.ai/data-insights/ai-chip-production>. The Chip Letter has an alternative rough count of (a) the TOP500 supercomputer list’s aggregate Rmax (about 5.2 exaFLOPS FP64 in mid‑2023), (b) public cloud CPU and GPU deployments, and Nvidia H100 shipments in 2023—a public‑cloud‑plus‑supercomputer total on the order of 25–30 exaFLOPS (FP64) in 2023 <https://thechipletter.substack.com/p/deep-thought-deep-learning-and-the>. A sound‑bite for now would be: roughly half of the world’s serious data center compute is now under the control of a couple of dozen “hyperscale” operators, and that share is rising.

These three things mean, of course, that issues of utilization and controllability come to the fore. Outside of the datacenters, it is:

  • battery‑constrained,

  • idle or running trivial workloads most of the time, or

  • behind security walls that make them unusable for planetary‑scale distributed computing except in very special cases (Folding@home, crypto mining, etc.).

“Most of the world’s compute” now sits in people’s pockets and on their desks. But high-performance computing, intensively used cloud, and on-premises enterprise with solid workflows is—perhaps?—4 x 10^8 of that 10^10 total TOPS. And that 4 x 10^8 is probably doing 80% of the real work at an average moment in time.

So the phase change from computational capability being truly scarce, hence on average intensively used, to it being much more abundant than we can think of ways to use the overwhelming bulk of it means that the real question I want to ask is a different one: Not “how much and what share of the world’s computational capability can I, personally, deploy here in my house?” Rather “how much as a share of the work that computers are doing all over the world could I do, personally, in my house if I could fill the pipeline and hammer my CPUs, GPUs, and NPUs 24/7?” So I should be comparing my 100 TOPS to the 5 x 10^8 TOPS of computational capability that—I guess—are in intensive use at any moment. That gives me:

2 x 10^-7

0.00002%

Two-thousands of one percent.

Or: 500 people like me, if we could truly get our acts together and fill our personal-machine pipelines, could do as much work as 1% of the work done by computers worldwide.

That is, truly, not too shabby.

Alternatively, turn that around: If we could fill the global distributed computational-capability pipeline—which we cannot—we could now be doing 20 times as much in the way of useful computation as we are. The clouds and hyperscalers are lit up and burning power. Your phone and mine are overwhelmingly sitting in our pockets, idle, waiting to be poked. As are all of my personal machines—which are, if I add in those of my immediately family, not 100 TOPS but rather 250 TOPS.

I am not sure what the implications of all this are. Below is a speculative and probably wrong take, that I embarrassed to have distributed too widely (yet)”:

Read more