Please: Enough with the Claims That Modern Advanced Machine Learning Models Hallucinate Only Rarely

Rise of the Sparrows, disappearance of the world model: what MAMLM hallucinations are trying to tell us. Treating large language models as semi-sober research assistants misunderstands what they are: compressed correlation engines, not minds. A modern chatbot confidently listed the “second volume” of a fantasy series as a book that doesn’t exist, producing a botched fantasy bibliography plus an unhinged plot summary. A week later, it produced a a lovingly described but substantially imaginary state of Europe during World War II. This is what you deserve to get when you ask a correlation machine to impersonate a thinking entity with a world model substantially containing an accurate Visualization of the Cosmic All. You don’t just risk getting a detail wrong—you risk wandering into a parallel universe of invented novels and impossible borders; leading one to ask what, exactly, these systems are doing under the hood; and getting nothing in response but that ASI is coming any moment now.

It requires very careful iterative prompting to get MAMLMs into conformity with reality—and often even that does not work well enough.

One example:

Share DeLong's Grasping Reality: Economy in the 2000s & Before

We don’t even have to get into the strange political boundaries of axis allies. (Or just what is that country sandwiched between Poland and Romania supposed to be? And is that an Adriatic Republic separate from Italy spanning Venetia and Dalmatia? And what is that map stretching/projection anyway? And how has Turkey managed to conquer all the land up to the crest of the Caucasus Mountains?)

You look first at the absence of Vichy France (then in a situation parallel to then-Spain), then at the Nazi occupation of Sweden, third at the failure to register Finland as an active Nazi-allied belligerent, fourth at the independent allied power of southern Norway, and fifth at the Nazi occupation of Great Britain north of historic Wessex.

Or there was the time last week when I asked it for the title of the second volume of Elizabeth Wheatley’s <https://www.elisabethwheatley.com/> “Daindreth” fantasy series:

The second volume of The Daindreth Saga is Daindreth’s Assassin.
To place it properly in context:
Rise of the Sparrows — Volume I
Daindreth’s Assassin — Volume II
Daindreth’s Outrider — Volume III
Daindreth’s Traitor — Volume IV
Daindreth’s Champion — Volume V
Daindreth’s Assassin is where the series decisively pivots from courtly and dynastic intrigue into morally tangled covert action: espionage, assassination, and the slow construction of Daindreth as a ruler who understands that legitimacy and power are not the same thing. It is also the volume in which Thadred begins to matter in a structurally important way—no longer merely texture, but a lever on the plot.
If you would like, I can also sketch how Assassin differs tonally and structurally from Rise of the Sparrows, or map which secondary characters become load-bearing across Volumes II–IV.

Give a gift subscription

Ummm… No. Simply not. The books are:

Daindreth’s Assassin
Daindreth’s Outlaw
Daindreth’s Traitor
Daindreth’s Sorceress
Daindreth’s Empress.

There is a related novel, Bastart’s Honor, and a novella, “The Archduke”. But there is no rise, no sparrows, no outrider, and no champion. And the plot/theme summary is invented bullshit as well.

This is puzzling
A chatbot MAMLM looks in its training data for a conversation like the one it is currently having:
- It picks one “close” in terms of whatever flexible distance metric it has created.
- It then jumps to that conversation.
- It hands over control to whatever human was then writing.
- It outputs the next token that that human wrote.
- It then looks at the now-one-token-longer conversation, and does the same exercise again.
At every stage it is piggybacking on a human train of thought.
That’s all it can do: it is trained to do as well as possible, when it is given a conversation in its training data, to replicate the next to ken.
Yes: it is highly compressed. That compression does something—analogous to a blurry jpeg compression algorithm. But nobody has ever explained to me of what that analogy consists.
Yes, there is then RLHF afterwards. But that is simply an attempt to poke it into a region of its training data that produces conversations that the mamlm chatbot imagineers like.
Yes, there is “prompt engineering”. But that is simply an attempt to shape the conversation in a way that it gets into pieces of the training data that the users of the mamlm chatbot find useful.

None of these help me understand where Rise of the Sparrows comes from.

Or where the words “Outrider” and “Champion” in book titles come from.

“Daindreth” is an odd word. My guess is that there are less than 6000 webpages in which the words “Daindreth” and “Elisabeth Wheatley” both appear. A niche-but-not-tiny fantasy romance series typically generates: (i) one canonical page per retailer per edition/format, (ii) one canonical page per book database entry, and (iii) a long tail of reviews, lists, reading trackers, scraped metadata mirrors, and forum/reddit chatter. Who on the internet was ever having a conversation about Elisabeth Wheatley’s “Daindreth” series that then jumped to talking about Sarina Langer—whose first volume of Ar’Zac is called Rise of the Sparrows—for the MAMLM ChatBot to glom onto?

Lack of a world model and attempts to substitute for that lack with correlation matrices induces strange behavior indeed.

And I am not even going to try to imagine what in the training data could possibly have given rise to the map. I mean, I understand how in drawing hands you can get enthuiastic about fingers and keep drawing more of them. But this flummoxes me.

When a chatbot confidently tells you that Rise of the Sparrows is the first volume of a series it has just fabricated, or produces a map in which Turkey owns the Caucasus but Nazi panzertruppen drink from the Bosporus, and southern Norway fights on in company with historic Wessex alone, you are not seeing a tiny edge case. You are seeing the core logic of a system that has patterns instead of facts and correlations instead of a model of reality. This is how MAMLMs actually operate, why “compression” is a treacherous metaphor, and how RLHF and prompt engineering polish—but do not cure—the underlying tendency to make things up.

The conclusion is uncomfortable but important: without a world model, correlation matrices will always hallucinate—and often in ways we can’t predict and can’t prune out, unless we already know what the answers are that we are purporting to be trying to get.

If reading this gets you Value Above Replacement, then become a free subscriber to this newsletter. And forward it! And if your VAR from this newsletter is in the three digits or more each year, please become a paid subscriber! I am trying to make you readers—and myself—smarter. Please tell me if I succeed, or how I fail…

##please-enough-with-the-claims-that-modern-advanced-machine-learning-models-hallucinate-only-rarely
##subturingbradbot
#mamlms
#hallucinations
#language-models
#world-model
#daindreths-assassin
#rise-of-the-sparrows
#world-war-ii
#imaginary-maps
#prompt-engineering
#correlation-vs-reality

Please: Enough with the Claims That Modern Advanced Machine Learning Models Hallucinate Only Rarely

##please-enough-with-the-claims-that-modern-advanced-machine-learning-models-hallucinate-only-rarely##subturingbradbot#mamlms#hallucinations#language-models#world-model#daindreths-assassin#rise-of-the-sparrows#world-war-ii#imaginary-maps#prompt-engineering#correlation-vs-reality

##please-enough-with-the-claims-that-modern-advanced-machine-learning-models-hallucinate-only-rarely
##subturingbradbot
#mamlms
#hallucinations
#language-models
#world-model
#daindreths-assassin
#rise-of-the-sparrows
#world-war-ii
#imaginary-maps
#prompt-engineering
#correlation-vs-reality