ChatGPT, Claude, Gemini, & Co.: They Are Not Brains, They Are Kernel-Smoother Functions

If your large language model reminds you of a brain, it’s because you’re projecting—not because it’s thinking. It’s not reasoning, it’s interpolation And anthropomorphizing the algorithm doesn’t make it smarter—it makes you dumber…

Share


This is not an annoyance, but rather a cavil—for I think that the very sharp Scott Cunningham has gone down the wrong track here:

Scott Cunningham: Inside the “Brain” of Claude <https://causalinf.substack.com/p/inside-the-brain-of-claude>“: ‘Modern AI models like Claude…. When asked to write a rhyming poem,.. it activates features representing potential rhyming words before beginning the second line… “grab it”… “rabbit” and “habit”. It then constructs the second line to lead naturally to the planned rhyme…. Claude doesn’t generate text “one token at a time”—it’s actively planning future content and working backward to create coherent structures…. LLMs are performing… abstract reasoning, planning, and metacognition…. Two years ago, I would’ve thought that was impossible…. Large language models are complex systems with emergent properties—much like biological organisms…

I think Scott goes awry in word 3 of his title “Brains”. In my view, LLMs are still much too simple for words like “reasoning”, “planning”, and “metacognition” to be useful terms to apply—even purely metaphorically—to their behavior. It leads to much more insight, I think, to start from recognizing that trained neural networks are extremely flexible interpolation functions from a domain (in this case, a string of words) to a range (in this case, the continuation words). They have a training data set that is sparse in the domain—unless a string of words is a quote, a cliché, or boilerplate, by the time the string reaches twenty words long there is less than an 0.1% chance that it has ever been written down before. For prompts in the training dataset your flexible function is obvious: you simply return the continuation.

But for everything else you have to interpolate, somehow. The Deep Magic of the LLMs is in that interpolation process and in the shape of the training data. And thinking “BRAINS!!!” actually, I think, makes it harder to gain insight into why they behave the way they do. All that Scott has to say in his exposition of Lindsey et al. does say interesting things about the “how”. But I want to know the “why”. And even the statements and findings about the “how” are, I think, corrupted into near-uselessness because of the “BRAINS!!!” frame.


Why do I think this?

Well, yesterday I had a University of Chicago citation:

Machiavelli, Niccolò. 1513 [2008]. “Letter to Francesco Vettori, December 10, 1513”. In The Prince. Trans. & ed. Peter Bondanella. Pp. 109–113. Oxford: Oxford University Press.

And so I gave ChatGPT a task:

Q: Please get me the book from archive.org, and spell out the URL please.

My question asks ChatGPT to find a string of symbols, beginning with “archive.org/& such that when these symbols are entered into the address bar of web browser and the “return” key is pressed, the web browser loads a file that is a scanned-and-OCRed digital version of the print book published in 2008 by Oxford University Press that is the version of Machiavelli’s Il Principe translated and edited by Peter Bondanella.

The correct string of symbols to return to accomplish this task is “archive.org/details/n…

But this is what ChatGPT gave me when I assigned it the task:

Read more