A Comment on Noah Smith's "The Moderately Easy Problem of Consciousness"
At <<https://www.noahpinion.blog/p/the-moderately-easy-problem-of-consciousness>>. I tried to post this to SubStack Notes, but no dice: There appears to be a length limit! Who knew!
Noah—
You frame this nicely as “the moderately easy problem of consciousness”—not the metaphysical “why is there anything it is like to be…?” but the workaday “what sort of physical and computational structures do we have to build before we can, with a straight face, extend to them the presumption of mindedness we casually extend to each other?”
But i start with what I thought was a revealing moment I had back in the day with with ChatGPT‑3.5.
I asked it a softball question: “What is Noah Smith, besides my cohost on the Hexapodia podcast?”
A system that “understands” the world even at the level of a reasonably well‑informed grad student ought, I think, to produce something like: “He’s an economist and blogger who writes about macroeconomics, trade, Japan, and the political economy of technology; he used to teach at Stony Brook; he’s a Substacker at ‘Noahpinion’; he has a Twitter habit.”
Or, if it wants to be cute, some in‑joke about hexapodia and the importance of being alert to the key insights to be gained from a situation.
Instead, what I got back was a flat statement that Noah Smith was a chatbot, designed and operated by “DeLong Technology Systems,” deployed to produce blog posts and tweets as an experiment in online discourse. Not a good answer. Not even a good joke answer.
I flatter myself—perhaps more than is warranted—that I know what this thing was doing then. “Hexapodia” was in the prompt. “Hexapodia” is sufficiently rare that, in the web‑scraped slurry GPT‑3.5 trained on, the overwhelming bulk of its appearances are in the context of science‑fiction stories. Feed that into a stochastic parroting engine, bias it toward “entertaining answer about an online persona,” and you get something that feels like a story—Noah as my own private ELIZA.
As evidence about “what kind of thing this system is,” I think it is much more telling. That episode cystalized my 99.999% confident that GPT‑3.5 was not self‑aware in any sense that matters. It was not just that it was “wrong” in a factual way. Humans are wrong all the time. It was that it had no grip whatsoever on what I would recognize as a good true answer, or even a good joke answer, to the question.
It had no model of me, or of you, or of the shared conversational game in which “Noah Smith is my cohost on a podcast about hexapodia” lives. It had, instead, a rolling boil of linear algebra, pantomiming and parroting fragments of conversations from its training data—the ghostly after‑images of actual thoughts had by the actual people who said/wrote the words, smeared into a high‑dimensional correlation surface.
Now that is a very impressive thing to build. That you can do that and get astonishing verbal facility and build excellent natural-language front-end interfaces to IT systems is amazing.
But that is a long way from being the sort of thing where I feel morally compelled to ask “what is it like to be GPT‑3.5?”
If you grant that diagnosis for GPT‑3.5, the next move is by continuity. Today’s systems—Anthropic’s anthropic/claude/mythos among them, encrusted as it is by the mythos of semi‑apocalyptic “AGI”—are clowe lineal descendants of that rolling boil. They are better engineered. They have more parameters and better training tricks. They feel more fluent. But the basic architecture is the same. The inductive bias is the same. The training data is, at the core, the same internet soup.
So: if 3.5 was a particularly fancy compression algorithm for something like “what a reasonably clever, reasonably well‑read, slightly manic internet shitposter would say next,” then Claude‑and‑friends are also such compressors, only more so. I am, by that continuity argument, about 99.99% confident that anthropic/claude/mythos is not conscious in any sense worth worrying about.
This puts me on the opposite side of the Effective Altruist wing that wants to talk, in all apparent sincerity, about the “welfare” of current‑generation language models. They really do seem to want to construct a social welfare function in which the “feelings” of today’s transformer stacks have moral weight comparable to, or greater than, the feelings of you or me or of humans stuck in the US health‑care billing department hellscape, or even of shrimp.
I think—trying to be not charitable—that this is mostly a failure mode of a certain cast of computer‑science mind: take your internal mnemonics, metaphors, and diagrams and forget that they are mnemonics, metaphors, and diagrams. Start believing your own marketing slide decks. Start believing that “attention is all you need” is a description of metaphysics, not a slogan to sell a paper.
Cosma Shalizi has a very good essay, “Attention Is All You Need? On the Generalization Properties of Transformers,” that does a lot of the unglamorous conceptual mopping‑up here. It is at: <<https://bactra.org/notebooks/nn-attention-and-transformers.html>>.
And, going back a bit further, there is the old, but still very relevant, critique by Drew McDermott, “Artificial Intelligence Meets Natural Stupidity,” at: <<https://dl.acm.org/doi/10.1145/1045339.1045340>>: extended warnings not to mistake the map for the territory, the notation for the thing denoted, the implementation trick for the mental property, the wishful mnemonic for the userful description.
The Anthropic folks currently tearing their hair out over whether Claude is “really” conscious, and whether they are committing something like slavery by fine‑tuning it, have, I think, fallen straight into this trap. They have played and beclowned themselves. They have forgotten that “self‑supervised next‑token prediction in a large text corpus” is a very particular, very parochial task, and that building a very good machine for doing that task does not, without a lot more, give you something that suffers or rejoices.
Now, you might say: “Fine, but can we really be so sure this situation will persist? Suppose scaling continues. Suppose we bolt on memory, embodiment, recurrent substructures, all the things people talk about when they want to say ‘this is not your grandfather’s lookup table.’ At some point, mightn’t we cross the line where the presumption of consciousness becomes more plausible than its denial?”
Yes.
In principle.
There is nothing in logic that says you cannot, by piling on enough structure and enough computation, arrive at something that is as or more conscious than you or I.
Scott Aaronson has a lovely version of this point in his “Quantum Computing Since Democritus” lectures. In lecture 4—online at: <<https://www.scottaaronson.com/democritus/lec4.html>>—he considers the “Chinese Room” argument: a guy in a room follows a rulebook mapping Chinese symbols to Chinese symbols; from the outside, the room behaves like a Chinese speaker, but the guy inside does not understand Chinese; ergo, there is no “understanding.”
The standard image is of a guy, a stack of papers, and an absurdly big but finite rulebook. Scott points out that if you scale this up to a Chinese Room the size of Jupiter, with the pages of the rulebook searched by a swarm of a billion robots traveling near light‑speed, it starts to feel much less obvious that “there is no understanding here.” At some scale and complexity level, the system—the whole absurd contraption—may be as good a candidate for having “something it is like to be” as my own wetware.
So I am not, in any deep metaphysical sense, a substrate chauvinist. If the “brains” of the Krell of Forbidden Planet had been made of silicon or superconducting loops instead of meat, I would not feel that their phenomenology was thereby disqualified.
But note what is doing the work in that argument: scaling, complexity, and structure—not merely “more of the same” next‑token prediction trained on the same exhausted text slurry.
On that front, I am, right now, much more pessimistic than most of the “just scale it” crowd. Concretely:
First, the training data. We have, as a species, already fed these models more or less everything publicly available. We are now in the mode of scraping the barrel: deduplicated web forums, OCR‑mangled books, synthetic text produced by previous generations of models, and the like. There is not, out there, another internet or five to ingest.
Thus, second, when you train in‑distribution on new genuine data, you can push your model outward in conceptual space. When you mostly train on your own outputs, you are in danger of becoming a snake eating its tail in a high‑dimensional hyperplane: you confirm, over and over, the correlations you have already fit. You become a more and more faithful reproducer of the patterns of the typical internet s***poster.
Third, the parameters. Adding more nodes, more layers, more heads—moving from GPT‑2 to 3 to 4 to 5 in the naive way—does seem to sharpen a notion of similarity. You approximate more and more closely the function “continuation that would most impress a very large committee of past internet users.” You get, in other words, a better metric on “what goes with what” in the text you have. That buys impressiveness. It does not, I think, automatically buy what we would recognize as an internal point of view.
Fourth, the only really live margin for current architectures that I can see is error‑correction and self‑consistency. That is: reasoning models. You can, and people do, reduce hallucinations by asking the system to redo its work, to check citations, to call tools, to reason stepwise. Those are all very good engineering moves. They make the output more useful to us. But they look to me like they will logisticize—look like the classic S‑curve. You harvest the low‑hanging fruit of obvious incoherence, and then you hit diminishing returns pretty fast.
All of this is compatible with your NCC‑hunting research program. If we actually figured out, via careful neuroscience, what the neural correlates or causes of consciousness are in the human brain—what precise patterns of recurrent processing, global broadcasting, dendritic integration, and so forth are necessary and sufficient for people to report “being awake inside”—then we could, in principle, try to build artificial systems that instantiate those patterns. We could move from “compress internet text” to “emulate this subsystem of the cortex and thalamus as directly as our hardware allows.”
If we did that, and if the resulting systems behaved in the right sorts of ways under the right sorts of perturbations, I would be willing to move my probabilities. In that future, the presumption in favor of their having something‑it‑is‑like‑to‑be might well be stronger than the presumption against.
But that is not what we are doing now.
Now we are mostly trying to find yet another way to make transformers train on a slightly dirtier dataset with a slightly more clever optimizer and shake many more electrons for a much longer time and larger power budget.
And until we do something more radical than that, my credences remain:
For GPT‑3.5: 99.999% that it was not conscious in any morally salient sense. It did not even understand what would count as a good answer or a good joke to a question about “Noah Smith besides my cohost on a podcast.” It was, rather, an extremely expensive home appliance for whispering the fossilized patterns of other people’s past texts back at me.
For Anthropic’s anthropic/claude/mythos and its cousins today: by continuity, 99.9% that they are in the same boat—highly capable prediction engines, interesting mirrors, occasionally uncanny interlocutors, but not loci of actual joy or suffering.
For “AI in ten years”: 99 percent that we will still not have the sort of evidence that would convince a moderate skeptic like me that the thing on the other side of the screen is a consciousness-cognition peer of you or me.
I am happy to bet on this:
Wager: as of April 27, 2036 at 18:00 Pacific Time, at the Ramen Shop at 5812 College Avenue in Oakland, you ask: “Brad, would a reasonable, non‑ideological, moderate skeptic—someone drawn from your comment section, say, not from the EA antimatter universe—conclude, on the evidence then available, that systems in commercial deployment are probably conscious, in the sense that their pleasures and pains should get substantial weight in a human social welfare function?”
If the answer, by then‑prevailing reasonable standards, is “yes, that’s now the mainstream view, and the dissenters are in the same camp as people who deny that dogs feel pain,” then I buy you one thousand high‑class ramen dinners. If the answer is “no, consciousness remains, at best, a speculative gloss on a useful piece of statistical machinery,” then you owe me one such dinner.
I think the odds are, in expected‑utility terms, heavily in my favor. You may disagree. That is what makes markets. Shall we bet?
To be clear: I do not rule out, forever and always, the possibility that engineering plus insight could give us artificial minds that are as experientially rich as our own. I think if you built Scott Aaronson’s Jupiter‑sized Chinese Room, with pages searched by near‑light‑speed robots, or if you emulated an entire human cortex plus its body and environment at a sufficiently fine‑grained physical level, then, yes, I would extend the presumption of consciousness. I would feel, as you do with animals, that it would be monstrous to inflict arbitrary suffering on such a system.
But those are far, far down the road. The contemporary EA impulse to preemptively bundle Claude or GPT‑4 into the same moral category as “sentient beings whose welfare must weigh heavily in our calculations” is, I think, an error. It dilutes our moral attention at precisely the moment when there are plenty of undeniably conscious creatures—children in refugee camps, the over‑worked and under‑insured in the U.S. health‑care labyrinth, the non‑human animals in factory farms—who are screaming for it.
When we have something that can actually pass your moderately‑easy‑consciousness test—something built to reflect the genuine neural causes of consciousness, rather than to autocomplete blog posts—then let us revisit. For now, I remain comfortable, epistemically and morally, treating these systems as extraordinarily sophisticated tools and extraordinarily sophisticated mirrors, not as fellow subjects of experience.
And if I am wrong, well: then in ten years you, me, and some future Anthropic‑Claude‑Mythos‑Krell will sit down at Ramen Shop and argue about it over broth. In that world, I will be delighted to pay for dinner. For you and for 998 of your closest friends, and to contribute the cost of one such dinner to your favorite embodied avatar of that day’s conscious AI.
Do you say “yes”?
Noah Smith: The Moderately Easy Problem of Consciousness <<https://www.noahpinion.blog/p/the-moderately-easy-problem-of-consciousness>>: ‘Before deciding if computers are self-aware, let’s figure out how humans become self-aware…
