Mondays with the Machine: The Tongue & the Token: Language as Interface in Our Current Age of AI
Natural-language interfaces as quite possibly the most important thing about the current wave of MAMLMs. Our frontier MAMLMs are tools that “understand” our goals without understanding our minds, or indeed having minds themelves. LLMs aren’t smart, but they are useful—and that’s enough. What are the implications of conversing with machines that simulate and mimic conversation so that natural language becomes our information-technology interface of choice. But if these tools don’t really think and converse, what are we really doing when we talk to them?…
The paradox and promise of conversational AI.
Begin with the simple, yet deceptively deep, observation: most people are not good at computer science. They should not have to be. For most of history, we have interfaced with our tools—shovels, typewriters, assembly lines, and spreadsheets—not through formal logic or recursive abstraction, but through intuitive interaction, gesture, routine, and through one other thing. What is that one other thing? It is the most important and the most fundamental thing: language.
Thus it matters a lot that what the current generation of "AI"—specifically, the GPT LLMs that are so much of MAMLMs (the General-Purpose Transformer Large-Language Models that are so much of Modern Advanced Machine-Learning Models)—promises is the incorporation of natural language into the center of our human-machine interface.
That is, I think, a very big deal.
First, however, a parenthesis: This is not a big deal because "AI" as we now or will soon know it is smart. It is not. It is, I think, still far on the John Searle side of the "Chinese Room" divide and not on the Scott Aaronson side. What do I mean by this? Here is how Scott puts it:
In the last 60 years, have there been any new insights about the Turing Test itself? In my opinion, not many. There has, on the other hand, been a famous "attempted" insight, which is called Searle's Chinese Room.... Let's say you don't speak Chinese. You sit in a room, and someone passes you paper slips through a hole in the wall with questions written in Chinese, and you're able to answer the questions (again in Chinese) just by consulting a rule book. In this case, you might be carrying out an intelligent Chinese conversation, yet by assumption, you don't understand a word of Chinese!...
I don't know whether the conclusion of the Chinese Room argument is true or false. I don't know what conditions are necessary or sufficient for a physical system to "understand" Chinese—and neither, I think, does Searle, or anyone else. But... the Chinese Room argument... gets... mileage from... choice of imagery... sidestep[ping] the entire issue of computational complexity.... We're invited to imagine someone pushing around slips of paper with zero understanding or insight....
But how many slips of paper are we talking about? How big would the rule book have to be, and how quickly would you have to consult it, to carry out an intelligent Chinese conversation in anything resembling real time? If each page of the rule book corresponded to one neuron of a native speaker's brain, then probably we'd be talking about a "rule book" at least the size of the Earth, its pages searchable by a swarm of robots traveling at close to the speed of light. When you put it that way, maybe it's not so hard to imagine that this enormous Chinese-speaking entity that we've brought into being might have something we'd be prepared to call understanding or insight...
You start by having training data composed of word-sequences followed by their next-word continuations. You use this to construct a function from the set of word-sequences to next-word continuations via sophisticated interpolation, since your training dataset is sparse in the space of word-sequences that it the domain of your function. But the only intelligence is in the humans who wrote the things that are the training data—the word-sequences and then the next-words. This is only autocomplete-on-steroids. And then... somehow... as the system becomes more complicated it is no longer a guy in a Chinese Room but rather an AI-entity the size or the earth serviced by a large swarm of lightspeed robots, and it understands, is intelligent, thinks.
But this was just a parenthesis. It is just a marker that perhaps in a generation—but not in the next decade—I will shift my position on whether the MAMLMs of that future day are in fact “AI”.
Resuming my main argument:
MAMLMs now and for the next decade are simply software arguments that manipulate tokens in ways that mimic and simulate understanding without possessing it. But that, in itself, is the wrong standard. My desk lamp does not understand light; it produces it. And if my AI assistant can “understand” me—i.e., parse my input, interpret my goals, and produce reasonably accurate, useful, and legible outputs—it does not matter that it lacks intentionality or phenomenology. It matters that it works. The question is not whether it is conscious, but whether it is useful.
We live, after all, in a world that has become too complex for our monkey minds. Our systems of law, finance, logistics, and science have outpaced not merely our capacity to remember them, but our ability to interpret them without digital assistance. We are, already, augmented by algorithmic prosthetics—we just do not call them that. We call them “search engines”, “recommendation systems”, or “autocomplete”. But these systems are not conversational. They do not invite dialectic. They do not allow us to explore the space of our ignorance.
Natural-language AI is different. It enables a kind of Sokratic interaction, one where we externalize still more parts of our cognition in dialogue. It is not thinking, but it is an excellent mirror of thought.
That is why I launched—probably foolishly—my SubTuringBradBot project:
<https://chatgpt.com/g/g-6813df8e5a408191b26db4f9c441f149-subturingbradbot>
The idea was simple: could I offload part of my own cognitive work onto a properly tuned LLM, such that it could serve, if not as a coauthor, then at least as an intelligent research assistant? Could it tutor students in economic history better than my harried TAs? Could it compose reading questions, or help me triage my overflowing inbox of academic queries? In other words: could it serve as an interface not to machines, but to myself—both students interfacing with a cut-rate sub-Turing version of myself much more than my divided attention would allow them to grab time with the real me, and for me to interface with some other selves that are me with much less cognitive depth, but that are still good enough to sometimes produce things that make me go thumbs-up?
The results, so far, have been mixed.
The LLM is glib, articulate, and oddly plausible.
But it is not accurate.
It sounds like a version of me—if I had only half read the book I was assigned, and was very eager to impress the professor. Still, there is potential here. If the “stochastic parrot” is a good enough mimic, then perhaps, with sufficient tuning and context, it can simulate not my thought, but the shadow of it. And that may be enough, in many cases, for pedagogy, for exploration, for decision support.
But let us pull back and ask the bigger question: What does it mean to introduce natural language AI into the human symbolic ecosystem? What are we doing when we speak to machines, and they answer back?
One answer is that we are democratizing access to complexity. When a factory worker in Shenzhen or a nurse in Cleveland can ask a question about taxes, trade, or thermodynamics—and get an answer that is linguistically legible, contextually relevant, and socially calibrated—we are leveling the epistemic field. The gatekeeping of expertise is weakened. The affordances of understanding are expanded. This is not a substitute for schooling, but it is an accelerant for learning.
Another answer is that we are building new forms of social infrastructure. Consider the analogy to the anthropological gift economy—something I have meditated on at length. In traditional societies, information is often exchanged not in market transactions but in ritualized, reciprocal, socially embedded forms. So too, in a world of conversational AI, where my input is not merely data but a gift of intent, and the machine’s response is a return offering—a constructed answer shaped by millions of prior interactions. What results is not just a tool, but a kind of proto-agent, situated within a social space. It is a step, perhaps, toward the “mirror society” that science fiction has long foretold: an always-on interlocutor, one that reflects and refracts our own queries into structured outputs.
Of course, there are dangers. There is the peril of false confidence, of persuasive nonsense. There is the risk of dependence—of letting the machines not merely finish our sentences, but begin our thoughts. And there is the political economy problem: who builds these systems? Who owns them? Who governs their affordances, their training data, their biases, their blind spots? These are not technical questions, or these are not computer-science technics but rather humanity-as-an-anthology-intelligence technics. They are institutional, regulatory, and ideological.
But what is the potential? What I see, in the best cases, is a kind of cognitive co-evolution as natural-language AI becomes more capable. Just as the printing press allowed the Renaissance mind to scale, and just as the spreadsheet allowed the accountant to manipulate ten thousand rows of capital flows without tears, so too might AI allow the student, the teacher, the policymaker, and the citizen to interface with knowledge more fluidly, more dialogically, more humanely.
Of course, the printing press also brought two centuries of genocidal religious war to Europe.
It is a tool. A tool to be used for good and ill. Tools, when properly used, have always been levers by which we pry open the stuck doors of understanding. In the end, I do not expect these systems to replace us. I expect them to interface with us. And in that interface, something like progress might emerge.
I am, as ever, merely guessing.