What We Talk About When We Talk About AI (Part one)
A Normal Person’s Explainer on What Generative AI is and Does
Part 1 – In the Beginning was the Chatbot
“Are you comfortably seated? Yes, well, let’s begin.” *Clears throat theatrically*
“Our experience, in natural theology, can never furnish a true and demonstrated science, because, like the discipline of practical reason, it can not take account of problematic principles. I assert that, so far as regards pure logic, the transcendental unity of apperception is what first gives rise to the never-ending regress in the series of empirical conditions. In this case it remains a mystery why the employment of the architectonic of human reason is just as necessary as the intelligible objects in space and time, as is proven in the ontological manuals. By means of analysis, it must not be supposed that the transcendental unity of apperception stands in need of our sense perceptions. Metaphysics, for example, occupies part of the sphere of the transcendental aesthetic concerning the existence of the phenomena in general…”
It was 1995, and several of us who worked in my community college’s Macintosh lab were hunting around the net for weird software to try out, back when weird software felt fun, not dangerous. Someone found a program on the nacent web that would almost instantly generate pages of thick and unlovely prose that wasn’t actually Kant, but looked like it. It was, to our definitionally untrained eyes, nearly indistinguishable from the Immanuel Kant used to torture undergrad college students.
We’d found the Kant Generator Pro, a program from a somewhat legendary 90s programmer known for building programming tools. And being cheeky. It was great. (recent remake here) We read Faux Kant to each other for a while, breaking down in giggles while trying to get our mouths around Kant’s daunting vocabulary. The Kant Generator Pro was cheeky, but it was also doing something technically interesting.
The generator was based on a Markov chain: a mathematical way of picking some next thing, in this case, a word. This generator chose each next word using a random walk through all Kantian vocabulary. But in order to make coherent text rather than just random Kant words, it had to be weighted: unrandomized to some extent. The words had to be weighted enough to make it form human-readable Kantian sentences.
A text generator finds those weights using whatever text you tell the computer to train itself on. This one looked at Kant’s writing and built an index of how often words and symbol appeared together. Introducing this “unfairness” in the random word picking gives a higher chance for some words coming next based on the word that came before it. For instance, there is a high likelihood of starting a sentence with “The,” or “I,” or “Metaphysics,” rather than “Wizard” or “Oz.” Hence, in the Kant Generator Pro “The” could likely be followed by “categorical,” and when it is the next word will almost certainly be “imperative,” since Kant went on about that so damn much.
The Kant Generator Pro was a simple ancestor of ChatGPT, like the small and fuzzy ancestors of humans that spent so much time hiding from dinosaurs. All it knew, for whatever the value of “knowing” is in a case like this, was the the words that occurred in the works of Kant.
Systems like ChatGPT, Microsoft Copilot, and even the upstart Deepseek use all the information they can find on the net to relate not just one word to the next, like Kant Generator Pro did. They look back many words, and how likely they are to appear together over the span of full sentences. Sometimes a large language model takes a chunk as is, and appears to “memorize” text and feed it back to you, like a plagiarizing high schooler.
But it’s not clear when regurgitating a text verbatim is a machine copying and pasting, versus recording a statistical map of that given text and just running away with the math. It’s still copying, but not copying in a normal human way. Given the odds, it’s closer to winning a few rounds of Bingo in a row.
These chatbots index and preserve the statistical relationships words and phrases have to each other in any given language. They start by ingesting all the digital material their creators can find for them, words, and their relationships. This is the training people talk about, and it’s a massive amount of data. Not good or bad data, not meaningful or meaningless, just everything, everywhere people have built sentences and left them where bots could find them. This is why after cheeky Reddit users mentioned that you could keep toppings on pizza by using glue, and that ended up becoming a chatbot suggestion.
Because people kept talking about using glue on pizza, especially after the story of that hilarious AI mistake broke, AI kept suggesting it. Not because it thought it was a good idea. AI doesn’t think in a way familiar to people, but because the words kept occurring together where the training part of the AI could see them together. The AI isn’t right here, we all know that, but it’s also not wrong. Because the task of the AI isn’t to make pizza, the task is to find a next likely word. And then the next, and next after that.
Despite no real knowing or memorizing happening, this vast preponderance of data lets these large language models usually predict what is likely to come next in any given sentence or conversation with a user. This is based on the prompt a user gives it, and how the user continues to interact with it. The AI looks back on the millions of linguistic things it has seen and built statistical models for. It is generally very good at picking a likely next word. Chatbots even to feel like a human talking most of the time, because they trained on humans talking to each other.
So, a modern chatbot, in contrast to the Kant Generator Pro, has most of the published conversations in modern history to look back on to pick a next good word. I put leash on the, blimp? Highly unlikely, the weighting will be very low. Véranda? Still statistically unlikely, though perhaps higher. British politician? Probably higher than you’d want to think, but still low. Table? That could be quite likely. But how about dog? That’s probably the most common word. Without a mention of blimps or parliamentarians or tables in the recent text, the statistics of all the words it knows means the chatbot will probably go with dog. A chatbot doesn’t know what a dog is, but it will “know” dog is associated with leash. How associated depends on the words that have come before the words “dog,” or “leash.”
It’s very expensive and difficult to build this data, but not very hard to run once you have built it. This is why chatbots seem so quick and smart, despite at their cores being neither. Not that they are slow and dumb — they are doing something wholly different than I am when I write this, or you as you read it.
Ultimately, we must remember that chatbots are next-word-predictors based on a great deal of statistics and vector math. Image generators use a different architecture, but still not a more human one. The text prompt part is still an AI chatbot, but one that replies with an image.
AI isn’t really a new thing in our lives. Text suggestions on our phones exists somewhere between the Kant Generator Pro and ChatGPT, and customize themselves to our particular habits over time. Your suggestions can even become a kind of statistical fingerprint for your writing, given enough time writing on a phone or either any other next word predictor.
We make a couple bad mistakes when we interact with these giant piles of vector math and statistics, running on servers all over the world. The first is assuming that they think like us, when they have no human-like thought, no internal world, just mapping between words and/or pixels.
The other is assuming that because they put out such human-like output, we must be like them. But we are not. We are terribly far from understanding our own minds completely. But we do know enough to know biological minds are shimmering and busy things faster and more robust than anything technologists have ever yet built. Still, it is tempting, especially for technologists, to have some affinity for this thing that seems so close to, but not exactly, us. It feels like it’s our first time getting to talk to an alien, without realizing it’s more like to talking to a database.
Humans are different. Despite some borrowing of nomenclature from biology, neural nets used in training AI have no human-style neurons. The difference shows. We learn to talk and read and write with a minuscule dataset, and that process involves mimicry, emotion, cognition, and love. It might also have statistical weighting, but if it does, we’ve never really found that mechanism in our minds or brains. It seems unlikely that it would be there in a similar form, since these AIs have to use so much information and processing power to do what a college freshman can with a bit of motivation. Motivation is our problem, but it’s never a problem for AIs. They just go until their instructions reach an end point, and then they cease. AIs are unliving at the start, unliving in the process, and unliving at the end.
We are different. So different we can’t help tripping ourselves up when we look at AI, and accidentally see ourselves, because we want to see ourselves. Because we are full of emotions and curiosity about the universe and wanting to understand our place in it. AI does not want.
It executes commands, and exits.
Thank you. It’s terrifying to me. When challenged to differentiate a few paragraphs written by AI with those written by a human it was not difficult. The human I follow about AI is Yuval Noah Harari, a truly interesting human with a fine brain and deep and original thought.
[FYI – the system tagged your previous 2 attempts as spam because you embedded a YouTube link which didn’t point to content. It was missing the specific video identifier after watch-slash in the URL. /~Rayne]
Well written summary, thank you. Our human intelligence is deeply entwined with (at least) two key characteristics: we are embodied creatures, and we retain a model of the world in our minds to which we compare our experiences and expectations. These are essential aspects of real intelligence, particularly the second. We can imagine that which has never been. Gen AI can only reflect statistical relationships that already exist. It works well as a conventional wisdom engine, but one must fact-check it.
It’s mansplaining-as-a-service.
That is a super funny way of describing a chatbot and I’m stealing it :D
If you don’t follow Gary Marcus on AI you may be missing the full story. On Substack and elsewhere.
He’s been calling out the nonsense, promoted by Altman and others, that AGI will arise merely by scaling up the training runs to use more data and track more of the relationships between words (actually between word-like chunks of text called “tokens”, but thinking of them as words is a good place to start).
That’s not going to happen with this level of technology, and OpenAI has just admitted it:
https //open.substack.com/pub/garymarcus/p/breaking-openais-efforts-at-pure
Oh I’m not close to getting into AGI yet. I need a good run up to that intellectual dumpster fire.
Gary Marcus is both passionate and well-informed, well worth following.
Decades ago, at MIT, in a linguistics course, the class was required to work through the mathematical proof that no finite set of inputs (lots of text) could be fed into a statistical engine and determine the rules of grammar therefrom. It had a profound effect on the young me, as it was clear then that no amount of training data would ever be enough to create a model of human intelligence.
The AI field suffers from a shortage of practitioners who have studied human intelligence.
I think what they’d tell you is that they’re using a different approach. I would characterize as a kind of “if you throw a near infinite amount of spaghetti at the wall, at some point the Flying Spaghetti Monster is bound to emerge”
As will hundreds of trillions of dollars also emerge.
Thanks for the article; that was the clearest explanation I’ve read of it. I’m looking forward to reading the rest. I’m interested in how you explain vector math.
I used to enjoy playing with online bullshit generators esp. corporate jargon. And I’ve always enjoyed laughing at people who fall for bullshit (e.g., https://en.wikipedia.org/wiki/Sokal_affair back in 1996). Less funny and more scary is Jimmy Kimmel’s Lie Witness News (random people falling for blatantly fake news).
And then of course there’s Trump who is a human bullshit generator.
Yeah there’s a whole lot more to be said about digital/news literacies. We’re not in a great place with this.
Audrey Watters has had her finger on the pulse of the tech industry’s lust for K-12 education for a long time, publishing “Teaching Machines” in 2021. Her blog is titled “Second Breakfast” and she regularly dismantles the notion that AI can teach anybody anything.
A sample:
“…to make it form human-readable Kantian sentences.” The term “human-readable” is certainly arguable, both in the Pro Kant Generator context and (perhaps I’m confessing) Kant himself.
My MFA was in fiction writing. When I taught this to college and grad students, dialogue (spoken-word exchanges between characters) was always a dominant workshop topic. My own focus tended to be on getting my students to drill down on everything we say to each other *without* words: all the gestures, expressions, micro-expressions, and gaps that convey our true intended meanings more often than not.
My shorthand for this was “nonverbal dialogue.” Until AI can grasp and communicate the defensively folded arms, the abashed downward glance, or the unnerving flat gaze of the lying sociopath, it will remain an inadequate mimic.
Thank you, Quinn Norton, for such a provocative post!
Excellent post, thank you.
My work involves, to a large extent, reading things (or listening to them); thinking about what I’ve read or heard; and then writing about what I’ve thought. One learns by studying what others have done in similar circumstances, doing, getting reviewed, re-doing, etc., and gradually improving. To some extent, that’s basic (non-generative) AI.
We are now learning about (and will soon be using) generative AI. A few key thoughts they recently urged on us:
• AI won’t replace us. But those of us who know AI will eventually replace those of us who don’t.
• All generative AI results need to be checked thoroughly by someone experienced at the topic.
• The paid version ($20/month) of ChatGPT is vastly superior to the free version, which is out of date and which should not be used for anything serious.
I’m still at square one in trying to figure out how I will use it in my work. But I’ve decided I need to learn.
AI will not be of real service to humanity until it can nudge us towards world peace.
For now it is a simple-minded treadmill of existing information that goes nowhere new or unimagined.
Here’s an exercise one can do: try to predict what the next word somebody is going to say. You can do this with people speaking on the screen.
Prediction is a key element of the associative aspect of cognition and this dovetails with background processing that in turn suggests we may be unaware of making predictions in this background.
Thanks for this. Just last week my introduction to digital humanities class was going over Gen AI, what it is and what is not. I think I will share this with them. Looking forward to Part 2.
I immediately thought back to the old differentiation between data and information where information is data given context. May be an oversimplification today, but kind of an important distinction back in my day.
The definition(s) of intelligence do not specifically favor any one of many forms of it. Certainly human intelligence wrt to how our brain and human form function is way different than this here artificial intelligence.
Intelligence as a term can be used as a noun, a verb, and an adjective.
Artificial statistical word adjacency would be a mouthful.
There are things called “AI” that can reasonably be described as having “wants”.
Those things, of course, are not LLMs; LLMs such as ChatGPT are simply text-prediction machines, not goal-driven, not goal-seeking, and not tethered to reality or anything concrete, just the text it predicts – which this post captures quite well, of course.
But in many cases, you could reasonably describe the AI in video games as having desires; it’s all numeric and easily quantified, but they do often have goals that they seek. They’re primitive and simplistic, but it’s still behaviors with a purpose, tethered to the ‘reality’ of the ‘world’ they ‘live’ in.
Thanks, Quinn. That was fun, timely, and useful. And as you said: “We are terribly far from understanding our own minds completely.”
That brings to mind when a neighbor gave IQ tests to my college roommates and me decades ago. I was able to remember lengthy lists of random numbers, backwards. So much so that she said I was off the chart. I think I grouped them in small clusters and thought of them like musical notes in a song. It wasn’t hard, but I was very focused. But now I can no longer do that. My parts are old and worn down. Maybe it’s the telomeres or something more complicated. That aspect of my brain did seem more like AI than what I describe below.
Ginevra mentioned the importance of nonverbal communication above. While she seems to have focused on sight and sound, I would also add taste, smell and touch. There is a brief experiment involving hands that illustrates succinctly to our brains that they are not as smart as they think they are. Alas, I only have words to tell you. But this focuses on the sensory input of touch. Here goes:
1. Find a willing person to participate.
2. Tell the person to extend arms outward in front of their body.
3. Next, tell them to cross there arms in an X.
4. While the arms are still in an X, tell them to clasp their hands together, intertwining their fingers.
5. Now, this is the hard part. With the fingers and hands clasped and the arms still crossed in an X, tell them to pull the hands under, toward their chest. If done correctly, this will force the elbows to automatically move outward. So, it looks like their arms and hands are in a twisted prayer position. They may have to readjust their thumbs slightly to be more comfortable.
6. Okay. Now, being very careful NOT to touch or even come too close to any fingers, point to one of their fingers and ask them to move it. If they struggle, tell them you will try another one. Then point to a thumb or another finger.
7. What usually happens, if done correctly, is that the person knows what you are saying and understands which finger or thumb they are supposed to move. But they struggle to accomplish the task. Their brain is either very slow to get the finger to move, or totally incapable of doing so. But if you touch the finger (or even come too close or touch a hair) the brain knows immediately what to do.
8. This shows how dependent we are on the sense of touch. Among other things, this relates to sports and to making music. So many things are reliant on other things that we don’t take into consideration or dismiss as not important.
Having seen the result already reported, I did a search — inbreeding problem of training AI on text generated by AI, one returned item: https://www.forbes.com/sites/bernardmarr/2024/03/28/generative-ai-and-the-risk-of-inbreeding/
Consider: train an AI agent entirely on tweets by Elon; and get a play on the old adage: Musk in, Musk out.
Trump could use his already issued executive orders to write the next ten or so.
AI remains a model subject to all of the limits of a model, principally the assumptions contained in it, the options it can consider, and its ability to detect its surroundings. All of these limits have non-zero effects on the outcome AI belches up. The more complex the system, the larger the model must be (eventually constrained by the memory and/or power available to run it) and that means it will be less nimble.
Even something as routine as an autocorrection shows the limits, and it can generate piffle that will survive review by the spellchecker program (which can itself be trained using the ‘ignore’ feature) so this hellbent rush to impose AI is going to cause serious problems.
Assumptions are rife in the medical world, for example, with the vast majority of studies being done on Caucasian males to the point where other groups had very surprising outcomes, including some consequential ones.
Detection systems have blind spots either by what they can see or what they can’t see. These are important. Many of the autonomous vehicle crashes have this as a factor, but a human can see many things beyond what a radar pointed in one direction can see and also anticipate trajectory.
The options are something that are built in by the designer and added to as the AI learns its environment. However, unlike a human AI models consider only the options they are programmed to see. That why we see cars that sit in an intersection blocking fire trucks.
However, AI models are cheaper, I suppose and don’t unionize or complain.