Epistemic status: in some sense, I am just complaining, and making light of the extensive effort which goes into designing modern AI. I’m focusing on a sense that something is missing and could be better, which might incidentally come off as calling a broad category of people stupid. Sorry.
The video Badness 0 by Suckerpinch makes a comparison between the approach of Donald Knuth and a fictional villain which he names “Lorem Epson”. Knuth created the typesetting tool TeX, which (together with LaTeX, a macro package for TeX) has become a nearly ubiquitous tool for writing academic papers, especially difficult-to-typset mathematical work. TeX, along with Knuth’s other work, focuses on identifying good abstractions for thinking about the problem, and delivering perfect solutions at that level of abstraction. In contrast, the Lorem Epson approach focuses on looking good over being good, buzzwords over understanding, etc.
Suckerpinch understandably puts modern LLMs in the Lorem Epsom camp. For example, modern LLM-based editing tools (such as the Writeful tool integrated with the popular LaTeX editing environment Overleaf) fundamentally work by suggesting rephrasings that make your document more probable as opposed to more correct. (I have found Writeful’s suggestions to be almost universally unhelpful, giving me trivial rephrasings that are not particularly easy to read, and are often less correct.)
To illustrate the difference, Suckerpinch shows an example of text typeset via Tex vs text typeset by a more naive, greedy algorithm. I don’t know what the counterfactual history looks like, but it seems all-too-plausible that without Knuth, we would be living in a dystopian alt-history where automated typesetting would be pretty awful.
Modern AI is based on the idea of generative pretraining (GPT)[1]. The basic idea was previously known as transfer learning: it’s the idea that you can train an AI on lots of data, perhaps even unrelated to the final task you want your AI to be good at. The AI learns a lot of patterns[2] (some might say, learns a lot about the world) which end up being useful later on when you train it on your final task. This is a great idea! Unfortunately, it is also easy to misuse.
Truth Machines
Modern chatbots such as ChatGPT take the probability distribution obtained through GPT and try to warp and wrangle it towards outputting true and useful information, through various post-GPT training methods (sometimes broadly called “fine-tuning”, although fine-tuning is also sometimes used in a way which contrasts with more sophisticated methods such as RLHF).
One way I sometimes talk about this: we’re fundamentally starting with a creativity[3] machine, which outputs random plausible continuations of text (or other data formats). The “creativity” of the machine is then subsequently treated as the enemy; it is maligned with the term “hallucination”[4] and subsequent training attempts to stomp it out while keeping the useful behavior. However, with no fundamental way to eliminate hallucinations; in some sense, it is all the system does.
I always struggle a bit with I’m asked about the “hallucination problem” in LLMs. Because, in some sense, hallucination is all LLMs do. They are dream machines. [...] I know I’m being super pedantic but the LLM has no “hallucination problem”. Hallucination is not a bug, it is LLM’s greatest feature. The LLM Assistant has a hallucination problem, and we should fix it.
We’re trying to treat them as truth machines rather than dream machines.
One terrible consequence of this is the application of modern voice-to-text transcription technologies. OpenAI’s Whisper system recently made headlines when it came to light that it is already being widely deployed in medical institutions to transcribe interactions with patients, and sometimes makes horrible errors. These errors are high-risk, since they can end up in medical records and influence outcomes.
Surely there should be a better way?
Fundamentally, these systems take recorded audio, and then attempt to produce written text which accurately reflects the audio. One way to think about how hallucinations like this occur is that the learned model has some uncertainty about what an accurate transcription would be, and fills in this uncertainty with its pre-existing world knowledge (that is, its prior over text). At some level, this is necessary. Human transcribers also have some degree of uncertainty and create text by combining what they hear with their prior knowledge of what’s plausible.
However, human transcribers have a nuanced picture of when these plausible inferences are acceptable. Humans can do things like use brackets to represent uncertainty, like writing [inaudible] to represent that something was said but they’re not sure how to transcribe it, or [fire?] to represent an uncertain guess.
The information for this sort of nuance is present in LLMs. In principle, we could do even better: voice transcriptions could represent confidence levels and rate the top completions by probability. In principle, we could even separate confidence that is coming from the audio (the word being transcribed is definitely “wait” based on the local sound-waves alone) vs cases where the confidence is coming from the prior over languages (the word “wait” is expected with high certainty in this context, but the audio itself is more ambiguous). This would help flag cases where the system is guessing based on its prior.
The output of a speech-to-text system could be richly annotated with this sort of information, rather than just giving the text.
However, the technology isn’t designed for this sort of nuance in its present state.
Where are the Knuths?
So, where are the Knuths of the modern era? Why is modern AI dominated by the Lorem Epsoms of the world? Where is the craftsmanship? Why are our AI tools optimized for seeming good, rather than being good?
One hypothesis is that most of the careful thinkers read LessWrong and decided against contributing to AI progress, instead opting to work on AI safety or at least avoiding accelerating AI.
If that’s the case, I think it might be a mistake. Yes, we want stronger sorts of safety. However, I also think that there are types of modern AI which are qualitatively better and worse. It seems like the in-practice gulf between “AI safety people” and “AI engineering people” has created a bad situation where the sort of AI that is being developed at frontier labs lacks a Knuth-like virtue of craftsmanship.
I’m not sure what concrete actions in the world could drive us towards a better future at this point, but maybe safety-minded people (or more broadly, “careful thinkers”) should reconsider the strategy of withdrawing from mainstream AI development. Maybe the world would benefit from more AI craftsmanship.
I’ll close by mentioning a few projects I am excited about in this vein.
First is Yoshua Bengio’s current research project. This project aims to combine the successes of modern LLMs with careful thinking about safety, and careful thinking about how you should build an actual “truth machine” (he calls this combination a “careful AI scientist”).
Third is Sahil’s Live Theory agenda. I would describe a significant part of Sahil’s recent thinking as: let’s take the user interface design problem of AI seriously. It matters how we interact with these things. Sahil is running a hackathon about that soon, which you can apply for. Here is the poster, which I think is great:
OpenAI has tried to take ownership of this perfectly good acronym and turn it into a meaningless brand-name. Fortunately, they seem to have lost this battle, and switched to the “o1” branding. Unfortunately, GPT still lost a lot of meaning, and is now commonly used as three letters you stick on the end of something to mean “chatbot” or something like that.
Some people I know want to use the term “creativity” to point to something which LLMs lack. LLMs uncreatively interpolate between existing ideas they’ve seen, rather than inventing new things. This is fine. It’s not what I mean by “creativity” here. I mean the thing that even basic markov-models of text had in the 1990s: chaining together combinations of words that can sometimes surprise and delight humans due to their unexpectedness.
The term “confabulation” would be much more apt, since confabulation (1) points to language, which is a better fit to LLMs, and (2) refers to nonfactual output, whereas “hallucination” connotes nonfactual sensory input.
AI Craftsmanship
Epistemic status: in some sense, I am just complaining, and making light of the extensive effort which goes into designing modern AI. I’m focusing on a sense that something is missing and could be better, which might incidentally come off as calling a broad category of people stupid. Sorry.
The video Badness 0 by Suckerpinch makes a comparison between the approach of Donald Knuth and a fictional villain which he names “Lorem Epson”. Knuth created the typesetting tool TeX, which (together with LaTeX, a macro package for TeX) has become a nearly ubiquitous tool for writing academic papers, especially difficult-to-typset mathematical work. TeX, along with Knuth’s other work, focuses on identifying good abstractions for thinking about the problem, and delivering perfect solutions at that level of abstraction. In contrast, the Lorem Epson approach focuses on looking good over being good, buzzwords over understanding, etc.
Suckerpinch understandably puts modern LLMs in the Lorem Epsom camp. For example, modern LLM-based editing tools (such as the Writeful tool integrated with the popular LaTeX editing environment Overleaf) fundamentally work by suggesting rephrasings that make your document more probable as opposed to more correct. (I have found Writeful’s suggestions to be almost universally unhelpful, giving me trivial rephrasings that are not particularly easy to read, and are often less correct.)
To illustrate the difference, Suckerpinch shows an example of text typeset via Tex vs text typeset by a more naive, greedy algorithm. I don’t know what the counterfactual history looks like, but it seems all-too-plausible that without Knuth, we would be living in a dystopian alt-history where automated typesetting would be pretty awful.
Modern AI is based on the idea of generative pretraining (GPT)[1]. The basic idea was previously known as transfer learning: it’s the idea that you can train an AI on lots of data, perhaps even unrelated to the final task you want your AI to be good at. The AI learns a lot of patterns[2] (some might say, learns a lot about the world) which end up being useful later on when you train it on your final task. This is a great idea! Unfortunately, it is also easy to misuse.
Truth Machines
Modern chatbots such as ChatGPT take the probability distribution obtained through GPT and try to warp and wrangle it towards outputting true and useful information, through various post-GPT training methods (sometimes broadly called “fine-tuning”, although fine-tuning is also sometimes used in a way which contrasts with more sophisticated methods such as RLHF).
One way I sometimes talk about this: we’re fundamentally starting with a creativity[3] machine, which outputs random plausible continuations of text (or other data formats). The “creativity” of the machine is then subsequently treated as the enemy; it is maligned with the term “hallucination”[4] and subsequent training attempts to stomp it out while keeping the useful behavior. However, with no fundamental way to eliminate hallucinations; in some sense, it is all the system does.
We’re trying to treat them as truth machines rather than dream machines.
One terrible consequence of this is the application of modern voice-to-text transcription technologies. OpenAI’s Whisper system recently made headlines when it came to light that it is already being widely deployed in medical institutions to transcribe interactions with patients, and sometimes makes horrible errors. These errors are high-risk, since they can end up in medical records and influence outcomes.
Surely there should be a better way?
Fundamentally, these systems take recorded audio, and then attempt to produce written text which accurately reflects the audio. One way to think about how hallucinations like this occur is that the learned model has some uncertainty about what an accurate transcription would be, and fills in this uncertainty with its pre-existing world knowledge (that is, its prior over text). At some level, this is necessary. Human transcribers also have some degree of uncertainty and create text by combining what they hear with their prior knowledge of what’s plausible.
However, human transcribers have a nuanced picture of when these plausible inferences are acceptable. Humans can do things like use brackets to represent uncertainty, like writing [inaudible] to represent that something was said but they’re not sure how to transcribe it, or [fire?] to represent an uncertain guess.
The information for this sort of nuance is present in LLMs. In principle, we could do even better: voice transcriptions could represent confidence levels and rate the top completions by probability. In principle, we could even separate confidence that is coming from the audio (the word being transcribed is definitely “wait” based on the local sound-waves alone) vs cases where the confidence is coming from the prior over languages (the word “wait” is expected with high certainty in this context, but the audio itself is more ambiguous). This would help flag cases where the system is guessing based on its prior.
The output of a speech-to-text system could be richly annotated with this sort of information, rather than just giving the text.
However, the technology isn’t designed for this sort of nuance in its present state.
Where are the Knuths?
So, where are the Knuths of the modern era? Why is modern AI dominated by the Lorem Epsoms of the world? Where is the craftsmanship? Why are our AI tools optimized for seeming good, rather than being good?
One hypothesis is that most of the careful thinkers read LessWrong and decided against contributing to AI progress, instead opting to work on AI safety or at least avoiding accelerating AI.
If that’s the case, I think it might be a mistake. Yes, we want stronger sorts of safety. However, I also think that there are types of modern AI which are qualitatively better and worse. It seems like the in-practice gulf between “AI safety people” and “AI engineering people” has created a bad situation where the sort of AI that is being developed at frontier labs lacks a Knuth-like virtue of craftsmanship.
I’m not sure what concrete actions in the world could drive us towards a better future at this point, but maybe safety-minded people (or more broadly, “careful thinkers”) should reconsider the strategy of withdrawing from mainstream AI development. Maybe the world would benefit from more AI craftsmanship.
I’ll close by mentioning a few projects I am excited about in this vein.
First is Yoshua Bengio’s current research project. This project aims to combine the successes of modern LLMs with careful thinking about safety, and careful thinking about how you should build an actual “truth machine” (he calls this combination a “careful AI scientist”).
Second is Conjecture’s Cognitive Emulation agenda.
Third is Sahil’s Live Theory agenda. I would describe a significant part of Sahil’s recent thinking as: let’s take the user interface design problem of AI seriously. It matters how we interact with these things. Sahil is running a hackathon about that soon, which you can apply for. Here is the poster, which I think is great:
OpenAI has tried to take ownership of this perfectly good acronym and turn it into a meaningless brand-name. Fortunately, they seem to have lost this battle, and switched to the “o1” branding. Unfortunately, GPT still lost a lot of meaning, and is now commonly used as three letters you stick on the end of something to mean “chatbot” or something like that.
Remember back in 2013 when the talk of the town was how vector representations of words learned by neural networks represent rich semantic information? So you could do cool things like take the [male] vector, subtract the [female] vector, add the [king] vector, and get out something close to the [queen] vector? That was cool! Where’s the stuff like that these days?
Some people I know want to use the term “creativity” to point to something which LLMs lack. LLMs uncreatively interpolate between existing ideas they’ve seen, rather than inventing new things. This is fine. It’s not what I mean by “creativity” here. I mean the thing that even basic markov-models of text had in the 1990s: chaining together combinations of words that can sometimes surprise and delight humans due to their unexpectedness.
The term “confabulation” would be much more apt, since confabulation (1) points to language, which is a better fit to LLMs, and (2) refers to nonfactual output, whereas “hallucination” connotes nonfactual sensory input.