So, where are the Knuths of the modern era? Why is modern AI dominated by the Lorem Epsoms of the world? Where is the craftsmanship? Why are our AI tools optimized for seeming good, rather than being good?
I’m a bit confused by your confusion, and by the fact that your post does not contain what seems to me like the most straightforward explanation of these phenomena. An explanation that I am almost fully certain you are aware of, and which seems to be almost universally agreed upon by those interested (at any level) in interpretability in ML.
Namely the fact that, starting in the 2010s, it happened to be the case (for a ton of historically contingent reasons) that top AI companies (at the beginning, and followed by other ML hubs and researchers afterwards) realized the bitter lesson is basically correct: attempts to hard-code human knowledge or intuition into frontier models ultimately always harm their performance in the long-term compared to “literally just scale the model with more data and compute.” This led to a focus, among experts and top engineers, on figuring out scaling laws, ways of improving the quality and availability of data (perhaps through synthetic generation methods), ways of creating better end-user products through stuff like fine-tuning and RLHF, etc, instead of the older GOFAI stuff of trying to figure out at a deeper level what is going on inside the model.
Another way of saying this is that top researchers and companies ultimately stumbled on an AI paradigm which increased capabilities significantly more than had been achievable previously, but at the cost of strongly decoupling “capability improvements” and “interpretability improvements” as distinct things that researchers and engineers could focus on. It’s not that capability and interpretability were necessarily tightly correlated in the past; that is not the claim I am making. Rather, I am saying that in the pre-(transformer + RL) era, the way you generated improvements in your models/AI was by figuring out specific issues and analyzing them deeply to find out how to get around them, whereas now, a far simpler, easier, less insight-intensive approach became available: literally just scaling up the model with more data and compute.
So the basic point is that you no longer see all this cool research on the internal representations that models generate of high-dimensional data like word embeddings (such as the word2vec stuff you are referring to in the second footnote) because you no longer have nearly as much of a need for these insights in order to improve the capabilities/performance of the AI tools currently in use. It’s fundamentally an issue with demand, not with supply. And the demand from the interpretability-focused AI alignment community is just nowhere close to large enough to bridge the gap and cover the loss generated by the shift in paradigm focus and priorities among the capabilities/”normie” AI research community.
Indeed, the notion that nowadays, the reason you no longer have deep thinkers who try to figure out what is going on or are “motivated by reasons” in how they approach these issues, is somehow because “careful thinkers read LessWrong and decided against contributing to AI progress,” seems… rather ridiculous to me? It’s not like I enjoy responding to an important question that you are asking with derision in lieu of a substantive response, but… I mean, the literal authors of the word2vec paper you cited were AI (capabilities) researchers working at top companies, not AI alignment researchers! Sure, some people like Bengio and Hofstadter (less relevant in practical terms) who are obviously not “LARP-ing impostors” in Wentworth’s terminology have made the shift from capabilities work to trying to raise public awareness of alignment/safety/control problems. But the vast majority (according to personal experience, general impressions, as well as the current state of the discourse on these topics) absolutely have not, and since they were the ones generating the clever insights back in the day, of course it makes sense that the overall supply of these insights has gone down.
I just really don’t see how it could be the case that “people refuse to generate these insights because they have been convinced by AI safety advocates that it would dangerously increase capabilities and shorten timelines” and “people no longer generate these insights as much because they are instead focusing on other tasks that improve model capabilities more rapidly and robustly, given the shifted paradigm” are two hypotheses that can be given similar probabilities in any reasonable person’s mind. The latter should be at least a few orders of magnitude more likely than the former, as I see it.
I think you’re mostly right about the world, but I’m going to continue to articulate disagreements based on my sense of dissatisfaction. You should probably mostly read me as speaking from a should-world rather than being very confused about the is-world.
The bitter lesson is insufficient to explain the lack of structure we’re seeing. I gave the example of Whisper. I haven’t actually used Whisper, so correct me if I’m wrong—maybe there is a way to get more nuanced probabilistic information out of it? But the bitter lesson is about methods of achieving capabilities, not about the capabilities themselves. Producing a plain text output rather than a more richly annotated text that describes some of the uncertainty is a design choice.
To give another example, LLMs could learn a conditional model that’s annotated with metadata like author, date and time, etc. Google Lambda reportedly had something like author-vectors, so that different “characters” could be easily summoned. I would love to play with that, EG, averaging the author-vectors of specific authors I’m interested in to see what it’s like to mix their voices. In theory, LLMs could also learn to predict the metadata. You could use partially-labeled-data approaches such as the EM algorithm to train LLMs to predict the metadata labels while also training them to use those labels in text generation. This would give a rich set of useful capabilities. This would be scientifically useful, too: predictions of date, author, and other metadata would be interesting for historians.
In this way, we could actually get closer to a “truth machine”. There’s a world of difference between the kinds of inferences you can make from ChatGPT’s opinion about a text vs inferences you could make by treating these as legit latent variables and seeing what latent-variable-inference algorithms think.
You say “it’s a matter of demand”. So why does it feel so much like big tech is eager to push the chatbot model on everyone? Everyone is scared to fall behind in the AI race, ever since ChatGPT made it feel plausible that AI would drive significant market share. But did big tech make the correct guess about what there was demand for? Or did everyone over-update on the specific form of ChatGPT, and now there’s just not that much else out there to reveal what the demand really is? Little sparkle-emoji buttons decorate everything; press to talk to the modern equivalent of Clippy.
I’m a bit confused by your confusion, and by the fact that your post does not contain what seems to me like the most straightforward explanation of these phenomena. An explanation that I am almost fully certain you are aware of, and which seems to be almost universally agreed upon by those interested (at any level) in interpretability in ML.
Namely the fact that, starting in the 2010s, it happened to be the case (for a ton of historically contingent reasons) that top AI companies (at the beginning, and followed by other ML hubs and researchers afterwards) realized the bitter lesson is basically correct: attempts to hard-code human knowledge or intuition into frontier models ultimately always harm their performance in the long-term compared to “literally just scale the model with more data and compute.” This led to a focus, among experts and top engineers, on figuring out scaling laws, ways of improving the quality and availability of data (perhaps through synthetic generation methods), ways of creating better end-user products through stuff like fine-tuning and RLHF, etc, instead of the older GOFAI stuff of trying to figure out at a deeper level what is going on inside the model.
Another way of saying this is that top researchers and companies ultimately stumbled on an AI paradigm which increased capabilities significantly more than had been achievable previously, but at the cost of strongly decoupling “capability improvements” and “interpretability improvements” as distinct things that researchers and engineers could focus on. It’s not that capability and interpretability were necessarily tightly correlated in the past; that is not the claim I am making. Rather, I am saying that in the pre-(transformer + RL) era, the way you generated improvements in your models/AI was by figuring out specific issues and analyzing them deeply to find out how to get around them, whereas now, a far simpler, easier, less insight-intensive approach became available: literally just scaling up the model with more data and compute.
So the basic point is that you no longer see all this cool research on the internal representations that models generate of high-dimensional data like word embeddings (such as the word2vec stuff you are referring to in the second footnote) because you no longer have nearly as much of a need for these insights in order to improve the capabilities/performance of the AI tools currently in use. It’s fundamentally an issue with demand, not with supply. And the demand from the interpretability-focused AI alignment community is just nowhere close to large enough to bridge the gap and cover the loss generated by the shift in paradigm focus and priorities among the capabilities/”normie” AI research community.
Indeed, the notion that nowadays, the reason you no longer have deep thinkers who try to figure out what is going on or are “motivated by reasons” in how they approach these issues, is somehow because “careful thinkers read LessWrong and decided against contributing to AI progress,” seems… rather ridiculous to me? It’s not like I enjoy responding to an important question that you are asking with derision in lieu of a substantive response, but… I mean, the literal authors of the word2vec paper you cited were AI (capabilities) researchers working at top companies, not AI alignment researchers! Sure, some people like Bengio and Hofstadter (less relevant in practical terms) who are obviously not “LARP-ing impostors” in Wentworth’s terminology have made the shift from capabilities work to trying to raise public awareness of alignment/safety/control problems. But the vast majority (according to personal experience, general impressions, as well as the current state of the discourse on these topics) absolutely have not, and since they were the ones generating the clever insights back in the day, of course it makes sense that the overall supply of these insights has gone down.
I just really don’t see how it could be the case that “people refuse to generate these insights because they have been convinced by AI safety advocates that it would dangerously increase capabilities and shorten timelines” and “people no longer generate these insights as much because they are instead focusing on other tasks that improve model capabilities more rapidly and robustly, given the shifted paradigm” are two hypotheses that can be given similar probabilities in any reasonable person’s mind. The latter should be at least a few orders of magnitude more likely than the former, as I see it.
I think you’re mostly right about the world, but I’m going to continue to articulate disagreements based on my sense of dissatisfaction. You should probably mostly read me as speaking from a should-world rather than being very confused about the is-world.
The bitter lesson is insufficient to explain the lack of structure we’re seeing. I gave the example of Whisper. I haven’t actually used Whisper, so correct me if I’m wrong—maybe there is a way to get more nuanced probabilistic information out of it? But the bitter lesson is about methods of achieving capabilities, not about the capabilities themselves. Producing a plain text output rather than a more richly annotated text that describes some of the uncertainty is a design choice.
To give another example, LLMs could learn a conditional model that’s annotated with metadata like author, date and time, etc. Google Lambda reportedly had something like author-vectors, so that different “characters” could be easily summoned. I would love to play with that, EG, averaging the author-vectors of specific authors I’m interested in to see what it’s like to mix their voices. In theory, LLMs could also learn to predict the metadata. You could use partially-labeled-data approaches such as the EM algorithm to train LLMs to predict the metadata labels while also training them to use those labels in text generation. This would give a rich set of useful capabilities. This would be scientifically useful, too: predictions of date, author, and other metadata would be interesting for historians.
In this way, we could actually get closer to a “truth machine”. There’s a world of difference between the kinds of inferences you can make from ChatGPT’s opinion about a text vs inferences you could make by treating these as legit latent variables and seeing what latent-variable-inference algorithms think.
To give a third example, there’s Drexler’s Quasilinguistic Neural Representations.
You say “it’s a matter of demand”. So why does it feel so much like big tech is eager to push the chatbot model on everyone? Everyone is scared to fall behind in the AI race, ever since ChatGPT made it feel plausible that AI would drive significant market share. But did big tech make the correct guess about what there was demand for? Or did everyone over-update on the specific form of ChatGPT, and now there’s just not that much else out there to reveal what the demand really is? Little sparkle-emoji buttons decorate everything; press to talk to the modern equivalent of Clippy.