As far as producing algorithms that are able to, once trained on a vast dataset of [A, B] samples, interpolate a valid completion B for an arbitrary prompt sampled from the distribution of A? Yes, for sure.
I’d say this still applies even to non-LLM architectures like RL, which is the important part, but Jacob Cannell and 1a3orn will have to clarify.
As far as producing something that can genuinely generalize off-distribution, strike way outside the boundaries of interpolation? Jury’s still out.
I agree, but with a caveat, in that I think we do have enough evidence to rule out extreme importance on algorithms, ala Eliezer, and compute is not negligible. Epoch estimates a 50⁄50 split between compute and algorithmic progress being important. Algorithmic progress will likely matter IMO, just not nearly as much as some LWers think it is.
Like, I think my update on all the LLM stuff is “boy, who knew interpolation can get you this far?”. The concept-space sure turned out to have a lot of intricate structure that could be exploited via pure brute force.
I definitely updated something in this direction, which is important, but I now think the AI optimist arguments are general enough to not rely on LLMs, and sometimes not even relying on a model of what future AI will look like beyond the fact that capabilities will grow, and people expect to profit from it.
I’m just saying that if it did, and if the inner homunculus became smart enough, that’d cause all the deceptive-alignment/inner-misalignment/wrapper-mind issues.
Not automatically, and there are potential paths to AGI like Steven Byrnes’s path to Brain-like AGI that either outright avoid deceptive alignment altogether or make it far easier to solve (the short answer is that Steven Byrnes suspects there’s a simple generator of value, so simple that it’s dozens of lines long and if that’s the case, then the corrigible alignment/value learning agent’s simplicity gap is either 0, negative, or a very small positive gap, so small that very little data is required to pick out the honest value learning agent over the deceptive aligned agent, and we have a lot of data on human values, so this is likely to be pretty easy.)
And that if you’re not modeling the AI as being/having a homunculus, you’re not thinking about an AGI,
I think a crux is that I think that AIs will basically always have much more white-boxness to them than any human mind, and I think that a lot of future paradigms of AI, including the ones that scale to superintelligence, that the AI control research is easier point to still mostly be true, especially since I think AI control is fundamentally very profitable and AIs have no legal rights/IRB boards to slow down control research.
I agree, but with a caveat, in that I think we do have enough evidence to rule out extreme importance on algorithms
Mm, I think the “algorithms vs. compute” distinction here doesn’t quite cleave reality at its joints. Much as I talked about interpolation before, it’s a pretty abstract kind of interpolation: LLMs don’t literally memorize the data points, their interpolation relies on compact generative algorithms they learn (but which, I argue, are basically still bounded by the variance in the data points they’ve been shown). The problem of machine learning, then, is in finding some architecture + training-loop setup that would, over the course of training, move the ML model towards implementing some high-performance cognitive algorithms.
It’s dramatically easier than hard-coding the algorithms by hand, yes, and the learning algorithms we do code are very simple. But you still need to figure out in which direction to “push” your model first. (Pretty sure if you threw 2023 levels of compute at a Very Deep fully-connected NN, it won’t match a modern LLM’s performance, won’t even come close.)
So algorithms do matter. It’s just our way of picking the right algorithms consists of figuring out the right search procedure for these algorithms, then throwing as much compute as we can at it.
So that’s where, I would argue, the sharp left turn would lie. Not in-training, when a model’s loss suddenly drops as it “groks” general intelligence. (Although that too might happen.) It would happen when the distributed optimization process of ML researchers tinkering with training loops stumbles upon a training setup that actually pushes the ML model in the direction of the basin of general intelligence. And then that model, once scaled up enough, would suddenly generalize far off-distribution. (Indeed, that’s basically what happened in the human case: the distributed optimization process of evolution searched over training architectures, and eventually stumbled upon one that was able to bootstrap itself into taking off. The “main” sharp left turn happens during the architecture search, not during the training.)
And I’m reasonably sure we’re in an agency overhang, meaning that the newborn GI would pass human intelligence in an eye-blink. (And if it won’t, it’ll likely stall at incredibly unimpressive sub-human levels, so the ML researchers will keep tinkering with the training setups until finding one that does send it over the edge. And there’s no reason whatsoever to expect it to stall again at the human level, instead of way overshooting it.)
we have a lot of data on human values
Which human’s values? IMO, “the AI will fall into the basin of human values” is kind of a weird reassurance, given the sheer diversity of human values – diversity that very much includes xenophobia, genocide, and petty vengeance scaled up to geopolitical scales. And stuff like RLHF designed to fit the aesthetics of modern corporations doesn’t result in deeply thoughtful cosmopolitan philosophers – it results in sycophants concerned with PR as much as with human lives, and sometimes (presumably when not properly adapted to a new model’s scale) in high-strung yanderes.
Let’s grant the premise that the AGI’s values will be restricted to the human range (which I don’t really buy). If the quality of the sample within the human range that we pick will be as good as what GPT-4/Sydney’s masks appeared to be? Yeah, I don’t expect humans to stick around for a while after.
Indeed, that’s basically what happened in the human case: the distributed optimization process of evolution searched over training architectures, and eventually stumbled upon one that was able to bootstrap itself into taking off.
Actually I think the evidence is fairly conclusive that the human brain is a standard primate brain with the only change being nearly a few compute scale dials increased (the number of distinct gene changes is tiny—something like 12 from what I recall). There is really nothing special about the human brain other than 1.) 3x larger than expected size, and 2.) extended neotany (longer training cycle). Neuroscientists have looked extensively for other ‘secret sauce’ and we now have some confidence in a null result: no secret sauce, just much more training compute.
Yes, but: whales and elephants have brains several times the size of humans, and they’re yet to build an industrial civilization. I agree that hitting upon the right architecture isn’t sufficient, you also need to scale it up – but scale alone doesn’t suffice either. You need a combination of scale, and an architecture + training process that would actually transmute the greater scale into more powerful cognitive algorithms.
Evolution stumbled upon the human/primate template brain. One of the forks of that template somehow “took off” in the sense of starting to furiously select for larger brain size. Then, once a certain compute threshold was reached, it took a sharp left turn and started a civilization.
The ML-paradigm analogue would, likewise, involve researchers stumbling upon an architecture that works well at some small scales and has good returns on compute. They’ll then scale it up as far as it’d go, as they’re wont to. The result of that training run would spit out an AGI, not a mere bundle of sophisticated heuristics.
And we have no guarantees that the practical capabilities of that AGI would be human-level, as opposed to vastly superhuman.
(Or vastly subhuman. But if the maximum-scale training run produces a vastly subhuman AGI, the researchers would presumably go back to the drawing board, and tinker with the architectures until they selected for algorithms with better returns on intelligence per FLOPS. There’s likewise no guarantees that this higher-level selection process would somehow result in an AGI of around human level, rather than vastly overshooting it the first time they properly scale it up.)
Yes, but: whales and elephants have brains several times the size of humans, and they’re yet to build an industrial civilization.
Size/capacity isn’t all, but In terms of the capacity which actually matters (synaptic count, and upper cortical neuron count) - from what I recall elephants are at great ape cortical capacity, not human capacity. A few specific species of whales may be at or above human cortical neuron capacity but synaptic density was still somewhat unresolved last I looked.
Then, once a certain compute threshold was reached, it took a sharp left turn and started a civilization.
Human language/culture is more the cause of our brain expansion, not just the consequence. The human brain is impressive because of its relative size and oversized cost to the human body. Elephants/whales are huge and their brains are much smaller and cheaper comparatively. Our brains grew 3x too large/expensive because it was valuable to do so. Evolution didn’t suddenly discover some new brain architecture or trick (it already had that long ago). Instead there were a number of simultaneous whole body coadapations required for larger brains and linguistic technoculture to take off: opposable thumbs, expressive vocal cords, externalized fermentation (gut is as energetically expensive as brain tissue—something had to go), and yes larger brains, etc.
Language enabled a metasystems transition similar to the origin of multicelluar life. Tribes formed as new organisms by linking brains through language/culture. This is not entirely unprecedented—insects are also social organisms of course, but their tiny brains aren’t large enough for interesting world models. The resulting new human social organisms had inter generational memory that grew nearly unbounded with time and creative search capacity that scaled with tribe size.
You can separate intelligence into world model knowledge (crystal intelligence) and search/planning/creativity (fluid intelligence). Humans are absolutely not special in our fluid intelligence—it is just what you’d expect for a large primate brain. Humans raised completely without language are not especially more intelligent than animals. All of our intellectual super powers are cultural. Just as each cell can store the DNA knowledge of the entire organism, each human mind ‘cell’ can store a compressed version of much of human knowledge and gains the benefits thereof.
The cultural metasystems transition which is solely completely responsible for our intellectual capability is a one time qualitative shift that will never reoccur. AI will not undergo the same transition, that isn’t how these work. The main advantage of digital minds is just speed, and to a lesser extent, copying.
I’d say this still applies even to non-LLM architectures like RL, which is the important part, but Jacob Cannell and 1a3orn will have to clarify.
We’ve basically known how to create AGI for at least a decade. AIXI outlines the 3 main components: a predictive world model, a planning engine, and a critic. The brain also clearly has these 3 main components, and even somewhat cleanly separated into modules—that’s been clear for a while.
Transformers LLMs are pretty much exactly the type of generic minimal ULM arch I was pointing at in that post (I obviously couldn’t predict the name but). On a compute scaling basis GPT4 training at 1e25 flops uses perhaps a bit more than human brain training, and its clearly not quite AGI—but mainly because it’s mostly just a world model with a bit of critic: planning is still missing. But its capabilities are reasonably impressive given that the architecture is more constrained than a hypothetical more directly brain equivalent fast-weight RNN of similar size.
Anyway I don’t quite agree with the characterization that these models are just ” interpolating valid completions of any arbitrary prompt sampled from the distribution”. Human intelligence also varies widely on a spectrum with tradeoffs between memorization and creativity. Current LLMs mostly aren’t as creative as the more creative humans and are more impressive in breadth of knowledge, but eh part of that could be simply that they currently completely lack the component essential for creativity? That they accomplish so much without planning/search is impressive.
the short answer is that Steven Byrnes suspects there’s a simple generator of value, so simple that it’s dozens of lines long and if that’s the case,
Interestingly that is closer to my position and I thought that Byrnes thought the generator of value was somewhat more complex, although are views are admittedly fairly similar in general.
I’d say this still applies even to non-LLM architectures like RL, which is the important part, but Jacob Cannell and 1a3orn will have to clarify.
I agree, but with a caveat, in that I think we do have enough evidence to rule out extreme importance on algorithms, ala Eliezer, and compute is not negligible. Epoch estimates a 50⁄50 split between compute and algorithmic progress being important. Algorithmic progress will likely matter IMO, just not nearly as much as some LWers think it is.
I definitely updated something in this direction, which is important, but I now think the AI optimist arguments are general enough to not rely on LLMs, and sometimes not even relying on a model of what future AI will look like beyond the fact that capabilities will grow, and people expect to profit from it.
Not automatically, and there are potential paths to AGI like Steven Byrnes’s path to Brain-like AGI that either outright avoid deceptive alignment altogether or make it far easier to solve (the short answer is that Steven Byrnes suspects there’s a simple generator of value, so simple that it’s dozens of lines long and if that’s the case, then the corrigible alignment/value learning agent’s simplicity gap is either 0, negative, or a very small positive gap, so small that very little data is required to pick out the honest value learning agent over the deceptive aligned agent, and we have a lot of data on human values, so this is likely to be pretty easy.)
I think a crux is that I think that AIs will basically always have much more white-boxness to them than any human mind, and I think that a lot of future paradigms of AI, including the ones that scale to superintelligence, that the AI control research is easier point to still mostly be true, especially since I think AI control is fundamentally very profitable and AIs have no legal rights/IRB boards to slow down control research.
Mm, I think the “algorithms vs. compute” distinction here doesn’t quite cleave reality at its joints. Much as I talked about interpolation before, it’s a pretty abstract kind of interpolation: LLMs don’t literally memorize the data points, their interpolation relies on compact generative algorithms they learn (but which, I argue, are basically still bounded by the variance in the data points they’ve been shown). The problem of machine learning, then, is in finding some architecture + training-loop setup that would, over the course of training, move the ML model towards implementing some high-performance cognitive algorithms.
It’s dramatically easier than hard-coding the algorithms by hand, yes, and the learning algorithms we do code are very simple. But you still need to figure out in which direction to “push” your model first. (Pretty sure if you threw 2023 levels of compute at a Very Deep fully-connected NN, it won’t match a modern LLM’s performance, won’t even come close.)
So algorithms do matter. It’s just our way of picking the right algorithms consists of figuring out the right search procedure for these algorithms, then throwing as much compute as we can at it.
So that’s where, I would argue, the sharp left turn would lie. Not in-training, when a model’s loss suddenly drops as it “groks” general intelligence. (Although that too might happen.) It would happen when the distributed optimization process of ML researchers tinkering with training loops stumbles upon a training setup that actually pushes the ML model in the direction of the basin of general intelligence. And then that model, once scaled up enough, would suddenly generalize far off-distribution. (Indeed, that’s basically what happened in the human case: the distributed optimization process of evolution searched over training architectures, and eventually stumbled upon one that was able to bootstrap itself into taking off. The “main” sharp left turn happens during the architecture search, not during the training.)
And I’m reasonably sure we’re in an agency overhang, meaning that the newborn GI would pass human intelligence in an eye-blink. (And if it won’t, it’ll likely stall at incredibly unimpressive sub-human levels, so the ML researchers will keep tinkering with the training setups until finding one that does send it over the edge. And there’s no reason whatsoever to expect it to stall again at the human level, instead of way overshooting it.)
Which human’s values? IMO, “the AI will fall into the basin of human values” is kind of a weird reassurance, given the sheer diversity of human values – diversity that very much includes xenophobia, genocide, and petty vengeance scaled up to geopolitical scales. And stuff like RLHF designed to fit the aesthetics of modern corporations doesn’t result in deeply thoughtful cosmopolitan philosophers – it results in sycophants concerned with PR as much as with human lives, and sometimes (presumably when not properly adapted to a new model’s scale) in high-strung yanderes.
Let’s grant the premise that the AGI’s values will be restricted to the human range (which I don’t really buy). If the quality of the sample within the human range that we pick will be as good as what GPT-4/Sydney’s masks appeared to be? Yeah, I don’t expect humans to stick around for a while after.
Actually I think the evidence is fairly conclusive that the human brain is a standard primate brain with the only change being nearly a few compute scale dials increased (the number of distinct gene changes is tiny—something like 12 from what I recall). There is really nothing special about the human brain other than 1.) 3x larger than expected size, and 2.) extended neotany (longer training cycle). Neuroscientists have looked extensively for other ‘secret sauce’ and we now have some confidence in a null result: no secret sauce, just much more training compute.
Yes, but: whales and elephants have brains several times the size of humans, and they’re yet to build an industrial civilization. I agree that hitting upon the right architecture isn’t sufficient, you also need to scale it up – but scale alone doesn’t suffice either. You need a combination of scale, and an architecture + training process that would actually transmute the greater scale into more powerful cognitive algorithms.
Evolution stumbled upon the human/primate template brain. One of the forks of that template somehow “took off” in the sense of starting to furiously select for larger brain size. Then, once a certain compute threshold was reached, it took a sharp left turn and started a civilization.
The ML-paradigm analogue would, likewise, involve researchers stumbling upon an architecture that works well at some small scales and has good returns on compute. They’ll then scale it up as far as it’d go, as they’re wont to. The result of that training run would spit out an AGI, not a mere bundle of sophisticated heuristics.
And we have no guarantees that the practical capabilities of that AGI would be human-level, as opposed to vastly superhuman.
(Or vastly subhuman. But if the maximum-scale training run produces a vastly subhuman AGI, the researchers would presumably go back to the drawing board, and tinker with the architectures until they selected for algorithms with better returns on intelligence per FLOPS. There’s likewise no guarantees that this higher-level selection process would somehow result in an AGI of around human level, rather than vastly overshooting it the first time they properly scale it up.)
Size/capacity isn’t all, but In terms of the capacity which actually matters (synaptic count, and upper cortical neuron count) - from what I recall elephants are at great ape cortical capacity, not human capacity. A few specific species of whales may be at or above human cortical neuron capacity but synaptic density was still somewhat unresolved last I looked.
Human language/culture is more the cause of our brain expansion, not just the consequence. The human brain is impressive because of its relative size and oversized cost to the human body. Elephants/whales are huge and their brains are much smaller and cheaper comparatively. Our brains grew 3x too large/expensive because it was valuable to do so. Evolution didn’t suddenly discover some new brain architecture or trick (it already had that long ago). Instead there were a number of simultaneous whole body coadapations required for larger brains and linguistic technoculture to take off: opposable thumbs, expressive vocal cords, externalized fermentation (gut is as energetically expensive as brain tissue—something had to go), and yes larger brains, etc.
Language enabled a metasystems transition similar to the origin of multicelluar life. Tribes formed as new organisms by linking brains through language/culture. This is not entirely unprecedented—insects are also social organisms of course, but their tiny brains aren’t large enough for interesting world models. The resulting new human social organisms had inter generational memory that grew nearly unbounded with time and creative search capacity that scaled with tribe size.
You can separate intelligence into world model knowledge (crystal intelligence) and search/planning/creativity (fluid intelligence). Humans are absolutely not special in our fluid intelligence—it is just what you’d expect for a large primate brain. Humans raised completely without language are not especially more intelligent than animals. All of our intellectual super powers are cultural. Just as each cell can store the DNA knowledge of the entire organism, each human mind ‘cell’ can store a compressed version of much of human knowledge and gains the benefits thereof.
The cultural metasystems transition which is solely completely responsible for our intellectual capability is a one time qualitative shift that will never reoccur. AI will not undergo the same transition, that isn’t how these work. The main advantage of digital minds is just speed, and to a lesser extent, copying.
We’ve basically known how to create AGI for at least a decade. AIXI outlines the 3 main components: a predictive world model, a planning engine, and a critic. The brain also clearly has these 3 main components, and even somewhat cleanly separated into modules—that’s been clear for a while.
Transformers LLMs are pretty much exactly the type of generic minimal ULM arch I was pointing at in that post (I obviously couldn’t predict the name but). On a compute scaling basis GPT4 training at 1e25 flops uses perhaps a bit more than human brain training, and its clearly not quite AGI—but mainly because it’s mostly just a world model with a bit of critic: planning is still missing. But its capabilities are reasonably impressive given that the architecture is more constrained than a hypothetical more directly brain equivalent fast-weight RNN of similar size.
Anyway I don’t quite agree with the characterization that these models are just ” interpolating valid completions of any arbitrary prompt sampled from the distribution”. Human intelligence also varies widely on a spectrum with tradeoffs between memorization and creativity. Current LLMs mostly aren’t as creative as the more creative humans and are more impressive in breadth of knowledge, but eh part of that could be simply that they currently completely lack the component essential for creativity? That they accomplish so much without planning/search is impressive.
Interestingly that is closer to my position and I thought that Byrnes thought the generator of value was somewhat more complex, although are views are admittedly fairly similar in general.