I am writing a post on scenario-based planning, and thought it would be useful to get some example scenario symptoms on the topic of near-term AGI.
So the scenario-based planning thought exercise goes like this:
Imagine it is ten years from now and AGI has happened (for the thought experiment just imagine you are looking at history from the outside and are capable of writing this even if all humans were turned into paperclips or whatever) and you are writing a post on how in retrospect it was obvious that AGI was just around the corner.
What symptoms of this do you see now? Contingent on this scenario being true, what symptoms would you expect in the next 10 years?
Bonus points if you write it from the point of view of your fictional self 10 years from now who has experianced this scenero.
Note: everything stated about 2021 and earlier is actually the case in the real world; everything stated about the post-2021 world is what I’d expect to see contingent on this scenario being true, and something I would give decently high probabilities of in general. I believe there is a fairly high chance of AGI in the next 10 years.
12 July 2031, Retrospective from a post-AGI world:
By 2021, it was blatantly obvious that AGI was immanent. The elements of general intelligence were already known: access to information about the world, the process of predicting part of the data from the rest and then updating one’s model to bring it closer to the truth (note that this is precisely the scientific method, though the fact that it operates in AGI by human-illegible backpropagation rather than legible hypothesis generation and discarding seems to have obscured this fact from many researchers at the time), and the fact that predictive models can be converted into generative models by reversing them: running a prediction model forwards predicts levels of X in a given scenario, but running it backwards predicts which scenarios have a given level of X. A sufficiently powerful system with relevant data, updating to improve prediction accuracy and the ability to be reversed to generate optimization of any parameter in the system is a system that can learn and operate strategically in any domain.
Data wasn’t exactly scarce in 2021. The internet was packed with it, most of it publicly available, and while the use of internal world-simulations to bootstrap an AI’s understanding of reality didn’t become common in would-be general programs until 2023, it was already used in more narrow neural nets like AlphaGo by 2016; certainly researchers at the time were already familiar with the concept.
Prediction improvement by backpropagation was also well known by this point, as was the fact that this is the backbone of human intelligence. While there was a brief time when it seemed like this might be fundamentally different than the operation of the brain (and thus less likely to scale to general intelligence) given that human neurons only feed forwards, it was already known by 2021 that the predictive processing algorithm used by the human neocortex is mathematically isomorphic to backpropagation, albeit implemented slightly differently due to the inability of neurons to feed backwards.
The interchangeability of prediction and optimization or generation was known as well, indeed it wasn’t too uncommon to use predictive neural nets to produce images (one not-uncommon joke application was using porn filters to produce the most pornographic images possible according to the net), and the rise of DeepMind’s complementary AIs DALL-E (image from text) and CLIP (text from image) showed the interchangeability in a striking way (though careful observers might note that CLIP wasn’t reversed DALL-E; the twin nets merely demonstrated that the calculation can go either way; the reversed porn filter was a more rigorous demonstration of optimization from prediction).
Given that all the pieces for AGI thus existed in 2021, why didn’t more people realize what was coming? For that matter, given that all the pieces existed already, why did true AGI take until 2023, and AGI with a real impact on the world until 2025? The answer to the second question is scale. All animal brains operate on virtually identical principles (though there are architectural differences, e.g. striatum vs pallium), yet the difference between a human and a chimp, let alone a human and a mouse, is massive. Until the rise of neural nets, it was commonly assumed that AGI would be a matter primarily of more clever software, rather than simply scaling up relatively simple algorithms. The fact that greater performance is primarily the result of simple size, rather than brilliance on the part of the programmers even became known as the Bitter Lesson, as it wasn’t exactly easy on designers’ egos. With the background assumption of progress as a function of algorithms rather than scale, it was easy to miss that AlphaGo already had nearly everything a modern superintelligence needs; it was just small.
From 2018 through 2021, neural nets were built at drastically increasing scales. GPT (2018) had 117 million parameters, GPT-2 (2019) had 1.5 billion, GPT-3 (2020) had 175 billion, ZeRO-Infinity (2021) had 32 trillion. By comparison to animal brains (a neural net’s parameter is closely analogous to a brain’s synapse), that is similar to an ant (very wide error bars on this one; on the other comparisons I was able to find synapse numbers, but for an ant I could only find the number of neurons), bee, mouse and cat respectively. Extrapolating this trend, it should not have been hard to see human-scale nets coming (100 trillion parameters, reached by 2022), nor AIs orders of magnitude more powerful than this.
Moreover, neural nets are much more powerful in many ways than their biological counterparts. Part of this is speed (computers can operate around a million times faster than biological neurons), but a more counterintuitive part of this is encephalization. Specifically, the requirements of operating an animal’s body are sufficiently intense that available intelligence for other things scales not with brain size, but with the ratio of brain size to body size, called the encephalization quotient (this is why elephants are not smarter than humans, despite having substantially larger brains). An artificial neural net, of course, is not trying to control a body, and can use all of its power on the question at hand. This allowed even a relatively small net like GPT-3 to do college-level work in law and history by 2021 (subjects that require actual understanding remained out of reach of neural nets until 2022, though the 2021 net Hua Zhibing, based on the Chinese Wu Dao 2.0 system, came very close). Given that a mouse-sized neural net can compete with college students, it should have been clear that human-sized nets would posses most elements of general intelligence, and the nets that soon followed at ten and one hundred times the scale of the human brain would be capable of it.
Given the (admittedly retrospective) obviousness of all this, why wasn’t it widely recognized at the time? As previously stated, much of the lag in recognition was driven by the belief that progress would be driven more by advances in algorithms than by scale. Given this belief, AGI would appear extraordinarily difficult, as one would try to imagine algorithms capable of general intelligence at small scales (DeepMind’s AlphaOmega proved in 2030 that this is mathematically impossible; you can’t have true general intelligence at much below cat-scale, and it’s very difficult to have it below human-scale!) Even among those who understood the power of scaling, the fact that it’s almost impossible to have AI do anything in the real world beyond very narrow applications like self-driving cars without reaching the general intelligence threshold made it appear plausible that simply building larger GPT-style systems wouldn’t be enough without another breakthrough. However, in 2021 DeepMind published a landmark paper entitled “Reward is Enough”, recognizing that reward-based reinforcement learning was in fact capable of scaling to general intelligence. This paper was the closest thing humanity ever got to a fire alarm for general AI: a fairly rigorous warning that existing models could scale up without limit, and that AGI was now only a matter of time, rather than requiring any further real breakthroughs.
After that paper, 2022 brought human-scale neural nets (not quite fully generally intelligent, due to lacking human instincts and only being trained on internet data, which leaves some gaps that require substantially superhuman capacity to bridge through inference alone), and 2023 brought the first real AGI, with a quadrillion parameters, powerful enough to develop an accurate map of the world purely through a mix of internet data and internal modeling to bootstrap the quality of its predictions. After that, AI was considered to have stalled, as alignment concerns prohibited the use of such nets to optimize the real world, until 2025 when a program that trained agents on modeling each others’ full terminal values from limited amounts of data allowed the safe real-world deployment of large-scale neural nets. Mankind is eternally grateful to those who raised the alarm about the value alignment problem, without which DeepMind would not have conducted that crucial hiatus, and without which our entire light cone would now be paperclips (instead of just the Horsehead Nebula, which Elon Musk converted to paperclips as a joke).
Thanks! This is the sort of thing that we aimed for with Vignettes Workshop. The scenario you present here has things going way too quickly IMO; I’ll be very surprised if we get to human-scale neural nets by 2022, and quadrillion parameters by 2023. It takes years to scale up, as far as I can tell. Gotta build the supercomputers and write the parallelization code and convince the budget committee to fund things. If you have counterarguments to this take I’d be interested to hear them!
(Also I think that the “progress stalled because people didn’t deploy AI because of alignment concerns” is way too rosy-eyed a view of the situation, haha)
Thanks for the feedback! The timeframe is based on extrapolating neural net sizes since 2018; given that the past two years have each shown two order of magnitude increases, in some ways it’s actually conservative. Of course, it appears we’re in a hardware overhang for neural nets, and once that overhang is exhausted, “gotta build the supercomputers” could slow things down massively. Do you have any data on how close we are to fully utilizing existing hardware? That could tell us a lot about how long we might expect current trends to continue. Another potential argument for slowdown is that there’s stated interest in the industry to build nets up to 100 trillion parameters, but I don’t know for certain how much interest there is in scaling beyond that (though if a 100 trillion parameter net performs well enough, that would likely generate interest by itself).
DeepMind strikes me as reasonably likely to take alignment concerns into account (whether or not they succeed in addressing them is another question); a much scarier scenario IMO would be for the first AGIs to be developed by governments. Convincing the US Congress or Chinese Communist Party to slow progress due to misalignment would be borderline impossible; keeping the best AI in the hands of people willing to align it may be as important as solving alignment in the first place. That could be a remarkably difficult task, as not only do governments have vast resources and strong incentives to pursue AI research, but trying to avoid an unfriendly AI due to congressional fiat would almost certainly touch on politics, with all the associated risks of mind-killing.
I wrote that scenario to be realistic, but “DARPA wanted to wait for alignment, Congress told them to press on, everyone died six months later” is also disturbingly plausible.
I suspect the weakest part of the scenario is the extrapolation from “predictive nets can generate scenarios that optimize a given parameter” to “such a net can be strategic”. While finding the input conditions required for the desired output is a large part of strategy, so too is intelligently determining where to look in the search space, how to chain actions together, how to operate in a world that responds to attempts to alter it in a way it doesn’t to simple predictions and so on. If something substantively similar to this scenario does not happen, most of my probability for why not concentrates here.
Human scale neural nets would be 3 OOMs bigger than GPT-3, a quadrillion parameters would be 1 OOM bigger still. According to the scaling laws and empirical compute-optimal scaling trends, it seems that anyone training a net 3 OOMs bigger than GPT-3 would also train it for, ike, 2 OOMs longer, for a total of +5 OOMs of compute. For a quadrillion-parameter model, we’re looking at +6 OOMs or so.
There’s just no way that’s possible by 2023. GPT-3 costs millions of dollars of compute to train, apparently. +6 OOMs would be trillions. Presumably algorithmic breakthroughs will lower the training cost a bit, and hardware improvements will lower the compute cost, but I highly doubt we’d get 3 OOMs of lower cost by 2023. So we’re looking at a 10-billion-dollar price tag, give or take. I highly doubt anyone will be spending that much in 2023, and even if someone did, I am skeptical that the computing infrastructure for such a thing will have been built in time. I don’t think there are compute clusters a thousand times bigger than the one GPT-3 was trained on (though I might be wrong) and even if we were, to achieve your prediction we’d need it to be tens or hundreds of thousands of times bigger.
On alignment optimism: As I see it, three things need to happen for alignment to succeed.
1. A company that is sympathetic to alignment concerns has to have a significant lead-time over everyone else (before someone replicates or steals code etc.), so that they can do the necessary extra work and spend the extra time and money needed to implement an alignment solution.
2. A solution needs to be found that can be implemented in that amount of lead-time.
3. This solution needs to be actually chosen and implemented by the company, rather than some other, more appealing but incorrect solution chosen instead. (There will be dozens of self-experts pitching dozens of proposed solutions to the problem, each of which will be incorrect by default. The correct one needs to actually rise to the top in the eyes of the company leaders, which is hard since the company leaders don’t know much alignment literature and may not be able to judge good from bad solutions.)
On 1: In my opinion there are only 3 major AI projects sympathetic to alignment concerns, and the pace of progress is such (and the state of security is such) that they’ll probably have less than six months of lead time.
On 2: In my opinion we are not at all close to finding a solution that works even in principle; finding one that works in six months is even harder.
On 3: In my opinion there is only 1 major AI project that has a good chance of distinguishing viable solutions from fake solutions, and actually implementing it rather than dragging feet or convincing themselves that the danger is still in the future and not now. (e.g. “takeoff is supposed to be slow, we haven’t seen any warning shots yet, this system can’t be that dangerous yet”)
Currently, I think the probability of all three things happening seems to be <1%. Happily there’s model uncertainty, unknown unknowns, etc. which is why I’m not quite that pessimistic. But still, it’s pretty scary.
Trillions of dollars for +6 OOMs is not something people are likely to be willing to spend by 2023. On the other hand, part of the reason that neural net sizes have consistently increased by one to two OOMs per year lately is due to advances in running them cheaply. Programs like Microsoft’s ZeRO system aim explicitly at creating nets on the hundred trillion-parameter scale at an acceptable price. Certainly there’s uncertainty around how well it will work, and whether it will be extended to a quadrillion parameters even if it does, but parts of the industry appear to believe it’s practical.
Yeah, I do remember NVIDIA claiming they could do 100T param models by 2023. Not a quadrillion though IIRC.
However, (a) this may be just classic overoptimistic bullshit marketing, and thus we should expect it to be off by a couple years, and (b) they may have been including Mixture of Expert models, in which case 100T parameters is much less of a big deal. To my knowledge a 100T parameter MoE model would be a lot cheaper (in terms of compute and thus money) to train than a 100T parameter dense model like GPT, but also the performance would be significantly worse. If I’m wrong about this I’d love to hear why!
Given the timing of Jensen’s remarks about expecting trillion+ models and the subsequent MoEs of Switch & Wudao (1.2t) and embedding-heavy models like DLRM (12t), with dense models still stuck at GPT-3 scale, I’m now sure that he was referring to MoEs/embeddings, so a 100t MoE/embedding is both plausible and also not terribly interesting. (I’m sure Facebook would love to scale up DLRM another 10x and have embeddings for every SKU and Internet user and URL and video and book and song in the world, that sort of thing, but it will mean relatively little for AI capabilities or risk.) After all, he never said they were dense models, and the source in question is marketing, which can be assumed to accentuate the positive.
More generally, it is well past time to drop discussion of parameters, and switch to compute-only as we can create models with more parameters than we can train (you can fit a 100t-param with ZeRo into your cluster? great! how you gonna train it? Just leave it running for the next decade or two?) and we have no shortage of Internet data either: compute, compute, compute! It’ll only get worse if some new architecture with fast weights comes into play, and we have to start counting runtime-generated parameters as ‘parameters’ too. (eg Schmidhuber back in like the ’00s showed off archs which used… Fourier transforms? to have thousands of weights generate hundreds of thousands of weights or something. Think stuff like hypernetworks. ‘Parameter’ will mean even less than it does now.)
Wouldn’t that imply that the trajectory of AI is heavily dependent on how long Moore’s Law lasts, and how well quantum computers do?
Is your model that the jump to GPT-3 scale consumed the hardware overhang, and that we cannot expect meaningful progress on the same time scale in the near future?
Moore: yes.
QC: AFAIK it’s irrelevant?
GPT-3: it used up a particular kind of overhang you might call the “small-scale industrial CS R&D budget hardware overhang”. (It would certainly be possible to make much greater than GPT-3-level progress, but you’d need vastly larger budgets: say, 10% of a failed erectile-dysfunction drug candidate, or 0.1% of the money it takes to run a failed European fusion reactor or particle collider.) So, I continue to stand by my scaling hypothesis essay’s paradigm that as expected, we saw some imitation and catchup, but no one created a model much bigger than GPT-3, never mind one that was >100x bigger the way GPT-3 was to GPT-2-1.5b, because no one at relevant corporations truly believes in scaling or wishes to commit the necessary resources, or feels that it’s near a crunchtime where there might be a rush to train a model at the edge of the possible, and OA itself has been resting on its laurels as it turns into a SaaS startup. (We’ll see what the Anthropic refugees choose to do with their $124m seed capital, but so far they appear to be making a relaxed start of it as well.)
The overhang GPT-3 used up should not be confused with other overhangs. There are many other hardware overhangs of interest: the hardware overhang of the experience curve where the cost halves every year or two; the hardware overhang of a distilled/compressed/sparsified model; the hardware overhang of the global compute infrastructure available to a rogue agent. The small-scale industrial R&D overhang is the relevant and binding one… for now. But the others become relevant later on, under different circumstances, and many of them keep getting bigger.
Why would QC be irrelevant? Quantum systems don’t perform well on all tasks, but they generally work well for parallel tasks, right? And neural nets are largely parallel. QC isn’t to the point of being able to help yet, but especially if conventional computing becomes a serious bottleneck, it might become important over the next decade.
I think that the only known quantum speedup for relatively generic tasks is from Grover’s algorithm, which only gives a quadratic speedup. That might be significant some day, or not, depending on the cost of quantum hardware. When it comes to superpolynomial speed-ups, it is very much an active field of study which tasks are relevant, and as far as we know it’s only some very specialized tasks like integer factoring. A bunch of people are trying to apply QC to ML but AFAIK it’s still anyone’s guess whether that will end up being significant.
And some of the past QC claims for ML have not panned out. Like, I think there was a Quantum Monte Carlo claimed to be potentially useful for ML which could be done on cheaper QC archs, but then it turned out to be doable classically...? In any case, I have been reading about QCs all my life, and they have yet to become relevant to anything I care about; and I assume Scott Aaronson will alert us should they suddenly become relevant to AI/ML/DL, so the rest of us should go about our lives until that day.
I would be surprised if this was true, because it would mean that the blind search process of evolution was able to create a close to maximally-efficient general intelligence.
That’s why I had it that general intelligence is possible at the cat level. That said, it doesn’t seem too implausible that there’s a general intelligence threshold around human-level intelligence (not brain size), which raises the possibility that achieving general intelligence becomes substantially easier with human-scale brains (which is why evolution achieved it with us, rather than sooner or later).
This scenario is based on the Bitter Lesson model, in which size is far more important than the algorithm once a certain degree of generality in the algorithm is attained. If that is true in general, while evolution would be unlikely to hit on a maximally efficient algorithm, it might get within an order of magnitude of it.
https://davidrozado.substack.com/p/what-is-the-iq-of-chatgpt
I would like to leave this here as evidence that the model stated above is not merely right on track, but arguably too conservative. I was expecting this level of performance in mid 2023, not to see it in January with a system from last year!
I know it’s necessary for the scenario to be as quick as you wrote it, but making things all be the easiest way comes off much less believably than if there were still actual challenges involved. It’s hard to come up with fake breakthroughs to real problems, but it could really help the verisimilitude if done plausibly.
Aside from that it seems pretty well written. It is much too assertive to be accurate today [and there are parts I expect to be very different than reality], but it fits with how history is often explained.
Thanks. The assertiveness was deliberate; I wanted to take the perspective of someone in a post-AGI world saying, “Of course it worked out this way!” In our time, we can’t be as certain; the narrator is suffering from a degree of hindsight bias.
There were a couple of fake breakthroughs in there (though maybe I glossed over them more than I ought?). Specifically the bootstrapping from a given model to a more accurate one by looking for implications and checking for alternatives (this actually is very close to the self-play that helped build AlphaGo as stated, but making it work with a full model of the real world would require substantial further work), and the solution of AI alignment via machine learning with multiple agents seeking to more accurately model each other’s values (which I suspect might actually work, but which is purely speculative).
I can’t say I remember noticing either one of them being listed; perhaps they were glossed over in my remembering things as going the easy way?
I do think that learning to be more accurate through checking implications and checking alternatives is absolutely necessary for high level general intelligence unless you want to include brute force checking the entire possible state of the universe as intelligent. Bootstrapping seems very necessary for getting from where we are now.
Honestly, if it isn’t self-reflective, I view it as an ordinary algorithm.
Enough is only enough. I can masturbate all day, but that doesn’t mean I will have the necessary social skills to pass on my genes.
How is that a response?
This is what I would expect an AGI takeoff to look like if we are in fact in a “hardware overshoot”. I actually think a hardware-bound “slow takeoff” is more likely, but I’d put a scenario like this at >5%.
I should have known that AGI was near the moment that BetaStar was released. Unlike AlphaStar, which was trained using more compute than any previous algorithm and still achieved sub human-expert performance, BetaStar was trained by a researcher on a single TPU in under a month and could beat the world’s best player even when limited to 1⁄2 of the actions-per-minute of human players. Unlike AlphaStar, which used a swarm of Reinforcement Learners to learn a strategy, BetaStar used a much more elegant algorithm that could be said to be a combination of Transformers (of GPT-3 fame) and good-old-fashioned AB-pruning (the same algorithm used by DeepBlue 30 years ago).
The trick was finding a way to combine these that didn’t result in a combinatorial explosion. Not only did the trick work, but because transformers were known to work on a wide variety of domains (text, images, audio, video, gestures,...), it was immediately obvious how to apply the BetaStar algorithm to literally every domain. Motion-planning for robots, resume writing, beating the stock market.
Even if I didn’t see it coming, the experts a Google, OpenAI, and all of the world’s major governments did. Immediately a world-wide arms race was launched to see who could scale BetaStar up as fast as possible. First place meant ruling the world. Second place meant the barest chance at survival. Third place meant extinction.
OpenAI was the first to announce that they had trained a version of BetaStar that appeared to have the intelligence of a 5-year-old child. A week later Google announced that their version of BetaStar was equivalent of a PhD Grad Student. The NSA didn’t say how smart their version of BetaStar was. Rather, the president of the United States announced that every single super-computer and nuclear-weapon in China, Russia, Iran, North Korea and Syria had been destroyed.
A few weeks later, ever single American received a check for $10,000 and a letter explaining that the checks would keep coming every month thereafter. A few riots broke out around the world in resistance to “American Imperialism”, but after checks started arriving in other countries, most people stopped complaining.
Nobody really knows what the AI is up to these days, but life on Earth is good so far and we try not to worry about it. Space, however, --much to Elon Musk’s disappointment—belongs to the AI.
Signs we are in hardware overshoot:
A novel algorithm achieves state-of-the-art performance on a well-studied problem using 2-3 orders of magnitude less compute
It is apparent to experts how this algorithm generalizes to other real-world problems.
Major institutions undertake a “Manhattan Project” style arms-race to scale up a general-purpose algorithm.
Caveats
I gave this story a “happy ending”. Hardware overshoot (and other forms of fast AGI takeoff) is the most-dangerous version of AGI because it has the ability to quickly surpass all human beings. It’s easy to imagine a version of the story where the winner of the arms race is not benevolent, or where there is an alignment-failure and humans lose control of the AGI entirely.
I would frame it a bit differently: Currently, we haven’t solved the alignment problem, so in this scenario the AI would be unaligned and it would kill us all (or do something similarly bad) as soon as it suited it. We can imagine versions of this scenario where a ton of progress is made in solving the alignment problem, or we can imagine versions of this scenario where surprisingly it turns out “alignment by default” is true and there never was a problem to begin with. But both of these would be very unusual, and distinct, scenarios, requiring more text to be written.
It was a slippery slope, with those Neural Networks. They were able to do more and more things, previously unimagined to be possible for them. It was a big surprise for everyone, how good they were at chess, 3600 or so Elo points. Leela Chess Zero invented some theoretical breakthroughs, soon to be exploited by more algorithmic, non-NN chess engines like Stockfish, for its position evaluation function. Even back then, I was baffled by people expecting that this propagation will soon stop, due to some unexpected effect, which never came. Not in chess, nor anywhere else.
It was indeed a matter of “when”, not of “maybe not” anymore. Yes, those first mighty AI’s were quite fake, they have no real clue. Except that this mattered less and less and it was less and less true. In an increasing number of fields.
It was only a matter of time when the first translators from the gibberish weight tables learned by NN’s, to exact algorithms will emerge. Something which people have previously done, stealing ideas from Leela, implementing them with rigor into algorithmic schemes of Stockfish—AI learned as well. Only better, of course.
By then, the slope was very slippery, indeed. I still can’t comprehend, how this wasn’t clear to everyone, even back then, less than 10 years ago.
I never got around to the scenario-based planning post, but things sure have changed in 10 months!
I wrote about this from a retrospective perspective already. “If computer power is the only thing standing between us and the singularity then we will finally have enough computer power… a decade ago.” Humans have a slight advantage in compute architecture now, but I doubt that’s enough to overcome computers’ other advantages.
https://www.lesswrong.com/posts/m5rvZBKyMRtFo53wZ/hardware-is-already-ready-for-the-singularity-algorithm