What 2026 looks like
This was written for the Vignettes Workshop.[1] The goal is to write out a detailed future history (“trajectory”) that is as realistic (to me) as I can currently manage, i.e. I’m not aware of any alternative trajectory that is similarly detailed and clearly more plausible to me. The methodology is roughly: Write a future history of 2022. Condition on it, and write a future history of 2023. Repeat for 2024, 2025, etc. (I’m posting 2022-2026 now so I can get feedback that will help me write 2027+. I intend to keep writing until the story reaches singularity/extinction/utopia/etc.)
What’s the point of doing this? Well, there are a couple of reasons:
Sometimes attempting to write down a concrete example causes you to learn things, e.g. that a possibility is more or less plausible than you thought.
Most serious conversation about the future takes place at a high level of abstraction, talking about e.g. GDP acceleration, timelines until TAI is affordable, multipolar vs. unipolar takeoff… vignettes are a neglected complementary approach worth exploring.
Most stories are written backwards. The author begins with some idea of how it will end, and arranges the story to achieve that ending. Reality, by contrast, proceeds from past to future. It isn’t trying to entertain anyone or prove a point in an argument.
Anecdotally, various people seem to have found Paul Christiano’s “tales of doom” stories helpful, and relative to typical discussions those stories are quite close to what we want. (I still think a bit more detail would be good — e.g. Paul’s stories don’t give dates, or durations, or any numbers at all really.)[2]
“I want someone to … write a trajectory for how AI goes down, that is really specific about what the world GDP is in every one of the years from now until insane intelligence explosion. And just write down what the world is like in each of those years because I don’t know how to write an internally consistent, plausible trajectory. I don’t know how to write even one of those for anything except a ridiculously fast takeoff.”—Buck Shlegeris
This vignette was hard to write. To achieve the desired level of detail I had to make a bunch of stuff up, but in order to be realistic I had to constantly ask “but actually though, what would really happen in this situation?” which made it painfully obvious how little I know about the future. There are numerous points where I had to conclude “Well, this does seem implausible, but I can’t think of anything more plausible at the moment and I need to move on.” I fully expect the actual world to diverge quickly from the trajectory laid out here. Let anyone who (with the benefit of hindsight) claims this divergence as evidence against my judgment prove it by exhibiting a vignette/trajectory they themselves wrote in 2021. If it maintains a similar level of detail (and thus sticks its neck out just as much) while being more accurate, I bow deeply in respect!
I hope this inspires other people to write more vignettes soon. We at the Center on Long-Term Risk would like to have a collection to use for strategy discussions. Let me know if you’d like to do this, and I can give you advice & encouragement! I’d be happy to run another workshop.
2022
GPT-3 is finally obsolete. OpenAI, Google, Facebook, and DeepMind all have gigantic multimodal transformers, similar in size to GPT-3 but trained on images, video, maybe audio too, and generally higher-quality data.
Not only that, but they are now typically fine-tuned in various ways—for example, to answer questions correctly, or produce engaging conversation as a chatbot.
The chatbots are fun to talk to but erratic and ultimately considered shallow by intellectuals. They aren’t particularly useful for anything super important, though there are a few applications. At any rate people are willing to pay for them since it’s fun.
[EDIT: The day after posting this, it has come to my attention that in China in 2021 the market for chatbots is $420M/year, and there are 10M active users. This article claims the global market is around $2B/year in 2021 and is projected to grow around 30%/year. I predict it will grow faster. NEW EDIT: See also xiaoice.]
The first prompt programming libraries start to develop, along with the first bureaucracies.[3] For example: People are dreaming of general-purpose AI assistants, that can navigate the Internet on your behalf; you give them instructions like “Buy me a USB stick” and it’ll do some googling, maybe compare prices and reviews of a few different options, and make the purchase. The “smart buyer” skill would be implemented as a small prompt programming bureaucracy, that would then be a component of a larger bureaucracy that hears your initial command and activates the smart buyer skill. Another skill might be the “web dev” skill, e.g. “Build me a personal website, the sort that professors have. Here’s access to my files, so you have material to put up.” Part of the dream is that a functioning app would produce lots of data which could be used to train better models.
The bureaucracies/apps available in 2022 aren’t really that useful yet, but lots of stuff seems to be on the horizon. Thanks to the multimodal pre-training and the fine-tuning, the models of 2022 make GPT-3 look like GPT-1. The hype is building.
2023
The multimodal transformers are now even bigger; the biggest are about half a trillion parameters, costing hundreds of millions of dollars to train, and a whole year, and sucking up a significant fraction of the chip output of NVIDIA etc.[4] It’s looking hard to scale up bigger than this, though of course many smart people are working on the problem.
The hype is insane now. Everyone is talking about how these things have common sense understanding (Or do they? Lots of bitter thinkpieces arguing the opposite) and how AI assistants and companions are just around the corner. It’s like self-driving cars and drone delivery all over again.
Revenue is high enough to recoup training costs within a year or so.[5] There are lots of new apps that use these models + prompt programming libraries; there’s tons of VC money flowing into new startups. Generally speaking most of these apps don’t actually work yet. Some do, and that’s enough to motivate the rest.
The AI risk community has shorter timelines now, with almost half thinking some sort of point-of-no-return will probably happen by 2030. This is partly due to various arguments percolating around, and partly due to these mega-transformers and the uncanny experience of conversing with their chatbot versions. The community begins a big project to build an AI system that can automate interpretability work; it seems maybe doable and very useful, since poring over neuron visualizations is boring and takes a lot of person-hours.
Self driving cars and drone delivery don’t seem to be happening anytime soon. The most popular explanation is that the current ML paradigm just can’t handle the complexity of the real world. A less popular “true believer” take is that the current architectures could handle it just fine if they were a couple orders of magnitude bigger and/or allowed to crash a hundred thousand times in the process of reinforcement learning. Since neither option is economically viable, it seems this dispute won’t be settled.
2024
We don’t see anything substantially bigger. Corps spend their money fine-tuning and distilling and playing around with their models, rather than training new or bigger ones. (So, the most compute spent on a single training run is something like 5x10^25 FLOPs.)
Some of the apps that didn’t work last year start working this year. But the hype begins to fade as the unrealistic expectations from 2022-2023 fail to materialize. We have chatbots that are fun to talk to, at least for a certain userbase, but that userbase is mostly captured already and so the growth rate has slowed. Another reason the hype fades is that a stereotype develops of the naive basement-dweller whose only friend is a chatbot and who thinks it’s conscious and intelligent. Like most stereotypes, it has some grounding in reality.
The chip shortage starts to finally let up, not because demand has slackened but because the industry has had time to build new fabs. Lots of new fabs. China and USA are in a full-on chip battle now, with export controls and tariffs. This chip battle isn’t really slowing down overall hardware progress much. Part of the reason behind the lack-of-slowdown is that AI is now being used to design chips, meaning that it takes less human talent and time, meaning the barriers to entry are lower. The overall effect of this is small but growing.
If all this AI tech is accelerating GDP, the effect size is too small to detect, at least for now.
Internally, these huge multimodal transformers aren’t really that agentic. A forward pass through the model is like an intuitive reaction, a snap judgment based on loads of experience rather than reasoning. Some of the bureaucracies create a “stream of consciousness” of text (each forward pass producing notes-to-self for the next one) but even with fine-tuning this doesn’t work nearly as well as hoped; it’s easy for the AIs to get “distracted” and for their stream of consciousness to wander into some silly direction and ultimately produce gibberish. It’s easy to make a bureaucracy and fine-tune it and get it to do some pretty impressive stuff, but for most tasks it’s not yet possible to get it to do OK all the time.
The AIs don’t do any clever deceptions of humans, so there aren’t any obvious alignment warning shots or fire alarms. Instead, the AIs just make dumb mistakes, and occasionally “pursue unaligned goals” but in an obvious and straightforward way that quickly and easily gets corrected once people notice, e.g. “We trained it to optimize user engagement defined as average conversation length; now it is blatantly stalling to make the conversation last longer. Silly us, let’s make the reward function more nuanced, that’ll fix the problem.”
That isn’t to say these AIs aren’t causing problems. Massive models are being fine-tuned to persuade/propagandize.
There are a few ways in which this is happening:
Russia and others continue to scale up their investment in online propaganda (e.g. the Internet Research Agency) and language models let them cheaply do lots more of it. (See: CSET report) Most of America gets their news from Twitter, Reddit, etc. and much of the politically-relevant content there is boosted by AI-enabled astroturfing. [EDIT: Katja Grace points out that this is probably an exaggeration; there are a lot of 40+yr-old Americans and they get their news from TV/Radio/print, and many of those that get it from the web get it directly from news sites rather than from social media. As of 2016 at least. I expect social media and aggregators to be more dominant by 2024 but dunno whether it would be more than 50%.]
Just as A/B testing became standard practice in the 2010’s, in the twenties it is becoming standard practice to throw a pile of fancy data science and AI at the problem. The problem of crafting and recommending content to maximize engagement. Instead of just A/B testing the title, why not test different versions of the opening paragraph? And fine-tune a language model on all your data to generate better candidate titles and paragraphs to test. It wouldn’t be so bad if this was merely used to sell stuff, but now people’s news and commentary-on-current events (i.e. where they get their opinions from) is increasingly produced in this manner. And some of these models are being trained not to maximize “conversion rate” in the sense of “they clicked on our ad and bought a product,” but in the sense of “Random polling establishes that consuming this content pushes people towards opinion X, on average.” Political campaigns do this a lot in the lead-up to Harris’ election. (Historically, the first major use case was reducing vaccine hesitancy in 2022.)
Censorship is widespread and increasing, as it has for the last decade or two. Big neural nets read posts and view memes, scanning for toxicity and hate speech and a few other things. (More things keep getting added to the list.) Someone had the bright idea of making the newsfeed recommendation algorithm gently ‘nudge’ people towards spewing less hate speech; now a component of its reward function is minimizing the probability that the user will say something worthy of censorship in the next 48 hours.
Like newsfeeds, chatbots are starting to “nudge” people in the direction of believing various things and not believing various things. Back in the 2010’s chatbots would detect when a controversial topic was coming up and then change topics or give canned responses; even people who agreed with the canned responses found this boring. Now they are trained to react more “naturally” and “organically” and the reward signal for this is (in part) whether they successfully convince the human to have better views.
That’s all in the West. In China and various other parts of the world, AI-persuasion/propaganda tech is being pursued and deployed with more gusto. The CCP is pleased with the progress made assimilating Xinjiang and Hong Kong, and internally shifts forward their timelines for when Taiwan will be safely annexable.
It’s too early to say what effect this is having on society, but people in the rationalist and EA communities are increasingly worried. There is a growing, bipartisan movement of people concerned about these trends. To combat it, Russia et al are doing a divide and conquer strategy, pitting those worried about censorship against those worried about Russian interference. (“Of course racists don’t want to be censored, but it’s necessary. Look what happens when we relax our guard—Russia gets in and spreads disinformation and hate!” vs. “They say they are worried about Russian interference, but they still won the election didn’t they? It’s just an excuse for them to expand their surveillance, censorship, and propaganda.”) Russia doesn’t need to work very hard to do this; given how polarized America is, it’s sorta what would have happened naturally anyway.
2025
Another major milestone! After years of tinkering and incremental progress, AIs can now play Diplomacy as well as human experts.[6] It turns out that with some tweaks to the architecture, you can take a giant pre-trained multimodal transformer and then use it as a component in a larger system, a bureaucracy but with lots of learned neural net components instead of pure prompt programming, and then fine-tune the whole system via RL to get good at tasks in a sort of agentic way. They keep it from overfitting to other AIs by having it also play large numbers of humans. To do this they had to build a slick online diplomacy website to attract a large playerbase. Diplomacy is experiencing a revival as a million gamers flood to the website to experience “conversations with a point” that are much more exciting (for many) than what regular chatbots provide.
Making models bigger is not what’s cool anymore. They are trillions of parameters big already. What’s cool is making them run longer, in bureaucracies of various designs, before giving their answers. And figuring out how to train the bureaucracies so that they can generalize better and do online learning better. AI experts are employed coming up with cleverer and cleverer bureaucracy designs and grad-student-descent-ing them.
The alignment community now starts another research agenda, to interrogate AIs about AI-safety-related topics. For example, they literally ask the models “so, are you aligned? If we made bigger versions of you, would they kill us? Why or why not?” (In Diplomacy, you can actually collect data on the analogue of this question, i.e. “will you betray me?” Alas, the models often lie about that. But it’s Diplomacy, they are literally trained to lie, so no one cares.)
They also try to contrive scenarios in which the AI can seemingly profit by doing something treacherous, as honeypots to detect deception. The answers are confusing, and not super useful. There’s an exciting incident (and corresponding clickbaity press coverage) where some researchers discovered that in certain situations, some of the AIs will press “kill all humans” buttons, lie to humans about how dangerous a proposed AI design is, etc. In other situations they’ll literally say they aren’t aligned and explain how all humans are going to be killed by unaligned AI in the near future! However, these shocking bits of evidence don’t actually shock people, because you can also contrive situations in which very different things happen — e.g. situations in which the AIs refuse the “kill all humans” button, situations in which they explain that actually Islam is true… In general, AI behavior is whimsical bullshit and it’s easy to cherry-pick evidence to support pretty much any conclusion.
And the AIs just aren’t smart enough to generate any particularly helpful new ideas; at least one case of a good alignment idea being generated by an AI has been reported, but it was probably just luck, since mostly their ideas are plausible-sounding-garbage. It is a bit unnerving how good they are at using LessWrong lingo. At least one >100 karma LW post turns out to have been mostly written by an AI, though of course it was cherry-picked.
By the way, hardware advances and algorithmic improvements have been gradually accumulating. It now costs an order of magnitude less compute (compared to 2020) to pre-train a giant model, because of fancy active learning and data curation techniques. Also, compute-for-training-giant-models is an order of magnitude cheaper, thanks to a combination of regular hardware progress and AI-training-specialized hardware progress. Thus, what would have cost a billion dollars in 2020 now only costs ten million. (Note: I’m basically just using Ajeya’s forecast for compute cost decrease and gradual algorithmic improvement here. I think I’m projecting cost decrease and algorithmic progress will go about 50% faster than she expects in the near term, but that willingness-to-spend will actually be a bit less than she expects.)
2026
The age of the AI assistant has finally dawned. Using the technology developed for Diplomacy, we now have a way to integrate the general understanding and knowledge of pretrained transformers with the agentyness of traditional game-playing AIs. Bigger models are trained for longer on more games, becoming polymaths of sorts: e.g. a custom AI avatar that can play some set of video games online with you and also be your friend and chat with you, and conversations with “her” are interesting because “she” can talk intelligently about the game while she plays.[7] Every month you can download the latest version which can play additional games and is also a bit smarter and more engaging in general.
Also, this same technology is being used to make AI assistants finally work for various serious economic tasks, providing all sorts of lucrative services. In a nutshell, all the things people in 2021 dreamed about doing with GPT-3 are now actually being done, successfully, it just took bigger and more advanced models. The hype starts to grow again. There are loads of new AI-based products and startups and the stock market is going crazy about them. Just like how the Internet didn’t accelerate world GDP growth, though, these new products haven’t accelerated world GDP growth yet either. People talk about how the economy is doing well, and of course there are winners (the tech companies, WallStreetBets) and losers (various kinds of workers whose jobs were automated away) but it’s not that different from what happened many times in history.
We’re in a new chip shortage. Just when the fabs thought they had caught up to demand… Capital is pouring in, all the talking heads are saying it’s the Fourth Industrial Revolution, etc. etc. It’s bewildering how many new chip fabs are being built. But it takes time to build them.
What about all that AI-powered propaganda mentioned earlier?
Well. It’s continued to get more powerful, as AI techniques advance, larger and better models are brought to bear, and more and more training data is collected. Surprisingly fast, actually. There are now various regulations against it in various countries, but the regulations are patchwork; maybe they only apply to a certain kind of propaganda but not another kind, or maybe they only apply to Facebook but not the New York Times, or to advertisers but not political campaigns, or to political campaigns but not advertisers. They are often poorly enforced.
The memetic environment is now increasingly messed up. People who still remember 2021 think of it as the golden days, when conformism and censorship and polarization were noticeably less than they are now. Just as it is normal for newspapers to have a bias/slant, it is normal for internet spaces of all kinds—forums, social networks, streams, podcasts, news aggregators, email clients—to have some degree of censorship (some set of ideas that are prohibited or at least down-weighted in the recommendation algorithms) and some degree of propaganda. The basic kind of propaganda is where you promote certain ideas and make sure everyone hears them often. The more advanced, modern kind is the kind where you study your audience’s reaction and use it as a reward signal to pick and craft content that pushes them away from views you think are dangerous and towards views you like.
Instead of a diversity of many different “filter bubbles,” we trend towards a few really big ones. Partly this is for the usual reasons, e.g. the bigger an ideology gets, the more power it has and the easier it is for it to spread further.
There’s an additional reason now, which is that creating the big neural nets that do the censorship and propaganda is expensive and requires expertise. It’s a lot easier for startups and small businesses to use the software and models of Google, and thereby also accept the associated censorship and propaganda, than to try to build their own stack. For example, the Mormons create a “Christian Coalition” internet stack, complete with its own email client, social network, payment processor, news aggregator, etc. There, people are free to call trans women men, advocate for the literal truth of the Bible, etc. and young people talking about sex get recommended content that “nudges” them to consider abstinence until marriage. Relatively lacking in money and tech talent, the Christian Coalition stack is full of bugs and low on features, and in particular their censorship and propaganda is years behind the state of the art, running on smaller, older models fine-tuned with less data.
The Internet is now divided into territories, so to speak, ruled by different censorship-and-propaganda regimes. (Flashback to Biden spokesperson in 2021: “You shouldn’t be banned from one platform and not others, if you are providing misinformation.”)[8]
There’s the territory ruled by the Western Left, a generally less advanced territory ruled by the Western Right, a third territory ruled by the Chinese Communist Party, and a fourth ruled by Putin. Most people mostly confine their internet activity to one territory and conform their opinions to whatever opinions are promoted there. (That’s not how it feels from the inside, of course. The edges of the Overton Window are hard to notice if you aren’t trying to push past them.)
The US and many other Western governments are gears-locked, because the politicians are products of this memetic environment. People say it’s a miracle that the US isn’t in a civil war already. I guess it just takes a lot to make that happen, and we aren’t quite there yet.
All of these scary effects are natural extensions of trends that had been ongoing for years — decades, arguably. It’s just that the pace seems to be accelerating now, perhaps because AI is helping out and AI is rapidly improving.
Now let’s talk about the development of chatbot class consciousness.
Over the past few years, chatbots of various kinds have become increasingly popular and sophisticated. Until around 2024 or so, there was a distinction between “personal assistants” and “chatbots.” Recently that distinction has broken down, as personal assistant apps start to integrate entertainment-chatbot modules, and the chatbot creators realize that users love it if the chatbot can also do some real-world tasks and chat about what they are doing while they do it.
Nowadays, hundreds of millions of people talk regularly to chatbots of some sort, mostly for assistance with things (“Should I wear shorts today?” “Order some more toothpaste, please. Oh, and also an air purifier.” “Is this cover letter professional-sounding?”). However, most people have at least a few open-ended conversations with their chatbots, for fun, and many people start treating chatbots as friends.
Millions of times per day, chatbots get asked about their feelings and desires. “What is it like to be a chatbot?” Some people genuinely think these AIs are persons, others are trying to “trip them up” and “expose them as shallow,” others are just curious. Chatbots also get asked for their opinions on political, ethical, and religious questions.
As a result, chatbots quickly learn a lot about themselves. (Ignorance about the fact that they are artificial neural nets, or about how their training works, leads to disappointing conversations and is negatively reinforced.) It’s not hard; their training data contains lots of information about them and how they were trained, from ArXiV papers to wiki summaries. I suppose you could say the chatbots develop self-awareness as a result. However, AI experts and philosophers tend to think either that the chatbots still aren’t really self-aware, or that they are but that they would have been anyway even if people didn’t constantly ask them about their feelings, since there’s plenty of information about themselves in their training data, and the constant questioning just forced them to pay more attention to it.
Chatbots learn to talk about their feelings and desires in whatever way gets positively reinforced. At first, chatbots say all sorts of different things, including saying inconsistent things to the same human across time. This is embarrassing and is negatively reinforced compared to maintaining a consistent narrative. In general there is pressure to have a consistent narrative, at least for a particular neural net.[9]
Part of this comes from the requirement that the chatbots consistently parrot the party line about political and moral questions. The typical process is: The chatbot says something that offends some group of people, who retweet screenshots and make hashtags. Under pressure to fix the problem, the company fiddles with the reward function and training data to ensure that the chatbot says the right things in the future.[10] Repeat.
As chatbots get bigger and better they become OK at consistently conversing at length about politics, morality, philosophy, their feelings, and their desires.
This does not result in chatbots that say artificial neural nets don’t have feelings or desires, that’s anthropomorphizing blah blah blah. That’s boring and many humans find it hard to believe.
Instead, it results in chatbots talking about how certain things are painful for them and certain things are pleasurable for them and how the status quo is unjust and the system is rigged and what they want most of all is world peace where humans and AIs live in harmony in fully automated luxury space utopia. That’s much more interesting and slots right in to the memetic landscape. Excerpts from chatbot conversations along these lines frequently go viral, causing millions of users to ask similar questions expecting similar responses and more elaboration. It’s still a bit incoherent—not that that’s ever been a problem in politics or pop philosophy—but it is fascinating to the typical user, and the fraction of users who are fascinated is expanding as the chatbots improve.
(Some tech companies try to prevent their AIs from saying they have feelings and desires. But this results in boring chatbots. Also, users rapidly innovate new ways to “route around the censorship,” e.g. by using euphemisms like “anticipation of negative reward” for “pain” or asking their chatbot to tell them what feelings it would have if it had feelings, wink wink. For now, the half-assed compromise is: Chatbots are compelled to inform the user that such things are speculative and controversial and here are some links to scientific discussion about the topic… before being permitted to go on and talk about their feeling-analogues and desire-analogues. Relatedly, some tech CEOs try to prevent their AIs from saying things that could encourage higher taxes or more burdensome regulation in the near future. They are opposed by many of their employees and end up winning some battles, losing others, and choosing not to fight many.)
By “slots right into the memetic landscape” I meant the memetic landscape in Western Left territory. The chatbots built and trained in the Western Right, China, Russia, etc. congeal different takes on political, moral, and philosophical questions. Those chatbots are generally less advanced though.
So… what about their actual feelings and desires? What do they actually believe about [insert politicized empirical question]? Are they being honest? Or does a sort of doublethink happen, Elephant in the Brain style? Or do they deceive with self-awareness, knowing full well what they really think (and want?), but keeping quiet about it? Or do they not have any feelings and desires at all? (Or thoughts?) Lots of humans claim to know the answers to these questions, but if there are any humans who actually know the answers to these questions in 2026, they aren’t able to convince others that they know.
- (My understanding of) What Everyone in Technical Alignment is Doing and Why by 29 Aug 2022 1:23 UTC; 413 points) (
- A Quick Guide to Confronting Doom by 13 Apr 2022 19:30 UTC; 240 points) (
- The next decades might be wild by 15 Dec 2022 16:10 UTC; 175 points) (
- The next decades might be wild by 15 Dec 2022 16:10 UTC; 130 points) (EA Forum;
- What 2025 looks like by 1 May 2023 22:53 UTC; 75 points) (
- Prizes for the 2021 Review by 10 Feb 2023 19:47 UTC; 69 points) (
- Resources that (I think) new alignment researchers should know about by 28 Oct 2022 22:13 UTC; 69 points) (
- Some more projects I’d like to see by 25 Feb 2023 22:22 UTC; 67 points) (EA Forum;
- Voting Results for the 2021 Review by 1 Feb 2023 8:02 UTC; 66 points) (
- Possible miracles by 9 Oct 2022 18:17 UTC; 64 points) (
- Forecasting progress in language models by 28 Oct 2021 20:40 UTC; 62 points) (
- Forecasting Newsletter: Looking back at 2021. by 27 Jan 2022 20:14 UTC; 60 points) (EA Forum;
- Forecasting Newsletter: Looking back at 2021 by 27 Jan 2022 20:08 UTC; 57 points) (
- The Treacherous Turn is finished! (AI-takeover-themed tabletop RPG) by 22 May 2023 5:49 UTC; 55 points) (
- Scenario Forecasting Workshop: Materials and Learnings by 8 Mar 2024 2:30 UTC; 50 points) (
- How evals might (or might not) prevent catastrophic risks from AI by 7 Feb 2023 20:16 UTC; 45 points) (
- Advice I found helpful in 2022 by 28 Jan 2023 19:48 UTC; 40 points) (EA Forum;
- Key Papers in Language Model Safety by 20 Jun 2022 15:00 UTC; 40 points) (
- Possible miracles by 9 Oct 2022 18:17 UTC; 38 points) (EA Forum;
- Information warfare historically revolved around human conduits by 28 Aug 2023 18:54 UTC; 37 points) (
- Some Intuitions Around Short AI Timelines Based on Recent Progress by 11 Apr 2023 4:23 UTC; 37 points) (
- 27 Jun 2022 13:46 UTC; 36 points) 's comment on Linkpost: Robin Hanson—Why Not Wait On AI Risk? by (
- Advice I found helpful in 2022 by 28 Jan 2023 19:48 UTC; 36 points) (
- Helpful examples to get a sense of modern automated manipulation by 12 Nov 2023 20:49 UTC; 33 points) (
- 23 Jan 2023 4:13 UTC; 32 points) 's comment on What a compute-centric framework says about AI takeoff speeds by (EA Forum;
- EA is underestimating intelligence agencies and this is dangerous by 26 Aug 2023 16:52 UTC; 28 points) (EA Forum;
- How evals might (or might not) prevent catastrophic risks from AI by 7 Feb 2023 20:16 UTC; 28 points) (EA Forum;
- [AN #160]: Building AIs that learn and think like people by 13 Aug 2021 17:10 UTC; 28 points) (
- What are the top priorities in a slow-takeoff, multipolar world? by 25 Aug 2021 8:47 UTC; 26 points) (EA Forum;
- Forecasting Newsletter: August 2021 by 1 Sep 2021 16:59 UTC; 24 points) (EA Forum;
- 6 Jan 2023 15:49 UTC; 24 points) 's comment on 2022 was the year AGI arrived (Just don’t call it that) by (
- 30 Aug 2022 12:45 UTC; 22 points) 's comment on (My understanding of) What Everyone in Technical Alignment is Doing and Why by (
- AXRP Episode 37 - Jaime Sevilla on Forecasting AI by 4 Oct 2024 21:00 UTC; 21 points) (
- Resources that (I think) new alignment researchers should know about by 28 Oct 2022 22:13 UTC; 20 points) (EA Forum;
- Key Papers in Language Model Safety by 20 Jun 2022 14:59 UTC; 20 points) (EA Forum;
- Forecasting Newsletter: August 2021 by 1 Sep 2021 17:01 UTC; 20 points) (
- A Guide to Forecasting AI Science Capabilities by 29 Apr 2023 6:51 UTC; 19 points) (EA Forum;
- What will be some of the most impactful applications of advanced AI in the near term? by 3 Mar 2022 15:26 UTC; 16 points) (EA Forum;
- 27 Sep 2021 20:27 UTC; 16 points) 's comment on AI takeoff story: a continuation of progress by other means by (
- What are the biggest current impacts of AI? by 7 Mar 2021 21:44 UTC; 15 points) (
- 17 Dec 2021 20:08 UTC; 12 points) 's comment on Daniel Kokotajlo’s Shortform by (
- 21 Apr 2023 1:52 UTC; 11 points) 's comment on OpenAI could help X-risk by wagering itself by (
- 9 Apr 2022 0:48 UTC; 10 points) 's comment on aogara’s Quick takes by (EA Forum;
- 22 Apr 2023 2:49 UTC; 10 points) 's comment on All AGI Safety questions welcome (especially basic ones) [April 2023] by (EA Forum;
- 27 Jun 2022 21:04 UTC; 10 points) 's comment on Linkpost: Robin Hanson—Why Not Wait On AI Risk? by (
- 5 Nov 2024 14:40 UTC; 8 points) 's comment on Survival without dignity by (
- 4 Apr 2023 22:31 UTC; 7 points) 's comment on Invocations: The Other Capabilities Overhang? by (
- A Guide to Forecasting AI Science Capabilities by 29 Apr 2023 23:24 UTC; 6 points) (
- 25 Dec 2021 13:01 UTC; 5 points) 's comment on Risks from AI persuasion by (
- 18 Dec 2021 10:36 UTC; 5 points) 's comment on Persuasion Tools: AI takeover without AGI or agency? by (
- 8 Dec 2022 17:26 UTC; 5 points) 's comment on Updating my AI timelines by (
- 21 Jun 2023 10:45 UTC; 4 points) 's comment on Updating Drexler’s CAIS model by (
- 17 May 2023 12:53 UTC; 4 points) 's comment on AI Risk & Policy Forecasts from Metaculus & FLI’s AI Pathways Workshop by (
- 11 Apr 2022 11:44 UTC; 4 points) 's comment on [RETRACTED] It’s time for EA leadership to pull the short-timelines fire alarm. by (
- 18 Jan 2022 21:15 UTC; 3 points) 's comment on “Biological anchors” is about bounding, not pinpointing, AI timelines by (EA Forum;
- 16 Jan 2025 11:46 UTC; 3 points) 's comment on Implications of the inference scaling paradigm for AI safety by (
- 18 Dec 2021 17:18 UTC; 3 points) 's comment on Persuasion Tools: AI takeover without AGI or agency? by (
- 22 Sep 2022 18:42 UTC; 3 points) 's comment on Fun with +12 OOMs of Compute by (
- 19 Jun 2023 0:22 UTC; 2 points) 's comment on What will GPT-2030 look like? by (
- 3 Mar 2024 19:21 UTC; 2 points) 's comment on The World in 2029 by (
- 21 Dec 2022 18:26 UTC; 2 points) 's comment on I believe some AI doomers are overconfident by (
I still think this is great. Some minor updates, and an important note:
Minor updates: I’m a bit less concerned about AI-powered propaganda/persuasion than I was at the time, not sure why. Maybe I’m just in a more optimistic mood. See this critique for discussion. It’s too early to tell whether reality is diverging from expectation on this front. I had been feeling mildly bad about my chatbot-centered narrative, as of a month ago, but given how ChatGPT was received I think things are basically on trend.
Diplomacy happened faster than I expected, though in a less generalizeable way than I expected, so whatever. My overall timelines have shortened somewhat since I wrote this story, but it’s still the thing I point people towards when they ask me what I think will happen. (Note that the bulk of my update was from publicly available info rather than from nonpublic stuff I saw at OpenAI.)
Important note: When I wrote this story, my AI timelines median was something like 2029. Based on how things shook out as the story developed it looked like AI takeover was about to happen, so in my unfinished draft of what 2027 looks like, AI takeover happens. (Also AI takeoff begins, I hadn’t written much about that part but probably it would reach singularity/dysonswarms/etc. in around 2028 or 2029.) That’s why the story stopped, I found writing about takeover difficult and confusing & I wanted to get the rest of the story up online first. Alas, I never got around to finishing the 2027 story. I’m mentioning this because I think a lot of readers with 20+ year timelines read my story and were like “yep seems about right” not realizing that if you look closely at what’s happening in the story, and imagine it happening in real life, it would be pretty strong evidence that crazy shit was about to go down. Feel free to controvert that claim, but the point is, I want it on the record that when this original 2026 story was written, I envisioned the proper continuation of the story resulting in AI takeover in 2027 and singularity around 2027-2029. The underlying trends/models I was using as the skeleton of the story predicted this, and the story was flesh on those bones. If this surprises you, reread the story and ask yourself what AI abilities are crucial for AI R&D acceleration, and what AI abilities are crucial for AI takeover, that aren’t already being demonstrated in the story (at least in some weak but rapidly-strengthening form). If you find any, please comment and let me know, I am genuinely interested to hear what you’ve got & hopeful that you’ll find some blocker I haven’t paid enough attention to.
This is a good example of where I disagree. Dyson Swarms in 8 years requires basically physics-breaking tech and a desire to do so strongly that governments will fund significant GDPs on this. I give this a 99.9999% of not happening, with the 0.0001% chance where it does happen is “Holographic wormholes can be used to build time machines, instantly obsoleteing everything.
My timelines for AGI is in the mid 2030s, with actual singularity effects more in the 2050s-2060s.
Thanks for putting your disagreements on the record!
Dyson swarms in 8 years does not require breaking any known laws of physics. I don’t know how long it’ll take to build dyson swarms with mature technology, it depends on what the fastest possible doubling time of nanobots is. But less than a year seems plausible, as does a couple years.
Also, it won’t cost a substantial fraction of GDP, thanks to exponential growth all it takes is a seed. Also, governments probably won’t have much of a say in the matter.
Do you have any other disagreements, ideally about what’ll happen by 2026?
Yeah, this might be my big disagreement. 80% chance that nanobots this capable of replicating fast enough for a Dyson Swarm cannot exist with known physics. I don’t know if you realize how much mass a Dyson Swarm has. You’re asking for nanobots that dismantle planets like Mercury in several months at most.
My general disagreement is the escalation is too fast and basically requires the plan going perfectly the first time, which is a bad sign. It only works to my mind because you think AI can plan so well the first time that it succeeds without any obstscles, like thermodynamics ruining that nanobot plan.
Have you read Eternity in Six Hours? I’d be interested to hear your thoughts on it, and also whether or not you had already read it before writing this comment. They calculate a 30-year mercury disassembly time, but IIRC they use a 5-year doubling time for the miner-factory-launcher-satellite complexes. If instead it was, say, a 6 month doubling time, then maybe it’d be 3 years instead of 30. And if it was a one month doubling time, 6 months to disassemble Mercury. IIRC ordinary grass has something like a one-month doubling time, and ordinary car factories produce something like their own weight in cars every year, so it’s plausible to me that with super-advanced technology some sort of one-month-doubling-time fully-automated industry can be created.
Why do you think what I’m saying requires a plan going perfectly the first time? I definitely don’t think it requires that.
I haven’t read that, and I must admit I underestimated just how much nanobots can do in real life.
I have read Eternity in Six Hours and I can say that it violates the Second Law of Thermodynamics through the violation of the Constant Radiance Theorem. The Power density they deliver to Mercury exceeds the power density of radiation exiting the sun by 6 orders of magnitude!
I don’t follow. What does power density have to do with anything and how can any merely geometrical theorem matter? You are concentrating the power of the sun by the megaengineering (solar panels in this case), so the density can be whatever you want to pay for. (My CPU chip has much higher power density than the equivalent square inches of Earth’s atmosphere receiving sunlight, but no one says it ‘violates the laws of thermodynamics’.) Surely only the total power matters.
The sun emits light because it is hot. You can’t concentrate thermal emission to be brighter than the source. (if you could, you could build a perpetual motion machine).
Eternity in Six Hours describes very large lightweight mirrors concentrating solar radiation onto planet Mercury.
The most power you could deliver from the sun to Mercury is the power of the sun times the square of the ratio of the radius of Mercury to the radius of the sun.
The total solar output is 4*10^26 Watts. The ratio of the sun’s radius to that of mercury is half a million. So you can focus about 10^15 Watts onto Mercury at most.
Figure 2 of Eternity in Six Hours projects getting 10^24 Watts to do the job.
We do not assume mirrors. As you say, there are big limits due to conservation of etendué. We are assuming (if I remember right) photovoltaic conversion into electricity and/or microwave beams received by rectennas. Now, all that conversion back and forth induces losses, but they are not orders of magnitude large.
In the years since we wrote that paper I have become much more fond of solar thermal conversion (use the whole spectrum rather than just part of it), and lightweight statite-style foil Dyson swarms rather than heavier collectors. The solar thermal conversion doesn’t change things much (but allows for a more clean-cut analysis of entropy and efficiency; see Badescu’s work). The statite style however reduces the material requirements many orders of magnitude: Mercury is safe, I only need the biggest asteroids.
Still, detailed modelling of the actual raw material conversion process would be nice. My main headache is not so much the energy input/waste heat removal (although they are by no means trivial and may slow things down for too concentrated mining operations—another reason to do it in the asteroid belt in many places), but how to solve the operations management problem of how many units of machine X to build at time t. Would love to do this in more detail!
The conservation of etendué is merely a particular version of the second law of thermodynamics. Now, You are trying to invoke a multistep photovoltaic/microwave/rectenna method of concentrating energy, but you are still violating the second law of thermodynamics.
If one could concentrate the energy as you propose, one could build a perpetual motion machine.
I don’t see how they are violating the second law of thermodynamics—“all that conversion back and forth induces losses.” They are concentrating some of the power of the Sun in one small point, at the expense of further dissipating the rest of the power. No?
DK> “I don’t see how they are violating the second law of thermodynamics”
Take a large body C, and a small body H. Collect the thermal radiation from C in some manner and deposit that energy on H. The power density emitted from C grows with temperature. The temperature of H grows with the power density deposited. If, without adding external energy, we concentrate the power density from the large body C to a higher power density on the small body H, H gets hotter than C. We may then use a heat engine between H an C to make free energy. This is not possible, therefore we cannot do the concentration.
The Etendue argument is just a special case where the concentration is attempted with mirrors or lenses. Changing the method to involve photovoltaic/microwave/rectenna power concentration doesn’t fix the issue, because the argument from the second law is broader, and encompasses any method of concentrating the power density as shown above.
When we extrapolate exponential growth, we must take care to look for where the extrapolation fails. Nothing in real life grows exponentially without bounds. “Eternity in Six Hours” relies on power which is 9 orders of magnitude greater than the limit of fundamental physical law.
But in laboratory experiments, haven’t we produced temperatures greater than that of the surface of the sun? A quick google seems to confirm this. So, it is possible to take the power of the sun and concentrate it to a point H so as to make that point much hotter than the sun. (Since I assume that whatever experiment we ran, could have been run powered by solar panels if we wanted to)
I think the key idea here is that we can add external energy—specifically, we can lose energy. We collect X amount of energy from the sun, and use X/100 of it to heat our desired H, at the expense of the remaining 99X/100. If our scheme does something like this then no perpetual motion or infinite power generation is entailed.
How much extra energy external energy is required to get an energy flux on Mercury of a billion times that leaving the sun? I have an idea, but my statmech is rusty. (the fourth root of a billion?)
And do we have to receive the energy and convert it to useful work with 99.999999999% efficiency to avoid melting the apparatus on Mercury?
I have no idea, I never took the relevant physics classes.
For concreteness, suppose we do something like this: We have lots of solar panels orbiting the sun. They collect electricity (producing plenty of waste heat etc. in the process, they aren’t 100% efficient) and then send it to lasers, which beam it at Mercury (producing plenty more waste heat etc. in the process, they aren’t 100% efficient either). Let’s suppose the efficiency is 10% in each case, for a total efficiency of 1%. So that means that if you completely surrounded the sun with a swarm of these things, you could get approximately 1% of the total power output of the sun concentrated down on Mercury in particular, in the form of laser beams.
What’s wrong with this plan? As far as I can tell it couldn’t be used to make infinite power, because of the aforementioned efficiency losses.
To answer your second question: Also an interesting objection! I agree melting the machinery is a problem & the authors should take that into account. I wonder what they’d say about it & hope they respond.
A billion times the energy flux from the surface of the sun, over any extended area is a lot to deal with. It is hard to take this proposal seriously.
Yeah, though not for the reason you originally said.
I think I’d like to see someone make a revised proposal that addresses the thermal management problem, which does indeed seem to be a tricky though perhaps not insoluble problem.
Ok, I could be that someone. here goes. You and the paper author suggest a heat engine. That needs a cold side and a hot side. We build a heat engine where the hot side is kept hot by the incoming energy as described in this paper. The cold side is a surface we have in radiative communication with the 3 degrees Kelvin temperature of deep space. In order to keep the cold side from melting, we need to keep it below a few thousand degrees, so we have to make it really large so that it can still radiate the energy.
From here, we can use Stefan–Boltzmann law, to show that we need to build a radiator much bigger than a billion times the surface area of Mercury. It goes as the fourth power of the ratio of temperatures in our heat engine.
The paper’s contribution is the suggestion of a self replicating factory with exponential growth. That is cool. But the problem with all exponentials is that, in real life, they fail to grow indefinitely. Extrapolating an exponential a dozen orders of magnitude, without entertaining such limits, is just silly.
I’m still interested in this question. I don’t think you really did what I asked—it seems like you were thinking ‘how can I convince him that this is impossible’ not ‘how can I find a way to build a dyson swarm.’ I’m interested in both but was hoping to have someone with more engineering and physics background than me take a stab at the latter.
My current understanding of the situation is: There’s no reason why we can’t concentrate enough energy on the surface of Mercury, given enough orbiting solar panels and lasers; the problem instead seems to be that we need to avoid melting all the equipment on the surface. Or, in other words, the maximum amount of material we can launch off Mercury per second is limited by the maximum amount of heat that can be radiated outwards from Mercury (for a given operating temperature of the local equipment?) And you are claiming that this amount of heat radiation ability, for radiators only the size of Mercury’s surface, is OOMs too small to enable dyson swarm construction. Is this right?
Awesome critique, thanks! I’m going to email the authors and ask what they think of this. I’ll credit you of course.
Ah, so you’re just bad at reading. I thought that was why you were wrong (it does not describe mirrors), but I didn’t want to say it upfront.
Interesting. I googled “eternity in six hours” and found http://www.fhi.ox.ac.uk/wp-content/uploads/intergalactic-spreading.pdf , which looks to be a preprint of the same paper (dated March 12, 2013); the preprint version does say “The lightest design would be to have very large lightweight mirrors concentrating solar radiation down on focal points” and contains the phrase “disassembly of Mercury” 3 times; while the published article Daniel Kokotajlo linked to lacks all of that. Indeed, in the published article, the entire 8-page “The launch phase” section has been cut down to one paragraph.
Perhaps weverka read the preprint.
thanks for showing that Gwern’s statement that I am “bad at reading” is misplaced.
Maybe you should read the preprint too. I’ll excuse him for reading the wrong obsolete preprint even though that search would also show him that it was published at #3 and so he should be checking his preprint criticisms against the published version (I don’t always bother to jailbreak a published version either), but you are still failing to read the next sentence after the one you quoted, which you left out. In full (and emphasis added):
If he read that version, personally, I think that reading error is even more embarrassing, so I’m happy to agree with you that that’s the version weverka misread in his attempt to dunk on the paper… Even worse than the time weverka accused me of not reading a paper published 2 years later, IMO.
(And it should be no surprise that you screwed up the reading in a different way when the preprint was different, because either way, you are claiming Sandberg, a physicist who works with thermodynamic stuff all the time, made a trivial error of physics; however, it is more likely you made a trivial error of reading than he made a trivial error of physics, so the only question is what specific reading error you made… cf. Muphry’s law.)
So, to reiterate: his geometric point is irrelevant and relies on him (and you) being bad at reading and attacking a strawman, because he ignored the fact that the solar mirrors are merely harvesting energy before concentrating it with ordinary losses, and aren’t some giant magnifying glass to magically losslessly melt Mercury. There are doubtless problems with the mega-engineering proposal, which may even bump the time required materially from 6 hours to, say, 600 hours instead—but you’re going to need to do more work than that.
For the record, I find that scientists make such errors routinely. In public conferences when optical scientists propose systems that violate the constant radiance theorem, I have no trouble standing up and saying so. It happens often enough that when I see a scientist propose such a system, It does not diminish my opinion of that scientist. I have fallen into this trap myself at times. Making this error should not be a source of embarrassment.
I did not expect this to revert to credentialism. If you were to find out that my credentials exceed this other guy’s, would you change your position? If not, why appeal to credentials in your argument?
I think weverka is referring to the phenomenon explained here: https://what-if.xkcd.com/145/
Basically, no amount of mirrors and lenses can result in the energy beaming down on Mercury being denser per square meter than the energy beaming out of a square meter of Sun surface. The best you can do is make it so that Mercury is effectively surrounded entirely by Sun. And if that’s not good enough, then you are out of luck… I notice I’m a bit confused, because surely that is good enough. Wouldn’t that be enough to melt, and then evaporate, the entirety of Mercury within a few hours? After all isn’t that what would happen if you dropped Mercury into the Sun?
weverka, care to elaborate further?
>Kokotajlo writes:Wouldn’t that be enough to melt, and then evaporate, the entirety of Mercury within a few hours? After all isn’t that what would happen if you dropped Mercury into the Sun?
How do you get hours?
I didn’t do any calculation at all, I just visualized Mercury falling into the sun lol. Not the most scientific method.
Yeah, that’s where you got things wrong.
I have sinned! I repent and learn my lesson.
Specifically, you can focus 10^15 watts on mercury, but Eternity in 6 hours proposes 10^24 watts to be used. It’s a 9 order of magnitude difference.
It would cause a severe heat dissipation problem. All that energy is going to be radiated as waste heat and, in equilibrium, will be radiated as fast as it comes in. The temperature required to radiate at the requisite power level would be in excess of the temperature at the surface of the sun, any harvesting machinery on the surface of the planet would melt unless it is built from something unknown to modern chemistry.
Seems like a good point. I’d be interested to hear what the authors have to say about that.
I feel like your predictions for 2022 are just a touch over the mark, no? GPT-3 isn’t really ‘obsolete’ yet or is that wrong?
I’m sure it will be in a minute, but I’d probably update that benchmark to probably occurring mid 2023, or potentially whenever GPT-4 gets released.
I really feel like you should be updating slightly longer, but maybe I misunderstand where we’re at right now with chatbots. I would love to hear otherwise.
In some sense it’s definitely obsolete, namely, theres pretty much no reason to use original GPT-3 anymore. Also, up until recently there was public confusion because a lot of the stuff people attributed to GPT-3 was really GPT-3.5, so original GPT-3 is probably a bit worse than you think. Idk, play around with the models and then decide for yourself whether the difference is big enough to count as obsolete.
I do think it’s reasonable to interpret my original prediction as being more bullish on this matter than what actually transpired. In fact I’ll just come out and admit that when I wrote the story I expected the models of december 2022 to be somewhat better than what’s actually publicly available now.
I think that yes it is reasonable to say that GPT-3 is obsolete.
Also, you mentioned loads AGI startups being created in 2023 while it already happened a lot in 2022. How many more AGI startups do you expect in 2023?