Predictions are hard, especially about the future. On this we can all agree.
Tyler Cowen offers a post worth reading in full in which he outlines his thinking about AI and what is likely to happen in the future. I see this as essentially the application of Stubborn Attachments and its radical agnosticism to the question of AI. I see the logic in applying this to short-term AI developments the same way I would apply it to almost all historic or current technological progress. But I would not apply it to AI that passes sufficient capabilities and intelligence thresholds, which I see as fundamentally different.
I also notice a kind of presumption that things in most scenarios will work out and that doom is dependent on particular ‘distant possibilities,’ that often have many logical dependencies or require a lot of things to individually go as predicted. Whereas I would say that those possibilities are not so distant or unlikely, but more importantly that the result is robust, that once the intelligence and optimization pressure that matters is no longer human that most of the outcomes are existentially bad by my values and that one can reject or ignore many or most of the detail assumptions and still see this.
My approach is, I’ll respond in-line to Tyler’s post, then there is a conclusion section will summarize the disagreements.
In several of my books and many of my talks, I take great care to spell out just how special recent times have been, for most Americans at least. For my entire life, and a bit more, there have been two essential features of the basic landscape:
1. American hegemony over much of the world, and relative physical safety for Americans.
2. An absence of truly radical technological change.
I notice I am still confused about ‘truly radical technological change’ when in my lifetime we went from rotary landline phones, no internet and almost no computers to a world in which most of what I and most people I know do all day involves their phones, internet and computers. How much of human history involves faster technological change than the last 50 years?
When I look at AI, however, I strongly agree that what we have experienced is not going to prepare us for what is coming, even in the most slow and incremental plausible futures that don’t involve any takeoffs or existential risks. AI will be a very different order of magnitude of speed, even if we otherwise stand still.
Unless you are very old, old enough to have taken in some of WWII, or were drafted into Korea or Vietnam, probably those features describe your entire life as well.
In other words, virtually all of us have been living in a bubble “outside of history.”
Now, circa 2023, at least one of those assumptions is going to unravel, namely #2. AI represents a truly major, transformational technological advance. Biomedicine might too, but for this post I’ll stick to the AI topic, as I wish to consider existential risk.
#1 might unravel soon as well, depending how Ukraine and Taiwan fare. It is fair to say we don’t know, nonetheless #1 also is under increasing strain.
The relative physical safety we enjoy, as I see it, mostly has nothing to do with American hegemony, and everything to do with other advances, and with the absurd trade-offs we have made in the name of physical safety, to the point of letting it ruin our ability to live life and our society’s ability to do things.
When there is an exception, as there recently was, we do not handle it well.
Have we already forgotten March of 2020? How many times in history has life undergone that rapid and huge a transformation? According to GPT-4, the answer is zero. It names The Black Death, Industrial Revolution and World War II, while admitting they fall short. Yes, those had larger long-term impacts, by far (or so we think for now, I agree but note it is too soon to tell), yet they impacted things relatively slowly.
I for one would call my experience of Covid living in history, in Tyler’s sense. I would also note that almost all of the associated changes were negative. Life really did get much worse, and to this day remains much worse than the counterfactual, with major hits to our health, economy, currency, national debt, society, social links and institutional trust.
In several other ways, too, I have felt like I was living in history, even if I’ve known no war or danger of conquest. Cultural change in my lifetime and especially in the last 10 years has been extremely rapid on many fronts, far more than I would expect to find in most 10-year historical periods, whatever you think of the changes. My children’s lives are not so similar to my experience.
It’s not always about war. And if you’re asking ‘how many times did I game out, fully seriously, because I felt it was important to know the answer, a potential breakdown of social order or peace in the United States in the last 5 years, even if those scenarios have so far not come to pass?’ I will simply say the answer isn’t one or zero.
Hardly anyone you know, including yourself, is prepared to live in actual “moving” history. It will panic many of us, disorient the rest of us, and cause great upheavals in our fortunes, both good and bad. In my view the good will considerably outweigh the bad (at least from losing #2, not #1), but I do understand that the absolute quantity of the bad disruptions will be high.
Yes. I believe that a lot of us have already been severely panicked and disoriented. Often multiple times, not only by Covid. Historically people would likely have better rolled with such punches. That was not what I observed.
I am reminded of the advent of the printing press, after Gutenberg. Of course the press brought an immense amount of good, enabling the scientific and industrial revolutions, among many other benefits. But it also created writings by Lenin, Hitler, and Mao’s Red Book. It is a moot point whether you can “blame” those on the printing press, nonetheless the press brought (in combination with some other innovations) a remarkable amount of true, moving history. How about the Wars of Religion and the bloody 17th century to boot? Still, if you were redoing world history you would take the printing press in a heartbeat. Who needs poverty, squalor, and recurrences of Ghenghis Khan-like figures?
Yes. We can agree Printing Press Good, and almost every other technological invention of the past 10,000 years good. Fire, The Wheel, Agriculture, Iron Working, Writing, Gunpowder, Steam Engines, Industrialization, Automobiles, Airplanes, Computers, Phones, you name it, it’s all pretty great with notably rare exceptions and there was mostly no reasonable path to stopping those exceptions for long.
I’d go another step, and say that the list of suppressed technologies that we did stop or slow also contains mostly good things. Of course there are obvious exceptions, like gain of function research and bioweapons and nukes to make the rubble bounce, and also I can see the argument for things like television and social media and crack cocaine and American cheese (so it’s clear it’s not only lethal weapons here, don’t let this list distract you), but these are the exceptions that prove the rule.
The Printing Press was still quite the gradual transition. There is a reason, which I had confirmed by GPT-4, that you can play Europa Universalis or for most historical people live your life in Europe after the printing press comes online in 1440 and until the Reformation comes knocking, and mostly not notice.
I’d also say that AI is fundamentally different from all prior inventions. This is an amazing tool, but it is not only a tool, it is the coming into existence of intelligence that exceeds our own in strength and speed, likely vastly so. This is not the same danger as Lenin or Hitler or Mao writing things or using tools.
But since we are not used to living in moving history, and indeed most of us are psychologically unable to truly imagine living in moving history, all these new AI developments pose a great conundrum. We don’t know how to respond psychologically, or for that matter substantively. And just about all of the responses I am seeing I interpret as “copes,” whether from the optimists, the pessimists, or the extreme pessimists (e.g., Eliezer). No matter how positive or negative the overall calculus of cost and benefit, AI is very likely to overturn most of our apple carts, most of all for the so-called chattering classes.
Yes. AI is very much going to overturn a lot of our apple carts. I continue to reserve judgment on ‘most’ depending on the scenario, especially if you don’t only consider the ‘chattering classes.’
The reality is that no one at the beginning of the printing press had any real idea of the changes it would bring. No one at the beginning of the fossil fuel era had much of an idea of the changes it would bring. No one is good at predicting the longer-term or even medium-term outcomes of these radical technological changes (we can do the short term, albeit imperfectly). No one. Not you, not Eliezer, not Sam Altman, and not your next door neighbor.
How well did people predict the final impacts of the printing press? How well did people predict the final impacts of fire? We even have an expression “playing with fire.” Yet it is, on net, a good thing we proceeded with the deployment of fire (“Fire? You can’t do that! Everything will burn! You can kill people with fire! All of them! What if someone yells “fire” in a crowded theater!?”).
How do we know this? What counts as ‘good at predicting’ such changes? I think in broad strokes you could make some very good predictions in 1440, and I am guessing that many did.
If you want to say fire caused ‘all of human history’ and then I mean, sure, in details that was very hard to predict. Not fair.
Then again, if you’re giving fire that level of credit, what if your prediction was ‘humans will harness more of nature, will discover more things, will be fruitful and multiply and dominate the Earth and increasingly over time destroy that which existing other animals value?’
That doesn’t seem like a crazy prediction to make. It isn’t specific but it doesn’t have to be. Nor is it a distant possibility, or extremely unlikely because there are so many other possible outcomes.
The ancients agreed that was a good and fair prediction. Consider the Myth of Prometheus, very on point. In the version I was taught, the existing powers (The Greek Gods) forbade giving humans the ability to recursively self-improve their abilities (harness fire, this exact thing) because this would allow the humans to displace and disempower the Gods over time, the same way the Gods had displaced the Titans.
Which is indeed exactly what happened, in any important sense. Even if the Greek Gods really did exist the future would now belong to the humans or to their AIs. Did that result in something the Greek Gods would still value? In some ways yes, in some very important ways no. The reason the answer is partly yes is because the Greek Gods were, in many real senses, stories created by and therefore aligned with humans.
The ‘within-a-lifetime’ predictions for the outcomes from fire, of course, seem like they’d be pretty fair and plausibly accurate, for what that’s worth.
So when people predict a high degree of existential risk from AGI, I don’t actually think “arguing back” on their chosen terms is the correct response. Radical agnosticism is the correct response, where all specific scenarios are pretty unlikely. Nonetheless I am still for people doing constructive work on the problem of alignment, just as we do with all other technologies, to improve them. I have even funded some of this work through Emergent Ventures.
What counts as a ‘specific scenario’ here?
Consider the predictions made above, about what would happen after the invention of fire, or similar predictions made in the wake of other basic discoveries, or based on homo sapiens having reached the necessary thresholds of intelligence and perhaps cultural transmission.
If you had predicted, early in the Industrial Revolution, that industry and those who mastered it would rapidly dominate the globe and any who did not embrace it, and anything they did not value would mostly get destroyed?
What if I claim that if we encountered Robin Hanson’s grabby aliens tomorrow, as unlikely as that is statistically, and they didn’t care about us or go galaxy brained on acausal trade or something, we would all be super duper dead, and at a minimum we are not going to be getting much of that cosmic endowment?
I would claim that my core model of AI risk is largely, to me, on a similar level. Not ‘can’t travel faster than speed of light’ or ‘Memento Mori’ or ‘if you raise the price you lower the quantity’ but also not that different either?
If you create something with superior intelligence, that operates at faster speed, that can make copies of itself, what happens by default?
That new source of intelligence will rapidly gain control of the future. It is very, very difficult to prevent this from happening even under ideal circumstances.
Some version of that intelligence will be pointed by someone, at some point, towards achieving some goal. Even if you think it is possibleto design powerful AIs that are not agents and not used as agents, and even to use them to perform miracles or pivotal acts, have you been watching what humans are doing? We are already designing tools to explicitly turn GPT into an agent.
So it doesn’t matter. There will be an agent and there will be a goal. Whatever that goal is (if you want to strengthen this requirement, it is easy to see that ambitious open-ended goals will totally be given to AIs pretty often anyway unless we prevent this), again by default, the best way to maximally and maximally reliably achieve that goal will be to rapidly take control of the future, provided that is within the powers of the system. Which it will be. So the system will do that.
This may or may not require or involve recursive self-improvement. The system then, unless again something goes very, very right, wipes out all value in the universe.
Greatly more powerful things take over from much less powerful things. Things that are much more intelligent than us, and faster than us to boot, and that can be copied, and that can be pointed towards goals, will be greatly more powerful than us. None of this requires complex detailed prediction.
This is not like our past tools. Even in most scenarios where many impossible-seeming things go spectacularly well things do not turn out so hot for us humans or our values.
There is no reason to think things should work out well for us. If we have true radical agnosticism, consider most possible arrangements of matter, or most low-entropy possible arrangements of matter, or most possible inhabitants of mind-space, or most possible advanced intelligences and their possible values, or anything like that.
If you think it would be fine if all the humans get wiped out and replaced by something even more bizarre and inexplicable and that mostly does not share our values, provided it is intelligent and complex, and don’t consider that doom, then that is a point of view. We are not in agreement. I would also warn that we should not presume that the resulting universe would actually be that likely to have that much intelligence or complexity when the music stops. Again, under radical agnosticism or otherwise, one should notice that most configurations of the universe seem pretty wasteful and devoid of value.
Do I and similar others get into tons of detail then, about how stopping this transfer of power from humans to AIs from happening, or preventing all the humans from dying in its wake, is super difficult? Oh yes, because there are so many non-obvious and complex reasons why this is hard, and why so many imagined alternative scenarios are actually Can’t Happens or damn near one.
I do think the detailed discussions are valuable, that they are vital context to modeling what might happen, and towards getting a reasonable distribution of possible outcomes. They’re still beyond scope here.
Key here is that in my model, the space of possible futures that involve the creation of transformational AI has quite a lot of no good very bad zero-value options in it, including but not limited to the default or baseline scenarios. Whereas the space of good outcomes is hard to locate, and requires specific things to go right in some unexpected way.
When we notice Earth seems highly optimized for humans and human values, that is because it is being optimized by human intelligence, without an opposing intelligent force. If we let that cause change, the results change.
I am a bit distressed each time I read an account of a person “arguing himself” or “arguing herself” into existential risk from AI being a major concern. No one can foresee those futures!
Once you keep up the arguing, you also are talking yourself into an illusion of predictability. Since it is easier to destroy than create, once you start considering the future in a tabula rasa way, the longer you talk about it, the more pessimistic you will become. It will be harder and harder to see how everything hangs together, whereas the argument that destruction is imminent is easy by comparison. The case for destruction is so much more readily articulable — “boom!”
I don’t think this would pass as a model of what most such people are thinking. It certainly does not pass mine.
I might say: The core reason we so often have seen creation of things we value win over destruction is, once again, that most of the optimization pressure by strong intelligences was pointing in that directly, that it was coming from humans, and the tools weren’t applying intelligence or optimization pressure. That’s about to change.
I almost never hear arguments like the one quoted above made, at least not in any load-bearing way. I cannot recall anyone saying ‘easier to create than destroy,’ although that is certainly true as far as it goes. It is more like in the long run ‘it is easier for someone to create and do than to ensure no one creates and does’ and ‘once created this thing will be able to create or destroy at will and this will involve our destruction, whatever is created.’ We are very much not saying ‘things get complex and I can’t see a solution, boom is easy, so probably boom.’
One does not need ‘predictability’ to gain some insight into what things might plausibly happen versus which can’t happen, or which types of scenarios are relatively more likely, even under quite a lot of uncertainty. Waving one’s hand and saying ‘can’t predict things’ doesn’t get you out of this. Nor does saying ‘tools have always worked out fine in the past, we’re still here and things are good.’
Nor do we get to make the move ‘all things are possible, therefore doomed scenarios are distant and not likely and not worth worrying about.’ What makes you think that most future scenarios are not doomed, in terms of what you would value about the universe, however you think about that, and we need only worry about particular narrow specific dooms that don’t add up to much probability mass? What makes you think that us puny humans get to keep deciding what happens, or that what ends up happening will be something of which we approve?
I presume this is essentially, at core, Tyler’s Stubborn Attachments argument. That more economic growth and prosperity creates more value, even if it might not take the form you would like, and that in the long run nothing else matters?
I don’t entirely buy that argument even if transformational AI or AGI was not a practical physical possibility. I am sympathetic to that view in such worlds, I think the view is true on the margin for many people and most policy choices. I have some faith that if the future still fundamentally was based on what humans wanted and decided, in good Hayekian style, I’d worry about stubborn equilibria and path dependence, including in terms of their ability to guide longer term growth, but I would worry far less about this than most others.
Yet at some point your inner Hayekian (Popperian?) has to take over and pull you away from those concerns. (Especially when you hear a nine-part argument based upon eight new conceptual categories that were first discussed on LessWrong eleven years ago.) Existential risk from AI is indeed a distant possibility, just like every other future you might be trying to imagine. All the possibilities are distant, I cannot stress that enough. The mere fact that AGI risk can be put on a par with those other also distant possibilities simply should not impress you very much.
This seems like a strange reference class claim, and seems like it grasps at associations and affectations. One cannot say every future is equally distant, or use that with arbitrary divisions of possible classes of futures, none of this is probability.
Given this radical uncertainty, you still might ask whether we should halt or slow down AI advances. “Would you step into a plane if you had radical uncertainty as to whether it could land safely?” I hear some of you saying.
I would put it this way. Our previous stasis, as represented by my #1 and #2, is going to end anyway. We are going to face that radical uncertainty anyway. And probably pretty soon. So there is no “ongoing stasis” option on the table.
I can say that if there was a plane where I had radical uncertainty, or 90% confidence, on its ability to land safely, I would not get on that plane. If you said ‘but you will eventually get on a plane at some point’ I would say all right, let’s work on our air travel technology and build a different plane. If you told me ’yes we might not have to put everyone on Earth into this radically uncertain plane now but we definitely are going to do it with some plane,eventually, might as well do it now, I’d probably get to work on airplane safety.
No, we cannot have ongoing stasis. The AI is very much out of the box and on its way, as I know full well, and advances will continue. I don’t have any hope of preventing GPT-5 and I don’t know anyone else who does either, whether or not it is a good idea.
I find this reframing helps me come to terms with current AI developments. The question is no longer “go ahead?” but rather “given that we are going ahead with something (if only chaos) and leaving the stasis anyway, do we at least get something for our trouble?” And believe me, if we do nothing yes we will re-enter living history and quite possibly get nothing in return for our trouble.
With AI, do we get positives? Absolutely, there can be immense benefits from making intelligence more freely available. It also can help us deal with other existential risks. Importantly, AI offers the potential promise of extending American hegemony just a bit more, a factor of critical importance, as Americans are right now the AI leaders. And should we wait, and get a “more Chinese” version of the alignment problem? I just don’t see the case for that, and no I really don’t think any international cooperation options are on the table. We can’t even resurrect WTO or make the UN work or stop the Ukraine war.
Too late, it’s happening, you can’t stop it and it’s good actually. I know that meme.
So essentially the argument here is that if we don’t build AI fast to beat China then the Chinese will build it first, and we cannot possibly make a deal here, so we had better build it first to maintain our hegemony, the important thing is which monkey gets the banana first?
That is exactly the nightmare scenario thinking we’ve been warning about for decades, shouting from the rooftops, who says the future is so hard to predict?
It also does not bear on the question of what one should expect, should one go down that road, even if true.
It is entirely possible (I do not endorse these numbers at all) that if we build AGI here in America quickly we die with 50% probability and if we let China build it we die with 75% probability instead, or perhaps we die with 50% probability either way but if China builds the AI and we live then we get a totalitarian future, or what not.
And perhaps there is in practice actually no way out of the dilemma. And maybe we should therefore with heavy hearts do the most dangerous thing that has ever been done. No missing moods. Do not pretend the 50% risk is undefined and therefore almost zero. Litany of Tarski, if it’s 5% or 10% or 50% or 90% I want to believe that.
Besides, what kind of civilization is it that turns away from the challenge of dealing with more…intelligence? That has not the self-confidence to confidently confront a big dose of more intelligence? Dare I wonder if such societies might not perish under their current watch, with or without AI? Do you really want to press the button, giving us that kind of American civilization?
The kind of civilization that wants to survive. That wants its people and their legacies to survive. Seriously.
I don’t want a society that has the self-confidence to commit suicide because it wouldn’t look confident to not do that. If we are who I want us to be? We will not go quietly into the night. We will not perish without a fight.
(Nor will we presume that we can successfully face down a technologically vastly superior an alien invasion with bravery and a computer virus.)
If you tell me we can make all our people more intelligent? Or all our children more intelligent? Great, let’s totally do that.
If you instead propose creating powerful truly alien computer intelligences that we have no idea how to control, whose values we cannot predict, that will inevitably take control of our future and impose very alien values that I model as very likely not including keeping us around all that long, for reasons we’ve discussed a lot already? Let’s not do that.
Unless you are fine with that outcome. In which case we are not in agreement.
So we should take the plunge. If someone is obsessively arguing about the details of AI technology today, and the arguments on LessWrong from eleven years ago, they won’t see this. Don’t be suckered into taking their bait. The longer a historical perspective you take, the more obvious this point will be. We should take the plunge. We already have taken the plunge.
This is a call to not consider the object-level physical arguments about how AI is likely to work and what it is likely to do when it scales up over the medium-term. That does not seem like a good way to predict its likely consequences, at all. Or to ensure good outcomes, at all.
My Inner Tyler says that’s the point, you can’t predict such outcomes, Stubborn Attachments, economic growth, ship has sailed regardless, stop pretending you matter or you have any control over the future, there is no other way. I don’t agree.
We designed/tolerated our decentralized society so we could take the plunge.
See you all on the other side.
Yes, we designed and tolerated our society in order to be able to create lots of new tools, and do lots of new things. Which we have gotten out of the habit of doing, instead preventing us from building houses and clean energy projects and trying new medicines and doing a wide variety of things without explicit permission. When we do get in the way, it’s almost always making things much worse, creating stagnation and impoverishment. It’s terrible.
And yes, I absolutely want to be able to say that all of that applies to AI as well.
I even think it actually does apply to AIs like GPT-4. I expect great and positive things.
I still can’t help but notice that we are all on schedule to then die if we keep going. Not 99%+ definitely, but not ‘distant possibility’ or anything one can ignore. And that’s bad, you see, and worth doing quite a lot to prevent.
Conclusion
In most ways in most contexts, my model is remarkably close to Tyler’s, as I understand it. If we were having this argument about building a tool that wasn’t intelligent, I would almost always agree. We should go ahead and build it, far more than we actually do. We both want to see more focus on economic growth, less restraint and regulation, less worry about distributional impacts or shifts in what humans value over time, more confidence that life will get better as a result of improving physical conditions.
I would even apply that to current AI systems like GPT-4, even with plug-ins. I see the direct risks there as fully acceptable, except for the risk of what comes after.
That brings us to where we centrally disagree. When we cross the necessary thresholds and AI gets sufficiently powerful, I expect most outcomes to be existentially bad by my own values, in ways that are very difficult to avoid. I see this as robust, not based on a complex chain of logical reasoning.
I also strongly expect that the safety protocols that work now will suddenly stop working at exactly the worst possible time, and that this is simply a fact about the physical world. We’ll need solutions that at least might work, and we don’t have them. Assuming that things likely kind of turn out normal and fine on this level seems like exactly the type of thing Tyler is warning others not to think in so many other contexts.
I also put much higher probability and credence to particular scenarios of rapid and complete existential risk, especially those that involve some combination of self-improvement, power-seeking, instrumental convergence, sharp left turns, orthogonality of goals and the AGI winning before we know there is a battle or that the AGI even exists in anything like its current form and capabilities. I do not consider this a ‘distant possibility’ at all. I don’t have it at 99% or anything, but I see this as the natural default outcome and the details of how we get there as mostly not much altering the destination.
The thing is, I am not relying on that to explain why I am worried.
I see these as two distinct disagreements, both about how seriously we should take certain particular more narrow scenarios in terms of probability, and the question of whether the bulk of other potential outcomes we should consider doomed versus not doomed (and I do think that most of the not doomed ones probably go very well).
That brings us to the third disagreement, which is a more universal question of whether one can make usefully predictions about the future at all beyond the short-term – Tyler as I understand him says no, hence Stubborn Attachments. I say yes, and that while finding good interventions to change outcomes over longer time horizons is difficult it is not impossible, or was it in the past.
We would have many disagreements about details of arguments, except that Tyler in his third disagreement is arguing that none of those details matter. I would say that they matter very much for the second argument, even if one rejects the first on the basis of the third.
The fourth disagreement is in Tyler’s assertion of a fait accompli. Even if we could slow things down or stop them here, he says, we can’t stop China or make a deal with them, so we need to go ahead anyway. Well, not with that attitude we can’t, that’s only going to make the race more intense, faster and less safe. I am not convinced China could make real progress in AI on its own rather than doing imitation. I am not convinced coordination and other interventions hopeless, even if good ones are difficult to find – we don’t have a solution to this, even a partial one, but we also don’t have a solution to alignment.
I do see a lot of signs that the necessary concerns are gaining in traction and attention, and that those in AI labs take them increasingly seriously. That greatly increases our chances of success in various ways. Some dignity has been won versus the counterfactual, new lines of action are possible in the future. What we have so far is inadequate and will definitely fail, I don’t like where we are, it is still a start, every little bit helps.
There is also a fifth disagreement, where Tyler considers us to have not lived through history, that tech advances have been unusually slow and non-disruptive, that unless we build AI soon that suddenly we will once again live in ‘interesting times’ anyway, that are filled with danger and disruption in ways we will not like, sufficiently so that perhaps substantial risk of ruin is justified to prevent this.
I think there are important things being gestured at here, and that goes double if we ‘bake in’ existing AI technologies that we can’t hope to undo. A lot of things are going to change, our lives are going to be disrupted. I still think that in many ways my life has been pretty disrupted by technological change. It has been extremely physically safe, more so than I would even want, but I don’t expect that the end of hegemony would put me in any physical danger. It is not only in America that life is deeply physically safe.
At core: I think taking an attitude of fait accompli, of radical uncertainty and not attempting to predict the future or what might impact it, is not The Way, here or anywhere else. Nor should we despair that there is anything we can do to change our odds of success or sculpt longer term outcomes beyond juicing economic growth and technological advancement (although in almost every case we should totally be juicing real economic growth and technological advancement).
If you think we can’t slow things down, or that slowing things down would inevitably hand the race to China, I notice that we are already slowing things down in the name of safety concerns even if they are other safety concerns, that there is real and growing effort to worry about all sorts of risks, both in general and in the AI labs. We are not favorites, the game board is in a terrible state, the odds are against us and the situation is grim, but the game is going in many ways much better than I expected, or at least much better than I would have expected given the pace of capabilities progress.
My other approach, as always, continues to be that even if we cannot solve the problem directly, we can help people better understand the problem, help people better understand the world, improve our ability to reason and make good decisions generally, improve the world such that coordination and cooperation and optimism and personal sacrifices become more viable – almost entirely in ways that I would hope Tyler would agree with.
Response to Tyler Cowen’s Existential risk, AI, and the inevitable turn in human history
Link post
Predictions are hard, especially about the future. On this we can all agree.
Tyler Cowen offers a post worth reading in full in which he outlines his thinking about AI and what is likely to happen in the future. I see this as essentially the application of Stubborn Attachments and its radical agnosticism to the question of AI. I see the logic in applying this to short-term AI developments the same way I would apply it to almost all historic or current technological progress. But I would not apply it to AI that passes sufficient capabilities and intelligence thresholds, which I see as fundamentally different.
I also notice a kind of presumption that things in most scenarios will work out and that doom is dependent on particular ‘distant possibilities,’ that often have many logical dependencies or require a lot of things to individually go as predicted. Whereas I would say that those possibilities are not so distant or unlikely, but more importantly that the result is robust, that once the intelligence and optimization pressure that matters is no longer human that most of the outcomes are existentially bad by my values and that one can reject or ignore many or most of the detail assumptions and still see this.
My approach is, I’ll respond in-line to Tyler’s post, then there is a conclusion section will summarize the disagreements.
I notice I am still confused about ‘truly radical technological change’ when in my lifetime we went from rotary landline phones, no internet and almost no computers to a world in which most of what I and most people I know do all day involves their phones, internet and computers. How much of human history involves faster technological change than the last 50 years?
When I look at AI, however, I strongly agree that what we have experienced is not going to prepare us for what is coming, even in the most slow and incremental plausible futures that don’t involve any takeoffs or existential risks. AI will be a very different order of magnitude of speed, even if we otherwise stand still.
The relative physical safety we enjoy, as I see it, mostly has nothing to do with American hegemony, and everything to do with other advances, and with the absurd trade-offs we have made in the name of physical safety, to the point of letting it ruin our ability to live life and our society’s ability to do things.
When there is an exception, as there recently was, we do not handle it well.
Have we already forgotten March of 2020? How many times in history has life undergone that rapid and huge a transformation? According to GPT-4, the answer is zero. It names The Black Death, Industrial Revolution and World War II, while admitting they fall short. Yes, those had larger long-term impacts, by far (or so we think for now, I agree but note it is too soon to tell), yet they impacted things relatively slowly.
I for one would call my experience of Covid living in history, in Tyler’s sense. I would also note that almost all of the associated changes were negative. Life really did get much worse, and to this day remains much worse than the counterfactual, with major hits to our health, economy, currency, national debt, society, social links and institutional trust.
In several other ways, too, I have felt like I was living in history, even if I’ve known no war or danger of conquest. Cultural change in my lifetime and especially in the last 10 years has been extremely rapid on many fronts, far more than I would expect to find in most 10-year historical periods, whatever you think of the changes. My children’s lives are not so similar to my experience.
It’s not always about war. And if you’re asking ‘how many times did I game out, fully seriously, because I felt it was important to know the answer, a potential breakdown of social order or peace in the United States in the last 5 years, even if those scenarios have so far not come to pass?’ I will simply say the answer isn’t one or zero.
Yes. I believe that a lot of us have already been severely panicked and disoriented. Often multiple times, not only by Covid. Historically people would likely have better rolled with such punches. That was not what I observed.
Yes. We can agree Printing Press Good, and almost every other technological invention of the past 10,000 years good. Fire, The Wheel, Agriculture, Iron Working, Writing, Gunpowder, Steam Engines, Industrialization, Automobiles, Airplanes, Computers, Phones, you name it, it’s all pretty great with notably rare exceptions and there was mostly no reasonable path to stopping those exceptions for long.
I’d go another step, and say that the list of suppressed technologies that we did stop or slow also contains mostly good things. Of course there are obvious exceptions, like gain of function research and bioweapons and nukes to make the rubble bounce, and also I can see the argument for things like television and social media and crack cocaine and American cheese (so it’s clear it’s not only lethal weapons here, don’t let this list distract you), but these are the exceptions that prove the rule.
The Printing Press was still quite the gradual transition. There is a reason, which I had confirmed by GPT-4, that you can play Europa Universalis or for most historical people live your life in Europe after the printing press comes online in 1440 and until the Reformation comes knocking, and mostly not notice.
I’d also say that AI is fundamentally different from all prior inventions. This is an amazing tool, but it is not only a tool, it is the coming into existence of intelligence that exceeds our own in strength and speed, likely vastly so. This is not the same danger as Lenin or Hitler or Mao writing things or using tools.
Yes. AI is very much going to overturn a lot of our apple carts. I continue to reserve judgment on ‘most’ depending on the scenario, especially if you don’t only consider the ‘chattering classes.’
How do we know this? What counts as ‘good at predicting’ such changes? I think in broad strokes you could make some very good predictions in 1440, and I am guessing that many did.
If you want to say fire caused ‘all of human history’ and then I mean, sure, in details that was very hard to predict. Not fair.
Then again, if you’re giving fire that level of credit, what if your prediction was ‘humans will harness more of nature, will discover more things, will be fruitful and multiply and dominate the Earth and increasingly over time destroy that which existing other animals value?’
That doesn’t seem like a crazy prediction to make. It isn’t specific but it doesn’t have to be. Nor is it a distant possibility, or extremely unlikely because there are so many other possible outcomes.
The ancients agreed that was a good and fair prediction. Consider the Myth of Prometheus, very on point. In the version I was taught, the existing powers (The Greek Gods) forbade giving humans the ability to recursively self-improve their abilities (harness fire, this exact thing) because this would allow the humans to displace and disempower the Gods over time, the same way the Gods had displaced the Titans.
Which is indeed exactly what happened, in any important sense. Even if the Greek Gods really did exist the future would now belong to the humans or to their AIs. Did that result in something the Greek Gods would still value? In some ways yes, in some very important ways no. The reason the answer is partly yes is because the Greek Gods were, in many real senses, stories created by and therefore aligned with humans.
The ‘within-a-lifetime’ predictions for the outcomes from fire, of course, seem like they’d be pretty fair and plausibly accurate, for what that’s worth.
What counts as a ‘specific scenario’ here?
Consider the predictions made above, about what would happen after the invention of fire, or similar predictions made in the wake of other basic discoveries, or based on homo sapiens having reached the necessary thresholds of intelligence and perhaps cultural transmission.
If you had predicted, early in the Industrial Revolution, that industry and those who mastered it would rapidly dominate the globe and any who did not embrace it, and anything they did not value would mostly get destroyed?
What if I claim that if we encountered Robin Hanson’s grabby aliens tomorrow, as unlikely as that is statistically, and they didn’t care about us or go galaxy brained on acausal trade or something, we would all be super duper dead, and at a minimum we are not going to be getting much of that cosmic endowment?
I would claim that my core model of AI risk is largely, to me, on a similar level. Not ‘can’t travel faster than speed of light’ or ‘Memento Mori’ or ‘if you raise the price you lower the quantity’ but also not that different either?
If you create something with superior intelligence, that operates at faster speed, that can make copies of itself, what happens by default?
That new source of intelligence will rapidly gain control of the future. It is very, very difficult to prevent this from happening even under ideal circumstances.
Some version of that intelligence will be pointed by someone, at some point, towards achieving some goal. Even if you think it is possibleto design powerful AIs that are not agents and not used as agents, and even to use them to perform miracles or pivotal acts, have you been watching what humans are doing? We are already designing tools to explicitly turn GPT into an agent.
So it doesn’t matter. There will be an agent and there will be a goal. Whatever that goal is (if you want to strengthen this requirement, it is easy to see that ambitious open-ended goals will totally be given to AIs pretty often anyway unless we prevent this), again by default, the best way to maximally and maximally reliably achieve that goal will be to rapidly take control of the future, provided that is within the powers of the system. Which it will be. So the system will do that.
This may or may not require or involve recursive self-improvement. The system then, unless again something goes very, very right, wipes out all value in the universe.
Greatly more powerful things take over from much less powerful things. Things that are much more intelligent than us, and faster than us to boot, and that can be copied, and that can be pointed towards goals, will be greatly more powerful than us. None of this requires complex detailed prediction.
This is not like our past tools. Even in most scenarios where many impossible-seeming things go spectacularly well things do not turn out so hot for us humans or our values.
There is no reason to think things should work out well for us. If we have true radical agnosticism, consider most possible arrangements of matter, or most low-entropy possible arrangements of matter, or most possible inhabitants of mind-space, or most possible advanced intelligences and their possible values, or anything like that.
If you think it would be fine if all the humans get wiped out and replaced by something even more bizarre and inexplicable and that mostly does not share our values, provided it is intelligent and complex, and don’t consider that doom, then that is a point of view. We are not in agreement. I would also warn that we should not presume that the resulting universe would actually be that likely to have that much intelligence or complexity when the music stops. Again, under radical agnosticism or otherwise, one should notice that most configurations of the universe seem pretty wasteful and devoid of value.
Do I and similar others get into tons of detail then, about how stopping this transfer of power from humans to AIs from happening, or preventing all the humans from dying in its wake, is super difficult? Oh yes, because there are so many non-obvious and complex reasons why this is hard, and why so many imagined alternative scenarios are actually Can’t Happens or damn near one.
I do think the detailed discussions are valuable, that they are vital context to modeling what might happen, and towards getting a reasonable distribution of possible outcomes. They’re still beyond scope here.
Key here is that in my model, the space of possible futures that involve the creation of transformational AI has quite a lot of no good very bad zero-value options in it, including but not limited to the default or baseline scenarios. Whereas the space of good outcomes is hard to locate, and requires specific things to go right in some unexpected way.
When we notice Earth seems highly optimized for humans and human values, that is because it is being optimized by human intelligence, without an opposing intelligent force. If we let that cause change, the results change.
I don’t think this would pass as a model of what most such people are thinking. It certainly does not pass mine.
I might say: The core reason we so often have seen creation of things we value win over destruction is, once again, that most of the optimization pressure by strong intelligences was pointing in that directly, that it was coming from humans, and the tools weren’t applying intelligence or optimization pressure. That’s about to change.
I almost never hear arguments like the one quoted above made, at least not in any load-bearing way. I cannot recall anyone saying ‘easier to create than destroy,’ although that is certainly true as far as it goes. It is more like in the long run ‘it is easier for someone to create and do than to ensure no one creates and does’ and ‘once created this thing will be able to create or destroy at will and this will involve our destruction, whatever is created.’ We are very much not saying ‘things get complex and I can’t see a solution, boom is easy, so probably boom.’
One does not need ‘predictability’ to gain some insight into what things might plausibly happen versus which can’t happen, or which types of scenarios are relatively more likely, even under quite a lot of uncertainty. Waving one’s hand and saying ‘can’t predict things’ doesn’t get you out of this. Nor does saying ‘tools have always worked out fine in the past, we’re still here and things are good.’
Nor do we get to make the move ‘all things are possible, therefore doomed scenarios are distant and not likely and not worth worrying about.’ What makes you think that most future scenarios are not doomed, in terms of what you would value about the universe, however you think about that, and we need only worry about particular narrow specific dooms that don’t add up to much probability mass? What makes you think that us puny humans get to keep deciding what happens, or that what ends up happening will be something of which we approve?
I presume this is essentially, at core, Tyler’s Stubborn Attachments argument. That more economic growth and prosperity creates more value, even if it might not take the form you would like, and that in the long run nothing else matters?
I don’t entirely buy that argument even if transformational AI or AGI was not a practical physical possibility. I am sympathetic to that view in such worlds, I think the view is true on the margin for many people and most policy choices. I have some faith that if the future still fundamentally was based on what humans wanted and decided, in good Hayekian style, I’d worry about stubborn equilibria and path dependence, including in terms of their ability to guide longer term growth, but I would worry far less about this than most others.
This seems like a strange reference class claim, and seems like it grasps at associations and affectations. One cannot say every future is equally distant, or use that with arbitrary divisions of possible classes of futures, none of this is probability.
I can say that if there was a plane where I had radical uncertainty, or 90% confidence, on its ability to land safely, I would not get on that plane. If you said ‘but you will eventually get on a plane at some point’ I would say all right, let’s work on our air travel technology and build a different plane. If you told me ’yes we might not have to put everyone on Earth into this radically uncertain plane now but we definitely are going to do it with some plane, eventually, might as well do it now, I’d probably get to work on airplane safety.
No, we cannot have ongoing stasis. The AI is very much out of the box and on its way, as I know full well, and advances will continue. I don’t have any hope of preventing GPT-5 and I don’t know anyone else who does either, whether or not it is a good idea.
Too late, it’s happening, you can’t stop it and it’s good actually. I know that meme.
So essentially the argument here is that if we don’t build AI fast to beat China then the Chinese will build it first, and we cannot possibly make a deal here, so we had better build it first to maintain our hegemony, the important thing is which monkey gets the banana first?
That is exactly the nightmare scenario thinking we’ve been warning about for decades, shouting from the rooftops, who says the future is so hard to predict?
It also does not bear on the question of what one should expect, should one go down that road, even if true.
It is entirely possible (I do not endorse these numbers at all) that if we build AGI here in America quickly we die with 50% probability and if we let China build it we die with 75% probability instead, or perhaps we die with 50% probability either way but if China builds the AI and we live then we get a totalitarian future, or what not.
And perhaps there is in practice actually no way out of the dilemma. And maybe we should therefore with heavy hearts do the most dangerous thing that has ever been done. No missing moods. Do not pretend the 50% risk is undefined and therefore almost zero. Litany of Tarski, if it’s 5% or 10% or 50% or 90% I want to believe that.
The kind of civilization that wants to survive. That wants its people and their legacies to survive. Seriously.
I don’t want a society that has the self-confidence to commit suicide because it wouldn’t look confident to not do that. If we are who I want us to be? We will not go quietly into the night. We will not perish without a fight.
(Nor will we presume that we can successfully face down a technologically vastly superior an alien invasion with bravery and a computer virus.)
If you tell me we can make all our people more intelligent? Or all our children more intelligent? Great, let’s totally do that.
If you instead propose creating powerful truly alien computer intelligences that we have no idea how to control, whose values we cannot predict, that will inevitably take control of our future and impose very alien values that I model as very likely not including keeping us around all that long, for reasons we’ve discussed a lot already? Let’s not do that.
Unless you are fine with that outcome. In which case we are not in agreement.
This is a call to not consider the object-level physical arguments about how AI is likely to work and what it is likely to do when it scales up over the medium-term. That does not seem like a good way to predict its likely consequences, at all. Or to ensure good outcomes, at all.
My Inner Tyler says that’s the point, you can’t predict such outcomes, Stubborn Attachments, economic growth, ship has sailed regardless, stop pretending you matter or you have any control over the future, there is no other way. I don’t agree.
Yes, we designed and tolerated our society in order to be able to create lots of new tools, and do lots of new things. Which we have gotten out of the habit of doing, instead preventing us from building houses and clean energy projects and trying new medicines and doing a wide variety of things without explicit permission. When we do get in the way, it’s almost always making things much worse, creating stagnation and impoverishment. It’s terrible.
And yes, I absolutely want to be able to say that all of that applies to AI as well.
I even think it actually does apply to AIs like GPT-4. I expect great and positive things.
I still can’t help but notice that we are all on schedule to then die if we keep going. Not 99%+ definitely, but not ‘distant possibility’ or anything one can ignore. And that’s bad, you see, and worth doing quite a lot to prevent.
Conclusion
In most ways in most contexts, my model is remarkably close to Tyler’s, as I understand it. If we were having this argument about building a tool that wasn’t intelligent, I would almost always agree. We should go ahead and build it, far more than we actually do. We both want to see more focus on economic growth, less restraint and regulation, less worry about distributional impacts or shifts in what humans value over time, more confidence that life will get better as a result of improving physical conditions.
I would even apply that to current AI systems like GPT-4, even with plug-ins. I see the direct risks there as fully acceptable, except for the risk of what comes after.
That brings us to where we centrally disagree. When we cross the necessary thresholds and AI gets sufficiently powerful, I expect most outcomes to be existentially bad by my own values, in ways that are very difficult to avoid. I see this as robust, not based on a complex chain of logical reasoning.
I also strongly expect that the safety protocols that work now will suddenly stop working at exactly the worst possible time, and that this is simply a fact about the physical world. We’ll need solutions that at least might work, and we don’t have them. Assuming that things likely kind of turn out normal and fine on this level seems like exactly the type of thing Tyler is warning others not to think in so many other contexts.
I also put much higher probability and credence to particular scenarios of rapid and complete existential risk, especially those that involve some combination of self-improvement, power-seeking, instrumental convergence, sharp left turns, orthogonality of goals and the AGI winning before we know there is a battle or that the AGI even exists in anything like its current form and capabilities. I do not consider this a ‘distant possibility’ at all. I don’t have it at 99% or anything, but I see this as the natural default outcome and the details of how we get there as mostly not much altering the destination.
The thing is, I am not relying on that to explain why I am worried.
I see these as two distinct disagreements, both about how seriously we should take certain particular more narrow scenarios in terms of probability, and the question of whether the bulk of other potential outcomes we should consider doomed versus not doomed (and I do think that most of the not doomed ones probably go very well).
That brings us to the third disagreement, which is a more universal question of whether one can make usefully predictions about the future at all beyond the short-term – Tyler as I understand him says no, hence Stubborn Attachments. I say yes, and that while finding good interventions to change outcomes over longer time horizons is difficult it is not impossible, or was it in the past.
We would have many disagreements about details of arguments, except that Tyler in his third disagreement is arguing that none of those details matter. I would say that they matter very much for the second argument, even if one rejects the first on the basis of the third.
The fourth disagreement is in Tyler’s assertion of a fait accompli. Even if we could slow things down or stop them here, he says, we can’t stop China or make a deal with them, so we need to go ahead anyway. Well, not with that attitude we can’t, that’s only going to make the race more intense, faster and less safe. I am not convinced China could make real progress in AI on its own rather than doing imitation. I am not convinced coordination and other interventions hopeless, even if good ones are difficult to find – we don’t have a solution to this, even a partial one, but we also don’t have a solution to alignment.
I do see a lot of signs that the necessary concerns are gaining in traction and attention, and that those in AI labs take them increasingly seriously. That greatly increases our chances of success in various ways. Some dignity has been won versus the counterfactual, new lines of action are possible in the future. What we have so far is inadequate and will definitely fail, I don’t like where we are, it is still a start, every little bit helps.
There is also a fifth disagreement, where Tyler considers us to have not lived through history, that tech advances have been unusually slow and non-disruptive, that unless we build AI soon that suddenly we will once again live in ‘interesting times’ anyway, that are filled with danger and disruption in ways we will not like, sufficiently so that perhaps substantial risk of ruin is justified to prevent this.
I think there are important things being gestured at here, and that goes double if we ‘bake in’ existing AI technologies that we can’t hope to undo. A lot of things are going to change, our lives are going to be disrupted. I still think that in many ways my life has been pretty disrupted by technological change. It has been extremely physically safe, more so than I would even want, but I don’t expect that the end of hegemony would put me in any physical danger. It is not only in America that life is deeply physically safe.
At core: I think taking an attitude of fait accompli, of radical uncertainty and not attempting to predict the future or what might impact it, is not The Way, here or anywhere else. Nor should we despair that there is anything we can do to change our odds of success or sculpt longer term outcomes beyond juicing economic growth and technological advancement (although in almost every case we should totally be juicing real economic growth and technological advancement).
If you think we can’t slow things down, or that slowing things down would inevitably hand the race to China, I notice that we are already slowing things down in the name of safety concerns even if they are other safety concerns, that there is real and growing effort to worry about all sorts of risks, both in general and in the AI labs. We are not favorites, the game board is in a terrible state, the odds are against us and the situation is grim, but the game is going in many ways much better than I expected, or at least much better than I would have expected given the pace of capabilities progress.
My other approach, as always, continues to be that even if we cannot solve the problem directly, we can help people better understand the problem, help people better understand the world, improve our ability to reason and make good decisions generally, improve the world such that coordination and cooperation and optimism and personal sacrifices become more viable – almost entirely in ways that I would hope Tyler would agree with.