How humanity would respond to slow takeoff, with takeaways from the entire COVID-19 pandemic
While there have been some posts on how a slow takeoff would occur given humanity’s response to COVID, they were written during the crisis, rather than it’s end to get a full account of what would likely happen.
I will be taking mostly an outside view here, which means I will mostly abstract this as “how humanity usually responds to potential crises or x-risk” as the reference class here. I will mostly ignore inside view concerns here, so specific details will be abstracted away.
I am also ignoring hard takeoff/hard singularity or Foom scenarios, where AI controls everything in hours, days or weeks.
So without further ado, here’s how humanity will by default respond and how well they will respond.
The median outcome is usually what will happen.
By default, this means that people should mostly ignore the worst or best outcomes if they want to predict how a crisis or x-risk will go. This has important implications for how bad things are likely to go in slow takeoff.
Basically, it means that we should down weight scenarios that imply a lot of civilizational competence like a purely positive singularity, or the banning of AGI. Or in other words, AGI policy is useless for the attempt to align AGI unless governments get far more competent than they are today, due to the complicated thing that AGI is.
This also means that bluntly, EA/Rationalist communities are almost certainly overworried about AGI, especially suffering focused ethics people, and this is also applicable to extinction risks or existential risks. One of the biggest problems in AI Alignment and AI fields more generally is the people that get selected in AI Alignment are people that tend to buy in to the extreme worst-case scenarios, and the average AI capabilities person is far too optimistic about human level AI outcomes.
AGI, if it’s built, will not go away even after millions are dead.
One of the most important things that COVID-19 forced was after a short time, it couldn’t be eradicated. And while AI has a different benefit and threat profile, the differences strengthen this takeaway, as COVID was essentially pure threat. This means it is very likely that we will have to live with AGI/ASI once we build it for real.
Partial solutions to problems, and hacky attempts to align it is necessary.
The field of AI Alignment is operating under a Nirvana Fallacy, which is basically looking for perfect solutions to problems, when probably good enough solutions suffice. One example is MIRI’s early attempt to get a Coherent Extrapolated Volition, which was supposed to Align AI with all of humanity to get a utopian future for arbitrarily high power levels and arbitrarily long times. Thankfully that’s abandoned now, but there is still a Nirvana Fallacy operating in AI Alignment, which tries to Align AI with almost arbitrary power and arbitrarily high time. Essentially the perfect solution is the enemy of the good, and that’s why AI Alignment is partially so hard.
And our final takeaway is that the public will have huge misinformation and disinformation about AI as soon as it’s actually impacting the real world of politics.
There will not necessarily be any consensus from AI if they do get misaligned, and that will be a problem for us as once a field gets politicized, it’s basically impossible not to be mind killed by it. This means there will need to be a willingness to be careful about a consensus, and if it’s politicized, to be willing to not follow certain parts of the consensus that is politicized.
What if the median outcome and the worst outcome are equivalent, in that humanity dies in both?
Shouldn’t we be aiming for—at very least—the median outcome in which humanity survives? Even if that is only a small part of the total space of outcomes?
Perhaps the world really is such that we have a 1% chance that we fully solve alignment and live in a glorious transhuman utopia, a 48% chance that we end up in a good world with partly aligned AGI, a 1% chance that we coordinate on never developing AGI, a 49% chance that we end up in a bad world with partly unaligned AGI, and a 1% chance that we all die screaming from completely unaligned AGI.
But maybe the world really is such that there is a 1% chance that we fully solve alignment and live in a transhuman utopia, a 1% chance that we coordinate on never developing AGI, a 1% chance that we partly solve alignment and end up in a permanent dystopia that is nevertheless better than extinction, and a 97% chance that UFAI kills us all or worse.
We don’t know which way the world really is, and focusing on “the median scenario” won’t help at all with that. My estimate is that the real world is much closer to the second than to the first, and I see no point in focusing on the “median scenario” in that.