Below is what I see is required for AI-Caused Extinction to happen in the next few tens of years (years 2024-2050 or so). In brackets is my very approximate probability estimation as of 2024-07-25 assuming all previous steps have happened.
AI technologies continue to develop at approximately current speeds or faster (80%)
AI manages to reach a level where it can cause an extinction (90%)
AI that can cause an extinction did not have enough alignment mechanisms in place (90%)
AI executes an unaligned scenario (low, maybe less than 10%)
Other AIs and humans aren’t able to notice and stop the unaligned scenario in time (50-50ish)
Once the scenario is executed humanity is never able to roll it back (50-50ish)
I think #1 implies #2 pretty strongly, but OK, I was mostly with you until #4. Why is it that low? I think #3 implies #4, with high probability. Why don’t you?
#5 and #6 don’t seem like strong objections. Multiple scenarios could happen multiple times in the interval we are talking about. Only one has to deal the final blow for it to be final, and even blows we survive, we can’t necessarily recover from, or recover from quickly. The weaker civilization gets, the less likely it is to survive the next blow.
We can hope that warning shots wake up the world enough to make further blows less likely, but consider that the opposite may be true. Damage leads to desperation, which leads to war, which leads to arms races, which leads to cutting corners on safety, which leads to the next blow. Or human manipulation/deception through AI leads to widespread mistrust, which prevents us from coordinating on our collective problems in time. Or AI success leads to dependence, which leads to reluctance to change course, which makes recovery harder. Or repeated survival leads to complacency until we boil the frog to death. Or some combination of these, or similar cascading failures. It depends on the nature of the scenario. There are lots of ways things could go wrong, many roads to ruin; disaster is disjunctive.
Would warnings even work? Those in the know are sounding the alarm already. Are we taking them seriously enough? If not, why do you expect this to change?
AI-Caused Extinction Ingredients
Below is what I see is required for AI-Caused Extinction to happen in the next few tens of years (years 2024-2050 or so). In brackets is my very approximate probability estimation as of 2024-07-25 assuming all previous steps have happened.
AI technologies continue to develop at approximately current speeds or faster (80%)
AI manages to reach a level where it can cause an extinction (90%)
AI that can cause an extinction did not have enough alignment mechanisms in place (90%)
AI executes an unaligned scenario (low, maybe less than 10%)
Other AIs and humans aren’t able to notice and stop the unaligned scenario in time (50-50ish)
Once the scenario is executed humanity is never able to roll it back (50-50ish)
I think #1 implies #2 pretty strongly, but OK, I was mostly with you until #4. Why is it that low? I think #3 implies #4, with high probability. Why don’t you?
#5 and #6 don’t seem like strong objections. Multiple scenarios could happen multiple times in the interval we are talking about. Only one has to deal the final blow for it to be final, and even blows we survive, we can’t necessarily recover from, or recover from quickly. The weaker civilization gets, the less likely it is to survive the next blow.
We can hope that warning shots wake up the world enough to make further blows less likely, but consider that the opposite may be true. Damage leads to desperation, which leads to war, which leads to arms races, which leads to cutting corners on safety, which leads to the next blow. Or human manipulation/deception through AI leads to widespread mistrust, which prevents us from coordinating on our collective problems in time. Or AI success leads to dependence, which leads to reluctance to change course, which makes recovery harder. Or repeated survival leads to complacency until we boil the frog to death. Or some combination of these, or similar cascading failures. It depends on the nature of the scenario. There are lots of ways things could go wrong, many roads to ruin; disaster is disjunctive.
Would warnings even work? Those in the know are sounding the alarm already. Are we taking them seriously enough? If not, why do you expect this to change?