Daniel Kokotajlo comments on What would it look like if it looked like AGI was very near?

Daniel Kokotajlo 14 Jul 2021 10:26 UTC
5 points
It’s easy to imagine a version of the story where the winner of the arms race is not benevolent, or where there is an alignment-failure and humans lose control of the AGI entirely.
I would frame it a bit differently: Currently, we haven’t solved the alignment problem, so in this scenario the AI would be unaligned and it would kill us all (or do something similarly bad) as soon as it suited it. We can imagine versions of this scenario where a ton of progress is made in solving the alignment problem, or we can imagine versions of this scenario where surprisingly it turns out “alignment by default” is true and there never was a problem to begin with. But both of these would be very unusual, and distinct, scenarios, requiring more text to be written.