A possible AI-inoculation due to early “robot uprising”

I have talked about it before in the comments, but given the near-certain doomsday predictions by Eliezer, it might be worth posting it separately: there is a chance of a scenario that will sharply change the AI development trajectory before it is certain to wipe out everyone.

The scenario is that a non-super-intelligent AI, general or almost so, is given or accidentally gets access to a lot of computational power, and potentially to some real-world resources, as a part of, say “gain of function” or some other research. A concerted effort by a swarm of near-human-level agents acting together (or, equivalently, by one near-human-level agent with tons of computing power, like GPUs, TPUs etc.) can do a lot of damage, but is unlikely to wipe out all human life. If the event is devastating enough, the attitude to laissez faire AI research may change early, rather than late.

On general principles, I would expect there to be a phase transition of sorts:

  • A not-very-smart agent accessing a lot of compute is likely to trip over itself and the human safeguards and do little to no damage.

  • Once the agent is smart enough to avoid that, and to evade existing safeguards designed to constrain and contain non-adversarial side effects, its effects can discontinuously become very large.

  • Faced with an event like that, humanity, despite being inept and stupid about the pandemic, can enact AI safety measures it would not otherwise, giving us the precious second chance at doing it right.

Another bit of good news is that this scenario can actually be modeled and analyzed, since it does not require guessing how something both alien and smarter than a human would reason.