A possible AI-inoculation due to early “robot uprising”

Shmi16 Jun 2022 21:21 UTC

16 points

I have talked about it before in the comments, but given the near-certain doomsday predictions by Eliezer, it might be worth posting it separately: there is a chance of a scenario that will sharply change the AI development trajectory before it is certain to wipe out everyone.

The scenario is that a non-super-intelligent AI, general or almost so, is given or accidentally gets access to a lot of computational power, and potentially to some real-world resources, as a part of, say “gain of function” or some other research. A concerted effort by a swarm of near-human-level agents acting together (or, equivalently, by one near-human-level agent with tons of computing power, like GPUs, TPUs etc.) can do a lot of damage, but is unlikely to wipe out all human life. If the event is devastating enough, the attitude to laissez faire AI research may change early, rather than late.

On general principles, I would expect there to be a phase transition of sorts:

A not-very-smart agent accessing a lot of compute is likely to trip over itself and the human safeguards and do little to no damage.
Once the agent is smart enough to avoid that, and to evade existing safeguards designed to constrain and contain non-adversarial side effects, its effects can discontinuously become very large.
Faced with an event like that, humanity, despite being inept and stupid about the pandemic, can enact AI safety measures it would not otherwise, giving us the precious second chance at doing it right.

Another bit of good news is that this scenario can actually be modeled and analyzed, since it does not require guessing how something both alien and smarter than a human would reason.

What links here?

Shmi's comment on Pivotal outcomes and pivotal processes by Andrew_Critch (18 Jun 2022 1:02 UTC; 6 points)

Shmi16 Jun 2022 21:21 UTC

16 points

2 comments1 min readLW link

Moebius314 19 Jun 2022 7:33 UTC
3 points
Maybe one scenario in this direction is that a non-super-intelligent AI gains access to the internet and then spreads itself to a significant fraction of all computational devices, using them to solve some non-consequential optimization problem. This would aggravate a lot of people (who lose access to their computers) and also demonstrate the potential of AIs to have significant impact on the real world.

As the post mentions, there is an entire hierarchy of such unwanted AI behavior. The first such phenomena like reward hacking are already occurring now. The next level (such as an AI creating a copy of itself in anticipation of an operator trying to shut it down) might occur at levels below those representing a threat of an intelligence explosion, but it’s unclear whether the general public will see a lot of information about these. I think it’s an important empirical question how wide the window is between the AI levels producing publically visible misalignment-events and the threshold where the AI becomes genuinely dangerous.
avturchin 18 Jun 2022 8:57 UTC
3 points
If AI uprising is technically possible, it means that AI military infrastructure already exists, like drones and robots, so next uprising is even simpler.
Humanity pays much less attention now to monkeypox than it did to coronavirus, as everybody tired from pandemics.