handoflixue comments on AI box: AI has one shot at avoiding destruction—what might it say?

handoflixue 24 Jan 2013 20:39 UTC
3 points
p(UFAI) > p(Imminent, undetected catastrophe that only a FAI can stop)

Given UFAI results in “human extinction”, and my CEV assigns effectively infinite DISutility to that outcome, it would have to FIRST provide sufficient evidence for me to update to the catastrophe being more likely.

I’ve already demonstrated that an AI which can do exactly that will get more leniency from me :)