If you answered “No” to both of those questions, then congratulations! The safety threshold for our AI is at least 1 bit!
If an expert looks at and uses the bit, then immediately incinerates the box, this works.
Suppose one of those answers goes on to influence all sorts of parts of the world, by chaos theory. Leading to a break in by burglars who let the AI out in one of the worlds.
Basically the chance of the AI being let out is nearly doubled in the worst case. If this chance was already not too tiny, (say a powerful faction planned to seize the AI and do something stupid with it.) this could be bad.
Or we could let it out on the grounds that >50% likely to be friendly is better than whatever else we might make.
If an expert looks at and uses the bit, then immediately incinerates the box, this works.
Suppose one of those answers goes on to influence all sorts of parts of the world, by chaos theory. Leading to a break in by burglars who let the AI out in one of the worlds.
Basically the chance of the AI being let out is nearly doubled in the worst case. If this chance was already not too tiny, (say a powerful faction planned to seize the AI and do something stupid with it.) this could be bad.
Or we could let it out on the grounds that >50% likely to be friendly is better than whatever else we might make.