I agree that this part of the article can be confusing. Let me put it this way:
P(World A | AGI) = 0.999
P(World B | AGI) = 0.001
So I think the evidence makes me already prefer work A before B. What I don’t know is what observations would make me alter that posterior distribution. That’s what I meant
Whatever observations caused you to initially shift towards A, wouldn’t the opposite observation make you shift towards B? For instance, one observation that caused you to shift towards A is “I can’t think of any actionable plans an AGI could use to easily destroy humanity without a fight”. Thus, wouldn’t an observation of “I can now think of a plan”, or “I have fixed an issue in a previous plan I rejected” or “Someone else thought of a plan that meets my criteria” be sufficient to update you towards B?
Yes, hearing those plans would probably make me change my mind. But they would need to be bulletproof plans, otherwise they would fall into the category, probably not doable in practice/too risky/too slow. Thank you for engagement constructively anyway
I agree that this part of the article can be confusing. Let me put it this way:
P(World A | AGI) = 0.999
P(World B | AGI) = 0.001
So I think the evidence makes me already prefer work A before B. What I don’t know is what observations would make me alter that posterior distribution. That’s what I meant
I notice I’m still confused.
Whatever observations caused you to initially shift towards A, wouldn’t the opposite observation make you shift towards B? For instance, one observation that caused you to shift towards A is “I can’t think of any actionable plans an AGI could use to easily destroy humanity without a fight”. Thus, wouldn’t an observation of “I can now think of a plan”, or “I have fixed an issue in a previous plan I rejected” or “Someone else thought of a plan that meets my criteria” be sufficient to update you towards B?
Yes, hearing those plans would probably make me change my mind. But they would need to be bulletproof plans, otherwise they would fall into the category, probably not doable in practice/too risky/too slow. Thank you for engagement constructively anyway