I see some funny pattern in discussion: people argue against doom scenarios implying in their hope scenarios everyone believes in doom scenario. Like, “people will see that model behaves weirdly and shutdown it”. But you shutdown model that behaves weirdly (not explicitly harmful) only if you put non-negligible probability on doom scenarios.
Consider different degrees of belief. Giving low-credence to doom scenario by the conditional belief that evidence of danger would be properly observed is not inconsistent at all. The doom scenario requires BOTH that it happens AND that it’s ignored while happening (or happens too fast to stop).
I see some funny pattern in discussion: people argue against doom scenarios implying in their hope scenarios everyone believes in doom scenario. Like, “people will see that model behaves weirdly and shutdown it”. But you shutdown model that behaves weirdly (not explicitly harmful) only if you put non-negligible probability on doom scenarios.
Consider different degrees of belief. Giving low-credence to doom scenario by the conditional belief that evidence of danger would be properly observed is not inconsistent at all. The doom scenario requires BOTH that it happens AND that it’s ignored while happening (or happens too fast to stop).