So you’re saying that (for example) there could be a very large universe that is running simulations of both possible worlds and impossible worlds, and therefore even if we go extinct in all possible worlds, versions of us that live in the impossible worlds could escape into the base universe so the effect of a logical risk would be similar to a physical risk of equal magnitude (if we get most of our utility from controlling/influencing such base universes). Am I understanding you correctly?
If so, I have two objections to this. 1) Some impossible worlds seem impossible to simulate. For example suppose in the actual world AI safety requires solving metaphilosophy. How would you simulate an impossible world in which AI safety doesn’t require solving metaphilosophy? 2) Even for the impossible worlds that maybe can be simulated (e.g., where the trillionth digit of pi is different from what it actually is) it seems that only a subset of reasons for running simulations of possible worlds would apply to impossible worlds, so I’m a lot less sure that “logical doors” exist than I am that “quantum doors” exist.
It seems to me that AI will need to think about impossible worlds anyway—for counterfactuals, logical uncertainty, and logical updatelessness/trade. That includes worlds that are hard to simulate, e.g. “what if I try researching theory X and it turns out to be useless for goal Y?” So “logical doors” aren’t that unlikely.
So you’re saying that (for example) there could be a very large universe that is running simulations of both possible worlds and impossible worlds, and therefore even if we go extinct in all possible worlds, versions of us that live in the impossible worlds could escape into the base universe so the effect of a logical risk would be similar to a physical risk of equal magnitude (if we get most of our utility from controlling/influencing such base universes). Am I understanding you correctly?
If so, I have two objections to this. 1) Some impossible worlds seem impossible to simulate. For example suppose in the actual world AI safety requires solving metaphilosophy. How would you simulate an impossible world in which AI safety doesn’t require solving metaphilosophy? 2) Even for the impossible worlds that maybe can be simulated (e.g., where the trillionth digit of pi is different from what it actually is) it seems that only a subset of reasons for running simulations of possible worlds would apply to impossible worlds, so I’m a lot less sure that “logical doors” exist than I am that “quantum doors” exist.
It seems to me that AI will need to think about impossible worlds anyway—for counterfactuals, logical uncertainty, and logical updatelessness/trade. That includes worlds that are hard to simulate, e.g. “what if I try researching theory X and it turns out to be useless for goal Y?” So “logical doors” aren’t that unlikely.