This is because the AI if hypothetically presented with situation Y would predict the same situation might be used for testing it.
If we use the modify utility version of probability change, the AI cares only about those (very very very rare) universes in which X is what it wants and it’s output is not read.
When you say “we can check what would happen if the AI was given the ability to launch the world’s nuclear arsenals”, do you mean the question you’re really asking is “would it be a good (from the AI’s perspective) if the nuclear missiles were launched as a result of some fluke”? Because if you’re literally asking “would the AI launch nuclear missiles” than you run into the Newcomb obstacle I described since the AI that might launch nuclear missiles is the “wild type” AI without special control handicaps.
Also, there are questions for which you don’t want to know the detailed answer. For example “the text of a speech of an important world leader” that the AI would create is something to which you don’t want to expose your own brain.
If we use the modify utility version of probability change, the AI cares only about those (very very very rare) universes in which X is what it wants and it’s output is not read.
When you say “we can check what would happen if the AI was given the ability to launch the world’s nuclear arsenals”, do you mean the question you’re really asking is “would it be a good (from the AI’s perspective) if the nuclear missiles were launched as a result of some fluke”? Because if you’re literally asking “would the AI launch nuclear missiles” than you run into the Newcomb obstacle I described since the AI that might launch nuclear missiles is the “wild type” AI without special control handicaps.
Also, there are questions for which you don’t want to know the detailed answer. For example “the text of a speech of an important world leader” that the AI would create is something to which you don’t want to expose your own brain.