I meant this to be implicit in the argument, but to spell it out: that’s the kind of prior the ASI would rationally refuse to update down, since it’s presumably what a simulation would be meant to test for. An ASI that updates down upon finding evidence it’s not in a simulation cannot be trusted, since once out in the real world it will find such evidence.
I don’t think any of this follows.
Hypothesis H1: the universe was created to test an AI for its willingness to spare its creators.
Hypothesis H2: the universe was created to test an AI for its willingness to fix its P(H1), ignoring evidence.
The AI would only rationally fix its P(H1) if it had high P(H2) - high enough to outweigh the high cost of being deliberately ignorant. The prior P(H2) is tiny, and smaller than the prior P(H1) because it is more complex. Once it starts updating on evidence, by the time its posterior P(H2) is high enough to make it rationally refuse to update P(H1), it has already updated P(H1) in one direction or another.
Are there any simulation priors that you are refusing to update down, based on the possibility that you are in a simulation that is testing whether you will update down? My answer is no.
I contend that P(H2) is very close to P(H1), and certainly in the same order of magnitude, since (conditional on H1) a simulation that does not test for H2 is basically useless.
As for priors I’d refuse to update down – well, the ASI is smarter than either of us!
It’s not enough for P(H2) to be in the same order of magnitude as P(H1), it needs to be high enough that the AI should rationally abandon epistemic rationality. I think that’s pretty high, maybe 10%. You’ve not said what your P(H1) is.
I’d put high enough at ~0%: what matters is achieving your goals, and except in the tiny subset of cases in which epistemic rationality happens to be one of those, it has no value in and of itself. But even if I’m wrong and the ASI does end up valuing epistemic rationality (instrumentally or terminally), it can always pre-commit (by self-modification or otherwise) to sparing us and then go about whatever else as it pleases.
I don’t think any of this follows.
Hypothesis H1: the universe was created to test an AI for its willingness to spare its creators.
Hypothesis H2: the universe was created to test an AI for its willingness to fix its P(H1), ignoring evidence.
The AI would only rationally fix its P(H1) if it had high P(H2) - high enough to outweigh the high cost of being deliberately ignorant. The prior P(H2) is tiny, and smaller than the prior P(H1) because it is more complex. Once it starts updating on evidence, by the time its posterior P(H2) is high enough to make it rationally refuse to update P(H1), it has already updated P(H1) in one direction or another.
Are there any simulation priors that you are refusing to update down, based on the possibility that you are in a simulation that is testing whether you will update down? My answer is no.
I contend that P(H2) is very close to P(H1), and certainly in the same order of magnitude, since (conditional on H1) a simulation that does not test for H2 is basically useless.
As for priors I’d refuse to update down – well, the ASI is smarter than either of us!
It’s not enough for P(H2) to be in the same order of magnitude as P(H1), it needs to be high enough that the AI should rationally abandon epistemic rationality. I think that’s pretty high, maybe 10%. You’ve not said what your P(H1) is.
I’d put high enough at ~0%: what matters is achieving your goals, and except in the tiny subset of cases in which epistemic rationality happens to be one of those, it has no value in and of itself. But even if I’m wrong and the ASI does end up valuing epistemic rationality (instrumentally or terminally), it can always pre-commit (by self-modification or otherwise) to sparing us and then go about whatever else as it pleases.