Well, I’ve had a think about it, and I’ve concluded that it would matter how great the difference between TDT and TDT-prime is. If TDT-prime is almost the same as TDT, but has an extra stage in its algorithm in which it converts all dollar amounts to yen, it should still be able to prove that it is isomorphic to Omega’s simulation, and therefore will not be able to take advantage of “logical separation”.
But if TDT-prime is different in a way that makes it non-isomorphic, i.e. it sometimes gives a different output given the same inputs, that may still not be enough to “separate” them. If TDT-prime acts the same as TDT, except when there is a walrus in the vicinity, in which case it tries to train the walrus to fight crime, it is still the case in this walrus-free problem that it makes exactly the same choice as the simulation (?). It’s as if you need the ability to prove that two agents necessarily give the same output for the particular problem you’re faced with, without proving what output those agents actually give, and that sure looks crazy-hard.
EDIT: I mean crazy-hard for the general case, but much, much easier for all the cases where the two agents are actually the same.
EDIT 2: On the subject of fairness, my first thoughts: A fair problem is one in which if you had arrived at your decision by a coin flip (which is as transparently predictable as your actual decision process—i.e. Omega can predict whether it’s going to come down heads or tails with perfect accuracy), you would be rewarded or punished no more or less than you would be using your actual decision algorithm (and this applies to every available option).
EDIT 3: Sorry to go on like this, but I’ve just realised that won’t work in situations where some other agent bases their decision on whether you’re predicting what their decision will be, i.e. Prisoner’s Dilemma.
Well, I’ve had a think about it, and I’ve concluded that it would matter how great the difference between TDT and TDT-prime is. If TDT-prime is almost the same as TDT, but has an extra stage in its algorithm in which it converts all dollar amounts to yen, it should still be able to prove that it is isomorphic to Omega’s simulation, and therefore will not be able to take advantage of “logical separation”.
But if TDT-prime is different in a way that makes it non-isomorphic, i.e. it sometimes gives a different output given the same inputs, that may still not be enough to “separate” them. If TDT-prime acts the same as TDT, except when there is a walrus in the vicinity, in which case it tries to train the walrus to fight crime, it is still the case in this walrus-free problem that it makes exactly the same choice as the simulation (?). It’s as if you need the ability to prove that two agents necessarily give the same output for the particular problem you’re faced with, without proving what output those agents actually give, and that sure looks crazy-hard.
EDIT: I mean crazy-hard for the general case, but much, much easier for all the cases where the two agents are actually the same.
EDIT 2: On the subject of fairness, my first thoughts: A fair problem is one in which if you had arrived at your decision by a coin flip (which is as transparently predictable as your actual decision process—i.e. Omega can predict whether it’s going to come down heads or tails with perfect accuracy), you would be rewarded or punished no more or less than you would be using your actual decision algorithm (and this applies to every available option).
EDIT 3: Sorry to go on like this, but I’ve just realised that won’t work in situations where some other agent bases their decision on whether you’re predicting what their decision will be, i.e. Prisoner’s Dilemma.