I am denying that superintelligences play this game in a way that looks like “Pick an ordinal to be your level of sophistication, and whoever picks the higher ordinal gets $9.” I expect sufficiently smart agents to play this game in a way that doesn’t incentivize attempts by the opponent to be more sophisticated than you, nor will you find yourself incentivized to try to exploit an opponent by being more sophisticated than them, provided that both parties have the minimum level of sophistication to be that smart.
If faced with an opponent stupid enough to play the ordinal game, of course, you just refuse all offers less than $9, and they find that there’s no ordinal level of sophistication they can pick which makes you behave otherwise. Sucks to be them!
I agree most superintelligences won’t do something which is simply “play the ordinal game” (it was just an illustrative example), and that a superintelligence can implement your proposal, and that it is conceivable most superintelligences implement something close enough to your proposal that they reach Pareto-optimality. What I’m missing is why that is likely.
Indeed, the normative intuition you are expressing (that your policy shouldn’t in any case incentivize the opponent to be more sophisticated, etc.) is already a notion of fairness (although in the first meta-level, rather than object-level). And why should we expect most superintelligences to share it, given the dependence on early beliefs and other pro tanto normative intuitions (different from ex ante optimization)? Why should we expect this to be selected for? (Either inside a mind, or by external survival mechanisms) Compare, especially, to a nascent superintelligence who believes most others might be simulating it and best-responding (thus wants to be stubborn). Why should we think this is unlikely? Probably if I became convinced trapped priors are not a problem I would put much more probability on superintelligences eventually coordinating.
Another way to put it is: “Sucks to be them!” Yes sure, but also sucks to be me who lost the $1! And maybe sucks to be me who didn’t do something super hawkish and got a couple other players to best-respond! While it is true these normative intuitions pull on me less than the one you express, why should I expect this to be the case for most superintelligences?
I am denying that superintelligences play this game in a way that looks like “Pick an ordinal to be your level of sophistication, and whoever picks the higher ordinal gets $9.” I expect sufficiently smart agents to play this game in a way that doesn’t incentivize attempts by the opponent to be more sophisticated than you, nor will you find yourself incentivized to try to exploit an opponent by being more sophisticated than them, provided that both parties have the minimum level of sophistication to be that smart.
If faced with an opponent stupid enough to play the ordinal game, of course, you just refuse all offers less than $9, and they find that there’s no ordinal level of sophistication they can pick which makes you behave otherwise. Sucks to be them!
I agree most superintelligences won’t do something which is simply “play the ordinal game” (it was just an illustrative example), and that a superintelligence can implement your proposal, and that it is conceivable most superintelligences implement something close enough to your proposal that they reach Pareto-optimality. What I’m missing is why that is likely.
Indeed, the normative intuition you are expressing (that your policy shouldn’t in any case incentivize the opponent to be more sophisticated, etc.) is already a notion of fairness (although in the first meta-level, rather than object-level). And why should we expect most superintelligences to share it, given the dependence on early beliefs and other pro tanto normative intuitions (different from ex ante optimization)? Why should we expect this to be selected for? (Either inside a mind, or by external survival mechanisms)
Compare, especially, to a nascent superintelligence who believes most others might be simulating it and best-responding (thus wants to be stubborn). Why should we think this is unlikely?
Probably if I became convinced trapped priors are not a problem I would put much more probability on superintelligences eventually coordinating.
Another way to put it is: “Sucks to be them!” Yes sure, but also sucks to be me who lost the $1! And maybe sucks to be me who didn’t do something super hawkish and got a couple other players to best-respond! While it is true these normative intuitions pull on me less than the one you express, why should I expect this to be the case for most superintelligences?