Hrm… not sure what the obvious answer is here. Two humans, well, the argument for non defecting (when the scores represent utilities) basically involves some notion of similarity. ie, you can say something to the effect of “that person there is similar to me sufficiently that whatever reasoning I use, there is at least some reasonable chance they are going to use the same type of reasoning. That is, a chance greater than, well, chance. So even though I don’t know exactly what they’re going to choose, I can expect some sort of correlation between their choice and my choice. So, in the extreme case, where our reasoning is sufficiently similar that it’s more or less ensured that what I chose and what the other choses will be the same, clearly both cooperating is better than both defecting, and those two are (by the extreme case assumption) the only options”
It really isn’t obvious to me whether a line of reasoning like that could validly be applied with a human vs a paperclip AI or Pebblesorter.
Now, if, by assumption, we’re both equally rational, then maybe that’s sufficient for the “whatever reasoning I use, they’ll be using analogous reasoning, so we’ll either both defect or both cooperate, so...” but I’m not sure on this, and still need to think on it more.
Personally, I find Newcomb’s “paradox” to be much simpler than this since in that it’s given to us explicitly that the predictor is perfect (or highly highly accurate) so is basically “mirroring” us.
Here, I have to admit to being a bit confused about how well this sort of reasoning can be applied when two minds that are genuinely rather alien to each other, were produced by different origins, etc. Part of me wants to say “still, rationality is rationality, so to the extent that the other entity, well, manages to work/exist successfully, it’ll have rationality similar to mine (given the assumption that I’m reasonably rational. Though, of course, I provably can’t trust myself :))
Hrm… not sure what the obvious answer is here. Two humans, well, the argument for non defecting (when the scores represent utilities) basically involves some notion of similarity. ie, you can say something to the effect of “that person there is similar to me sufficiently that whatever reasoning I use, there is at least some reasonable chance they are going to use the same type of reasoning. That is, a chance greater than, well, chance. So even though I don’t know exactly what they’re going to choose, I can expect some sort of correlation between their choice and my choice. So, in the extreme case, where our reasoning is sufficiently similar that it’s more or less ensured that what I chose and what the other choses will be the same, clearly both cooperating is better than both defecting, and those two are (by the extreme case assumption) the only options”
It really isn’t obvious to me whether a line of reasoning like that could validly be applied with a human vs a paperclip AI or Pebblesorter.
Now, if, by assumption, we’re both equally rational, then maybe that’s sufficient for the “whatever reasoning I use, they’ll be using analogous reasoning, so we’ll either both defect or both cooperate, so...” but I’m not sure on this, and still need to think on it more.
Personally, I find Newcomb’s “paradox” to be much simpler than this since in that it’s given to us explicitly that the predictor is perfect (or highly highly accurate) so is basically “mirroring” us.
Here, I have to admit to being a bit confused about how well this sort of reasoning can be applied when two minds that are genuinely rather alien to each other, were produced by different origins, etc. Part of me wants to say “still, rationality is rationality, so to the extent that the other entity, well, manages to work/exist successfully, it’ll have rationality similar to mine (given the assumption that I’m reasonably rational. Though, of course, I provably can’t trust myself :))