Ok, thanks, that makes more sense than anything I’d guessed.
There’s a difference between shortcutting a calculation and not accounting for something in the first place. In the debate between all the topics mentioned in the paper (e.g. SSI/SSA, split responsibility, precommitments and so on) not one method would give a different answer if that 0 was a 5, a 9, or a −100. It’s not because they’re shortcutting the maths, it’s because, as I said in my first comment, they assume that it’s effectively not possible for the two people to vote differently anyway. Which is fine in the abstract, even if it’s a little suspect in practice (since this, for once, is a quite realisable experiment).
I’ll rephrase my final line then: “If a method says to vote tails, and yet would give the same answer with the 0 changed to a 9, then it is clearly suspect”. Incidentally I don’t know of a method which says “vote tails” and would give a different answer if you changed the 0 to a 9 either.
I think the reason I didn’t get your comment originally is that the first thing I do with this problem is work with the differences—which in this case means subtracting everything from 10 and think in terms of money lost on bad votes, not absolute values. So I wouldn’t be multiplying by 0. It’s neither better nor worse, just explains why I didn’t know what you meant.
Oh, okay. Looks like I didn’t really understand your point when I commented :)
Perhaps I still don’t—you say “no method gives a probability higher than 3⁄4 for the coin being tails,” but you’ve in fact been given information that should cause you to update that probability. It’s like someone had a bag with 10 balls in it. That person flipped a coin, and if the coin was heads the bag has 9 black balls and 1 white ball, but if the coin was tails the bag has 9 white balls and 1 black ball. They reach into the bag and hand you a ball at random, and it’s black—what’s the probability the coin was heads?
If you reward disagreement, then what you’re really rewarding in this case are mixed (probabilistic) actions. The reward only pays out if the coin landed tails, so that there’s someone else to disagree with. So people will give what seems to them to be the same honest answer when you change the result of disagreeing from 0 to 0+epsilon. But when the payoff from disagreeing passes the expected payoff of honesty, agents will pick mixed actions.
To be more precise: If we simplify a little and only let them choose 50⁄50 if they want to disagree, then we have that the expected utility of honesty is P(heads)U(choice,heads) + P(tails)U(choice,heads), while the expected utility of coin-flipping is pretty much P(heads)U(average,heads) + P(tails)*U(disagree,tails). These will pass each other at different values of U(disagree, tails) depending on that you think P(heads) and P(tails) are, and also depending on which choice you think is best.
I tried to cover what you’re talking about with my statement in brackets at the end of the first paragraph. Set the value for disagreeing too high and you’re rewarding it, in which case people start deliberately making randomised choices in order to disagree. Too low and they ought to be going out of their way to try and agree above all else—except there’s no way to do that in practice, and no way not to do it in the abstract analysis that assumes they think the same. A value of 9 though is actually in between these two cases—it’s exactly the average of the two agreement options, and it neither punishes nor rewards disagreement. It treats disagreement “fairly”, and in doing so entirely un-links the two agents. Which is exactly why I picked it, and why it simplifies the problem. Again I think I’m thinking of these values relatively while you’re thinking absolutely—a value of epsilon for disagreeing is not rewarding disagreeing slightly, it’s still punishing it severely relative to the other outcomes.
To me what it illustrates is that the linking between the two agents is something of an illusion in the first place. Punishing disagreement encourages the agents to collaborate on their vote, but the problem provides no explicit means for them to do so. Introducing an explicit means to co-operate, such as pre-commitment or having the agents run identical decision algorithms, would dissolve the problem into a clear solution (actually, explicitly identical algorithms makes it a version Newcomb’s Paradox, but that’s at least a well studied problem). It’s the ambiguity of how to co-operate combined with the strong motivation, lack of explicit means, and abundance of theoretical means to hand-wave agreement that creates the paradox.
As for the stuff you say about the probability and the bucket of coloured balls, I get all that. The original probability of the coin flip was 1⁄2 each way. The evidence that you’ve been asked to vote makes the subjective likelihood of tails 2⁄3. Also somehow the number 3⁄4 appears in the SSA solution to the Sleeping Beauty problem (which to me seems just flat-out wrong, and enough for me to write off that method unless I see a very good defence of it), which made me worry that somewhere out there was a method which somehow comes up with 3⁄4. So I covered my bases by saying “no method gives probability higher than 3/4”, which was the minimum neccesary requirement and what I figured was fairly safe statement. The reality is 2⁄3 is simply just correct for the subjective probability of tails, for reasons like you say, and maybe I just confuse things by mucking about trying to cover all possible bad solutions. It is I admit a little confusing to talk about whether anything is “more than 3/4″ when the only two values under serious consideration are the a-priori 1⁄2 and the subjective posterior 2⁄3.
Yeah, I didn’t know exactly what problem statement you were using (the most common formulation of the non-anthropic problem I know is this one), so I didn’t know “9” was particularly special.
Though since the point at which I think randomization becomes better than honesty depends on my P(heads) and on what choice I think is honest. So what value of the randomization-reward is special is fuzzy.
I guess I’m not seeing any middle ground between “be honest,” and “pick randomization as an action,” even for naive CDT where “be honest” gets the problem wrong.
which made me worry that somewhere out there was a method which somehow comes up with 3⁄4.
Somewhere in Stuart Armstrong’s bestiary of non-probabilistic decision procedures you can get an effective 3⁄4 on the sleeping beauty problem, but I wouldn’t worry about it—that bestiary is silly anyhow :P
Ok, thanks, that makes more sense than anything I’d guessed.
There’s a difference between shortcutting a calculation and not accounting for something in the first place. In the debate between all the topics mentioned in the paper (e.g. SSI/SSA, split responsibility, precommitments and so on) not one method would give a different answer if that 0 was a 5, a 9, or a −100. It’s not because they’re shortcutting the maths, it’s because, as I said in my first comment, they assume that it’s effectively not possible for the two people to vote differently anyway. Which is fine in the abstract, even if it’s a little suspect in practice (since this, for once, is a quite realisable experiment).
I’ll rephrase my final line then: “If a method says to vote tails, and yet would give the same answer with the 0 changed to a 9, then it is clearly suspect”. Incidentally I don’t know of a method which says “vote tails” and would give a different answer if you changed the 0 to a 9 either.
I think the reason I didn’t get your comment originally is that the first thing I do with this problem is work with the differences—which in this case means subtracting everything from 10 and think in terms of money lost on bad votes, not absolute values. So I wouldn’t be multiplying by 0. It’s neither better nor worse, just explains why I didn’t know what you meant.
Oh, okay. Looks like I didn’t really understand your point when I commented :)
Perhaps I still don’t—you say “no method gives a probability higher than 3⁄4 for the coin being tails,” but you’ve in fact been given information that should cause you to update that probability. It’s like someone had a bag with 10 balls in it. That person flipped a coin, and if the coin was heads the bag has 9 black balls and 1 white ball, but if the coin was tails the bag has 9 white balls and 1 black ball. They reach into the bag and hand you a ball at random, and it’s black—what’s the probability the coin was heads?
If you reward disagreement, then what you’re really rewarding in this case are mixed (probabilistic) actions. The reward only pays out if the coin landed tails, so that there’s someone else to disagree with. So people will give what seems to them to be the same honest answer when you change the result of disagreeing from 0 to 0+epsilon. But when the payoff from disagreeing passes the expected payoff of honesty, agents will pick mixed actions.
To be more precise: If we simplify a little and only let them choose 50⁄50 if they want to disagree, then we have that the expected utility of honesty is P(heads)U(choice,heads) + P(tails)U(choice,heads), while the expected utility of coin-flipping is pretty much P(heads)U(average,heads) + P(tails)*U(disagree,tails). These will pass each other at different values of U(disagree, tails) depending on that you think P(heads) and P(tails) are, and also depending on which choice you think is best.
I tried to cover what you’re talking about with my statement in brackets at the end of the first paragraph. Set the value for disagreeing too high and you’re rewarding it, in which case people start deliberately making randomised choices in order to disagree. Too low and they ought to be going out of their way to try and agree above all else—except there’s no way to do that in practice, and no way not to do it in the abstract analysis that assumes they think the same. A value of 9 though is actually in between these two cases—it’s exactly the average of the two agreement options, and it neither punishes nor rewards disagreement. It treats disagreement “fairly”, and in doing so entirely un-links the two agents. Which is exactly why I picked it, and why it simplifies the problem. Again I think I’m thinking of these values relatively while you’re thinking absolutely—a value of epsilon for disagreeing is not rewarding disagreeing slightly, it’s still punishing it severely relative to the other outcomes.
To me what it illustrates is that the linking between the two agents is something of an illusion in the first place. Punishing disagreement encourages the agents to collaborate on their vote, but the problem provides no explicit means for them to do so. Introducing an explicit means to co-operate, such as pre-commitment or having the agents run identical decision algorithms, would dissolve the problem into a clear solution (actually, explicitly identical algorithms makes it a version Newcomb’s Paradox, but that’s at least a well studied problem). It’s the ambiguity of how to co-operate combined with the strong motivation, lack of explicit means, and abundance of theoretical means to hand-wave agreement that creates the paradox.
As for the stuff you say about the probability and the bucket of coloured balls, I get all that. The original probability of the coin flip was 1⁄2 each way. The evidence that you’ve been asked to vote makes the subjective likelihood of tails 2⁄3. Also somehow the number 3⁄4 appears in the SSA solution to the Sleeping Beauty problem (which to me seems just flat-out wrong, and enough for me to write off that method unless I see a very good defence of it), which made me worry that somewhere out there was a method which somehow comes up with 3⁄4. So I covered my bases by saying “no method gives probability higher than 3/4”, which was the minimum neccesary requirement and what I figured was fairly safe statement. The reality is 2⁄3 is simply just correct for the subjective probability of tails, for reasons like you say, and maybe I just confuse things by mucking about trying to cover all possible bad solutions. It is I admit a little confusing to talk about whether anything is “more than 3/4″ when the only two values under serious consideration are the a-priori 1⁄2 and the subjective posterior 2⁄3.
Yeah, I didn’t know exactly what problem statement you were using (the most common formulation of the non-anthropic problem I know is this one), so I didn’t know “9” was particularly special.
Though since the point at which I think randomization becomes better than honesty depends on my P(heads) and on what choice I think is honest. So what value of the randomization-reward is special is fuzzy.
I guess I’m not seeing any middle ground between “be honest,” and “pick randomization as an action,” even for naive CDT where “be honest” gets the problem wrong.
Somewhere in Stuart Armstrong’s bestiary of non-probabilistic decision procedures you can get an effective 3⁄4 on the sleeping beauty problem, but I wouldn’t worry about it—that bestiary is silly anyhow :P