Good callout, that sentence is simplified. I think the conclusion is correct.
Epistemic status: personal rule of thumb, defensively oriented.
Example: Cursed Monty Hall. This is like Monty Hall except that we know that Monty doesn’t want us to win and is free to reveal whatever evidence he wants, at no cost to himself. Before Monty opens a door, we think that sticking with our choice has the same EV as switching to another door. After Monty opens a door, this should not change our decision. If updating on the evidence would cause us to make a better decision, Monty would not have given us the evidence.
It’s not quite that simple in other cases. In Cursed Monty Hall, we assume it costs Monty nothing to feed us false evidence. In Poker, it costs money to make a bet. A player’s desire to feed false evidence to their opponent is limited by the cost of providing it. Another way of looking at this is that a poker bet is not purely hostile evidence, it is also an action in the game.
Another example from Poker is deliberate table talk aimed at deception. This is more purely hostile evidence, it costs no in-game resources to do this. Updating based on table talk is therefore much harder than updating correctly based on bets. Whether it requires a “superintelligence” to update “correctly” is probably down to semantics.
In the LessWrong RedButton game, there is a cost to blowing up the home page when one is normally sleeping. We might update a little on the most likely sleeping habits of the attacker. But not too much! The value of the update must be less than the cost of misleading us, or else the attacker will pay the cost in order to mislead us. Whatever value we gain from updating positively about people who were asleep at 5:33:02 PM on 2022-09-26, it must be less than the cost to the attacker of staying up late or waking up early, one day a year.
Similarly, for a 200+ karma user there is no clear cost or benefit to the attacker in blowing up the home page at the beginning of their available window vs the end. So we should not update on the karma of the attacker, beyond noting that it was 200+. I welcome attempts to calculate a better Bayesian update on the karma of the attacker than that but I don’t see how it’s going to work.
I’m not really sure you can treat the button-presser as hostile in the same sense as someone you are playing poker against is hostile. Someone might for example just think it’s funny to take down the frontpage, it doesn’t mean they have an incentive to minimize the information we get out of it.
Good callout, that sentence is simplified. I think the conclusion is correct.
Epistemic status: personal rule of thumb, defensively oriented.
Example: Cursed Monty Hall. This is like Monty Hall except that we know that Monty doesn’t want us to win and is free to reveal whatever evidence he wants, at no cost to himself. Before Monty opens a door, we think that sticking with our choice has the same EV as switching to another door. After Monty opens a door, this should not change our decision. If updating on the evidence would cause us to make a better decision, Monty would not have given us the evidence.
It’s not quite that simple in other cases. In Cursed Monty Hall, we assume it costs Monty nothing to feed us false evidence. In Poker, it costs money to make a bet. A player’s desire to feed false evidence to their opponent is limited by the cost of providing it. Another way of looking at this is that a poker bet is not purely hostile evidence, it is also an action in the game.
Another example from Poker is deliberate table talk aimed at deception. This is more purely hostile evidence, it costs no in-game resources to do this. Updating based on table talk is therefore much harder than updating correctly based on bets. Whether it requires a “superintelligence” to update “correctly” is probably down to semantics.
In the LessWrong RedButton game, there is a cost to blowing up the home page when one is normally sleeping. We might update a little on the most likely sleeping habits of the attacker. But not too much! The value of the update must be less than the cost of misleading us, or else the attacker will pay the cost in order to mislead us. Whatever value we gain from updating positively about people who were asleep at 5:33:02 PM on 2022-09-26, it must be less than the cost to the attacker of staying up late or waking up early, one day a year.
Similarly, for a 200+ karma user there is no clear cost or benefit to the attacker in blowing up the home page at the beginning of their available window vs the end. So we should not update on the karma of the attacker, beyond noting that it was 200+. I welcome attempts to calculate a better Bayesian update on the karma of the attacker than that but I don’t see how it’s going to work.
I’m not really sure you can treat the button-presser as hostile in the same sense as someone you are playing poker against is hostile. Someone might for example just think it’s funny to take down the frontpage, it doesn’t mean they have an incentive to minimize the information we get out of it.