We can conclude that in gaining more karma than [300], one becomes the kind of person who doesn’t destroy the world symbolically or otherwise.
I imagine this is tongue in cheek, but we really can’t. You mentioned an important reason—someone with more karma could have waited to press the button. The first button press occurred 110 minutes after it could have been pressed. The second button press occurred at least 40 minutes after it could have been pressed, and perhaps 100, 160, 220, etc. In 2020 the button was pressed 187 minutes after it could have been pressed (by a 4000+ karma user).
You excluded known trouble makers from accessing the button but you didn’t exclude unknown trouble makers, and lower karma is correlated with being unknown.
We are also dealing with a hostile intelligence who pressed the button (or caused it to be pressed). Someone with higher karma might deliberately wait to press the button to throw people off the scent, to encourage people to make a naive update about karma scores, or to reduce the negative consequences of bringing the home page down for longer without the negative consequences of leaving it up all day. The timing evidence is thus hostile evidence and updating on it correctly requires superintelligence.
Put this together and I would not place any suspicion on the noble class of 200-299 karma users that I happened to enter on Petrov Day after net positive gains from complaining about the big red button.
I am willing to update that at least one person in the 200+ karma range pressed the button, and at least one person with zero karma pressed the button. This assumes there was not a third bug in play. This does not change my opinion of LessWrong users, but those who predicted that the home page would remain up could update.
I also imagine that it was tongue in cheeck but I also think that the structure of the whole thing so heavily suggests this line of thinking that on surface level recognising it to be wrong doesn’t really dispell it.
The timing evidence is thus hostile evidence and updating on it correctly requires superintelligence.
What do you mean by this? It seems trivially false that updating on hostile evidence requires superintelligence; for example poker players will still use their opponent’s bets as evidence about their cards, even though these bets are frequently trying to mislead them in some way.
The evidence being from someone who went against the collective desire does mean that confidently taking it at face value is incorrect, but not that we can’t update on it.
Good callout, that sentence is simplified. I think the conclusion is correct.
Epistemic status: personal rule of thumb, defensively oriented.
Example: Cursed Monty Hall. This is like Monty Hall except that we know that Monty doesn’t want us to win and is free to reveal whatever evidence he wants, at no cost to himself. Before Monty opens a door, we think that sticking with our choice has the same EV as switching to another door. After Monty opens a door, this should not change our decision. If updating on the evidence would cause us to make a better decision, Monty would not have given us the evidence.
It’s not quite that simple in other cases. In Cursed Monty Hall, we assume it costs Monty nothing to feed us false evidence. In Poker, it costs money to make a bet. A player’s desire to feed false evidence to their opponent is limited by the cost of providing it. Another way of looking at this is that a poker bet is not purely hostile evidence, it is also an action in the game.
Another example from Poker is deliberate table talk aimed at deception. This is more purely hostile evidence, it costs no in-game resources to do this. Updating based on table talk is therefore much harder than updating correctly based on bets. Whether it requires a “superintelligence” to update “correctly” is probably down to semantics.
In the LessWrong RedButton game, there is a cost to blowing up the home page when one is normally sleeping. We might update a little on the most likely sleeping habits of the attacker. But not too much! The value of the update must be less than the cost of misleading us, or else the attacker will pay the cost in order to mislead us. Whatever value we gain from updating positively about people who were asleep at 5:33:02 PM on 2022-09-26, it must be less than the cost to the attacker of staying up late or waking up early, one day a year.
Similarly, for a 200+ karma user there is no clear cost or benefit to the attacker in blowing up the home page at the beginning of their available window vs the end. So we should not update on the karma of the attacker, beyond noting that it was 200+. I welcome attempts to calculate a better Bayesian update on the karma of the attacker than that but I don’t see how it’s going to work.
I’m not really sure you can treat the button-presser as hostile in the same sense as someone you are playing poker against is hostile. Someone might for example just think it’s funny to take down the frontpage, it doesn’t mean they have an incentive to minimize the information we get out of it.
I imagine this is tongue in cheek, but we really can’t. You mentioned an important reason—someone with more karma could have waited to press the button. The first button press occurred 110 minutes after it could have been pressed. The second button press occurred at least 40 minutes after it could have been pressed, and perhaps 100, 160, 220, etc. In 2020 the button was pressed 187 minutes after it could have been pressed (by a 4000+ karma user).
You excluded known trouble makers from accessing the button but you didn’t exclude unknown trouble makers, and lower karma is correlated with being unknown.
We are also dealing with a hostile intelligence who pressed the button (or caused it to be pressed). Someone with higher karma might deliberately wait to press the button to throw people off the scent, to encourage people to make a naive update about karma scores, or to reduce the negative consequences of bringing the home page down for longer without the negative consequences of leaving it up all day. The timing evidence is thus hostile evidence and updating on it correctly requires superintelligence.
Put this together and I would not place any suspicion on the noble class of 200-299 karma users that I happened to enter on Petrov Day after net positive gains from complaining about the big red button.
I am willing to update that at least one person in the 200+ karma range pressed the button, and at least one person with zero karma pressed the button. This assumes there was not a third bug in play. This does not change my opinion of LessWrong users, but those who predicted that the home page would remain up could update.
I also imagine that it was tongue in cheeck but I also think that the structure of the whole thing so heavily suggests this line of thinking that on surface level recognising it to be wrong doesn’t really dispell it.
What do you mean by this? It seems trivially false that updating on hostile evidence requires superintelligence; for example poker players will still use their opponent’s bets as evidence about their cards, even though these bets are frequently trying to mislead them in some way.
The evidence being from someone who went against the collective desire does mean that confidently taking it at face value is incorrect, but not that we can’t update on it.
Good callout, that sentence is simplified. I think the conclusion is correct.
Epistemic status: personal rule of thumb, defensively oriented.
Example: Cursed Monty Hall. This is like Monty Hall except that we know that Monty doesn’t want us to win and is free to reveal whatever evidence he wants, at no cost to himself. Before Monty opens a door, we think that sticking with our choice has the same EV as switching to another door. After Monty opens a door, this should not change our decision. If updating on the evidence would cause us to make a better decision, Monty would not have given us the evidence.
It’s not quite that simple in other cases. In Cursed Monty Hall, we assume it costs Monty nothing to feed us false evidence. In Poker, it costs money to make a bet. A player’s desire to feed false evidence to their opponent is limited by the cost of providing it. Another way of looking at this is that a poker bet is not purely hostile evidence, it is also an action in the game.
Another example from Poker is deliberate table talk aimed at deception. This is more purely hostile evidence, it costs no in-game resources to do this. Updating based on table talk is therefore much harder than updating correctly based on bets. Whether it requires a “superintelligence” to update “correctly” is probably down to semantics.
In the LessWrong RedButton game, there is a cost to blowing up the home page when one is normally sleeping. We might update a little on the most likely sleeping habits of the attacker. But not too much! The value of the update must be less than the cost of misleading us, or else the attacker will pay the cost in order to mislead us. Whatever value we gain from updating positively about people who were asleep at 5:33:02 PM on 2022-09-26, it must be less than the cost to the attacker of staying up late or waking up early, one day a year.
Similarly, for a 200+ karma user there is no clear cost or benefit to the attacker in blowing up the home page at the beginning of their available window vs the end. So we should not update on the karma of the attacker, beyond noting that it was 200+. I welcome attempts to calculate a better Bayesian update on the karma of the attacker than that but I don’t see how it’s going to work.
I’m not really sure you can treat the button-presser as hostile in the same sense as someone you are playing poker against is hostile. Someone might for example just think it’s funny to take down the frontpage, it doesn’t mean they have an incentive to minimize the information we get out of it.
I second that we can’t really conclude that high-karma users aren’t the button-pressing types, for the reasons you reference
I like this comment.