Very much this. Indeed, creating common knowledge that the people around me will be able to wield potentially destructive power without trying to leverage that power into taking resources away from me, or trying to force me to change my mind on something, is one of the things I want out of Petrov Day, since it’s definitely a thing I am worried about.
I understand and sympathize with the desire to know that people around you can hold that power without abusing it. I would also like to know that.
But it’s only ever a belief about the average behavior of the people in the community. It should update when new information becomes available. The button is a test of your belief. Each decision made by each person to press or not to press the button is information that should feed your model of how probable it is that people can hold the power without abusing it. A bunch of people staring at a red button with candles lit nearby can be an iterated Prisoner’s Dilemma depending on the participants’ utility functions.
If I decide not to Petrov-Ruin out of a desire to protect your belief that people can hold the power without abusing it, and I make that change because I care about you and your suffering as a fellow human and think your life will be much worse if my actions demolish that belief, then a successful Petrov Day is at risk of becoming another example of Goodhart’s Law.
I think, anyway. Sometimes my prose comes off as aggressive when I’m just trying to engage with thoughtful people. I swear, on SlateStarCodex’s review of Surfing Uncertainty, that I’m typing in good faith and could have my mind changed on these issues.
If I decide not to Petrov-Ruin out of a desire to protect your belief that people can hold the power without abusing it, and I make that change because I care about you and your suffering as a fellow human and think your life will be much worse if my actions demolish that belief, then a successful Petrov Day is at risk of becoming another example of Goodhart’s Law.
TBC, I think you’re supposed to not Petrov-ruin so as to not be destructive (or to leverage your destructive power to modify habryka to be more like you’d like them to be). My interpretation of habryka is that it would be nice if (a) it were actually true that this community could wield destructive power without being destructive etc and (b) everybody knew that. The problem with wielding destructive power is that it makes (a) false, not just that it makes (b) false.
Very much this. Indeed, creating common knowledge that the people around me will be able to wield potentially destructive power without trying to leverage that power into taking resources away from me, or trying to force me to change my mind on something, is one of the things I want out of Petrov Day, since it’s definitely a thing I am worried about.
I understand and sympathize with the desire to know that people around you can hold that power without abusing it. I would also like to know that.
But it’s only ever a belief about the average behavior of the people in the community. It should update when new information becomes available. The button is a test of your belief. Each decision made by each person to press or not to press the button is information that should feed your model of how probable it is that people can hold the power without abusing it. A bunch of people staring at a red button with candles lit nearby can be an iterated Prisoner’s Dilemma depending on the participants’ utility functions.
If I decide not to Petrov-Ruin out of a desire to protect your belief that people can hold the power without abusing it, and I make that change because I care about you and your suffering as a fellow human and think your life will be much worse if my actions demolish that belief, then a successful Petrov Day is at risk of becoming another example of Goodhart’s Law.
I think, anyway. Sometimes my prose comes off as aggressive when I’m just trying to engage with thoughtful people. I swear, on SlateStarCodex’s review of Surfing Uncertainty, that I’m typing in good faith and could have my mind changed on these issues.
TBC, I think you’re supposed to not Petrov-ruin so as to not be destructive (or to leverage your destructive power to modify habryka to be more like you’d like them to be). My interpretation of habryka is that it would be nice if (a) it were actually true that this community could wield destructive power without being destructive etc and (b) everybody knew that. The problem with wielding destructive power is that it makes (a) false, not just that it makes (b) false.