Petrov Day Retrospective, 2023 (re: the most important virtue of Petrov Day & unilaterally promoting it)
When Stanislav Petrov’s missile alert system pinged, the world was not watching. Russia was not watching. Perhaps a number of superiors in the military were staying in the loop about Stanislav’s outpost, waiting for updates. It wasn’t theatre.
In contrast, LessWrong’s historical Petrov Day celebrations have been pretty flashy affairs. Great big red buttons, intimidating countdown timers, and all that. That’s probably not what the next “don’t destroy the world” moment will look like.
It’s also the case that some of the biggest moral dilemmas don’t come clearly labeled as such, and don’t have the options clearly marked as “cooperate” or “defect”. (I think in Petrov’s case, it was clear it was a big decision. Unclear to me how it easy it was for him to make and why.)
Matching the spirit of the above, this year’s LessWrong commemoration was a little more one-on-one. It started with a poll. In previous year’s, the LessWrong team has unilaterally decided the meaning of Petrov Day, often facing objections. So why not get a sense of what people actually think matters most?
We sent the following private message to anyone who’d been active on LessWrong in the previous 24 hours:
252 people responded to the survey at the time I started work on this post, and the results are pretty clear:
The Most Important Value of Petrov Day
Note: We did not actually spend much time thinking about the options in this poll, their framing, etc. Like under 10 minutes. Feel free to discuss in the comments.
Virtue | Num | Percent |
Avoiding actions that noticeably increase the chance that civilization is destroyed | 144 | 57% |
Accurately reporting your epistemic state | 27 | 11% |
Quickly orienting to novel situations | 25 | 10% |
Resisting social pressure | 56 | 22% |
Total | 252 | 100% |
Results are not significantly different for users with 1000+ karma:
Virtue | Num | Percent |
Avoiding actions that noticeably increase the chance that civilization is destroyed | 35 | 49% |
Accurately reporting your epistemic state | 8 | 11% |
Quickly orienting to novel situations | 8 | 11% |
Resisting social pressure | 56 | 28% |
Total | 107 | 100% |
Unilaterally pushing your own values over the collective?
I don’t know whether what really was going on was genuinely idealistic as opposed to symmetrical fighting over resources, but a lot of the US<>Russia conflict seemed to be about values and beliefs about what was right. Capitalism, communism, etc.
This raises some good questions. What are the legitimate ways to promote your own values over other people? This is where the follow-up poll question took us.
Users were divided on the most important virtue (we don’t know their opinions on the other virtues listed re Petrov Day), but it seemed reasonable that next year we’d go with the majority (or at least plurality) as a focus.
However, part of the Petrov Day experience (imo) is individuals being options to unilaterally change how things go for everyone else. Such an option we did kindly provide.
After some discussion, the LessWrong team has decided to make the focus of next year’s Petrov Day be the virtue that is selected as most important by the most people...
If you click the below link and are the first to do so of any minority group, we will make your selected virtue be the focus of next year’s commemoration [instead].
The plain value choice, according me, is faced with a values difference (or belief difference?) to go along with the majority, or to decide that unilaterally you’ll take the opportunity to promote what you think is correct.
I find myself thinking about Three Worlds Collide scenarios where you come across others with different values, and possibly there are power differentials. What do you do confronted by baby eaters people who prioritize communicating your epistemic status clearly above other things, and you have the power to defeat them?
It’s interesting to think about.
Unilateralists
Here’s what happened: 31 out of 181 users clicked the link to promote their own favored value over what the collective would have chosen.
Note that I sent out the second follow-up message in batches, and some people responded to the first message after the last batch, and they did not get this opportunity. 181 did. That’s 17% of people willing to unilaterally promote their value.
I’m not sure of people’s reasoning here. Oliver Habryka said he almost instinctively clicked the link because of how it was displayed. Many people click first, read later. Unfortunately, though I attempted to place the link in spoiler block, that didn’t work.
Perhaps people reason that it’s inevitable that someone clicks the link, so might as well be them. (But be the algorithm you want to see in the world??)
One user, Max H, did explain his reasoning in response to a shortform question I asked:
I got that exact message, and did click the link, about 1h after the timestamp of the message in my inbox.
Reasoning:
The initial poll doesn’t actually mention that the results would be used to decide the topic of next year’s Petrov Day. I think all the virtues are important, but if you want to have a day specifically focusing on one, it might make more sense to have the day focused on the least voted virtue (or just not the most-voted one), since it is more likely to be neglected.
I predict there was no outright majority (just a plurality) in the original poll. So most likely, the only thing the first clicker is deciding is going with the will of something like a 20% minority group instead of a 30% minority group.
I predict that if you ran a ranked-choice poll that was explicitly on which virtue to make the next Petrov Day about, the plurality winner of the original poll would not win.
All of these reasons are independent of my actual initial choice, and seem like the kind of thing that an actual majority of the initial poll respondents might agree with me about. And it actually seems preferable (or at least not harmful) if one of the other minorities gets selected instead, i.e. my actual preference ordering for what next year’s Petrov Day should be about is (my own choice) > (one of the other two minority options) > (whatever the original plurality selection was).
If lots of other people have a similar preference ordering, then it’s better for most people if anyone clicks the link, and if you happen to be the first clicker, you get a bonus of having your own personal favorite choice selected.
(Another prediction, less confident than my first two: I was not the first clicker, but the first clicker was also someone who initially chose the “Avoiding actions...” virtue in the first poll.)
He thought about it! I won’t dive into discussing this here, but curious to hear from other link-receivers why they did or didn’t click.
Which virtue-promoter group is the most unilateralist-y?
Before you hover/click into the spoiler text, please take a moment to register your predictions about which Virtue group clicked the “unilaterally make your virtue the focus” button the most.
Virtue | Total in Group | Num Link Clickers | Percent of Group |
Avoiding actions that noticeably increase the chance that civilization is destroyed | 104 | 17 | 16% |
Accurately reporting your epistemic state | 19 | 4 | 21% |
Quickly orienting to novel situations | 17 | 2 | 12% |
Resisting social pressure | 40 | 8 | 20% |
Total | 31 |
Well, that really does make a lot of sense, frankly. Watch out for them nonconformist types, they don’t care what you want!
EDIT: Oops, no, data error. I mismatched the values and it is not the case that “Resist social pressure” group “defects” at a higher rate.
So which virtue actually wins?
The very first click was from someone in group A, after 1 minute and 31 seconds had elapsed since they received the message.
But I think it’s unfair to base this on literally the first click, since the message contain the link was sent out in a very staggered way, in a few rounds.
In the first round, Group A got the message several minutes earlier than B, then C, etc. Then there was a subsequent round. Group A is the largest so might have had a clicker.
We can look here and see for each group, how long did the people who clicked the unilateralist link take to do so after they received the message.
Dun dun dun...
And well, fittingly perhaps, the person who clicked first in absolute terms and the person who clicked first relative to receiving are the same person! And also that person is from the majority group Avoiding actions that noticeably increase the chance civilization is destroyed.
And since they are from the majority group and they clicked the link, they and thier group are disqualified! Meaning we go with the second largest group, Resisting social pressure! Congrat Group D! You’re the most defect-y, but you win.
No, I joke. There’s no disqualification of your entire group. If I have say in next year’s LessWrong celebrations, I think we should honor our word[1] and go with the majority/person who clicked the link first, which is Avoiding actions that noticeably increase the chance civilization is destroyed!!!! Woooo. Good work.
That’s the most important message of Petrov Day. An absolute majority of 57% of respondents confirm it.
EDIT: Vanessay Kosoy points out in the comments that what I actually wrote was:
If you click the below link and are the first to do so of any minority group, we will make your selected virtue be the focus of next year’s commemoration.
And since many people clicked, the winner should be the first click from a minority group. This would be someone who selected Accurately reporting your epistemic state.
I find this reasoning compelling, but will allow myself to think/hear other arguments.
In all seriousness, I am pretty interested in question of how to behave when you’re sharing resources with people with differing values, including resources that determine which values get promoted more.
Majority vote is a basic standard of fairness, but maybe if you can “cure” the babyeaters, you do so by force when given the chance. Or there are lots of other things that play into it, as Max H’s reasoning is above. I’m quite curious, please share your experience and thoughts from the other end.
- ^
I twinge slightly that for the sake of the “game” we sent the majority group a message falsely saying they were in the minority. This was mostly that it was faster to not special-case it.
- 2024 Petrov Day Retrospective by 28 Sep 2024 21:30 UTC; 93 points) (
- 3 Oct 2023 13:37 UTC; 1 point) 's comment on Open Thread – Autumn 2023 by (
This was a really great Petrov day thing-to-do. I clicked on the link basically instinctively, after having a panic about being under time pressure, and about having my preferred outcome not taken, and therefore not being able to think so well about what my actions would actually do. Immediately after clicking, I felt very good, and like I had made the right decision, and then I felt a sinking sensation in my gut as I realized that if I had not taken any action, and nobody else had taken any action, then in a year I’d get to see what LessWrong as a collective decided, and I realized that actually I really wanted to see the result of that Petrov day, even if it meant Petrov day wouldn’t be about preventing the end of the world.
For the rest of that day, and into today, I could not be in the world where LessWrong both collectively voted on what they wanted, and collectively refused to overrule the majority. It felt like I had ruined Petrov day. Even if others had still clicked the link in my stead, I still didn’t want to have been among the people who ruined Petrov day, and it would feel even worse if I was the first, which I place nontrivial probability on (according to Firefox’s history, and my LessWrong message history, I received the message at 6:49 PM, and closed the tab the link opened at 6:51 PM, so it does indeed seem like I was the first one to click the link).
I’ve been feeling really terrible about this since doing it, imagining how sad people will feel when they see that someone has clicked this year’s instantiation of the button.
While feeling bad about it, I could also feel my mind trying to come up with excuses for why it isn’t so bad. The following are interesting rationalizations I’ve come up with:
Unilateralism to save the world is in the Petrov day spirit. Petrov was in fact a unilateralist, and he decided contrary to his legitimate government that he would overrule his country, and refuse to report incoming missiles.
The LessWrong team are the ones ultimately in charge of LessWrong. They set up a democratic process with a unilateralist element. Their word on their justifications should be taken as literal, if they believe allowing for the option to overrule the majority will better the kind of aggregation of will they’re aiming for to decide next year’s virtue, then I should use the special power they gave me in the way they described they wanted me to use it.
But both fell when I just imagined the world I created, and felt sad because I wouldn’t see the Petrov day decided by the communal process that I liked better than my own personal decisions.
My mind went through other processes, among them being my mind attempting to blame the LessWrong team for giving me this choice (to be clear: I do not blame them, and actually thank them), wanting to swear off my own ability to make unilateralist decisions on anything forever & always getting a second opinion on anything I could do that significantly affects anyone’s life (seems overkill, but lesson learned about the kind of situations my mind is bad at thinking under!), trying to blame things on the fact that this year’s button was not a button but a poorly labeled link (true buttons are not universally labeled as such, and I did have the hint that it was Petrov day, which future people won’t. Also the link wasn’t so poorly labeled), and blaming my own terrible reading comprehension (situationally induced panic that I should have seen ahead of time for some reason is a less self-serving label to put on the situation than generally being bad at reading comprehension in all situations).
Probably this was the most emotional experience I’ve felt around LessWrong’s Petrov day, and I learned a lot about a particular state my mind can get into for which I should not make actions, and about how the world may likely end. Perhaps in a similar circumstance one day, where three different people across the world have the option to click their own links and buttons, but not enough space or time to make their minds actually think about the actual worlds they’re making possible or impossible. Or maybe two of them can actually think, but the third has the kind of mind that panics in that particular way when under that particular pressure who then clicks their button, and maybe lives long enough to regret their decision once they get to think about it.
Some of the experience does get taken away knowing my group was not in fact the minority, and there was no absolute plans to change next year’s Petrov day plans.
This all sounds bad, but I really did love the experience. And you don’t need to worry that you distracted me from work I could have otherwise done! I worked double-time in order to try to distract myself from the issue!
Thanks for sharing all of that in such detail, <3 You make me feel quite glad we did this celebration.
Would you like to know which number click yours was?
I do want to know
You were the first, as you guessed.
I picked “resisting social pressure” and then when I got the second message, I thought “Aha, I was asked if I value resisting social pressure, and now I’m offered the chance of applying social pressure to make things go my way, to see if I will defect against the very virtue I claimed to be in favor of! I’m guessing that there’s a different message tailored for each of the virtues, where everyone is offered some action that is actually the opposite of the virtue they claimed to endorse, to see how many people are consistent. Clever! Can’t wait to see what the opposite choice for the other virtues is.”
Now I’m slightly disappointed that this wasn’t the case.
I would be worried about opposite action for “avoiding to increase probability of doom”...
Similar for me. I was very suspicious at first that the first message was a Scam and if I clicked I would blow up the website or something tricksy. Then with the second message I thought it might be customized to test my chosen virtue, “resisting social pressure”, so I didn’t click it.
I don’t think it’s inherently contradictory to apply social pressure while thinking that resisting social pressure is a virtue.
One possible reason for thinking that resisting social pressure is a virtue is if you think social pressure is generically bad, in which case applying it does indeed seem hypocritical. But it seems like there could be other reasons for considering that virtuous.
Perhaps you think that resisting social pressure shows personal strength and conviction, and that applying social pressure helps people to train their strength and conviction.
Or perhaps you think the world is made of a mixture of naturally-good people who can resist social pressure and naturally-bad people who can’t, and given that the bad people exist it’s pragmatically important to ensure that net social pressure points in a prosocial direction.
(In case it changes how you interpret my reasoning: I chose not to vote in the virtue poll.)
I saw the poll, found it really yucky, and thus didn’t answer it. I like LW for mostly staying on low simulacrum levels, and this poll felt like anything but. In more detail:
So the poll begins with an explanation of Petrov Day, and then leads with that as the first answer. This felt like one of those television thingies where they aren’t allowed to just enter you into a giveaway, and instead make you answer a question like “The president of the US is X. Who is the president of the US?”, so that it’s considered a game of skill, rather than pure chance. This kind of yucky Dark Arts style of leading questions is common, but I expect much more from LW.
Another leading / biasing aspect: Virtue A is by far the longest answer, and on my screen is the only one that covers two lines rather than one.
Anyway, having decided that I don’t like the poll, there’s no option for me to pick. Since the poll is so transparently biased towards making people pick Virtue A, even if I would’ve answered that in other circumstances, that’s not an option now. And choosing Virtue D is not “resisting social pressure”, it’s conforming to the social pressure of answering a flawed poll. Where was my option for “I’m cranky and this poll is stupid”?
Or if not that, then where was the option for “I saw the poll but don’t want to answer it?”. Twitter polls often have that, so that people don’t just pick an option at random to see the results.
Due to the lack of such a “mu” option, conscientious objection to the event didn’t work well this time, in contrast to some previous Petrov Day celebrations, where “do nothing” was a legitimate option.
There’s something ironic about the poll rewarding answers of the form “avoiding actions”, while actually punishing the avoidance of action.
Anyway, overall, I really didn’t like this.
I’m sad you didn’t like it. It indeed was not a carefully planned rigorous survey of Petrov Day attitudes.
In my thinking, it was more the start to a ~game/exercise than trying to maximally model people’s attitudes. I wanted to assign people to “teams” (I’d considered random assignment), but this felt this was a little more meaningful, and there’s non-zero bits even in an imperfect survey.
There was no intention to be leading in the responses, nor to corral for any particular response.
I actually hoped that the slap-dash nature would make people suspicious (plus inadvertent bugs/typos) and get them more into Petrov Day mood. From other comments, it sounds like this did happen somewhat.
I think if a failure happened here, it’s that you and others saw the poll as primarily an attempt to accurate survey LessWrong member’s beliefs about Petrov (pretty reasonable belief), but for me it was the start to something else, and goal wasn’t “rigorous survey”, for which a mu option would have made sense.
I’m uncertain how much we should ever be a little sneaky/misleading for the sake of games/experiments/etc. I’m a pro norm that on April Fool’s and Petrov Day and similar, people might hoodwink you a little. At least I might, I will as say a matter of metahonesty.
Sure, I get that. But the result also matters. I predict that, if you presented the parts I quoted from the survey message to a random sample of the university-educated population, and asked them whether they thought the poll was biased, >50% would say yes. Doubly so if you also showed them the lopsided survey results.
Anyway, if the first message had been presented as a pick-your-team exercise, I indeed would’ve been somewhat less frustrated with it. Though even then, I don’t think the team category “resist social pressure” or “be contrarian” ever works; it’s an inherent contradiction. And as such, and in keeping with the Petrov Day theme, I maintain that it’s important to offer a true “mu” option, or a “do not participate”.
And insofar as one is inevitably tempted to interpret the results, while you do mention “there’s non-zero bits even in an imperfect survey”, your post does some interpretation but imo without sufficiently acknowledging the imperfection, which I don’t think is particularly conducive to painting an epistemically accurate picture. For example, without the “mu” option, we don’t even know which fraction of LWers who saw that message actually picked an option!
That seems quite plausible to me. My response was that we weren’t trying especially hard to avoid bias because we weren’t trying to get a super clear result.
Can you elaborate on this?
Petrov Day has a variety of themes. A few comments here endorse the unilateralist interpretation, but insofar as another thematic connection is conscientious objection or resisting social pressure, the following argument applies:
You offered the option “resist social pressure”. As I argued, this option doesn’t really work, in that picking it doesn’t really mean what it sounds like. It’s like giving someone the options “think inside the box” or “think outside the box”. Then choosing the latter is not a sign of creativity, just like choosing the option “resist social pressure” is not sufficiently (honoring the virtue of) resisting social pressure, because by picking it you’re ultimately still conforming to the frame you’ve been given. I don’t know if this point resonates, but I feel pretty strongly about it.
(Relatedly, it’s in some sense harder to resist the social pressure of your own peer group than that of the broader society.)
Anyway, due to the above, if you want an option to actually symbolize <resist social pressure>, then options like “mu” or “do not participate” imo work much better than the explicit description “resist social pressure”, because these alternatives let people answer without needing to buy into the provided frame.
Sorry, I get the point that the option provided doesn’t let you mu/reject the frame. It’s not clear to me that this is a core framing of Petrov’s actions/virtue was conscientious objection or so on.
Beyond that, the survey wasn’t aiming to allow people to symbolically act out their responses or to reject the frame in an unambiguous way. Insisting that you get to register that you saw it but didn’t like it feels like insisting that you get to participate, but in your own way, rather than simply not engaging if you don’t like it. I also feel like if there was an option to conscientious object and you took that, that’d still be within the frame I created for you to do so? But open to being corrected here.
Well, personally speaking, I’ve already spent more time on my comments here than were warranted by my initial annoyance. So if I consider just myself, then choosing not to vote, followed by registering that non-vote in a comment, was of course a sufficient (albeit time-consuming) way to object.
However, looking at this comment thread by Said shows that besides me, at least three other people (EDIT: here’s one more) had problems with the framing and hence didn’t vote. But because there was no “mu” option, we’ll never know whether the proportion of those people was 0.1%, 1%, 10%, 50%, 90%, or whatever. That I continue to find unfortunate, and a “mu” option would’ve remedied this problem.
If I were to do it again, I might include such an option, though I’m still not terribly sad I didn’t.
If we really wanted that info, I could sample and survey people who received the message, looked at it (we have data to know this) and ask them why they didn’t vote. My guess is between 1-10% of people who didn’t vote because of frame commented about it,, so that’s 40 to 400.
372 people have responded by now out of 2500, so 15%. Let’s guess that 50% of people saw it by now, so ~1250 (though could get better data on it). A third responded if so, which seems great for a poll. O the 800 who saw but didn’t click, I could see 100-400 doing so because the frame didn’t really seem right (lining up with the above estimate). Which seems fine. I bet if I’d spent 10x developing the poll, I wouldn’t get that number down much, and knowing it with more precision doesn’t really help. It’s LW, people are very picky and precise (a virtue...but also makes having some nice things hard).
I didn’t participate in the poll because it didn’t make much sense, the options didn’t seem to cover the possibilities or really contain what I’d pick if asked in free-response form, and the whole thing seemed rather slapdash. Clicking any of the options wouldn’t’ve reliably communicated anything about my beliefs.
(Also, there was a spelling error in the poll option URLs, which said “petroy” instead of “petrov”. This further undermined any confidence I may have had in the poll meaning anything.)
I’m curious to know what your free-form response would be.
Something like: “What does ‘would preserve that [virtue] over all others’ mean, exactly? What’s the scenario here, what’s forcing me to make this choice and what exactly do I think will be the consequences of making the selection? Who am I ‘reporting’ my epistemic state to? Why is there no option that talks about seeking the truth and understanding reality? Can I pick ‘the virtue of not taking silly philosophy questions at face value’?”
At the risk of butting in, I also didn’t participate because none of the options reflected my own views on the important virtue of Petrov Day, which is more like “Do not start a fucking nuclear war”. You can try to distill that down to nicely categorized principles and virtues, and some of those might well be good things on their own, but at this level of abstraction it doesn’t capture what’s special about Petrov Day to me.
Trying to file down the Petrov Day narrative into supporting some other hobbyhorse, even if it’s a wonderful hobbyhorse which I otherwise support like resisting social pressure, is a disservice to what Stanislav Petrov actually did. The world is richer and more complex than that.
I personally preferred the past Petrov Day events, with the button games and standoffs between different groups and all that. They didn’t perfectly reflect the exact dilemma Petrov faced, sure, but what could. They were messy and idiosyncratic and turned on weird specific details. That feels like a much closer reflection of what makes Petrov’s story compelling, to me. Maybe the later stages of this year’s event would’ve felt more like that, if I’d seen them at the time, but reading the description I suspect probably not.
I like that you guys are trying a bunch of different stuff and it’s fine if this one thing didn’t land for me.
That’s why rather than clicking on any of the actual options I edited the URL to submit for choice=E, but as per the follow-up message it seems to have defaulted to the “resisting social pressure” option. Which… I guess I was doing by trying to choose an option that wasn’t present.
Huh, I’m surprised that happened. I wouldn’t have thought you’d get a message given that.
I also incorrectly got the follow-up “resisting social pressure” option. (My original choice was A)
I think answering “how should you behave when you’re sharing resources with people with different values?” is one of the projects of contractarian ethics, which is why I’m a fan.
A known problem in contractarian ethics is how people with more altruistic preferences can get screwed over by egalitarian procedures that give everyone’s preferences equal weight (like simple majority votes). For example, imagine the options in the poll were “A: give one ice cream to everyone” and “B: give two ice creams, only to the people whose names begin with consonants”. If Selfish Sally is in the minority, she’ll probably defect because she wants ice cream. When Altruistic Ally is in the minority, she reasons that more total utility is created by option B—since consonant names are in the majority, and they get twice as much ice cream—so she won’t defect and she’ll miss out on ice cream. Maybe she’s even totally fine with this outcome, because she has tuistic preferences (she prefers other people to be happy, not as a way of negotiating with them, simply as an end-in-itself) satisfied by giving Sally ice cream. But maybe this implies that, iterated over many such games, nice altruistic kind people will systematically be given less ice cream than selfish mean people! That might not be a characteristic that we want our moral system to have; we might even want to reward people for being nice.
So we could tell Ally to disregard her tuistic preference (her preference for Sally to receive ice cream as an end-in-itself) and vote like a Homo economicus, since that’s what Sally will do and we want a fair outcome for Ally. But maybe then, iterated over many games, Ally won’t be happy with the actual outcomes involved—because we’re asking her to disregard genuine altruistic preferences that she actually has, and she might be unhappy if someone else gets screwed over by that.
In this game you have an additional layer of complexity, since some people might have made their initial vote by asking, “What value do I think has the most universal benefit for everyone?” and others might have made the vote by asking, “What’s my personal favourite value?”—Those people are then facing very different moral decisions when asked, “Do you want to force your value on everyone else?”
If people who made their initial decision by considering the best value for everyone are also less likely to choose to force their value on everyone, while people who made their initial decision selfishly are also more likely to choose to force it on others, then we’d have an interesting problem. Luckily it looks similar to this existing known problem; unluckily, I don’t think the contractarians have a great solution for us yet.
First, I clicked the link in the second poll[1]. My thought process looked as follows:
I quickly skimmed the content of the message
My split-second-judgement registered that there is a RACE
Moreover, the race is on very small time scales: every second of indecision might cost me victory!
Moreover, split-second-judgment estimates that winning the race is good-in-expectation (where “expectation” should be thought of as including the “logical uncertainty” resulting from having to rely on split-second-judgement).
Therefore, click NOW before it’s too late!
Worse, even after clicking and reading the text again, I misunderstood its content. Somehow, I thought that this year’s celebration will be determined by the plurality, whereas next year’s will be determined by the fastest minority. This system is strange, but is not obviously defect-y, i.e. not obviously inferior to e.g. using plurality twice in a row, from behind the veil-of-ignorance.
Only after reading the OP and starting composing this comment in my mind, did I understand the actual meaning of the text in the second poll: that only the next year’s celebration is decided upon, and only according to a minority (if anyone in a minority clicks). Now, this is more or less clearly defect-y and in hindsight I don’t endorse clicking it.
What is my take-away lesson? The process I used to make the decision seems correct to me: if you have to make a split-second decision, then you need to use your split-second judgement because there is nothing else to go by. There might be some case for a bias towards inaction, but it’s not an overwhelming case. Personally, I know that I’m usually too slow to respond in emergency scenarios, so I don’t want to train myself to prefer inaction.
The right way to optimize this is to train your split-second judgement to do well in the sort of situations in which split-second judgement is likely to be required. The sort of reasoning required of us here is not likely to be tied to a split-second decision anywhere outside of Petrov Day games[2], so I think my split-second judgement did as well as expected and there’s nothing to correct.
[EDIT: Actually, there is a correction to be made here, and it refers to my wrong reading of the message after clicking the link. The lesson is: if I make a split-second decision, I need to carefully reexamine it after the fact, in order to understand its true consequences, and beware of anchoring on my split-second reasoning: this anchoring is probably motivated by wanting to justify myself later.]
Second, I think that going with the majority in this case is not honoring your word. You explicitly said “the first to do so out of any minority group”. If you break your word and go with the majority, I won’t completely lose my trust in you: but that’s mostly because this is a game. In a situation with more serious stakes, I expect you to take the precise meaning of your promises way more seriously, and I would be extremely disappointed if you don’t.
Third, I think this was a cool way to celebrate Petrov’s Day (modulo the issue with breaking your word, which is really bad and must not be repeated). Kudos!
My choice in the first poll was “accurately reporting your epistemic state”.
The actual Petrov had more time to make his decision, and also if I got Petrov’s job I would train my fast-judgement on Petrov-like situations in advance.
You make a very good point! I think I should update here. I too have been acting in haste. While in past years we spent quite significant number of person-days on Petrov Day, this year we’ve been focused elsewhere so this post was quickly written too. Fortunately, it gets feedback. Thanks, and I’ll update the OP to at least say I’ll need to review the decision here.
How can you honour your word at all if the premise of the link was false for more than half the respondents? There is no action that is consistent with your words.
It seems quite easy to me. Imagine me stating “The sky is purple, if you come to the party I’ll introduce you to Alice.” If you come to the party then me performing the promised introduction honours a commitment I made, even though I also lied to you.
A closer analogy is “You are an interesting person, and I will introduce the first interesting person who comes to the party to Alice”. You come to the party, you’re told that you’re the first there, but you’re not introduced to Alice because you’re not an interesting person after all. Instead they introduce the first interesting person to Alice (who for some reason only has time to meet one person).Ah never mind, I now see what you meant. Yes in general you can narrowly honour your commitment by carrying out the action, but I mean more by “honouring your word” than just that. As I see it, someone who deliberately lies has not honoured their word, regardless of any subsequent actions that they might perform.
They’ve made two statements, one vouching that something is true, and one vouching that something will be true. Ensuring that the latter will be true does nothing to restore their loss of honour from the deliberate falsity of the former. In this case they can’t even honour the latter part, since they made a mutually exclusive promise to two different people.
Seems to me that in this case, the two are connected. If I falsely believed my group was in the minority, I might refrain from clicking the button out of a sense of fairness or deference to the majority group.
Consequently, the lie not only influenced people who clicked the button, it perhaps also influenced people who did not. So due to the false premise on which the second survey was based, it should be disregarded altogether. To not disregard would be to have obtained by fraud or trickery a result that is disadvantageous to all the majority group members who chose not to click, falsely believing their view was a minority.
I think, morally speaking, avoiding disadvantaging participants through fraud is more important than honoring your word to their competitors.
The key difference between this and the example is that there’s a connection between the lie and the promise.
I really like this takeaway, and generally like how “rationality test self-assessment” process here.
I’m the one who picked the particular virtues we displayed, and it seemed good to explain a bit more about where I was coming from. We’ve had a lot of interesting discussion each year about the True Meaning of Petrov Day.
Eliezer’s original post on Petrov Day begins:
The theme of “Reflect on the fact that the world could have been destroyed, and be the sort of people who don’t destroy the world” was a central element most Petrov Day celebrations for years going forward. In 2019, when the LW team first built The Button experiment for LessWrong (where some users could press a button that took down the LessWrong frontpage for a day), we leaned heavily into this framing.
But throughout the years, some people have objected to this being the primary (or at least only) framing of the event. We’ve had a lot of interesting discussion each year about the True Meaning of Petrov Day.
Some people argued that it’s actually kind of confused to celebrate Petrov for the sake of “not taking dangerous unilateral actions”, when in fact it’s quite important that Petrov did act unilaterally in some sense – he violated the rules he had agreed to in favor of his own judgment. He was non-unilateralist with respect to the World At Large, but quite unilateralist with respect to his role in the Soviet military.
Some people argued that one of the most important aspects of Petrov is that he resisted social pressure, and did what he thought was right. In this frame, there was something ironic about everyone dutifully not-clicking-a-button as part of a tradition, especially if they were doing so mostly/entirely because they expected to be socially shamed if they clicked it.
Some people argued about whether it made sense to think of the thing more as a sacred ritual or as a fun game. Some people countered with “Look man, it doesn’t matter whether it’s a ritual or a game. The important thing is to notice that your action is going to destroy a bunch of value for no reason, and decide not to do that, regardless of what social role you think you’re supposed to be playing.”
In the last couple years, I’ve personally come to believe that the skill of orienting is quite important – noticing when you’re in a new situation (or a different situation that you thought you were in), and you need to reprocess your background information in light of that information. Decide what your actual goals are, and then figure out what to do in light of those goals. This feels like one of the important things that Petrov did to me.
Finally, just last week, Jimrandomh pointed out to me that one of the noteworthy things about Petrov is that, while he didn’t relay-the-information according to what the machine said and what the rules dictated… he also was accurately representing his epistemic state to his superiors. Petrov had helped build the missile-detection-system, and he had an idea of what it’s limitations were. The indicator was weird (I haven’t looked up the details right now, but my recollection is that it was showing only a small number of incoming missiles, which wasn’t what you’d actually expect if the US was attacking).
We don’t know what would have happened if it had looked to Petrov like a full scale attack was inbound. But in our current timeline, when he reported “false alarm”, it’s noteworthy that that was his actual epistemic state.
...
I’m sure there are many more frames to look at Petrov Day through. The Petrov Incident was not a movie, written with a theme in mind to communicate a particular message. It was a real thing, with lots of messy details. We can choose how to look at it and think about it, and take what lessons we want.
I was interested, this year, in leaning somewhat into the fact that Petrov actually had to decide what was right. I wanted LessWrong’s Petrov Day commemoration to present users with something more ambiguous that they had to reason about themselves.
(Also, separately, the Lightcone team was very busy last week and we scrambled to put Petrov Day together at the last minute. I’m fairly happy with what we came up with at the last minute but, like, don’t wanna oversell it as particularly well thought out)
A couple people have said “man none of these virtues carve up virtue-space the way I care about.” I’ve heard “‘not taking actions that might destroy the world’ isn’t a virtue even though it’s the most imprtant thing”, and “the most important virtue was ‘having agency in the face of systemic pressure to follow orders’”, which the person conceptualized as sort of a mix of the “Orient quickly” and “Resist Social Pressure” virtue. (or rather, I said “well, I think I basically split that into these two other virtues, so, uh, you gotta pick one?”, which they did grudgingly).
Also, after shipping the initial code and initial wave of notifications, it took us a while to realize I had mispelled “Petrov” as “Petroy”. We thought about trying to abort the notification volley, but decided “well, figuring out whether the system is malfunctioning is just a kinda reasonable part of the game.”
Reasoning through the consequences of removing all virtues but the chosen one, an inability to avoid actions that lead to noticeable increase that civilization was destroyed tended to outweigh the others in expected negatives, so I picked that. It’s also traditional, and is the one that makes any sense for the day to be about, though that wasn’t what I was asked. I also thought through how the poll link might be a red button in some way but didn’t see it.
I was quite puzzled by the second message that my choice to maintain the traditional and expected value for the site was in the minority. That did not fit with my model of basically any subset of the LessWrong community. My past experience with Petrov Day celebrations is that rash action tends not to be the thing anyone wants you to do, so I read the message trying to figure out what rash action I was being poked to do, noted the ‘unilateralism’ parameter, wondered if I was in fact ‘supposed to’ take unilateral action to not destroy civilization, figured that if that was the case someone else would do it here but it probably meant it was bad, and left it alone. And at some point probably processed that there weren’t particular negative stakes to not choosing a value, and continued leaving it alone.
This one made sense to be a red button in disguise, partly because I had a strong probability the message was false, partly because it was strange, and partly because it’s a day I expect a red button. So I avoided it.
I voted for correctly reporting your epistemic state. I claim that this is the actual virtue Petrov displayed, and that his primary virtue being “don’t take actions which destroy the world” because he decided to buck the chain of command is a mistaken belief. From the Wikipedia article:
More specifically I claim two things:
Stanislav Petrov actually believed that it was a false alarm.
Had he not believed it was a false alarm, he would have reported an attack.
The poll did not ask what virtue Petrov most displayed. It asked what virtue you think is most important.
Yes, and the virtue that is most important is the one that allowed Petrov to not doom the world. By contrast, the two most popular choices were about refusing to doom the world, and resisting social pressure, neither of which were features of the event.
If there was a poll in connection to Arkhipov, my answer might change.
I think this is bad. I mean, it’s not that big a deal, but I generally speaking expect messages I receive from The LessWrong Team to not tell falsehoods.
My metahonesty is I might hoodwink you a little on April Fools, Petrov Day, and similar.
Indeed—but maybe I was overly paranoid and I should’ve just clicked the buttons on the day.
My current moral code says “it’s okay to do things like this, as part of games, psychology study-ish-things, jokes, etc”, if you tell people the truth shortly afterwards (which we did).
Separate from the moral issue, this is the kind of trick you can only pull once. I assume that almost everyone who received the “your selected response is currently in the minority” message believed it, that will not be the case next year.
Yup, seems correct.
Hmm.
I don’t think Avoiding actions that noticeably increase the chance civilization is destroyed is necessarily the most practically-relevant virtue, for most people, but it does seem to me like it’s the point of Petrov day in particular. If we’re recognizing Petrov as a person, I’d say that was Petrov’s key virtue.
Or maybe I’d say something like “not doing very harmful acts despite incentives to do so”—I think “resisting social pressure” isn’t quite on the mark, but I think it is important to Petrov day that there were strong incentives against what Petrov did.
I think other virtues are worth celebrating, but I think I’d want to recognize them on different holidays.
I clicked the link in the second email quite quickly—i assumed it was a game/joke, and wanted to see what would happen. If I’d actually thought I was overriding people’s preferences, I… probably would have still clicked because I don’t think I place enormous value on people’s preferences for holiday reasons, and I would have enjoyed being the person who determined it.
There are definitely many circumstances where I wouldn’t unilaterally override a majority. I should probably try to figure out what the principles behind those are.
Came here to say this—I also clicked the link because I wanted to see what would happen. I wouldn’t have done it if I hadn’t already assumed it was a social experiment.
In the first poll I voted “Accurately reporting your epistemic state” because I feel like it is systemically important and sort of a foundation on which other things like not destroying the world can be built (e.g. avoiding actions that noticeably may lead to the destruction of the world is more effective if you are better at sharing information relevant to which actions do or do not destroy the world). However if I had known that it was a poll about the next Petrov day celebration, I would have voted “Avoiding actions that noticeably increase the chance that civilization is destroyed” because I feel like that’s the point of Petrov day.
A majority member being the initial clicker also isn’t terribly surprising because a group being larger means one-or-more of any given sort of person—in this case, a quick-responder-type—is likelier to crop up among them.
I would be curious to see what the poll results for Question 1 look like, say, a week from now. I only saw the message in my inbox after Petrov day was over, and still responded.
Ah, for the purposes of responding to the PMs (and writing my explanatory comment) I accepted the truth of the setup mostly without question, but it makes more sense in hindsight that the “Avoiding actions...” virtue was actually a strong majority vote.
About what to do about next Petrov Day, I mostly agree with Vanessa’s reasoning, but I don’t think it matters too much either way and don’t really think it’s about “honoring your word”.
Once the truth of the setup is violated, it’s mostly just about whether you want to do the thing that predictably would have been the result had the setup been honest.
If you had built in a special case to not send a second message to the majority-choosers at all, the result would have been that “Accurately reporting your epistemic state.” would be the winner.
If you had instead built in a special case where the majority-choosers get a slightly-modified second message (“currently in the minority” → “currently in the majority”), and “any minority group” were edited to “any group” in everyone’s second message, the result predictably would have been that “Avoiding actions...” wins.
I don’t know which of these modifications, if any, you would have actually built if you had more time to think / implement, but I don’t see any reason why you should feel particularly bound to do something a year from now by either of these counterfactuals about a 1-day game.
I selected “Quickly orienting to novel situations” (QOTNoS) because it’s strictly superior to the alternatives:
If you have the QOTNoS virtue, you can deal with the novel situation of AI destroying civilization
The necessity of “accurately reporting your epistemic state” is a novel situation for most people. QOTNoS helps again.
“Resisting social pressure” is a common situation. But if the survival of civilization depends on it (as the poll implies), this is a novel situation. Thus, QOTNoS will help with it.
In essence, QOTNoS (as in being able to make the right decisions in novel situations) is a synonym for general intelligence, and thus is the strongest power.
Yeah I also personal picked ‘quickly orient’ for these reasons.
Okay, I’m one of the unilateralists and I totally haven’t thought about poll as “value” choice. I’ve perceived poll as if it was: “Which ability is the most important to prevent destruction of the world? 1. INT 2. WIS 3. CHA 4. LUK”, and question “which virtue has the most impact on chance of world not being destroyed” is a factual question, not value question. If I thought about poll more seriously, I would consider “I’m in minority on factual question on site with smarter than me people, maybe I’m wrong”, but I was pretty confident and mostly took it as fun activity, not Very Serious thing.
Also, it can be the matter of language differences: English is not my native, maybe we have different connotations around word “virtue”.
To me, the initial poll options make no sense without each other. For example, “avoid danger” and “communicate beliefs” don’t make sense without each other [in context of society].
If people can’t communicate (report epistemic state), “avoid danger” may not help or be based on 100% biased opinions on what’s dangerous.
If some people solve Alignment, but don’t communicate, humanity may perish due to not building a safe AGI.
If nobody solves Alignment, but nobody communicates about Alignment, humanity may perish because careless actors build an unsafe AGI without even knowing they do something dangerous.
I like communication, so I chose the second option. Even though “communicating without avoiding danger” doesn’t make sense either.
Since the poll options didn’t make much sense to me, I didn’t see myself as “facing alien values” or “fighting off babyeaters”. I didn’t press the link, because I thought it may “blow up” the site (similar to the previous Petrov’s Day) + I wasn’t sure it’s OK to click, I didn’t think my unilateralism would be analogous to Petrov’s unilateralism (did Petrov cure anyone’s values, by the way?). I decided it’s more Petrov-like to not click.
But is AGI (or anything else) related to the lessons of Petrov’s Day? That’s another can of worms. I think we should update the lessons of the past to fit the future situations. I think it doesn’t make much sense to take away from Petrov’s Day only lessons about “how to deal with launching nukes”.
Another consideration: Petrov did accurately report his epistemic state. Or would have, if it were needed (if it were needed, he would lie to accurately report his epistemic state—“there are no launches”). Or “he accurately non-reported the non-presence of nuclear missiles”.
I never registered that I was being asked to make a decision on behalf of the whole site. I might have considered that interpretation of your second question for a fraction of a second, but then dismissed it as extremely unlikely to be your actual intent.
I spent probably less than 3 seconds on the second question (message). To get me to spend more time than that, your first 25 words or so would have needed to include some sign that something significant was at stake in my response. “I’m prepared to send you 5 dollars,” would have done it. So would signs that you sent the message only to me (or only to me and 4 other people) rather than to everyone on the site. (I was able to deduce the latter somehow.)
I didn’t list in the main post or say until because I fear that I’m saying it defensively in response to criticism, but to model the design for this year requires knowing that we spent vastly less time on it, deciding to something at the last minute. (We’d been very busy with a massive conference in days before Petrov Day.
At 11am (US West Coast time) we started thinking there was something we could maybe do, and at 12pm we got started. I felt we needed to rush if we were to include European folks at all, so really was looking for something we could get done quickly. As the post mentions, we didn’t spend much time on the poll options or trying to design it well, we just wanted something out so half the people wouldn’t be completely excluded.
The second message idea wasn’t even chosen until about half an hour before I sent it. We basically sent the first message and then worked on figuring out how to build on it, and then “next year’s will be decided based on this” was an 11th hour insight. It gives it some stakes without being overwhelming stakes.
Could we have done something better with more time and effort? For sure. But I think this was better than letting Petrov Day pass without any kind of commemoration.
Registering my predictions for which groups clicked the second link most:
Percentagewise, I don’t Groups A and C clicked on it that much (though I’d be surprised if the number from each group isn’t non-zero), since they picked a choice that indicates that they care about making high-quality decisions and cooperating with the rest of the world. A higher proportion of C probably clicked than A, since a person might decide it’s worth it even if they take their time to think it through (I’d disagree, but the commentor you quote fits into that category).
I’d then say the “accurately reporting your epistemic beliefs” group probably clicked on it the most because I don’t model ⌞the kind of person who’d say that is the important trait of Petrov day⌝ as being a particularly ethical person
This is not responding to the interesting part of the post, but I did not vote in the poll because I felt like virtue A was a mangled form of the thing I care about for Petrov Day, and non-voting was the closest I could come to fouling my ballot in protest.
To me Petrov Day is about having a button labeled “destroy world” and choosing not to press it. Virtue A as described in the poll is about having a button labeled “maybe destroy world, I dunno, are you feeling lucky?” and choosing not to press it. This is a different definition which seems to have been engineered so that a holiday about avoiding certain doom can be made compatible with avoiding speculative doom due to, for instance, AI.
I would prefer that Petrov Day gets to be about Petrov, and “please Sam Altman, don’t risk turning the world into paperclips” gets a different day if there is demand for such a thing.
It feels fairly important to me that in real life, Petrov had a “maybe destroy world, are you feeling lucky?” button. (It sounds like we disagree on this?)
Like, relaying information up the chain of command a) doesn’t automatically mean that they launch a full scale counterattack, b) that doesn’t mean the US automatically launches a full scale counterattack, c) my current belief is that full scale nuclear war probably cripples the northern hemisphere but doesn’t literally kill all humans (which is what I think most people mean by ‘destroy the world’)
Petrov was not the last link in the chain of launch authorization which means that his action wasn’t guaranteed to destroy the world since someone further down the chain might have cast the same veto he did. So technically yes, Petrov was pushing a button labeled “destroy the world if my superior also thinks these missiles are real, otherwise do nothing”. For this reason I think Vasily Arkhipov day would be better, but too late to change now.
But I think that if the missiles had been launched, that destroys the world (which I use as shorthand for destroying less than literally all humans, as in “The game Fallout is set in the year 2161 after the world was destroyed by nuclear war), and there is a very important difference between Petrov evaluating the uncertainty of “this is the button designed to destroy the world, which technically might get vetoed by my boss” and e.g. a nuclear scientist who has model uncertainty about the physics of igniting the planet’s atmosphere (which yes, actual scientists ruled out years before the first test, but the hypothetical scientist works great for illustrative purposes). In Petrov’s case, nothing good can ever come of hitting the button except perhaps selfishly, in that he might avoid personal punishment for failing in his button-hitting duties.
(I edited in more reply you may want to respond to. I think the button wasn’t actually designed to “destroy world”, it was designed to launch a counterattack. Petrov did seem to think it would based on some other quotes of his, but, like, AFAICT he was wrong. I think this is also true for Arkipov)
Granting for the sake of argument that launching the missiles might not have triggered full-scale nuclear war, or that one might wish to define “destroy the world” in a way that is not met by most full-scale nuclear wars, I am still dissatisfied with virtue A because I think an important part of Petrov’s situation was that whatever you think the button did, it’s really hard to find an upside to pushing it, whereas virtue A has been broadened to cover situations that are merely net bad, but where one could imagine arguments for pushing the button. My initial post framing it in terms of certainty may have been poorly phrased.
There is an upside to being the kind of person who will press the button in retaliation. You hope never to, but the fact that you credibly would allows for MAD game theory to apply. (FDT, etc. etc.)
Was everyone supposed to receive the second message? I only got the poll and not the second message with the unilateralist link.
Ah I see:
I modestly propose that eating babies is more likely to have good outcomes, including with regard to the likelihood of apocalypse, compared to the literal stated goal of avoiding the apocalypse.
This seems like a fairly hot take on a throwaway tangent in the parent post, so I’m very confused why you posted it. My current top contender is that it was a joke I didn’t get, but I’m very low confidence in that.
The parent post amusingly equated “accurately communicating your epistemic status”, which is the value I selected in the poll, with eating babies. So I adopted that euphemism (dysphemism?) in my tongue-in-cheek response.
Also, this: https://en.wikipedia.org/wiki/A_Modest_Proposal
A Review of Petrov Day 2023, according to the four virtues. First a check on the Manifold predictions for the day:
Will LessWrong Petrov Day 2023 be about blowing up home page(s)? - resolved NO, was 70%
Will Less Wrong’s big red button be used on Petrov Day 2023? - resolved NO, was 50%.
Avoiding actions that noticeably increase the chance that civilization is destroyed
LessWrong avoided creating a big red button that represents destroying civilization. This is symbolic of Virtue A actions like “don’t create nuclear weapons” and “don’t create a superintelligence” and “don’t create the torment nexus”. Given that LessWrong has failed on this virtue in past Petrov Days, I am glad to see this. Manifold had a 70% conditional chance that if the button was created then it would be used.
Rating: 10⁄10.
Accurately reporting your epistemic state
The following sentences in the second poll appear to be false:
After some discussion, the LessWrong team has decided… (false, still not decided today)
Your selected response is currently in the minority (false for 58% of recipients)
If you click the below link and are the first to do so of any minority group, we will make your selected virtue be the focus of next year’s commemoration. (not known to be true at the time it was sent, still not decided)
This is symbolic of actions lacking Virtue B like data fraud, social engineering and lazy bullshit. I don’t think much of the excuses given.
Rating: 0⁄10
Other virtues
Quickly orienting to novel situations. This was not a novel situation, it happens every year on the same day. Not applicable.
Resisting social pressure. Judging from the comments, there was little social pressure to have a big red button. There was social pressure to do something and something was done. Overall, unclear, no rating.
Predictions
This was fun thank you!
There must be a hiccup in the data because you show < 30 total group b pickers in the first chart, but say there are 40 group b pickers in the “unilaterally make your virtue the focus” chart.
Oh, good catch. I had the rows on the denominator sorted wrong so that table was 75% wrong. Fixed now...
Rule 1: Don’t destroy the world
Rule 0: Cultivate those virtues that will stand you in good stead when you are faced with an opportunity to not destroy the world.
Virtue A is the most important outcome, but B,C, and D are part of what it takes to achieve A.