Raemon comments on Honoring Petrov Day on LessWrong, in 2019

Raemon 26 Sep 2019 22:42 UTC
40 points
(others have said part of what I wanted to say, but didn’t quite cover the thing I was worried about)
I see two potential objections:
- how valuable is trust among LW users? (this is hard to quantify, but I think it is potentially quite high)
- how persuasive should “it’s better than for someone to die” type arguments.
My immediate thoughts are mostly about the second argument.
I think it’s quite dangerous to leave oneself vulnerable to the second argument (for reasons Julia discusses on givinggladly.com in various posts). Yes, you can reflect upon whether every given cup of coffee is worth the dead-child-currency it took to buy it. But taken naively this is emotionally cognitively exhausting. (It also pushes people towards a kind of frugality that isn’t actually that beneficial). The strategy of “set aside a budget for charity, based on your values, and don’t feel pressure to give more after that” seems really important for living sanely while altruistic.
(I don’t have a robustly satisfying answer on how to deal with that exactly, but see this comment of mine for some more expanded thoughts of mine on this)
Now, additional counterfactual donations still seem fine to be willing to make on the fly – I’ve derived fuzzy-pleasure-joy from donating based on weird schemes on the Dank EA Memes FB group. But I think it is quite dangerous to feel pressure to donate to weird Dank EA Meme schemes based on “a life is at stake.”
A life is always at stake. I don’t think most humans can or should live this way.
- Rohin Shah 27 Sep 2019 1:12 UTC
  24 points
  Parent
  The strategy of “set aside a budget for charity, based on your values, and don’t feel pressure to give more after that” seems really important for living sanely while altruistic.
  But this situation isn’t like that.
  I agree you don’t want to always be vulnerable to the second argument, for the reasons you give. I don’t think the appropriate response is to be so hard-set in your ways that you can’t take advantage of new opportunities that arise. You can in fact compare whether or not a particular trade is worth it if the situation calls for it, and a one-time situation that has an upside of $1672 for ~no work seems like such a situation.
  As a meta point directed more at the general conversation than this comment in particular, I would really like it if people stated monetary values at which they would think this was a good idea. At $10, I’m at “obviously not”, and at $1 million, I’m at “obviously yes”. I think the range of uncertainty is something like $500 - $20,000. Currently it feels like the building of trust is being treated as a sacred value; this seems bad.
  - habryka 27 Sep 2019 2:03 UTC
    24 points
    Parent
    My sense is that it’s very unlikely to be worth it at anything below $10k, and I might be a bit tempted at around $50k, though still quite hesitant. I agree that at $1M it’s very likely worth it.
    - lionhearted (Sebastian Marshall) 27 Sep 2019 4:19 UTC
      13 points
      Parent
      Firm disagree. Second-order and third-order effects go limit->infinity here.
      Also btw, I’m running a startup that’s now looking at — best case scenario — handling significant amounts of money over multiple years.
      It makes me realize that “a lot of money” on the individual level is a terrible heuristic. Seriously, it’s hard to get one’s mind around it, but a million dollars is decidedly not a lot of money on the global scale.
      For further elaboration, this is relevant and incredibly timely:
      https://slatestarcodex.com/2019/09/18/too-much-dark-money-in-almonds/
      - Rohin Shah 27 Sep 2019 18:39 UTC
        11 points
        Parent
        LW frontpage going down is also not particularly bad, so you don’t need much money to compensate for it.
        If you wanted to convince me, you could make a case that destroying trust is really bad, and that in this particular case pressing the button would destroy a lot of trust, but that case hasn’t really been made.
        lionhearted (Sebastian Marshall) 27 Sep 2019 22:36 UTC
        14 points
        Parent
        LW frontpage going down is also not particularly bad [...] If you wanted to convince me, you could make a case that destroying trust is really bad
        Umm, respectfully, I think this is extremely arrogant. Dangerously so.
        Anyways, I’m being blunt here, but I think respectful and hopefully useful. Think about this. Reasoning follows —
        The instructions if you got launch codes (also in the above post) were as such (emphasis added with underline) —
        “Every Petrov Day, we practice not destroying the world. One particular way to do this is to practice the virtue of not taking unilateralist action.
        It’s difficult to know who can be trusted, but today I have selected a group of LessWrong users who I think I can rely on in this way. You’ve all been given the opportunity to show yourselves capable and trustworthy.
        [...]
        This Petrov Day, between midnight and midnight PST, if you, {{username}}, enter the launch codes below on LessWrong, the Frontpage will go down for 24 hours.
        I hope to see you on the other side of this, with our honor intact.”
        So, to Ben Pace at least (the developer who put in a tremendous amount of hours and thought into putting this together), it represents...
        *”practicing not destroying the world”
        *”practicing the virtue of not taking unilateralist action”
        *implications around his own uncertainty of who to trust
        *de facto for Ben that he can’t rely on you personally, by his standards, if you do it
        *showing yourself not “capable and trustworthy” by his standards
        *having the total group’s “honor” “not be intact”, under Ben’s conception
        And you want me to make a case for you on a single variable while ignoring the rather clear and straightforward written instructions for your own simple reductive understanding?
        For Ben at least, the button thing was a symbolic exercise analogous to not nuking another country and he specifically asked you not to and said he’s trusting you.
        So, no, I don’t want to “convince you” nor “make a case that destroying trust is really bad.” You’re literally stating you should set the burden of proof and others should “make a case.”
        In an earlier comment you wrote,
        You can in fact compare whether or not a particular trade is worth it if the situation calls for it, and a one-time situation that has an upside of $1672 for ~no work seems like such a situation.
        “No work”? You mean aside from the work that Ben and the team did (a lot) and demonstrating to the world at large that the rationality community can’t press a “don’t destroy our own website” button to celebrate a Soviet soldier who chose restraint?
        I mean, I don’t even want to put numbers on it, but if we gotta go to “least common denominator”, then $1672 is less than a week’s salary of the median developer in San Francisco. You’d be doing a hell of a lot more damage than that to morale and goodwill, I reckon, among the dev team here.
        To be frank, I think the second-order and third-order effects of this project going well on Ben Pace alone is worth more than $1672 in “generative goodness” or whatever, and the potential disappointment and loss of faith in people he “thinks but is uncertain he can rely upon and trust” is… I mean, you know that one highly motivated person leading a community can make an immense difference right?
        Just so you can get $1672 for charity (“upside”) with “~no work”?
        And that’s just productivity, ignoring any potential negative affect or psychological distress, and being forced to reevaluate who he can trust. I mean, to pick a more taboo example, how many really nasty personal insults would you shout at a random software developer for $1672 to charity? That’s almost “no work” — it’s just you shouting some words, and whatever trivial psychological distress they feel, and I wager getting random insults from a stranger is much lower than having people you “are relying on and trusting” press a “don’t nuke the world simulator button.”
        Like, if you just read what Ben wrote, you’d realize that risking destroying goodwill and faith in a single motivated innovative person alone should be priced well over $20k. I wouldn’t have done it for $100M going to charity. Seriously.
        If you think that’s insane, stop and think why our numbers are four orders of magnitude apart — our priors must be obviously very different. And based on the comments, I’m taking into account more things than you, so you might be missing something really important.
        (I could go on forever about this, but here’s one more: what’s the difference in your expected number of people discovering and getting into basic rationality, cognitive biases, and statistics with pressing the “failed at ‘not destroying the world day’ commemoration” vs not? Mine: high. What’s the value of more people thinking and acting rationally? Mine: high. So multiply the delta by the value. That’s just one more thing. There’s a lot you’re missing. I don’t mean this disrespectfully, but maybe think more instead of “doing you” on a quick timetable?)
        (Here’s another one you didn’t think about: we’re celebrating a Soviet engineer. Run this headline in a Russian newspaper: “Americans try to celebrate Stanislav Petrov by not pressing ‘nuke their own website’ button, arrogant American pushes button because money isn’t donated to charity.”)
        (Here’s another one you didn’t think about: I’ll give anyone 10:1 odds this is cited in a mainstream political science journal within 15 years, which are read by people who both set and advise on policy, and that “group of mostly American and European rationalists couldn’t not nuke their own site” absolutely is the type of thing to shape policy discussions ever-so-slightly.)
        (Here’s another one you didn’t think about: some fraction of the people here are active-duty or reserve military in various countries. How does this going one way or another shape their kill/no-kill decisions in ambiguous warzones? Have you ever read any military memoirs about people who made to make those calls quickly, EX overwatch snipers in Mogadishu? No?)
        (Not meant to be snarky — Please think more and trust your own intuition less.)
        What links here?
        lionhearted (Sebastian Marshall)'s comment on On Destroying the World by Chris_Leong (28 Sep 2020 22:45 UTC; 17 points)
        Kaj_Sotala's comment on On Destroying the World by Chris_Leong (28 Sep 2020 23:32 UTC; 14 points)
        lionhearted (Sebastian Marshall)'s comment on On Destroying the World by Chris_Leong (28 Sep 2020 20:41 UTC; 7 points)
        Rohin Shah 28 Sep 2019 1:04 UTC
        19 points
        Parent
        Thanks for writing this up. It’s pretty clear to me that you aren’t modeling me particularly well, and that it would take a very long time to resolve this, which I’m not particularly willing to do right now.
        I’ll give anyone 10:1 odds this is cited in a mainstream political science journal within 15 years, which are read by people who both set and advise on policy
        I’ll take that bet. Here’s a proposal: I send you $100 today, and in 15 years if you can’t show me an article in a reputable mainstream political science journal that mentions this event, then you send me an inflation-adjusted $1000. This is conditional on finding an arbiter I trust (perhaps Ben) who will:
        Adjudicate whether it is an “article in a reputable mainstream political science journal that mentions this event”
        Compute the inflation-adjusted amount, should that be necessary
        Vouch that you are trustworthy and will in fact pay in 15 years if I win the bet.
        Eli Tyre 27 Sep 2019 19:03 UTC
        1 point
        Parent
        If you wanted to convince me, you could make a case that destroying trust is really bad, and that in this particular case pressing the button would destroy a lot of trust, but that case hasn’t really been made.
        This basically seems right to me.
        habryka 27 Sep 2019 20:16 UTC
        11 points
        Parent
        Which part of the two statements? That destroying trust is really bad, or that the case hasn’t been made?
        Eli Tyre 28 Sep 2019 12:42 UTC
        8 points
        Parent
        That this particular case would destroy a lot of trust.
        This seemed to me like a fun game with stakes of social disapproval on one side, and basically no stakes on the other. This doesn’t seem like it has much bearing on the trustworthiness of members of the rationality community in situations with real stakes, where there is a stronger temptation to defect, or it would have more of a cost on the community.
        I guess implicit to what I’m saying is that the front page being down for 24 hours doesn’t seem that bad to me. I don’t come to Less Wrong most days anyway.
  - TurnTrout 27 Sep 2019 1:52 UTC
    20 points
    Parent
    But this is not a one-time situation. If you’re a professional musician, would you agree to mess up at every dress rehearsal, because it isn’t the real show?
    
    More indirectly… the whole point of “celebrating and practicing our ability to not push buttons” is that we need to be able to not push buttons, even when it seems like a good idea (or necessary, or urgent that we defect while we can still salvage the the percieved situation). The vast majority of people aren’t tempted by pushing a button when pushing it seems like an obviously bad idea. I think we need to take trust building seriously, and practice the art of actually cooperating. Real life doesn’t grade you on how well you understand TDT considerations and how many blog posts you’ve read on it, it grades you on whether you actually can make the cooperation equilibrium happen.
    - jp 27 Sep 2019 3:03 UTC
      10 points
      Parent
      Rohin argues elsewhere for taking a vote (at least in principal). If 50% vote in favor, then he has successfully avoided “falling into the unilateralist’s curse” and has gotten $1.6k for AMF. He even has some bonus for “solved the unilateralist’s curse in a way that’s not just “sit on his hands”. Now, it’s probably worth subtracting points for “the LW team asked them not to blow up the site and the community decided to anyway.” But I’d consider it fair play.
    - Rohin Shah 27 Sep 2019 18:03 UTC
      8 points
      Parent
      If you’re a professional musician, would you agree to mess up at every dress rehearsal, because it isn’t the real show?
      Depends on the upside.
      I think we need to take trust building seriously, and practice the art of actually cooperating.
      This comment of mine was meant to address the claim “people shouldn’t be too easily persuaded by arguments about people dying” (the second claim in Raemon’s comment above). I agree that intuitions like this should push up the size of the donation you require.
      More indirectly… the whole point of “celebrating and practicing our ability to not push buttons” is that we need to be able to not push buttons, even when it seems like a good idea (or necessary, or urgent that we defect while we can still salvage the the percieved situation). The vast majority of people aren’t tempted by pushing a button when pushing it seems like an obviously bad idea.
      As jp mentioned, I think the ideal thing to do is: first, each person figures out whether they personally think the plan is positive / negative, and then go with the majority opinion. I’m talking about the first step here. The second step is the part where you deal with the unilateralist curse.
      Real life doesn’t grade you on how well you understand TDT considerations and how many blog posts you’ve read on it, it grades you on whether you actually can make the cooperation equilibrium happen.
      It seems to me like the algorithm people are following is: if an action would be unilateralist, and there could be disagreement about its benefit, don’t take the action. This will systematically bias the group towards inaction. While this is fine for low-stakes situations, in higher-stakes situations where the group can invest effort, you should actually figure out whether it is good to take the action (via the two-step method above). We need to be able to take irreversible actions; the skill we should be practicing is not “don’t take unilateralist actions”, it’s “take unilateralist actions only if they have an expected positive effect after taking the unilateralist curse into account”.
      We never have certainty, not for anything in this world. We must act anyway, and deciding not to act is also a choice. (Source)
      What links here?
      Rohin Shah's comment on Honoring Petrov Day on LessWrong, in 2019 by Ben Pace (27 Sep 2019 18:29 UTC; 8 points)
      - TurnTrout 28 Sep 2019 1:37 UTC
        10 points
        Parent
        
        It seems to me like the algorithm people are following is: if an action would be unilateralist, and there could be disagreement about its benefit, don’t take the action. This will systematically bias the group towards inaction. While this is fine for low-stakes situations, in higher-stakes situations where the group can invest effort, you should actually figure out whether it is good to take the action (via the two-step method above). We need to be able to take irreversible actions; the skill we should be practicing is not “don’t take unilateralist actions”, it’s “take unilateralist actions only if they have an expected positive effect after taking the unilateralist curse into account”.
        
        I don’t disagree with this, and am glad to see reminders to actually evaluate different courses of action besides the one expected of us. my comment was more debating your own valuation as being too low, it not being a one-off event once you consider scenarios either logically or causally downstream of this one, and just a general sense that you view the consequences of this event as quite isolated.
        
        Rohin Shah 28 Sep 2019 5:03 UTC
        6 points
        Parent
        my comment was more debating your own valuation as being too low, it not being a one-off event once you consider scenarios either logically or causally downstream of this one
        That makes sense. I don’t think I’m treating it as a one-off event; it’s more that it doesn’t really seem like there’s much damage to the norm. If a majority of people thought it was better to take the counterfactual donation, it seems like the lesson is “wow, we in fact can coordinate to make good decisions”, as opposed to “whoops, it turns out rationalists can’t even coordinate on not nuking their own site”.
  - Richard Yannow 27 Sep 2019 4:33 UTC
    16 points
    Parent
    jkaufman’s initial offer was unclear. I read it (incorrectly) as “I will push the button (/release the codes) unless someone gives AMF $1672 counterfactually”, not as “if someone is willing to pay me $1672, I will give them the codes”. Read in the first way, Raemon’s concerns about “pressure” as opposed to additional donations made on the fly may be clearer; it’s not about jkaufman’s opportunity to get $1672 in donations for no work, it’s about everyone else being extorted for an extra $1672 to preserve their values.
  - mingyuan 27 Sep 2019 2:42 UTC
    10 points
    Parent
    Perhaps a nitpick, but I feel like the building of trust is being treated less as a sacred value, and more as a quantity of unknown magnitude, with some probability that that magnitude could be really high (at least >$1672, possibly orders of magnitude higher). Doing a Fermi is a trivial inconvenience that I for one cannot handle right now; since it is a weekday, maybe others feel much the same.
    - Rohin Shah 27 Sep 2019 18:10 UTC
      3 points
      Parent
      I agree that your comment takes this (very reasonable) perspective. It didn’t seem to me like any other comment was taking this perspective, but perhaps that was their underlying model.
  - lionhearted (Sebastian Marshall) 27 Sep 2019 4:16 UTC
    −2 points
    Parent
    I wouldn’t do it for $100M.
    Seriously.
    Because it increases the marginal chance that humanity goes extinct ever-so-slightly.
    If you have launch codes, wait until tomorrow to read the last part eh? —
    (V zrna, hayrff lbh guvax gur rkcrevzrag snvyvat frpergyl cebzbgrf pnhgvba naq qrfgeblf bcgvzvfz, juvpu zvtug or gehr.)
    - Rohin Shah 27 Sep 2019 18:05 UTC
      24 points
      Parent
      Why couldn’t you use the $100M to fund x-risk prevention efforts?
      - lionhearted (Sebastian Marshall) 27 Sep 2019 23:04 UTC
        −2 points
        Parent
        Well, why stop there?
        World GDP is $80.6 trillion.
        Why doesn’t the United States threaten to nuke everyone if they don’t give a very reasonable 20% of their GDP per year to fund X-Risk — or whatever your favorite worthwhile projects are?
        Screw it, why don’t we set the bar at 1%?
        Imagine you’re advising the U.S. President (it’s Donald Trump right now, incidentally). Who should President Trump threaten with nuking if they don’t pay up to fund X-Risk? How much?
        Now, let’s say 193 countries do it, and $X trillion is coming in and doing massive good.
        Only Switzerland and North Korea defect. What do you do? Or rather, what do you advise Donald Trump to do?
        Rohin Shah 28 Sep 2019 1:12 UTC
        8 points
        Parent
        I never suggested threats, and in fact I don’t think you should threaten to press the button unless someone makes a counterfactual donation of $1,672.
        Jeff’s original comment was also not supposed to be a threat, though it was ambiguous. All of my comments are talking about the non-threat version.
- lionhearted (Sebastian Marshall) 27 Sep 2019 4:11 UTC
  −3 points
  Parent
  Dank EA Memes? What? Really? How do I get in on this?
  (Serious.)
  (I shouldn’t joke “I have launch codes” — that’s grossly irresponsible for a cheap laugh — but umm, I just meta made the joke.)
  - lionhearted (Sebastian Marshall) 27 Sep 2019 4:23 UTC
    5 points
    Parent
    Note to self: Does lighthearted dark humor highlighting risk increase or decrease chances of bad things happening?
    Initial speculation: it might have an inverted response curve. One or two people making the joke might increase gravity, everyone joking about it might change norms and salience.
    - Adele Lopez 27 Sep 2019 4:46 UTC
      14 points
      Parent
      I noticed after playing a bunch of games of a mafia-type game with some rationalists that when people made edgy jokes about being in the mob or whatever, they were more likely to end up actually being in the mob.
      - lionhearted (Sebastian Marshall) 27 Sep 2019 23:07 UTC
        6 points
        Parent
        There’s rationalists who are in the mafia?
        Whoa.
        No insightful comment, just, like — this Petrov thread is the gift that keeps on giving.
        interstice 28 Sep 2019 0:50 UTC
        14 points
        Parent
        Can’t tell if joking, but they probably mean that they were “actually in the mafia” in the game, so not in the real-world mafia.
        Adele Lopez 28 Sep 2019 6:49 UTC
        4 points
        Parent
        Yes, lol :)
  - Tetraspace 28 Sep 2019 1:09 UTC
    2 points
    Parent
    Dank EA Memes is a Facebook group. It’s pretty good.