9eB1 comments on The Hacker Learns to Trust

9eB1 22 Jun 2019 5:07 UTC
10 points
As is always the case, this person changed their mind because they were made to feel valued. The community treated what they’d done with respect (even though, fundamentally, they were unsuccessful and the actual release of the model would have had no impact on the world), and as a result they capitulated.
- Ben Pace 22 Jun 2019 7:13 UTC
  20 points
  Parent
  While I agree that this is an important factor when modelling people’s decision-making, I think there is some straightforward evidence that this was not the primary factor here.
  
  Firstly, after the person spent an hour talking to friendly and helpful people from the high-status company, they did not change their decision, which is evidence against most parsimonious of status-based motives. (Relatedly, there was not a small set of people the author promised to read feedback from, but literally 100% of respondents, which is over-and-above what would be useful for getting the attention of key people.)
  
  And secondly, which is more persuasive for me though harder to communicate, I read the extensive reasons for their decisions for doing so, and they seemed clear and well-reasoned, and then the reasons against were important factors that are genuinely nuanced and hard to notice. It seemed to me more of a situation where someone actually improves their understanding of the world than one in which they were waiting for certain high-status-to-them people to give them attention. My sense is that writing that explains someone’s decisions that is wholly motivated by status makes less sense than these two posts did.
  
  You might still be right and I might have missed something, or just not have a cynical enough prior. Though I do believe people do sometimes change their actions due to good reasoning about the world and not solely due to immediate status considerations, and I feel very skeptical of any lens on the world that can’t (“As is always the case”) register a positive result on the question “Did this person make their decision due to updating their world model rather than short-sighted status-grabbing?”.
  
  Am interested to hear further thoughts of yours on the broader topic of modelling people’s decision making as primarily status based, if you have more things to add to the discussion.
  - 9eB1 22 Jun 2019 13:10 UTC
    20 points
    Parent
    The phenomenon I was pointing out wasn’t exactly that the person’s decision was made because of status. It was that a prerequisite for them changing their mind was that they were taken seriously and engaged with respectfully. That said, I do think that its interesting to understand the way status plays into these events.
    First, they started the essay with a personality-focused explanation:
    To explain how this all happened, and what we can learn from it, I think it’s important to learn a little bit more about my personality and with what kind of attitude and world model I came into this situation.
    and
    I have a depressive/paranoid streak, and tend to assume the worst until proven otherwise. At the time I made my first twitter post, it seemed completely plausible in my mind that no one, OpenAI or otherwise, would care or even notice me. Or, even worse, that they would antagonize me.”
    The narrative that the author themselves is setting up is that they had irrational or emotional reasons for behaving the way they did, then they considered longer and changed their mind. They also specifically call out that their perceived lack of self-status as an influencing factor.
    If someone has an irrational, status-focused explanation for their own initial reasoning, and then we see high-status people providing them extensive validation, it doesn’t mean that they changed their mind because of the high-status people, but it’s suggestive. My real model is that they took those ideas extra seriously because the people were nice and high status.
    Imagine a counterfactual world where they posted their model, and all of the responses they received were the same logical argument, but instead made on 4Chan and starting with “hey fuckhead, what are you trying to do, destroy the world?” My priors suggest that this person would have, out of spite, continued to release the model.
    The gesture they are making here, not releasing the model, IS purely symbolic. We know the model is not as good as mini-GPT2. Nonetheless, it may be useful to people who aren’t being supported by large corporate interests, either for learning or just for understanding ML better for real hackers. Since releasing the model is not a bona fide risk, part of not releasing it is so they can feel like they are part of history. Note the end where they talk about the precedent they are setting now by not releasing it.
    I think the fact that the model doesn’t actually work is an important aspect of this. Many hackers would have done it as a cool project and released it without pomp, but this person put together a long essay, explicitly touting the importance of what they’d done and the impact it would have on history. Then, it turned out the model did not work, which must have been very embarrassing. It is fairly reasonable to suggest that the person then took the action that made them feel the best about their legacy and status: writing an essay about why they were not releasing the model for good rationalist approved reasons. It is not even necessarily the case that the person is aware that this is influencing the decision, this is a fully Elephant in the Brain situation.
    When I read that essay, at least half of it is heavily-laden with status concerns and psychological motivations. But, to reiterate: though pro-social community norms left this person open to having their mind changed by argument, probably the arguments still had to be made.
    How you feel about this should probably turn on questions like “Who has the status in this community to have their arguments taken seriously? Do I agree with them?” and “Is it good for only well-funded entities to have access to current state-of-the-art ML models?”
    - Ben Pace 25 Jun 2019 18:51 UTC
      24 points
      Parent
      I agree with a lot of claims in your comment, and I think it’s valuable to think through how status plays a role in many situations, including this.
      There is an approach in your comments toward explaining someone’s behaviour that I disagree with, though it may just be a question of emphasis. A few examples:
      My real model is that they took those ideas extra seriously because the people were nice and high status.
      
      ...a prerequisite for them changing their mind was that they were taken seriously and engaged with respectfully
      These seem to me definitely true and simultaneously not that important*.
      When I read that essay, at least half of it is heavily-laden with status concerns and psychological motivations. But, to reiterate: though pro-social community norms left this person open to having their mind changed by argument, probably the arguments still had to be made. (emphasis added)
      The word ‘probably’ in that sentence feels false to me. I feel somewhat analogous to hearing someone argue that a successful tech startup is 100s of people working together in a company, and that basically running a tech startup is about status and incentives, though “probably code still had to be written” to make it successful. They’re both necessary.
      More generally, there are two types of games going on. One we’re allowed to talk about, and one we’re not, or at least not very directly. And we have to coordinate on both levels to succeed. This generally warps how our words relate to reality, because we’re also using those words to do things we’re pretending to ourselves we’re not doing, to let everyone express their preferences and coordinate in the silent games. These silent games have real and crucial implications for how well we can coordinate and where resources must be spent. But once you realise the silent games are being played, it isn’t the right move to say that the silent games are the only games, or always the primary games.
      I think the fact that the model doesn’t actually work is an important aspect of this. Many hackers would have done it as a cool project and released it without pomp, but this person put together a long essay, explicitly touting the importance of what they’d done and the impact it would have on history. Then, it turned out the model did not work, which must have been very embarrassing. It is fairly reasonable to suggest that the person then took the action that made them feel the best about their legacy and status: writing an essay about why they were not releasing the model for good rationalist approved reasons. It is not even necessarily the case that the person is aware that this is influencing the decision, this is a fully Elephant in the Brain situation.
      Again, I agree that something in this reference class is likely happening. But, for example, the long essay was not only about increasing the perceived importance of the action. It was also a strongly pro-social and cooperative move to the broader AI community to allow counterarguments to be presented, which is what successfully happened. There are multiple motives here, and (I think) it’s the case that the motive you point to was not the main one, even while it is a silent motive folks systematically avoid discussing.
      --
      *Actually I think that Connor in particular would’ve engaged with arguments even if they’d not been delivered respectfully, given that he responded substantively to many comments on Twitter/HackerNews/Medium, some of which were predominantly snark.
      - Ben Pace 25 Jun 2019 18:53 UTC
        7 points
        Parent
        When Robin Hanson is interviewed about The Elephant in the Brain, he is often asked “Are you saying that status accounts for all of our behaviour?”. His reply is that he+KevinSimler aren’t arguing that the hidden motives are the only motive, but that they’re a far more common motive than we give credit for in our normal discourse. Here’s an example of him saying this kind of thing on the 80k podcast:
        As we just said the example that, in education, your motive isn’t to learn the material, or when you go to the doctor, your motive isn’t to get well primarily, and the hidden motives are the actual motive. Now, how could I know what the hidden motives are, you might ask? The plan here, that’s where the book is … In each area, we identify the usual story, then we collect a set of puzzles that don’t make sense from the point of view of the usual story, strange empirical patterns, and then we offer an alternative motive that makes a lot more sense of those empirical patterns, and then we suggest that that is a stronger motive than the one we usually say.
        Now, just to be clear, almost every area of human life is complicated, and there’s a lot of people with a lot of different details and so, of course, almost every possible motive shows up in almost every area of human life, so we can’t be talking about the only motive, and so the usual motive does actually apply sometimes. Actually, you could think of the analogy to the excuse that the dog ate my homework. It only works because sometimes dogs eat homework. We don’t say the dragon ate my homework. That wouldn’t fly, so the usual story is part of the story. It’s just a smaller part than we like to admit, and what we’re going to call the hidden motive, the real motive is a bigger part of the story, but it’s still not the only part.
    - Ben Pace 22 Jun 2019 20:31 UTC
      8 points
      Parent
      it turned out the model did not work… It is fairly reasonable to suggest that the person then took the action that made them feel the best about their legacy and status
      Reading this I realise I developed most of my attitudes toward the topic when I believed that the copy was full-strength, and only in writing the post did I find out that it wasn’t—in fact it seems that it was weaker than the initial 117M version OpenAI released. You’re right that this makes the ‘release’ option less exciting from the perspective of one’s personal status, which (the status lens) would then predict taking whichever different action would give more personal status, and this is arguably one of those actions.
      Just now I found this comment in the medium comment section, where Connor agrees with you about it being symbolic, and mentions how this affected his thinking.
      ...I did admit failure as I linked to said failure in the very first paragraph, and I have no intentions of hiding that. In fact, after learning of my failure I was convinced I might as well release, since most safety issues were no longer a threat anyways (though there remains the possibility it could be used as a “warm start” to train a better model). So if anything, my failure encouraged me to dump it, apologize and let history take its course.
      My decision not to release is mostly symbolic. I’m doing it to signal good faith cooperation. Even if I failed today, some day someone will succeed, and we should have a default of cooperation before that.
      (Meta: Wow, Medium requires you to click twice to go down one step in a comment thread! Turns out there are like 20 comments on the OP.)
      - Ben Pace 23 Jun 2019 4:48 UTC
        6 points
        Parent
        Yeah, this is quite important, the attempted copy was weaker than the nerfed model OpenAI initially released. Thanks for emphasising this 9eB1, I’ve updated my post in a few places accordingly.
    - Ben Pace 22 Jun 2019 19:53 UTC
      4 points
      Parent
      
      The phenomenon I was pointing out wasn’t exactly that the person’s decision was made because of status. It was that a prerequisite for them changing their mind was that they were taking seriously and engaged with respectfully.
      
      Yeah, respectful and serious engagement with people’s ideas, even when you’re on the opposite sides of policy/norm disputes, is very important.