Stephen Fowler comments on Richard Ngo’s Shortform

Stephen Fowler Jun 21, 2024, 11:51 AM
26 points
27
This post confuses me.

Am I correct that the implied implication here is that assurances from a non-rationalist are essentially worthless?
I think it is also wrong to imply that Anthropic have violated their commitment simply because they didn’t rationally think through the implications of their commitment when they made it.
I think you can understand Anthropic’s actions as purely rational, just not very ethical.
They made an unenforceable commitment to not push capabilities when it directly benefited them. Now that it is more beneficial to drop the facade, they are doing so.
I think “don’t trust assurances from non-rationalists” is not a good takeaway. Rather it should be “don’t trust unenforceable assurances from people who will stand to greatly benefit from violating your trust at a later date”.
- Richard_Ngo Jun 21, 2024, 8:57 PM
  6 points
  8
  Parent
  The intended implication is something like “rationalists have a bias towards treating statements as much firmer commitments than intended then getting very upset when they are violated”.
  For example, unless I’m missing something, the “we do not wish to advance the rate of AI capabilities” claim is just one offhand line in a blog post. It’s not a firm commitment, it’s not even a claim about what their intentions are. As stated, it’s just one consideration that informs their actions—and in fact the “wish” terminology is often specifically not a claim about intended actions (e.g. “I wish I didn’t have to do X”).
  Yet rationalists are hammering them on this one sentence—literally making songs about it, tweeting it to criticize Anthropic, etc. It seems like there is a serious lack of metacognition about where a non-adversarial communication breakdown could have occurred, or what the charitable interpretations of this are.
  (I am open to people considering them then dismissing them, but I’m not even seeing that. Like, if people were saying “I understand the difference between Anthropic actually making an organizational commitment, and just offhand mentioning a fact about their motivations, but here’s why I’m disappointed anyway”, that seems reasonable. But a lot of people seem to be treating it as a Very Serious Promise being broken.)
  - Stephen Fowler Jun 22, 2024, 3:14 AM
    9 points
    7
    Parent
    That makes sense.
    I guess the followup question is “how were Anthropic able to cultivate the impression that they were safety focused if they had only made an extremely loose offhand commitment?”
    
    Certainly the impression I had from how integrated they are in the EA community was that they had made a more serious commitment.
    - William_S Jun 22, 2024, 8:19 AM
      6 points
      4
      Parent
      Everyone is afraid of the AI race, and hopes that one of the labs will actually end up doing what they think is the most responsible thing to do. Hope and fear is one hell of a drug cocktail, makes you jump to the conclusions you want based on the flimsiest evidence. But the hangover is a bastard.
    - William_S Jun 22, 2024, 8:22 AM
      4 points
      2
      Parent
      Imo I don’t know if we have evidence that Anthropic deliberately cultivated or significantly benefitted from the appearance of a commitment. However if an investor or employee felt like they made substantial commitments based on this impression and then later felt betrayed that would be more serious. (The story here is I think importantly different from other stories where I think there were substantial benefits from commitment appearance and then violation)
  - Viliam Jun 22, 2024, 12:43 PM
    2 points
    0
    Parent
    The intended implication is something like “rationalists have a bias towards treating statements as much firmer commitments than intended then getting very upset when they are violated”.
    That sounds suspiciously similar to “autists have a bias towards interpreting statements literally”.
    - Richard_Ngo Jun 23, 2024, 5:12 AM
      6 points
      2
      Parent
      I mean, yes, they’re closely related.