TsviBT comments on Zach Stein-Perlman’s Shortform

TsviBT 24 Aug 2024 17:40 UTC
11 points
19
But that’s not a plan to ensure their uranium pile goes well.
- TsviBT 24 Aug 2024 19:16 UTC
  5 points
  2
  Parent
  @Zach Stein-Perlman , you’re missing the point. They don’t have a plan. Here’s the thread (paraphrased in my words):
  
  Zach: [asks, for Anthropic]
  Zac: … I do talk about Anthropic’s safety plan and orientation, but it’s hard because of confidentiality and because many responses here are hostile. …
  Adam: Actually I think it’s hard because Anthropic doesn’t have a real plan.
  Joseph: That’s a straw-man. [implying they do have a real plan?]
  Tsvi: No it’s not a straw-man, they don’t have a real plan.
  Zach: Something must be done. Anthropic’s plan is something.
  Tsvi: They don’t have a real plan.
  - Joseph Miller 25 Aug 2024 22:38 UTC
    3 points
    2
    Parent
    Joseph: That’s a straw-man. [implying they do have a real plan?]
    I explicitly said “However I think the point is basically correct” in the next sentence.
  - Zach Stein-Perlman 24 Aug 2024 19:20 UTC
    2 points
    0
    Parent
    Sorry, reacts are ambiguous.
    I agree Anthropic doesn’t have a “real plan” in your sense, and narrow disagreement with Zac on that is fine.
    I just think that’s not a big deal and is missing some broader point (maybe that’s a motte and Anthropic is doing something bad—vibes from Adam’s comment—is a bailey).
    [Edit: “Something must be done. Anthropic’s plan is something.” is a very bad summary of my position. My position is more like various facts about Anthropic mean that them-making-powerful-AI is likely better than the counterfactual, and evaluating a lab in a vacuum or disregarding inaction risk is a mistake.]
    [Edit: replies to this shortform tend to make me sad and distracted—this is my fault, nobody is doing something wrong—so I wish I could disable replies and I will probably stop replying and would prefer that others stop commenting. Tsvi, I’m ok with one more reply to this.]
    - TsviBT 24 Aug 2024 20:37 UTC
      33 points
      39
      Parent
      (I won’t reply more, by default.)
      
      various facts about Anthropic mean that them-making-powerful-AI is likely better than the counterfactual, and evaluating a lab in a vacuum or disregarding inaction risk is a mistake
      
      Look, if Anthropic was honestly and publically saying
      
      We do not have a credible plan for how to make AGI, and we have no credible reason to think we can come up with a plan later. Neither does anyone else. But—on the off chance there’s something that could be done with a nascent AGI that makes a nonomnicide outcome marginally more likely, if the nascent AGI is created and observed by people are at least thinking about the problem—on that off chance, we’re going to keep up with the other leading labs. But again, given that no one has a credible plan or a credible credible-plan plan, better would be if everyone including us stopped. Please stop this industry.
      
      If they were saying and doing that, then I would still raise my eyebrows a lot and wouldn’t really trust it. But at least it would be plausibly consistent with doing good.
      
      But that doesn’t sound like either what they’re saying or doing. IIUC they lobbied to remove protection for AI capabilities whistleblowers from SB 1047! That happened! Wow! And it seems like Zac feels he has to pretend to have a credible credible-plan plan.
    - TsviBT 24 Aug 2024 19:33 UTC
      4 points
      5
      Parent
      Hm. I imagine you don’t want to drill down on this, but just to state for the record, this exchange seems like something weird is happening in the discourse. Like, people are having different senses of “the point” and “the vibe” and such, and so the discourse has already broken down. (Not that this is some big revelation.) Like, there’s the Great Stonewall of the AGI makers. And then Zac is crossing through the gates of the Great Stonewall to come and talk to the AGI please-don’t-makers. But then Zac is like (putting words in his mouth) “there’s no Great Stonewall, or like, it’s not there in order to stonewall you in order to pretend that we have a safe AGI plan or to muddy the waters about whether or not we should have one, it’s there because something something trade secrets and exfohazards, and actually you’re making it difficult to talk by making me work harder to pretend that we have a safe AGI plan or intentions that should promissorily satisfy the need for one”.
- mesaoptimizer 24 Aug 2024 18:35 UTC
  4 points
  0
  Parent
  Seems like most people believe (implicitly or explicitly) that empirical research is the only feasible path forward to building a somewhat aligned generally intelligent AI scientist. This is an underspecified claim, and given certain fully-specified instances of it, I’d agree.
  
  But this belief leads to the following reasoning: (1) if we don’t eat all this free energy in the form of researchers+compute+funding, someone else will; (2) other people are clearly less trustworthy compared to us (Anthropic, in this hypothetical); (3) let’s do whatever it takes to maintain our lead and prevent other labs from gaining power, while using whatever resources we have to also do alignment research, preferably in ways that also help us maintain or strengthen our lead in this race.
  - TsviBT 24 Aug 2024 18:57 UTC
    16 points
    14
    Parent
    
    most people believe (implicitly or explicitly) that empirical research is the only feasible path forward to building a somewhat aligned generally intelligent AI scientist.
    
    I don’t credit that they believe that. And, I don’t credit that you believe that they believe that. What did they do, to truly test their belief—such that it could have been changed? For most of them the answer is “basically nothing”. Such a “belief” is not a belief (though it may be an investment, if that’s what you mean). What did you do to truly test that they truly tested their belief? If nothing, then yours isn’t a belief either (though it may be an investment). If yours is an investment in a behavioral stance, that investment may or may not be advisable, but it would DEFINITELY be inadvisable to pretend to yourself that yours is a belief.