niplav comments on yams’s Shortform

niplav 19 Oct 2024 21:11 UTC
3 points
7
Grants to Redwood Research, SERI MATS, NYU alignment group under Sam Bowman for scalable supervision, Palisade research, and many dozens more, most of which seem net positive wrt TAI risk.
- yams 19 Oct 2024 23:21 UTC
  3 points
  0
  Parent
  Many MATS scholars go to Anthropic (source: I work there).
  
  Redwood I’m really not sure, but that could be right.
  
  Sam now works at Anthropic.
  
  Palisade: I’ve done some work for them, I love them, I don’t know that their projects so far inhibit Anthropic (BadLlama, which I’m decently confident was part of the cause for funding them, was pretty squarely targeted at Meta, and is their most impactful work to date by several OOM). In fact, the softer versions of Palisade’s proposal (highlighting misuse risk, their core mission), likely empower Anthropic as seemingly the most transparent lab re misuse risks.
  
  I take the thrust of your comment to be “OP funds safety, do your research”. I work in safety; I know they fund safety.
  
  I also know most safety projects differentially benefit Anthropic (this fact is independent of whether you think differentially benefiting Anthropic is good or bad).
  
  If you can make a stronger case for any of the other of the dozens of orgs on your list than exists for the few above, I’d love to hear it. I’ve thought about most of them and don’t see it, hence why I asked the question.
  
  Further: the goalpost is not ‘net positive with respect to TAI x-risk.’ It is ‘not plausibly a component of a meta-strategy targeting the development of TAI at Anthropic before other labs.’
  
  Edit: use of the soldier mindset flag above is pretty uncharitable here; I am asking for counter-examples to a hypothesis I’m entertaining. This is the actual opposite of soldier mindset.
  - niplav 20 Oct 2024 7:18 UTC
    2 points
    0
    Parent
    Apologies for the soldier mindset react, I pattern-matched to some more hostile comment. Communication is hard.
    - yams 20 Oct 2024 17:46 UTC
      1 point
      0
      Parent
      Makes sense. Pretty sure you can remove it (and would appreciate that).