Seth Herd comments on Why I think it’s net harmful to do technical safety research at AGI labs

Seth Herd 8 Feb 2024 2:40 UTC
3 points
−1
It seems quite possible that current AI orgs are going to develop superintelligence, with or without the participation of AI safety people. It seems to me much better if AI safety people participate in that process.

What’s the marginal impact? If you don’t take that job, someone less qualified will. They’ll be on average both less skilled and less motivated toward safety. The safety-washing will still be done, just marginally less effectively.

The other maginal change is that those orgs are now made up of people with less concern for safety.

Organizations (and any group of humans) have a sort of composite psychology. They have shared beliefs that change and tend to converge over time. If some of the org is giving persuasively saying “this will kill us all if we’re not really careful”, the end result is an org that believes that, much more than the alternative where no one involved is making that argument persuasively.

Therefor I think it’s highly net-positive to work at an AGI org, (but probably better yet to get funding for safety research elsewhere.)

I’d go further, and say probably net-positive to take capabilities jobs at major orgs. Again, you’re doing work that someone else would do just about as well, but without your beliefs on safety. You are filling a slot with safety-minded beliefs that would otherwise be taken by someone without them.

There are two reasons that working in capabilities might be more important than working in the safety department of that same org. One is that safety people may not be privy to all of the capabilities development that org is doing. Capabilities people may have more opportunities to call out risks, both internally and externally (whistleblowing). Second, your opinions may be taken quite differently if you’re not in the safety department, whose whole job and mindset is safety, but you’re still concerned with safety. It’s easy to dismiss an “AI safety person” talking about AI x-risks, and less easy to dismiss an AI engineer who’s quite worried about safety.

As an aside, I think it’s really important to distinguish “AI safety” from AGI x-risk. They overlap, but the AGI x-risk is the thing I think we should all be more worried about. Working on ways to make a deep network AI less likely to be racist is marginally helpful for x-risk, but not the same thing. So working on that sort of AI safety is already less impactful than working directly on AGI alignment, in my view.
- Remmelt 8 Feb 2024 12:52 UTC
  4 points
  0
  Parent
  
  Capabilities people may have more opportunities to call out risks, both internally and externally (whistleblowing).
  
  I would like to see this. I am not yet aware of a researcher deciding to whistleblow on the AGI lab they work at.
  
  If you are, please meet with an attorney in person first, and preferably get advice from an experienced whistleblower to discuss preserving anonymity – I can put you through: remmelt.ellen[a|}protonmail{d07]com
  
  There’s so much that could be disclosed that would help bring about injunctions against AGI labs.
  
  Even knowing what copyrighted data is in the datasets would be a boon for lawsuits.
  - Seth Herd 8 Feb 2024 19:09 UTC
    4 points
    −2
    Parent
    No one has done any whistleblowing yet because we are not in danger yet. Current gen networks simply are not existentially risky. When someone is risking the future of the entire human race, we’ll see whistleblowers give up their jobs and risk their freedom and fortune to take action.
    
    I’m not saying that’s enough, but it’s better than an org where people are carefully self-selected to not give a shit about safety.
    - Remmelt 11 Feb 2024 8:31 UTC
      0 points
      −3
      Parent
      
      When someone is risking the future of the entire human race, we’ll see whistleblowers give up their jobs and risk their freedom and fortune to take action.
      
      There are already AGI lab leaders that are risking the future of the entire human race.
      
      Plenty of consensus to be found on that.
      
      So why no whistleblowing?
      - Seth Herd 26 Feb 2024 15:24 UTC
        2 points
        0
        Parent
        There’s nothing to blow the whistle on. Everyone knows that those labs are pursuing AGI.
        
        We are not in direct danger yet, in all likelihood. I have short timelines, but there’s almost no chance that any current work is at risk of growing smart enough to disempower humans. There’s a difference between hitting the accelerator in the direction of a cliff, and holding it down as it gets close. Developing AGI internally is when we’ll need and hopefullly get whistleblowers.
        
        Are you thinking of blowing the whistle on something in between work on AGI and getting close to actually achieving it?
        Remmelt 27 Feb 2024 8:49 UTC
        1 point
        0
        Parent
        Are you thinking of blowing the whistle on something in between work on AGI and getting close to actually achieving it?
        
        Good question.
        
        Yes, this is how I am thinking about it.
        
        I don’t want to wait until competing AI corporations become really good at automating work in profitable ways, also because by then their market and political power would be entrenched. I want society to be well-aware way before then that the AI corporations are acting recklessly, and should be restricted.
        
        We need a bigger safety margin. Waiting until corporate machinery is able to operate autonomously would leave us almost no remaining safety margin.
        
        There are already increasing harms, and a whistleblower can bring those harms to the surface. That in turn supports civil lawsuits, criminal investigations, and/or regulator actions.
        
        Harms that fall roughly in these categories – from most directly traceable to least directly traceable:
        Data laundering (what personal, copyrighted and illegal data is being copied and collected en masse without our consent).
        Worker dehumanisation (the algorithmic exploitation of gig workers; the shoddy automation of people’s jobs; the criminal conduct of lab CEOs)
        Unsafe uses (everything from untested uses in hospitals and schools, to mass disinformation and deepfakes, to hackability and covered-up adversarial attacks, to automating crime and the kill cloud, to knowingly building dangerous designs).
        Environmental pollution (research investigations of data centers, fab labs, and so on)
        
        For example:
        If an engineer revealed authors’ works in the datasets of ChatGPT, Claude, Gemini or Llama that would give publishers and creative guilds the evidence they need to ramp up lawsuits against the respective corporations (to the tens or hundreds).
        Or if it turned out that the companies collected known child sexual abuse materials (as OpenAI probably did, and a collaborator of mine revealed for StabilityAI and MidJourney).
        If the criminal conduct of the CEO of an AI corporation was revealed
        Eg. it turned out that there is a string of sexual predation/assault in leadership circles of OpenAI/CodePilot/Microsoft.
        Or it turned out that Satya Nadella managed a refund scam company in his spare time.
        If managers were aware of the misuses of their technology, eg. in healthcare, at schools, or in warfare, but chose to keep quiet about it.
        
        Revealing illegal data laundering is actually the most direct, and would cause immediate uproar.
        The rest is harder and more context-dependent. I don’t think we’re at the stage where environmental pollution is that notable (vs. the fossil fuel industry at large), and investigating it across AI hardware operation and production chains would take a lot of diligent research as an inside staff member.
        Remmelt 27 Feb 2024 8:51 UTC
        1 point
        0
        Parent
        Note:
        Even if you are focussed on long-term risks, you can still whistleblow on eggregious harms caused by these AI labs right now. Providing this evidence enables legal efforts to restrict these labs.
        
        Whistleblowing is not going to solve the entire societal governance problem, but it will enable others to act on the information you provided.
        
        It is much better than following along until we reached the edge of the cliff.