mesaoptimizer comments on Why I think it’s net harmful to do technical safety research at AGI labs

mesaoptimizer 8 Feb 2024 22:04 UTC
2 points
0
While I share your sentiment, I expect that the problem is far more complex than we think. Sure, corporations are made of people, and people believe (explicitly or implicitly) that their actions are not going to lead to the end of humanity. The next question, then, is why do they believe this is the case? There are various attempts to answer to this question, and different people have different approaches to attempting to reduce x-risk given their answer to this question—see how MIRI and Conjecture’s approaches differ, for example.

This is, in my opinion, a viable line of attack, and is far more productive than pure truth-seeking comms (which is what I believe MIRI is trying) or an aggressive narrative shifting and policy influencing strategy (which is what I believe Conjecture is trying).
- Remmelt 11 Feb 2024 8:35 UTC
  1 point
  0
  Parent
  I appreciate this comment.
  
  Be careful though that we’re not just dealing with a group of people here.
  
  We’re dealing with artificial structures (ie. corporations) that take in and fire human workers as they compete for profit. With the most power-hungry workers tending to find their way to the top of those hierarchical structures.
  - mesaoptimizer 11 Feb 2024 15:37 UTC
    2 points
    0
    Parent
    
    Be careful though that we’re not just dealing with a group of people here.
    
    Yes, I am proposing a form of systemic analysis such that one is willing to look at multiple levels of the stack of abstractions that make up the world ending machine. This can involve aggressive reductionism, such that you can end up modeling sub-systems and their motivations within individuals (either archetypal ones or specific ones), and can involve game theoretic and co-ordination focused models of teams that make up individual frontier labs—their incentives, their resource requirements, et cetera.
    
    Most people focus on the latter, far fewer focus on the former, and I don’t think anyone is even trying to do a full stack analysis of what is going on.
    - Remmelt 12 Feb 2024 3:39 UTC
      1 point
      0
      Parent
      Good to hear!
  - Remmelt 11 Feb 2024 9:15 UTC
    1 point
    0
    Parent
    You can literally have a bunch of engineers and researchers believe that their company is contributing to AI extinction risk, yet still go with the flow.
    
    They might even think they’re improving things at the margin. Or they have doubts, but all their colleagues seem to be going on as usual.
    
    In this sense, we’re dealing with the problems of having that corporate command structure in place that takes in the loyal, and persuades them to do useful work (useful in the eyes of power-and-social-recognition-obsessed leadership).
    - Remmelt 12 Feb 2024 3:41 UTC
      2 points
      0
      Parent
      Someone shared the joke: “Remember the Milgram experiment, where they found out that everybody but us would press the button?”
      
      My response: Right! Expect AGI lab employees to follow instructions, because of…
      
      deference to authority
      incremental worsening (boiling frog problem)
      peer proof (“everyone else is doing it”)
      escalation of commitment