Matthew Barnett comments on Matthew Barnett’s Shortform

Matthew Barnett 26 Jan 2024 20:54 UTC
4 points
What do you mean by “misalignment”? In a regime with autonomous AI agents, I usually understand “misalignment” to mean “has different values from some other agent”. In this frame, you can be misaligned with some people but not others. If an AI is aligned with North Korea, then it’s not really “misaligned” in the abstract—it’s just aligned with someone who we don’t want it to be aligned with. Likewise, if OpenAI develops AI that’s aligned with the United States, but unaligned with North Korea, this mostly just seems like the same problem but in reverse.
In general, conflicts don’t really seem well-described as issues of “misalignment”. Sure, in the absence of all misalignment, wars would probably not occur (though they may still happen due to misunderstandings and empirical disagreements). But for the most part, wars seem better described as arising from a breakdown of institutions that are normally tasked with keeping the peace. You can have a system of lawful yet mutually-misaligned agents who keep the peace, just as you can have an anarchic system with mutually-misaligned agents in a state of constant war. Misalignment just (mostly) doesn’t seem to be the thing causing the issue here.
You could also solve or mitigate the problem by resolving all human conflicts (so the AI doesn’t have a group to ally with)
Note that I’m not saying
- AIs will aid in existing human conflicts, picking sides along the ordinary lines we see today
I am saying:
- AIs will likely have conflicts amongst themselves, just as humans have conflicts amongst themselves, and future conflicts (when considering all of society) don’t seem particularly likely to be AI vs. human, as opposed to AI vs AI (with humans split between these groups).
- ryan_greenblatt 26 Jan 2024 21:38 UTC
  2 points
  Parent
  
  Note that I’m not saying
  
  Yep, I was just refering to my example scenario and scenarios like this.
  
  Like the basic question is the extent to which human groups form a cartel/monopoly on human labor vs ally with different AI groups. (And existing conflict between human groups makes a full cartel much less likely.)
- ryan_greenblatt 26 Jan 2024 21:30 UTC
  2 points
  Parent
  Sorry, by “without misalignment” I mean “without misalignment related technical problems”. As in, it’s trivial to avoid misalignment from the perspective of ai creators.
  - Matthew Barnett 26 Jan 2024 21:58 UTC
    2 points
    Parent
    This doesn’t clear up the confusion for me. That mostly pushes my question to “what are misalignment related technical problems?” Is the problem of an AI escaping a server and aligning with North Korea a technical or a political problem? How could we tell? Is this still in the regime where we are using AIs as tools, or are you talking about a regime where AIs are autonomous agents?
    - ryan_greenblatt 26 Jan 2024 22:08 UTC
      2 points
      Parent
      I mean, it could be resolved in principle by technical means and might be resovable by political means as well. I’m assuming the AI creator didn’t want the AI to escape to north korea and therefore failed at some technical solution to this.
      
      I’m imagining very powerful AIs, e.g. AIs that can speed up R&D by large factors. These are probably running autonomously, but in a way which is de jure controlled by the AI lab.