I generally have an unfavorable view of multi-agent approaches to safety, especially those that seek to achieve safety via creating multiple agents (I’m more sympathetic to considerations of how to increase safety on the assumption that multiple agents are unavoidable). That being said, you might find these links interesting for some prior related discussion on this site:
I generally have an unfavorable view of multi-agent approaches to safety, especially those that seek to achieve safety via creating multiple agents (I’m more sympathetic to considerations of how to increase safety on the assumption that multiple agents are unavoidable). That being said, you might find these links interesting for some prior related discussion on this site:
https://www.lesswrong.com/posts/BXMCgpktdiawT3K5v/multi-agent-safety
https://www.lesswrong.com/posts/HekjhtWesBWTQW5eF/agis-as-populations
https://www.lesswrong.com/posts/sD6KuprcS3PFym2eM/three-kinds-of-competitiveness