Unfortunately I see this question didn’t get much engagement when it was originally posted, but I’m going to put a vote in for highly federated systems along the axes of agency, cognitive processes, and thinking, especially those that maximize transparency and determinism. I think that LM agents are just a first step into this area of safety. I write more about this here: https://www.lesswrong.com/posts/caeXurgTwKDpSG4Nh/safety-first-agents-architectures-are-a-promising-path-to
Unfortunately I see this question didn’t get much engagement when it was originally posted, but I’m going to put a vote in for highly federated systems along the axes of agency, cognitive processes, and thinking, especially those that maximize transparency and determinism. I think that LM agents are just a first step into this area of safety. I write more about this here: https://www.lesswrong.com/posts/caeXurgTwKDpSG4Nh/safety-first-agents-architectures-are-a-promising-path-to
For specific proposals I’d recommend Drexler’s work on federating agency https://www.lesswrong.com/posts/5hApNw5f7uG8RXxGS/the-open-agency-model and federating cognitive processes (memory) https://www.lesswrong.com/posts/FKE6cAzQxEK4QH9fC/qnr-prospects-are-important-for-ai-alignment-research