I agree that initially a powerful AGI would likely be composed of many sub-agents. However it seems plausible to me that these sub-agents may “cohere” under sufficient optimisation or training. This could result in the sub-agent with the most stable goals winning out. It’s possible that strong evolutionary pressure makes this more likely.
You could also imagine powerful agents that aren’t composed of sub-agents, for example a simpler agent with very computationally expensive search over actions.
Overall this topic seems under-discussed in my opinion. It would be great to have a better understanding of whether we expect sub-agents to turn into a single coherent agent.
However it seems plausible to me that these sub-agents may “cohere” under sufficient optimisation or training.
I think it’s possible to unify them somewhat, in terms of ensuring that they don’t have outright contradictory models or goals, but I don’t really see a path where a realistically feasible mind would stop being made up of different subagents. The subsystem that thinks about how to build nanotechnology may have overlap with the subsystem that thinks about how to do social reasoning, but it’s still going to be more efficient to have them specialized for those tasks rather than trying to combine them into one. Even if you did try to combine them into one, you’ll still run into physical limits—in the human brain, it’s hypothesized that one of the reasons why it takes time to think about novel decisions is that
different pieces of relevant information are found in physically disparate memory networks and neuronal sites. Access from the memory networks to the evidence accumulator neurons is physically bottlenecked by a limited number of “pipes”. Thus, a number of different memory networks need to take turns in accessing the pipe, causing a serial delay in the evidence accumulation process.
There are also closely related considerations for how much processing and memory you can cram into a single digital processing unit. In my language, each of those memory networks is its own subagent, holding different perspectives and considerations. For any mind that holds a nontrivial amount of memories and considerations, there are going to be plain physical limits on how much of that can be retrieved and usefully processed at a central location, making it vastly more efficient to run thought processes in parallel than try to force everything through a single bottleneck.
If the question is “how subagents can do superintelligently complex thing in unified manner, given limited bandwidth”, they can run internal prediction markets like “which next action is good” or “what we are going to observe in next five seconds”, because prediction markets is a powerful and general information integration engine. Moreover, it can lead to better mind integration, because some subagents can make a profit via exploiting incoherence in beliefs/decision-making.
Sure, right. (There are some theories suggesting that the human brain does something like a bidding process with the subagents with the best track record for prediction winning the ability to influence things more, though of course the system is different from an actual prediction market.) That’s significantly different from the system ceasing to meaningfully have subagents at all though, and I understood rorygreig to be suggesting that it might cease to have them.
Good points, however I’m still a bit confused about the difference between two different scenarios: “multiple sub-agents” vs “a single sub-agent that can use tools” (or can use oracle sub-agents that don’t have their own goals).
For example a human doing protein folding using alpha-fold; I don’t think of that as multiple sub-agents, just a single agent using an AI tool for a specialised task (protein folding). (Assuming for now that we can treat a human as a single agent, which isn’t really the case, but you can imagine a coherent agent using alpha-fold as a tool).
It still seems plausible to me that you might have a mind made of many different parts, but there is a clear “agent” bit that actually has goals and is controlling all the other parts.
It still seems plausible to me that you might have a mind made of many different parts, but there is a clear “agent” bit that actually has goals and is controlling all the other parts.
I suppose I can imagine an architecture that has something like a central planning agent that is capable of having a goal, observing the state of the world to check if the goal had been met, coming up with high level strategies to meet that goal, then delegating subtasks to a set of subordinate sub-agents (whilst making sure that these tasks are broken down enough that the sub-agents themselves don’t have to do much long time-horizon planning or goal directed behaviour).
With this architecture it seems like all the agent-y goal-directed stuff is done by a single central agent.
However I do agree that this may be less efficient or capable in practice than an architecture with more autonomous, decentralised sub-agents. But on the other hand it might be better at more consistently pursuing a stable goal, so that could compensate.
I agree that initially a powerful AGI would likely be composed of many sub-agents. However it seems plausible to me that these sub-agents may “cohere” under sufficient optimisation or training. This could result in the sub-agent with the most stable goals winning out. It’s possible that strong evolutionary pressure makes this more likely.
You could also imagine powerful agents that aren’t composed of sub-agents, for example a simpler agent with very computationally expensive search over actions.
Overall this topic seems under-discussed in my opinion. It would be great to have a better understanding of whether we expect sub-agents to turn into a single coherent agent.
I think it’s possible to unify them somewhat, in terms of ensuring that they don’t have outright contradictory models or goals, but I don’t really see a path where a realistically feasible mind would stop being made up of different subagents. The subsystem that thinks about how to build nanotechnology may have overlap with the subsystem that thinks about how to do social reasoning, but it’s still going to be more efficient to have them specialized for those tasks rather than trying to combine them into one. Even if you did try to combine them into one, you’ll still run into physical limits—in the human brain, it’s hypothesized that one of the reasons why it takes time to think about novel decisions is that
There are also closely related considerations for how much processing and memory you can cram into a single digital processing unit. In my language, each of those memory networks is its own subagent, holding different perspectives and considerations. For any mind that holds a nontrivial amount of memories and considerations, there are going to be plain physical limits on how much of that can be retrieved and usefully processed at a central location, making it vastly more efficient to run thought processes in parallel than try to force everything through a single bottleneck.
Subagents can run prediction markets.
Don’t understand what you’re saying? (I mean sure they can but what makes you bring that up.)
If the question is “how subagents can do superintelligently complex thing in unified manner, given limited bandwidth”, they can run internal prediction markets like “which next action is good” or “what we are going to observe in next five seconds”, because prediction markets is a powerful and general information integration engine. Moreover, it can lead to better mind integration, because some subagents can make a profit via exploiting incoherence in beliefs/decision-making.
Sure, right. (There are some theories suggesting that the human brain does something like a bidding process with the subagents with the best track record for prediction winning the ability to influence things more, though of course the system is different from an actual prediction market.) That’s significantly different from the system ceasing to meaningfully have subagents at all though, and I understood rorygreig to be suggesting that it might cease to have them.
Technically, every cell in the human body is a subagent trying to ‘predict’ each other’s future movements and actions.
Good points, however I’m still a bit confused about the difference between two different scenarios: “multiple sub-agents” vs “a single sub-agent that can use tools” (or can use oracle sub-agents that don’t have their own goals).
For example a human doing protein folding using alpha-fold; I don’t think of that as multiple sub-agents, just a single agent using an AI tool for a specialised task (protein folding). (Assuming for now that we can treat a human as a single agent, which isn’t really the case, but you can imagine a coherent agent using alpha-fold as a tool).
It still seems plausible to me that you might have a mind made of many different parts, but there is a clear “agent” bit that actually has goals and is controlling all the other parts.
What would that look like in practice?
I suppose I can imagine an architecture that has something like a central planning agent that is capable of having a goal, observing the state of the world to check if the goal had been met, coming up with high level strategies to meet that goal, then delegating subtasks to a set of subordinate sub-agents (whilst making sure that these tasks are broken down enough that the sub-agents themselves don’t have to do much long time-horizon planning or goal directed behaviour).
With this architecture it seems like all the agent-y goal-directed stuff is done by a single central agent.
However I do agree that this may be less efficient or capable in practice than an architecture with more autonomous, decentralised sub-agents. But on the other hand it might be better at more consistently pursuing a stable goal, so that could compensate.