My main objection to this idea is that it is a local solution, and doesn’t have built-in mechanisms to become global AI safety solution, that is, to prevent other AIs creation, which could be agential superintelligences. One can try to make “AI police” as a service, but it could be less effective than agential police.
Another objection is probably Gwern’s idea that any Tool AI “wants” to become agential AI.
This idea also excludes the robotic direction in AI development, which will anyway produce agential AIs.
If by agent we mean “system that takes actions in the real world”, then services can be agents. As I understand it, Eric is only arguing against monolithic AGI agents that are optimizing a long-term utility function and that can learn/perform any task.
Current factory robots definitely look like a service, and even the soon-to-come robots-trained-with-deep-RL will be services. They execute particular learned behaviors.
If I remember correctly, Gwern’s argument is basically that Agent AI will outcompete Tool AI because Agent AI can optimize things that Tool AI cannot, such as its own cognition. In the CAIS world, there are separate services that improve cognition, and so the CAIS services do get the benefit of ever-improving cognition, without being classical AGI agents. But overall I agree with this point (and disagree with Eric) because I expect there to be lots of gains to be had by removing the boundaries between services, at least where possible.
This idea also excludes the robotic direction in AI development, which will anyway produce agential AIs.
Recursive self-improvement that makes the intelligence “super” quickly is what makes the misaligned utility actually dangerous, as opposed to dangerous like a, say, current day automatized assembly line.
A robot that self-improves would need to have the capacity to control its actuators and also to self-improve. Since none of these capabilities directly depends on the other, each time one of them improves, the improvement is much more likely to be first demonstrated independently of an improvement in the other one.
Thus we’re likely to already have some experience with self-improving AI, or the recursively improved AI to help us, when we get to dealing with people wanting to build self-improving robots. Even though with advanced AI in hand to help we should maybe still start early on that, it seems more important to get the not-necessarily-and-also-probably-not-robotic AI right.
I meant not that the “robot will self-improve”, but that the research in robotics will create AIs which are agential and adapted to act in the real world. Such AIs may start to self-improve later and without robotic body.
Monitoring surveillance in order to see if anyone is breaking rules seems to be quite a bounded task, and in fact is one that we are already in the process of automating (using our current AI systems, which are basically all bounded).
Of course, there are lots of other tasks that are not as clear. But to the extent that you believe the Factored Cognition hypothesis, you should believe that we can make bounded services that nevertheless do a very good job.
Monitoring surveillance in order to see if anyone is breaking rules seems to be quite a bounded task, and in fact is one that we are already in the process of automating (using our current AI systems, which are basically all bounded).
That seems true, but if this surveillance monitoring isn’t 100% effective, won’t you still need an agential police to deal with any threats that manage to evade the surveillance? Or do you buy Eric’s argument that we can use a period of “unopposed preparation” to make sure that the defense, even though it’s bounded, is still much more capable than any agential threat it might face?
Sorry, when I said “there are lots of other tasks that are not as clear”, I meant that there are a lot of other tasks relevant to policing and security that are not as clear, such as police to deal with threats that evade surveillance. I think the optimism here comes from our ability to decompose tasks, such that we can take a task that seems to require goal-directed agency (like “be the police”) and turn it into a bunch of subtasks that no longer look agential.
I agree that in the long term, agent AI could probably improve faster than CAIS, but I think CAIS could still be a solution.
Regardless of how it is aligned, aligned AI will tend to improve slower than unaligned AI, because it is trying to achieve a more complicated goal, human oversight takes time, etc. To prevent unaligned AI, aligned AI will need a head start, so it can stop any unaligned AI while it’s still much weaker. I don’t think CAIS is fundamentally different in that respect.
If the reasoning in the post that CAIS will develop before AGI holds up, then CAIS would actually have an advantage, because it would be easier to get a head start.
My main objection to this idea is that it is a local solution, and doesn’t have built-in mechanisms to become global AI safety solution, that is, to prevent other AIs creation, which could be agential superintelligences. One can try to make “AI police” as a service, but it could be less effective than agential police.
Another objection is probably Gwern’s idea that any Tool AI “wants” to become agential AI.
This idea also excludes the robotic direction in AI development, which will anyway produce agential AIs.
If by agent we mean “system that takes actions in the real world”, then services can be agents. As I understand it, Eric is only arguing against monolithic AGI agents that are optimizing a long-term utility function and that can learn/perform any task.
Current factory robots definitely look like a service, and even the soon-to-come robots-trained-with-deep-RL will be services. They execute particular learned behaviors.
If I remember correctly, Gwern’s argument is basically that Agent AI will outcompete Tool AI because Agent AI can optimize things that Tool AI cannot, such as its own cognition. In the CAIS world, there are separate services that improve cognition, and so the CAIS services do get the benefit of ever-improving cognition, without being classical AGI agents. But overall I agree with this point (and disagree with Eric) because I expect there to be lots of gains to be had by removing the boundaries between services, at least where possible.
Recursive self-improvement that makes the intelligence “super” quickly is what makes the misaligned utility actually dangerous, as opposed to dangerous like a, say, current day automatized assembly line.
A robot that self-improves would need to have the capacity to control its actuators and also to self-improve. Since none of these capabilities directly depends on the other, each time one of them improves, the improvement is much more likely to be first demonstrated independently of an improvement in the other one.
Thus we’re likely to already have some experience with self-improving AI, or the recursively improved AI to help us, when we get to dealing with people wanting to build self-improving robots. Even though with advanced AI in hand to help we should maybe still start early on that, it seems more important to get the not-necessarily-and-also-probably-not-robotic AI right.
I meant not that the “robot will self-improve”, but that the research in robotics will create AIs which are agential and adapted to act in the real world. Such AIs may start to self-improve later and without robotic body.
This seems likely to me as well, especially since “service” is by definition bounded and agent is not.
Monitoring surveillance in order to see if anyone is breaking rules seems to be quite a bounded task, and in fact is one that we are already in the process of automating (using our current AI systems, which are basically all bounded).
Of course, there are lots of other tasks that are not as clear. But to the extent that you believe the Factored Cognition hypothesis, you should believe that we can make bounded services that nevertheless do a very good job.
That seems true, but if this surveillance monitoring isn’t 100% effective, won’t you still need an agential police to deal with any threats that manage to evade the surveillance? Or do you buy Eric’s argument that we can use a period of “unopposed preparation” to make sure that the defense, even though it’s bounded, is still much more capable than any agential threat it might face?
Sorry, when I said “there are lots of other tasks that are not as clear”, I meant that there are a lot of other tasks relevant to policing and security that are not as clear, such as police to deal with threats that evade surveillance. I think the optimism here comes from our ability to decompose tasks, such that we can take a task that seems to require goal-directed agency (like “be the police”) and turn it into a bunch of subtasks that no longer look agential.
I agree that in the long term, agent AI could probably improve faster than CAIS, but I think CAIS could still be a solution.
Regardless of how it is aligned, aligned AI will tend to improve slower than unaligned AI, because it is trying to achieve a more complicated goal, human oversight takes time, etc. To prevent unaligned AI, aligned AI will need a head start, so it can stop any unaligned AI while it’s still much weaker. I don’t think CAIS is fundamentally different in that respect.
If the reasoning in the post that CAIS will develop before AGI holds up, then CAIS would actually have an advantage, because it would be easier to get a head start.