We can also reframe AI agents as sophisticated feedback controllers.
A normal controller, like a thermostat, is set up to sense a variable (temperature) and take a control action in response (activate a boiler). If the boiler is broken, or the thermostat is inaccurate, the controller fails.
An AI agent is able to flexibly examine the upstream causes of its particular goal and exert control over those in adaptive fashion. It can figure out likely causes of failing to accurately regulate temperature, such as a broken boiler, and figure out how to monitor and control those as well.
I think one way we could try to define agency in a way that excludes calculators would be the ability to behave as if pursuing an arbitrary goal. Calculators are stuck with the goal you built them with, providing a correct answer to an input equation. AutoGPT lets you define its goal in natural language.
So if we put these together—a static piece of software that can monitor and manipulate the causes leading to some outcome in an open-ended way, and allows an input to flexibly specify what outcome ought to be controlled, I think we have a pretty intuitive definition of agency.
AutoGPT allows the user to specify truly arbitrary goals. AutoGPT doesn’t even need to be perfect at pursuing a destructive goal to be dangerous—it can also be dangerous by pursuing a good goal in a dangerous way. To me, the only thing making AutoGPT anything but a nightmare tool is that for now, it is ineffective at pursuing most goals. But I look at that thing operating and I don’t at all see anything intrinsically sphexish, nor an intelligence that is likely to naturally become more moral as it becomes more capable. I see an amoral agent that hasn’t been quite constructed to efficiently accomplish the aims it was given yet.
We can also reframe AI agents as sophisticated feedback controllers.
A normal controller, like a thermostat, is set up to sense a variable (temperature) and take a control action in response (activate a boiler). If the boiler is broken, or the thermostat is inaccurate, the controller fails.
An AI agent is able to flexibly examine the upstream causes of its particular goal and exert control over those in adaptive fashion. It can figure out likely causes of failing to accurately regulate temperature, such as a broken boiler, and figure out how to monitor and control those as well.
I think one way we could try to define agency in a way that excludes calculators would be the ability to behave as if pursuing an arbitrary goal. Calculators are stuck with the goal you built them with, providing a correct answer to an input equation. AutoGPT lets you define its goal in natural language.
So if we put these together—a static piece of software that can monitor and manipulate the causes leading to some outcome in an open-ended way, and allows an input to flexibly specify what outcome ought to be controlled, I think we have a pretty intuitive definition of agency.
AutoGPT allows the user to specify truly arbitrary goals. AutoGPT doesn’t even need to be perfect at pursuing a destructive goal to be dangerous—it can also be dangerous by pursuing a good goal in a dangerous way. To me, the only thing making AutoGPT anything but a nightmare tool is that for now, it is ineffective at pursuing most goals. But I look at that thing operating and I don’t at all see anything intrinsically sphexish, nor an intelligence that is likely to naturally become more moral as it becomes more capable. I see an amoral agent that hasn’t been quite constructed to efficiently accomplish the aims it was given yet.