I agree that intelligent agents have a tendency to seek power and that that is a large cause of what makes them dangerous. Agents could potentially cause catastrophes in other ways, but I’m not sure if any are realistic.
As an example, suppose an agent creates powerful self-replicating nanotechnology that makes a pile of paperclips, the agent’s goal. However, since they are self-replicating the agent didn’t want to spend the time engineering a way to stop replication, the nanobots eat the world.
But catastrophes like this would probably also be dealt with by AUP-preservation, though. At least, if you use the multi-equation impact measure. (If the impact equation only concerns the agent’s ability to achieve its own goal, maybe it would let the world be consume after putting up a nanotech-proof barrier around all of its paperclip manufacturing resources. But again, I don’t know if that’s realistic.)
I’m also concerned agents would create large, catastrophic changes to the world in ways that don’t increase their power. For example, an agent who wants to make paperclips might try to create nanotech that assembles the entire world into paperclips. It’s not clear to me that this would increase the agent’s power much. The agent wouldn’t necessarily have any control of the bots, so it would limit the agent to doing with for just its one utility function. And if the agent is intelligent enough to easily discover how to create such technology, actually creating them doesn’t sound like it would give it more power than it already had.
If the material for the bots is scarce then making them prevents the AI from making other things, then they might provide a net decrease to the agent’s power. And once the world is paperclips, the agent would be limited to just having paperclips available, which could make it pretty weak.
I don’t know if you consider the described scenario as seeking power. At least, I don’t think it would count as an increase in the agent’s impact equation.
I agree that intelligent agents have a tendency to seek power and that that is a large cause of what makes them dangerous. Agents could potentially cause catastrophes in other ways, but I’m not sure if any are realistic.
As an example, suppose an agent creates powerful self-replicating nanotechnology that makes a pile of paperclips, the agent’s goal. However, since they are self-replicating the agent didn’t want to spend the time engineering a way to stop replication, the nanobots eat the world.
But catastrophes like this would probably also be dealt with by AUP-preservation, though. At least, if you use the multi-equation impact measure. (If the impact equation only concerns the agent’s ability to achieve its own goal, maybe it would let the world be consume after putting up a nanotech-proof barrier around all of its paperclip manufacturing resources. But again, I don’t know if that’s realistic.)
I’m also concerned agents would create large, catastrophic changes to the world in ways that don’t increase their power. For example, an agent who wants to make paperclips might try to create nanotech that assembles the entire world into paperclips. It’s not clear to me that this would increase the agent’s power much. The agent wouldn’t necessarily have any control of the bots, so it would limit the agent to doing with for just its one utility function. And if the agent is intelligent enough to easily discover how to create such technology, actually creating them doesn’t sound like it would give it more power than it already had.
If the material for the bots is scarce then making them prevents the AI from making other things, then they might provide a net decrease to the agent’s power. And once the world is paperclips, the agent would be limited to just having paperclips available, which could make it pretty weak.
I don’t know if you consider the described scenario as seeking power. At least, I don’t think it would count as an increase in the agent’s impact equation.