If AIs’ goals are aligned with humans, it will be in the interest of the vast majority of AIs to coordinate to avoid harms from excessive competition. Which they would likely be good at, given that they are posited to be superhuman and humans currently do a decent job of mitigating such harms(at least to the extent that we haven’t destroyed the world)
I suppose what I’m trying to point to is some form of the outer alignment problem. I think we may end up with AIs that are aligned with human organizations like corporations more than individual humans. The reason for this is that corporations or militaries which employ more ruthless AIs will, over time, accrue more power and resources. It’s not so much explicit (i.e. violent) competition, but rather the gradual tendency for systems which are power-seeking and resource-maximizing to end up with more power and resources over time. If we allow for the creation / fine tuning of many AI agents, and allow them to accrue resources and copy themselves, then natural selection will favor the more selfish ones which are least aligned with humanity at large. We already require pretty extensive regulation to make sure that corporations don’t incur significant negative externalities, and these are organizations that are run by and composed of humans. When those entities are no longer humans, I think the vast majority of power and resources will no longer be explicitly controlled by humans, and moreover will be controlled by AI which has values poorly aligned with the majority of humans. The AI’s goals will only be aligned with the short-term interests of the small number of humans that created them in the first place. Once the majority of people realize that this system is not acting in their long-term interests, there will be nothing they can do about it.
If AIs’ goals are aligned with humans, it will be in the interest of the vast majority of AIs to coordinate to avoid harms from excessive competition. Which they would likely be good at, given that they are posited to be superhuman and humans currently do a decent job of mitigating such harms(at least to the extent that we haven’t destroyed the world)
I suppose what I’m trying to point to is some form of the outer alignment problem. I think we may end up with AIs that are aligned with human organizations like corporations more than individual humans. The reason for this is that corporations or militaries which employ more ruthless AIs will, over time, accrue more power and resources. It’s not so much explicit (i.e. violent) competition, but rather the gradual tendency for systems which are power-seeking and resource-maximizing to end up with more power and resources over time. If we allow for the creation / fine tuning of many AI agents, and allow them to accrue resources and copy themselves, then natural selection will favor the more selfish ones which are least aligned with humanity at large. We already require pretty extensive regulation to make sure that corporations don’t incur significant negative externalities, and these are organizations that are run by and composed of humans. When those entities are no longer humans, I think the vast majority of power and resources will no longer be explicitly controlled by humans, and moreover will be controlled by AI which has values poorly aligned with the majority of humans. The AI’s goals will only be aligned with the short-term interests of the small number of humans that created them in the first place. Once the majority of people realize that this system is not acting in their long-term interests, there will be nothing they can do about it.