I agree if you define alignment as ‘get your AI system to act in the best interests in humans’, then the coordination problem becomes harder and likely sufficient for problems 2 and 3. But I think it then bundles more problems together in a way that might be less conducive to solving them.
For loss of control, I was primarily thinking about making systems intent-aligned, by which I mean getting the AI system to try to do what its creators intend. I think this makes dividing these challenges up into subproblems easier (and seems to be what many people appear to be gunning for).
If you do define alignment as human-values alignment, I think “If you fail to implement to implement a working alignment solution, you [the creating organization] die” doesn’t hold—I can imagine successfully aligning a system to ‘get your AI system to act in the best interests of its creators’ working fine for its creators but not being great for the world.
Ah, I see. You are absolutely right. I unintentionally used two different meanings of the word “alignment” in problems 1 and 3.
If we define alignment as intent alignment (from my comment on problem 1), then humans don’t necessarily lose control over the economy in The Economic Transition Problem. The group of people to win the AI race will basically control the entire economy via controlling AI that’s controlling the world (and is intent aligned to them).
If we are lucky, they can create a democratic online council where each human gets a say in how the economy is run. The group will tell AI what to do based on how humanity voted.
Alternatively, with the help of their intent aligned AI, the group can try to build a value aligned AI. When they are confident that this AI is indeed value aligned, they can then release it and let it be the steward of humanity.
In this scenario, The Economic Transition Problem just becomes The Power Distribution Problem of ensuring that whoever wins the AI race will act in humanity’s best interests (or close enough).
Re: Your points about alignment solving this.
I agree if you define alignment as ‘get your AI system to act in the best interests in humans’, then the coordination problem becomes harder and likely sufficient for problems 2 and 3. But I think it then bundles more problems together in a way that might be less conducive to solving them.
For loss of control, I was primarily thinking about making systems intent-aligned, by which I mean getting the AI system to try to do what its creators intend. I think this makes dividing these challenges up into subproblems easier (and seems to be what many people appear to be gunning for).
If you do define alignment as human-values alignment, I think “If you fail to implement to implement a working alignment solution, you [the creating organization] die” doesn’t hold—I can imagine successfully aligning a system to ‘get your AI system to act in the best interests of its creators’ working fine for its creators but not being great for the world.
Ah, I see. You are absolutely right. I unintentionally used two different meanings of the word “alignment” in problems 1 and 3.
If we define alignment as intent alignment (from my comment on problem 1), then humans don’t necessarily lose control over the economy in The Economic Transition Problem. The group of people to win the AI race will basically control the entire economy via controlling AI that’s controlling the world (and is intent aligned to them).
If we are lucky, they can create a democratic online council where each human gets a say in how the economy is run. The group will tell AI what to do based on how humanity voted.
Alternatively, with the help of their intent aligned AI, the group can try to build a value aligned AI. When they are confident that this AI is indeed value aligned, they can then release it and let it be the steward of humanity.
In this scenario, The Economic Transition Problem just becomes The Power Distribution Problem of ensuring that whoever wins the AI race will act in humanity’s best interests (or close enough).