It’s useful to have some examples in mind of what it looks like when an intelligent agent isn’t aligned with the shared values of humanity. We have some extreme examples of this, like paperclip maximizers, and some less extreme but extreme in human terms examples, like dictators like Stalin, Mao, and Pol Pot who killed millions in the pursuit for their goals, but these feel like outliers that people can too easily make various arguments for being extreme and that no “reasonable” system would have these problems.
Okay, so let’s think about how hard it is to just get “reasonable” people aligned, much less superintelligent AIs.
Consider Robert Moses, a man who achieved much at the expense of wider humanity. He worked within the system, gamed it, did useful things incidentally since they happened to bring him power or let him build a legacy, and then wielded that power in ways that harmed many while helping some. He was smart, generally caring, and largely aligned with what seemed to be good for America at the time, yet still managed to pursue courses of action that were really aligned with humanity as a whole.
We have plenty of other examples, but I think most of them don’t put it quite into the kind of stark contrast Moses does. He’s a great example of the kind of failure mode you can expect from inadequate alignment mechanism (though on a smaller scale): you get something that’s kinda like what you wanted, but also bad in ways you probably didn’t anticipate ahead of time.
He worked within the system, gamed it, did useful things incidentally since they happened to bring him power or let him build a legacy, and then wielded that power in ways that harmed many while helping some.
I don’t think Moses did useful things just because they brought him into power. From reading Caro’s biography it seems to me that especially at the beginning Moses had good intentions.
When it comes to parks, parks are also not just helping some people but helped most people. When Moses caused a park to be build when the money would have better spent on a new school, the issue isn’t that less people profited from the park then would have profited from the school.
I think a key problem with Moses is that as his power grew, his workload also grew. Instead of delegating some of his power to people under him he made decisions about projects where he had little time to invest into the project.
If we would have invested the time he could have likely understood that mothers who want to go with small children to the park have a problem when they use a stroller and the entry of the park has stairs. Moses however cut himself of from being questioned and as a result such an issue didn’t get addressed when planning for new parks.
Other problems came from him doing things to keep up his power by making the system both intransparent and corrupt.
While intransparency might come with an AGI, I would be more surprised if issues arises because they AGI cuts itself from information flow or the AGI doesn’t have enough time to manage his duties. The AGI can just spin up more instances.
Robert Moses and AI Alignment
It’s useful to have some examples in mind of what it looks like when an intelligent agent isn’t aligned with the shared values of humanity. We have some extreme examples of this, like paperclip maximizers, and some less extreme but extreme in human terms examples, like dictators like Stalin, Mao, and Pol Pot who killed millions in the pursuit for their goals, but these feel like outliers that people can too easily make various arguments for being extreme and that no “reasonable” system would have these problems.
Okay, so let’s think about how hard it is to just get “reasonable” people aligned, much less superintelligent AIs.
Consider Robert Moses, a man who achieved much at the expense of wider humanity. He worked within the system, gamed it, did useful things incidentally since they happened to bring him power or let him build a legacy, and then wielded that power in ways that harmed many while helping some. He was smart, generally caring, and largely aligned with what seemed to be good for America at the time, yet still managed to pursue courses of action that were really aligned with humanity as a whole.
We have plenty of other examples, but I think most of them don’t put it quite into the kind of stark contrast Moses does. He’s a great example of the kind of failure mode you can expect from inadequate alignment mechanism (though on a smaller scale): you get something that’s kinda like what you wanted, but also bad in ways you probably didn’t anticipate ahead of time.
I don’t think Moses did useful things just because they brought him into power. From reading Caro’s biography it seems to me that especially at the beginning Moses had good intentions.
When it comes to parks, parks are also not just helping some people but helped most people. When Moses caused a park to be build when the money would have better spent on a new school, the issue isn’t that less people profited from the park then would have profited from the school.
I think a key problem with Moses is that as his power grew, his workload also grew. Instead of delegating some of his power to people under him he made decisions about projects where he had little time to invest into the project.
If we would have invested the time he could have likely understood that mothers who want to go with small children to the park have a problem when they use a stroller and the entry of the park has stairs. Moses however cut himself of from being questioned and as a result such an issue didn’t get addressed when planning for new parks.
Other problems came from him doing things to keep up his power by making the system both intransparent and corrupt.
While intransparency might come with an AGI, I would be more surprised if issues arises because they AGI cuts itself from information flow or the AGI doesn’t have enough time to manage his duties. The AGI can just spin up more instances.