I don’t believe alignment is possible. Humans are not aligned with other humans, and the only thing that prevents an immediate apocalypse is the lack of recursive self-improvement on short timescales. Certainly groups of humans happily destroy other groups of humans, and often destroy themselves in the process of maximizing something like the number of statues. Best we can hope for that whatever takes over the planet after meatbags are gone has some of the same goals that the more enlightened meatbags had, where “enlightened” is a very individual definition. Maybe it is a thriving and diverse Galactic civilization, maybe it is the word of God spread to the stars, maybe it is living quietly on this planet in harmony with the nature. There is no single or even shared vision of the future that can be described as “aligned” by most humans.
Do you think there are changes to the current world that would be “aligned”? (E.g. deleting covid) Then we could end up with a world that is better than our current one, even without needing all humans to agree on what’s best.
Another option: why not just do everything at once? Have some people living in a diverse Galactic civilization, other people spreading the word of god, and other people living in harmony with nature, and everyone contributing a little to everyone else’s goals? Yes, in principle people can have different values such that this future sounds terrible to everyone—but in reality it seems more like people would prefer this to our current world, but might merely feel like they were missing out relative to their own vision of perfection.
I have also made a similar comment a few weeks ago, In fact, this point seems to me so trivial yet corrosive that I find it outright bizarre it’s not being tackled/taken seriously by the AI alignment community.
To repost my comment from a couple of weeks back, which seems to say roughly the same thing, not as well:
Do you think there are changes to the current world that would be “aligned”? (E.g. deleting covid) Then we could end up with a world that is better than our current one, even without needing all humans to agree on what’s best.
Another option: why not just do everything at once? Have some people living in a diverse Galactic civilization, other people spreading the word of god, and other people living in harmony with nature, and everyone contributing a little to everyone else’s goals? Yes, in principle people can have different values such that this future sounds terrible to everyone—but in reality it seems more like people would prefer this to our current world, but might merely feel like they were missing out relative to their own vision of perfection.
I have also made a similar comment a few weeks ago, In fact, this point seems to me so trivial yet corrosive that I find it outright bizarre it’s not being tackled/taken seriously by the AI alignment community.