If you donate to AI alignment research, it doesn’t mean that you get to decide which values are loaded. Other people will decide that. You will then be forced to eat the end result, whatever it may look like. Your mistaken assumption is that there is such a thing as “human values”, which will cause a world that is good for human beings in general. In reality, people have their own values, and they include terms for “stopping other people from having what they want”, “making sure my enemies suffer”, “making people regret disagreeing with me”, and so on.
When people talk about “human values” in this context, I think they usually mean something like “goals that are Pareto optimal for the values of individual humans”- and the things you listed definitely aren’t that.
If we are talking about any sort of “optimality”, we can’t expect even individual humans to have these “optimal” values, much less so en masse. Of course it is futile to dream that our deus ex machina will impose those fantastic values on the world if 99% of us de facto disagree with them.
I’m not sure they mean that. Perhaps it would be better to actually specify the specific values you want implemented. But then of course people will disagree, including the actual humans who are trying to build AGI.
If you donate to AI alignment research, it doesn’t mean that you get to decide which values are loaded. Other people will decide that. You will then be forced to eat the end result, whatever it may look like. Your mistaken assumption is that there is such a thing as “human values”, which will cause a world that is good for human beings in general. In reality, people have their own values, and they include terms for “stopping other people from having what they want”, “making sure my enemies suffer”, “making people regret disagreeing with me”, and so on.
When people talk about “human values” in this context, I think they usually mean something like “goals that are Pareto optimal for the values of individual humans”- and the things you listed definitely aren’t that.
If we are talking about any sort of “optimality”, we can’t expect even individual humans to have these “optimal” values, much less so en masse. Of course it is futile to dream that our deus ex machina will impose those fantastic values on the world if 99% of us de facto disagree with them.
I’m not sure they mean that. Perhaps it would be better to actually specify the specific values you want implemented. But then of course people will disagree, including the actual humans who are trying to build AGI.