(If you can’t see why a single modern society locking in their current values would be a tragedy of enormous proportions, imagine an ancient civilization such as the Romans locking in their specific morals 2000 years ago. Moral progress is real, and important.)
This really doesn’t prove anything. That measurement shouldn’t be taken by our values, but by the values of the ancient romans.
Sure of course the morality of the past gets better and better. It’s taking a random walk closer and closer to our morality. Now moral progress might be real.
The place to look is inside our own value functions, if after 1000 years of careful philosophical debate, humanity decided it was a great idea to eat babies, would you say, “well if you have done all that thinking, clearly you are wiser than me”. Or would you say “Arghh, no. Clearly something has broken in your philosophical debate”? That is a part of your own meta value function, the external world can’t tell you what to think here (unless you have a meta meta value function. But then you have to choose that for yourself)
It doesn’t help that human values seem to be inarticulate half formed intuitions, and the things we call our values are often instrumental goals.
If, had ASI not been created, humans would have gone extinct to bioweapons, and pandas would have evolved intelligence, it the extinction of humans and the rise of panda-centric morality just part of moral progress?
If aliens arrive, and offer to share their best philosophy with us, is the alien influence part of moral progress, or an external fact to be removed?
If advertisers basically learn to brainwash people to sell more product, is that part of moral progress?
Suppose, had you not made the AI, that Joe Bloggs would have made an AI 10 years later. Joe Bloggs would actually have succeeded at alignment. And would have imposed his personal whims on all humanity forever. If you are trying not to unduely influence the future, do you make everyone beholden to the whims of Joe, as they would be without your influence.
My personal CEV cares about fairness, human potential, moral progress, and humanity’s ability to choose its own future, rather than having a future imposed on them by a dictator. I’d guess that the difference between “we run CEV on Nate personally” and “we run CEV on humanity writ large” is nothing (e.g., because Nate-CEV decides to run humanity’s CEV), and if it’s not nothing then it’s probably minor.
Wait. The whole point of the CEV is to get the AI to extrapolate what you would want if you were smarter and more informed. That is, the delta from your existing goals to your CEV should be unknowable to you, because if you know your destination you are already there. This sounds like your object level values. And they sound good, as judged by your (and my) object level values.
I mean there is a sense in which I agree that locking in say your favourite political party, or a particular view on abortion, is stupid. Well I am not sure that particular view on abortion would be actually bad, it would probably have near no effect in a society of posthuman digital minds. These are things that are fairly clearly instrumental. If I learned that after careful philosophical consideration, and analysis of lots of developmental neurology data, people decided abortion was really bad, I would take that seriously. They have probably realized a moral truth I do not know.
I think I have a current idea of what is right, with uncertainty bars. When philosophers come to an unexpected conclusion, it is some evidence that the conclusion is right, and also some evidence the philosopher has gone mad.
This really doesn’t prove anything. That measurement shouldn’t be taken by our values, but by the values of the ancient romans.
Sure of course the morality of the past gets better and better. It’s taking a random walk closer and closer to our morality. Now moral progress might be real.
The place to look is inside our own value functions, if after 1000 years of careful philosophical debate, humanity decided it was a great idea to eat babies, would you say, “well if you have done all that thinking, clearly you are wiser than me”. Or would you say “Arghh, no. Clearly something has broken in your philosophical debate”? That is a part of your own meta value function, the external world can’t tell you what to think here (unless you have a meta meta value function. But then you have to choose that for yourself)
It doesn’t help that human values seem to be inarticulate half formed intuitions, and the things we call our values are often instrumental goals.
If, had ASI not been created, humans would have gone extinct to bioweapons, and pandas would have evolved intelligence, it the extinction of humans and the rise of panda-centric morality just part of moral progress?
If aliens arrive, and offer to share their best philosophy with us, is the alien influence part of moral progress, or an external fact to be removed?
If advertisers basically learn to brainwash people to sell more product, is that part of moral progress?
Suppose, had you not made the AI, that Joe Bloggs would have made an AI 10 years later. Joe Bloggs would actually have succeeded at alignment. And would have imposed his personal whims on all humanity forever. If you are trying not to unduely influence the future, do you make everyone beholden to the whims of Joe, as they would be without your influence.
Wait. The whole point of the CEV is to get the AI to extrapolate what you would want if you were smarter and more informed. That is, the delta from your existing goals to your CEV should be unknowable to you, because if you know your destination you are already there. This sounds like your object level values. And they sound good, as judged by your (and my) object level values.
I mean there is a sense in which I agree that locking in say your favourite political party, or a particular view on abortion, is stupid. Well I am not sure that particular view on abortion would be actually bad, it would probably have near no effect in a society of posthuman digital minds. These are things that are fairly clearly instrumental. If I learned that after careful philosophical consideration, and analysis of lots of developmental neurology data, people decided abortion was really bad, I would take that seriously. They have probably realized a moral truth I do not know.
I think I have a current idea of what is right, with uncertainty bars. When philosophers come to an unexpected conclusion, it is some evidence that the conclusion is right, and also some evidence the philosopher has gone mad.