That’s what I’m talking about when I speak of human object-level behavior differing quite a lot in the past compared to the present, and about a mesa-objective-aligned AI still potentially messing everything up because it’s being driven by biases and broken heuristics.
If you put people from 500 years ago in charge of the galaxy, they’d have screwed it up according to my standards
Even if they were given a billion subjective years to try to reason our their “true” robust values, and were warned that they currently might be biased and wrong in all sorts of ways? I dunno, it seems plausible to me that they’d still be able to converge towards something like this.
And of course, an AGI should be in a somewhat better position than this anyway, inasmuch as it’d be more likely to have a concrete mesa-objective.
My answer is no, not because finding the one true morality is difficult, but because there is no objective morality and values, and values and morality can’t be derived from facts. Or, as the computing power and technology goes to infinity, morality is divergent, not convergent.
That’s what I’m talking about when I speak of human object-level behavior differing quite a lot in the past compared to the present, and about a mesa-objective-aligned AI still potentially messing everything up because it’s being driven by biases and broken heuristics.
Even if they were given a billion subjective years to try to reason our their “true” robust values, and were warned that they currently might be biased and wrong in all sorts of ways? I dunno, it seems plausible to me that they’d still be able to converge towards something like this.
And of course, an AGI should be in a somewhat better position than this anyway, inasmuch as it’d be more likely to have a concrete mesa-objective.
My answer is no, not because finding the one true morality is difficult, but because there is no objective morality and values, and values and morality can’t be derived from facts. Or, as the computing power and technology goes to infinity, morality is divergent, not convergent.