Yes, preserving the existence of multiple good options that humans can choose between using their normal reasoning process sounds great. Which is why an AI that learns human values should learn that humans want the universe to be arranged in such a way.
I’m concerned that you seem to be saying that problems of agency are totally different from learning human values, and have to be solved in isolation. The opposite is true—preferring agency is a paradigmatic human value, and solving problems of agency should only be a small part of a more general solution.
Thanks for the comment. I agree broadly of course, but the paper says more specific things. For example, agency needs to be prioritized, probably taken outside of standard optimization, otherwise decimating pressure is applied on other concepts including truth and other “human values”. The other part is a empirical one, also related to your concern, namely, human values are quite flexible and biology doesn’t create hard bounds / limits on depletion. If you couple that with ML/AI technologies that will predict what we will do next—then approaches that depend on human intent and values (broadly) are not as safe anymore.
Yes, preserving the existence of multiple good options that humans can choose between using their normal reasoning process sounds great. Which is why an AI that learns human values should learn that humans want the universe to be arranged in such a way.
I’m concerned that you seem to be saying that problems of agency are totally different from learning human values, and have to be solved in isolation. The opposite is true—preferring agency is a paradigmatic human value, and solving problems of agency should only be a small part of a more general solution.
Thanks for the comment. I agree broadly of course, but the paper says more specific things. For example, agency needs to be prioritized, probably taken outside of standard optimization, otherwise decimating pressure is applied on other concepts including truth and other “human values”. The other part is a empirical one, also related to your concern, namely, human values are quite flexible and biology doesn’t create hard bounds / limits on depletion. If you couple that with ML/AI technologies that will predict what we will do next—then approaches that depend on human intent and values (broadly) are not as safe anymore.