What worries me about this tact is that I’m sufficiently clever to realize that in conducting a vast and complex research program to empirically test humanity to determine a global reflectively consistent utility function, I will be changing the utility trade-offs of humanity.
So I might as well make sure that I conduct my mass studies in such a way to ensure that the outcome is both correct and easier for me to perform my second much longer (essentially infinitely longer) time phase of my functioning.
So said AI would determine and then forever follow exactly what humanity’s hidden utility function is. But there is no guarantee that this is a particularly friendly scenario.
What worries me about this tact is that I’m sufficiently clever to realize that in conducting a vast and complex research program to empirically test humanity to determine a global reflectively consistent utility function, I will be changing the utility trade-offs of humanity.
So I might as well make sure that I conduct my mass studies in such a way to ensure that the outcome is both correct and easier for me to perform my second much longer (essentially infinitely longer) time phase of my functioning.
So said AI would determine and then forever follow exactly what humanity’s hidden utility function is. But there is no guarantee that this is a particularly friendly scenario.