I agree with your reasoning.
[Thus] a machine could develop concepts of “good”, “ethical”, and “utility-maximizing” that are just as robust as mine, if not more so.
A corollary would be that figuring out human value is not enough to make it safe. At least not if you look at the NN stack alone.
I agree with your reasoning.
A corollary would be that figuring out human value is not enough to make it safe. At least not if you look at the NN stack alone.