It’s very nice to see you on LW! I think both your essay and Eliezer’s comments are very on point.
There are non-obvious ways to define a utility function for an AI. For example, you could “pass the buck” by giving the AI a mathematical description of a human upload, and telling it to maximize the value of the function that the upload would define, given enough time and resources to think. That’s Paul Christiano’s indirect normativity proposal. I think it fails for subtle reasons, but there might be other ways of defining what humans want by looking at the computational content of human brains and extrapolating it somehow (CEV), while keeping a guarantee that the extrapolation will talk about whatever world we actually live in. Basically it’s a huge research problem.
It’s very nice to see you on LW! I think both your essay and Eliezer’s comments are very on point.
There are non-obvious ways to define a utility function for an AI. For example, you could “pass the buck” by giving the AI a mathematical description of a human upload, and telling it to maximize the value of the function that the upload would define, given enough time and resources to think. That’s Paul Christiano’s indirect normativity proposal. I think it fails for subtle reasons, but there might be other ways of defining what humans want by looking at the computational content of human brains and extrapolating it somehow (CEV), while keeping a guarantee that the extrapolation will talk about whatever world we actually live in. Basically it’s a huge research problem.