Right. Many seem to assume that there is a causal relationship good → human desires → human evaluations. They are hoping both that if we do well according to human evaluations then we will be satisfying human desires, and that if we satisfy human desires, we will create a good world. I think both of those assumptions are questionable.
I like the analogy in which we consider an alternative world where AI researchers assumed, for whatever parochial reason, that it was actually human dreams that should guide AI behavior. In this world, they ask humans to write down their dreams, and try to devise AIs that would make the world like that. There are two assumptions here: (1) that making the world more like human dreams would be good, and (2) that humans can correctly report their dreams. In the case of dreams, both of these assumptions are suspect, right? But what exactly is the difference with human desires? Why do we assume that either they are a guide to what is good or can be reported accurately?
Right. Many seem to assume that there is a causal relationship good → human desires → human evaluations. They are hoping both that if we do well according to human evaluations then we will be satisfying human desires, and that if we satisfy human desires, we will create a good world. I think both of those assumptions are questionable.
I like the analogy in which we consider an alternative world where AI researchers assumed, for whatever parochial reason, that it was actually human dreams that should guide AI behavior. In this world, they ask humans to write down their dreams, and try to devise AIs that would make the world like that. There are two assumptions here: (1) that making the world more like human dreams would be good, and (2) that humans can correctly report their dreams. In the case of dreams, both of these assumptions are suspect, right? But what exactly is the difference with human desires? Why do we assume that either they are a guide to what is good or can be reported accurately?