This is way, way, off. CEV isn’t a magic tool that makes people have preferences that we consider ‘sane’.
FAWS didn’t say that CEV would filter out what-we-consider-to-be Hitler’s insanity. After all, we may be largely insane, too. I take FAWS to be suggesting that CEV would filter out Hitler’s actual insanity, possibly leaving something essentially the same as what CEV gets after it filters out my insanity.
People really do have drastically different preferences.
People express different preferences, but it is not obvious that their CEV-ified preferences would be so different. (I’m inclined to expect that they would be, but it’s not obvious.)
After all, we may be largely insane, too. I take FAWS to be suggesting that CEV would filter out Hitler’s actual insanity, possibly leaving something essentially the same as what CEV gets after it filters out my insanity.
Possibly. And possibly CEV<Mortimer Q. Snodgrass> is a universe tiled with stabbing victims! There seems to be some irresistible temptation to assume that extrapolating the volition of individuals will lead to convergence. This is a useful social stance to have and it is mostly harmless belief in practical terms for nearly everyone. Yet for anyone who is considering actual outcomes of agents executing coherent extrapolated volitions it is dangerous.
People express different preferences, but it is not obvious that their CEV-ified preferences would be so different.
We are considering individuals of entirely different upbringing and culture, from (quite possibly) a different genetic pool, with clearly different drives and desires and who by their very selection have an entirely different instinctive relationship with power and control. Sure, there are going to be similarities; relative to mindspace in general extrapolated humans will be comparatively similar. We can expect most models of such extrapolated humans to each have a node for sexiness even if the details of that node vary rather significantly. Yet assuming similarities too far beyond that requires altogether too much mind projection.
If CEV and CEV end up the same, then the difference between me and hitler (such as whether we should kill jews) is not relevant to the CEV output, which makes me very worried about its content.
FAWS didn’t say that CEV would filter out what-we-consider-to-be Hitler’s insanity. After all, we may be largely insane, too. I take FAWS to be suggesting that CEV would filter out Hitler’s actual insanity, possibly leaving something essentially the same as what CEV gets after it filters out my insanity.
People express different preferences, but it is not obvious that their CEV-ified preferences would be so different. (I’m inclined to expect that they would be, but it’s not obvious.)
Possibly. And possibly CEV<Mortimer Q. Snodgrass> is a universe tiled with stabbing victims! There seems to be some irresistible temptation to assume that extrapolating the volition of individuals will lead to convergence. This is a useful social stance to have and it is mostly harmless belief in practical terms for nearly everyone. Yet for anyone who is considering actual outcomes of agents executing coherent extrapolated volitions it is dangerous.
We are considering individuals of entirely different upbringing and culture, from (quite possibly) a different genetic pool, with clearly different drives and desires and who by their very selection have an entirely different instinctive relationship with power and control. Sure, there are going to be similarities; relative to mindspace in general extrapolated humans will be comparatively similar. We can expect most models of such extrapolated humans to each have a node for sexiness even if the details of that node vary rather significantly. Yet assuming similarities too far beyond that requires altogether too much mind projection.
If CEV and CEV end up the same, then the difference between me and hitler (such as whether we should kill jews) is not relevant to the CEV output, which makes me very worried about its content.