Devote the second half to discussing the question of FAI, with references to e.g. Joshua Greene’s thesis and other relevant sources for establishing this argument.
(Now that this is struck out it might not matter, but) I wonder if, in addition to possibly overrating Greene’s significance as an exponent of moral irrealism, we don’t overrate the significance of moral realism as an obstacle to understanding FAI (“shiny pitchforks”). I would expect the academic target audience of this paper, especially the more technical subset, to be metaethically confused but not moral realists. Much more tentatively, I suspect that for the technical audience, more important than resolving metaethical confusion would be communicating the complexity of values: getting across “no, you really don’t just value intelligence or knowledge or ‘complexity’, you really don’t want humans replaced by arbitrary greater intelligences (or possibly replaced by anything; values can be path-dependent), and, while that would be true even if most superintelligences had complex values and built complex worlds, there’s reason to think most would produce dull monocultures.” (“Which Consequentialism? Machine Ethics and Moral Divergence” is closely related.)
But maybe I overstate the difference between that and metaethical confusion; both fall under “reasons to think we get a good outcome by default”, both are supported by intuitions against “arbitrary” values, and probably have other psychology in common.
(Now that this is struck out it might not matter, but) I wonder if, in addition to possibly overrating Greene’s significance as an exponent of moral irrealism, we don’t overrate the significance of moral realism as an obstacle to understanding FAI (“shiny pitchforks”). I would expect the academic target audience of this paper, especially the more technical subset, to be metaethically confused but not moral realists. Much more tentatively, I suspect that for the technical audience, more important than resolving metaethical confusion would be communicating the complexity of values: getting across “no, you really don’t just value intelligence or knowledge or ‘complexity’, you really don’t want humans replaced by arbitrary greater intelligences (or possibly replaced by anything; values can be path-dependent), and, while that would be true even if most superintelligences had complex values and built complex worlds, there’s reason to think most would produce dull monocultures.” (“Which Consequentialism? Machine Ethics and Moral Divergence” is closely related.)
But maybe I overstate the difference between that and metaethical confusion; both fall under “reasons to think we get a good outcome by default”, both are supported by intuitions against “arbitrary” values, and probably have other psychology in common.