Maybe we don’t need to preserve all of the incompressible idiosyncrasies in human morality. Considering that individuals in the post-Singularity world will have many orders of magnitude more power than they do today, what really matter are the values that best scale with power. Anything that scales logarithmically for example will be lost in the noise compared to values that scale linearly. Even if we can’t understand all of human morality, maybe we will be able to understand the most important parts.
Just throwing away parts of one’s utility function seems bad. That can’t be optimal right? Well, as Peter de Blanc pointed out, it can be if there is no feasible alternative that improves expected utility under the original function. We should be willing to lose our unimportant values to avoid or reduce even a small probability of losing the most important ones. With CEV, we’re supposed to implement it with the help of an AI that’s not already Friendly, and if we don’t get it exactly right on the first try, we can’t preserve even our most important values. Given that we don’t know how to safely get an unFriendly AI to do anything, much less something this complicated, the probability of failure seems quite large.
Maybe we don’t need to preserve all of the incompressible idiosyncrasies in human morality. Considering that individuals in the post-Singularity world will have many orders of magnitude more power than they do today, what really matter are the values that best scale with power. Anything that scales logarithmically for example will be lost in the noise compared to values that scale linearly. Even if we can’t understand all of human morality, maybe we will be able to understand the most important parts.
Just throwing away parts of one’s utility function seems bad. That can’t be optimal right? Well, as Peter de Blanc pointed out, it can be if there is no feasible alternative that improves expected utility under the original function. We should be willing to lose our unimportant values to avoid or reduce even a small probability of losing the most important ones. With CEV, we’re supposed to implement it with the help of an AI that’s not already Friendly, and if we don’t get it exactly right on the first try, we can’t preserve even our most important values. Given that we don’t know how to safely get an unFriendly AI to do anything, much less something this complicated, the probability of failure seems quite large.