CEV takes more of an economic perspective where agent-extrapolations make deals with each other. The “good” agent-extrapolations might win out in the end (due to having a more-timeless discount rate, say), but there might be a lot of suffering along the way. CFAI on the other hand takes a less deal-centric perspective where the AI’s more directly supposed to reason everything through from first principles, which can avoid predictably-stupid-in-retrospect agents getting much of the future’s pie, so to speak. So I’m more afraid of CEV-like thinking than CFAI-like thinking, even though both are scary, because I am more afraid of humans being evil than I’m afraid of me not getting what I want. This may or may not overlap at all with your concerns.
(The difference isn’t necessarily whether or not they converge on the same policy, it might also be how quickly they converge on that policy. CFAI seems like it’d converge on justifiedness more quickly, but maybe not.)
CEV takes more of an economic perspective where agent-extrapolations make deals with each other. The “good” agent-extrapolations might win out in the end (due to having a more-timeless discount rate, say), but there might be a lot of suffering along the way. CFAI on the other hand takes a less deal-centric perspective where the AI’s more directly supposed to reason everything through from first principles, which can avoid predictably-stupid-in-retrospect agents getting much of the future’s pie, so to speak. So I’m more afraid of CEV-like thinking than CFAI-like thinking, even though both are scary, because I am more afraid of humans being evil than I’m afraid of me not getting what I want. This may or may not overlap at all with your concerns.
(The difference isn’t necessarily whether or not they converge on the same policy, it might also be how quickly they converge on that policy. CFAI seems like it’d converge on justifiedness more quickly, but maybe not.)