Consider the epistemic state of someone who knows that they have the attention of a vastly greater intelligence than themselves, but doesn’t know whether that intelligence is Friendly. An even-slightly-wrong CAI will modify your utility function, and there’s nothing you can do but watch it happen.
An even-slightly-wrong CAI won’t modify your utility function because she isn’t wrong in that way. An even-slightly-wrong CAI does do several other bad things, but that isn’t one of them.
Consider the epistemic state of someone who knows that they have the attention of a vastly greater intelligence than themselves, but doesn’t know whether that intelligence is Friendly. An even-slightly-wrong CAI will modify your utility function, and there’s nothing you can do but watch it happen.
An even-slightly-wrong CAI won’t modify your utility function because she isn’t wrong in that way. An even-slightly-wrong CAI does do several other bad things, but that isn’t one of them.