If there’s convergence in goals, then we don’t have to worry about making an AI with the wrong goals. If there’s only convergence in behavior, then we do, because building an AI with the wrong goals will shift the convergent behavior in the wrong direction. So I think it makes sense for Stuart’s paper to ignore acausal trading and just talk about whether there is convergence in goals.
If there’s convergence in goals, then we don’t have to worry about making an AI with the wrong goals. If there’s only convergence in behavior, then we do, because building an AI with the wrong goals will shift the convergent behavior in the wrong direction. So I think it makes sense for Stuart’s paper to ignore acausal trading and just talk about whether there is convergence in goals.
Not necessarily, it might destroy the earth before its goals converge.