So I agree it would be an advance, but you could solve inner alignment in the sense of avoiding mesaoptimizers, yet fail to solve inner alignment in the senses of predictable generalization or stability of generalization across in-lifetime learning.
So I agree it would be an advance, but you could solve inner alignment in the sense of avoiding mesaoptimizers, yet fail to solve inner alignment in the senses of predictable generalization or stability of generalization across in-lifetime learning.