I’d encourage you to delve more into this paragraph as I think this is the part of your article where it becomes the most hand-wavey:
“In order to “really solve” outer alignment, you want the AI-optimization process to care about the generalization properties of the created AI beyond the training data. In order to “really solve” inner alignment, the created AI shouldn’t just care about the raw outputs of the process that created it, it should care about the things communicated by the AI-optimization process in its real-world context.”
I’d encourage you to delve more into this paragraph as I think this is the part of your article where it becomes the most hand-wavey:
“In order to “really solve” outer alignment, you want the AI-optimization process to care about the generalization properties of the created AI beyond the training data. In order to “really solve” inner alignment, the created AI shouldn’t just care about the raw outputs of the process that created it, it should care about the things communicated by the AI-optimization process in its real-world context.”
I agree, would like a bit more detail and perhaps an example here.