The argument that we can only focus on the training data makes the assumption that the AI system is not going to generalize well outside of the training dataset.
I’m not intending to make this assumption. The claim is: parts of your model that exhibit intelligence need to do something on the training distribution, because “optimize to perform well on the training distribution” is the only mechanism that makes the model intelligent.
That makes sense and rereading the post the transparency section is clearer now, thanks! If I had to guess what gave me the wrong impression before, it would be this part:
its behavior can only be intelligent when it is exercised on the training distribution
I suspect when I read this, I thought it implied “when it is not on the training distribution, its behavior cannot be intelligent”.
I’m not intending to make this assumption. The claim is: parts of your model that exhibit intelligence need to do something on the training distribution, because “optimize to perform well on the training distribution” is the only mechanism that makes the model intelligent.
That makes sense and rereading the post the transparency section is clearer now, thanks! If I had to guess what gave me the wrong impression before, it would be this part:
I suspect when I read this, I thought it implied “when it is not on the training distribution, its behavior cannot be intelligent”.
I also had trouble understanding that sub-clause. Maybe we read it in our head with the wrong emphasis:
Meaning: The agent gets inputs that are within the training distribution. ↔ The agent behaves intelligently.
But I guess it’s supposed to be:
Meaning: A behaviour is intelligent. ↔ The behaviour was exercised during training on the training distribution.