One obvious response is “but what about adversarial examples”; my position is that image datasets are not rich enough for ML to learn the human-desired concepts; the concepts they do learn are predictive, just not about things we care about.
To clarify, are you saying that if we had a rich enough dataset, the concepts they learn would be things we care about? If so, what is this based on, and how rich of a dataset do you think we would need? If not, can you explain more what you mean?
In the images case, I meant that if you had a richer dataset with more images in more conditions, accompanied with touch-based information, perhaps even audio, and the agent were allowed to interact with the world and see through these input mechanisms what the world did in response, then it would learn concepts that allow it to understand the world the way we do—it wouldn’t be fooled by occlusions, or by putting picture of a baseball on top of an ocean picture, etc. (This also requires a sufficiently large dataset; I don’t know how large.)
I’m not saying that such a dataset would lead it to learn what we value. I don’t know what that dataset would look like, partly because it’s not clear to me what exactly we value.
To clarify, are you saying that if we had a rich enough dataset, the concepts they learn would be things we care about? If so, what is this based on, and how rich of a dataset do you think we would need? If not, can you explain more what you mean?
In the images case, I meant that if you had a richer dataset with more images in more conditions, accompanied with touch-based information, perhaps even audio, and the agent were allowed to interact with the world and see through these input mechanisms what the world did in response, then it would learn concepts that allow it to understand the world the way we do—it wouldn’t be fooled by occlusions, or by putting picture of a baseball on top of an ocean picture, etc. (This also requires a sufficiently large dataset; I don’t know how large.)
I’m not saying that such a dataset would lead it to learn what we value. I don’t know what that dataset would look like, partly because it’s not clear to me what exactly we value.