I think the main reason it seems like there hasn’t been much work on the data needed for IRL is that IRL is so far from modeling humans well (i.e. modeling humans how they want to be modeled) that most intermediate work doesn’t look like working with data-gathering approaches, it looks like figuring out better ways of learning a model of humans, which is a balance between reasoning about data and architecture.
Like, suppose you want to learn a model of the world (an inference function to turn sense data and actions into latent variables and predictions about those latents) and locate humans within that model. This is hard, and it probably requires specialized training data collected as human feedback! But we don’t know what data exactly without better ideas about what, precisely, we want the AI to be doing.
I agree that focusing too much on gathering data now would be a mistake. I believe thinking about data for IRL now is mostly valuable to identify challenges which make IRL hard. Then we can try to develop algorithms that solve these challenges or find out IRL is not a tractable solution for alignment.
I think the main reason it seems like there hasn’t been much work on the data needed for IRL is that IRL is so far from modeling humans well (i.e. modeling humans how they want to be modeled) that most intermediate work doesn’t look like working with data-gathering approaches, it looks like figuring out better ways of learning a model of humans, which is a balance between reasoning about data and architecture.
Like, suppose you want to learn a model of the world (an inference function to turn sense data and actions into latent variables and predictions about those latents) and locate humans within that model. This is hard, and it probably requires specialized training data collected as human feedback! But we don’t know what data exactly without better ideas about what, precisely, we want the AI to be doing.
I agree that focusing too much on gathering data now would be a mistake. I believe thinking about data for IRL now is mostly valuable to identify challenges which make IRL hard. Then we can try to develop algorithms that solve these challenges or find out IRL is not a tractable solution for alignment.