“Bad reporter” = any reporter that gives unambiguously bad answers in some situations (in the ontology identification case, basically anything other than a direct translator)
“use knowledge of direct translation” = it may be hard to learn direct translation because you need a bunch of parameters to specify how to do it, but these “bad” reporters may also need the same bunch of parameters (because they do direct translation in some situations)
In the “upstream” counterexample, the bad reporter does direct translation under many circumstances but then sometimes uses a different heuristic that generates a bad answer. So the model needs all the same parameters used for direct translation, as mentioned in the last point. (I think your understanding of this was roughly right.)
More like: now we’ve learned a reporter which contains what we want and also some bad stuff, you could imagine doing something like imitative generalization (or e.g. a different regularization scheme that jointly learned multiple reporters) in order to get just what we wanted.
“Bad reporter” = any reporter that gives unambiguously bad answers in some situations (in the ontology identification case, basically anything other than a direct translator)
“use knowledge of direct translation” = it may be hard to learn direct translation because you need a bunch of parameters to specify how to do it, but these “bad” reporters may also need the same bunch of parameters (because they do direct translation in some situations)
In the “upstream” counterexample, the bad reporter does direct translation under many circumstances but then sometimes uses a different heuristic that generates a bad answer. So the model needs all the same parameters used for direct translation, as mentioned in the last point. (I think your understanding of this was roughly right.)
More like: now we’ve learned a reporter which contains what we want and also some bad stuff, you could imagine doing something like imitative generalization (or e.g. a different regularization scheme that jointly learned multiple reporters) in order to get just what we wanted.
Thanks!