Xodarap comments on Prizes for ELK proposals

Xodarap 25 Jan 2022 16:22 UTC
1 point
I’ve been trying to understand this paragraph:
That is, it looks plausible (though still <50%) that we could improve these regularizers enough that a typical “bad” reporter was a learned optimizer which used knowledge of direct translation, together with other tricks and strategies, in order to quickly answer questions. For example, this is the structure of the counterexample discussed in Section: upstream. This is a still a problem because e.g. the other heuristics would often “misfire” and lead to bad answers, but it is a promising starting point because in some sense it has forced some optimization process to figure out how to do direct translation.
This comment is half me summarizing my interpretation of it to help others, and half an implicit question for the ARC team about whether my interpretation is correct.
1. What is a “bad” reporter? I think the term is used to refer to a reporter which is at least partially a human interpreter, or at least one which can’t confidently be said to be a direct translator.
2. What does it mean to “use knowledge of direct translation”? I think this means that, at least in some circumstances, it acts as a direct translator. I.e. there is some theoretical training data set + question such that the reporter will act as a direct translator. (Do we have to be able to prove this? Or do we just need to state it’s likely?)
3. How did the “upstream” counterexample “force some optimization process to figure out how to do direct translation”? I think this is saying that, if we were in a world where the direct translation nodes were upstream of the “human interpreter” nodes, the upstream regularizer would successfully force the reporter to do direct translation.
4. Why is this “a promising starting point?” Maybe we could find some other way of forcing the direct translator nodes to be upstream of the human interpreter ones, and then that strategy combined with the upstream regulatizer would force a direct translator.
Corrections and feedback on this extremely welcome!
- paulfchristiano 25 Jan 2022 17:18 UTC
  2 points
  Parent
  1. “Bad reporter” = any reporter that gives unambiguously bad answers in some situations (in the ontology identification case, basically anything other than a direct translator)
  2. “use knowledge of direct translation” = it may be hard to learn direct translation because you need a bunch of parameters to specify how to do it, but these “bad” reporters may also need the same bunch of parameters (because they do direct translation in some situations)
  3. In the “upstream” counterexample, the bad reporter does direct translation under many circumstances but then sometimes uses a different heuristic that generates a bad answer. So the model needs all the same parameters used for direct translation, as mentioned in the last point. (I think your understanding of this was roughly right.)
  4. More like: now we’ve learned a reporter which contains what we want and also some bad stuff, you could imagine doing something like imitative generalization (or e.g. a different regularization scheme that jointly learned multiple reporters) in order to get just what we wanted.
  - Xodarap 25 Jan 2022 20:36 UTC
    1 point
    Parent
    Thanks!