Lauro Langosco comments on Prizes for ELK proposals

Lauro Langosco 3 Feb 2022 12:58 UTC
1 point
Would you consider this a valid counter to the third strategy (have humans adopt the optimal Bayes net using imitative generalization), as alternative to ontology mismatch?

Counter: In the worst case, imitative generalization / learning the human prior is not competitive. In particular, it might just be harder for a model to match the human inference $y ∣ x, Z$ than to simply learn $y ∣ x$ . Here $Z$ is the set of instructions as in learning the prior (I think in the context of ELK $Z$ would be the proposed change to the human Bayes net?)
- Lauro Langosco 3 Feb 2022 14:44 UTC
  1 point
  Parent
  Here’s a few more questions about the same strategy:
  
  If I understand correctly, the IG strategy is to learn a joint model for observations and actions $p_{θ} (v, a; Z)$ , where $v$ , $a$ , and $Z$ are video, actions, and proposed change to the Bayes net, respectively. Then we do inference using $p_{θ} (v, a; Z^{*})$ , where $Z^{*}$ is optimized for predictive usefulness.
  
  This fails because there’s no easy way to get $P (diamond is in the vault)$ from $p_{θ}$ .
  
  A simple way around this would be to learn $p_{θ} (v, a, y; Z)$ instead, where $y = 1$ if the diamond is in the vault and $0$ otherwise.
  1. Is my understanding correct?
  2. If so, I would guess that my simple workaround doesn’t count as a strategy because we can only use this to predict whether the diamond is in the vault (or some other set of questions that must be fixed at training time), as opposed to any question we want an answer to. Is this correct? Is there some other reason this wouldn’t count, or does it in fact count?