paulfchristiano comments on ARC’s first technical report: Eliciting Latent Knowledge

paulfchristiano 9 Jan 2022 17:03 UTC
3 points
In all of the counterexamples the reporter starts from the $v_{1}$ , actions, and $v_{2}$ predicted by the predictor. In order to answer questions it needs to infer the latent variables in the human’s model.
Originally we described a counterexample where it copied the human inference process.
The improved counterexample is to instead use lots of computation to do the best inference it can, rather than copying the human’s mediocre inference. To make the counterexample fully precise we’d need to specify an inference algorithm and other details.
We still can’t do perfect inference though—there are some inference problems that just aren’t computationally feasible.
(That means there’s hope for creating data where the new human simulator does badly because of inference mistakes. And maybe if you are careful it will also be the case that the direct translator does better, because it effectively reuses the inference work done in the predictor? To get a proposal along these lines we’d need to describe a way to produce data that involves arbitrarily hard inference problems.)
- Johannes C. Mayer 9 Jan 2022 19:20 UTC
  1 point
  Parent
  Ah ok, thank you. Now I get it. I was confused by (i) “Imagine the reporter could do perfect inference” and (ii) “the reporter could simply do the best inference it can in the human Bayes net (given its predicted video)”.
  
  (i) I thought of this as that the reporter alone can do it, but what is actually meant is that with the use of the predictor model it can do it.
  
  (ii) Somehow I thought that “given its predicted video” is the important modification here, where in fact the only change is to go from that the reporter can do perfect inference, to that it does the best inference that it can.