I see why the approach I mention might have some intrinsic limitations in its ability to elicit latent knowledge though. The problem is that even if it understands roughly that it has incentives to use most of what it knows when we ask him simulating the prediction of someone with its own characteristics (or 1400 IQ), given that with ELK we look for an global maximum (we want that it uses ALL its knowledge), there’s always an uncertainty on whether it did understand that point or not for extreme intelligence / examples or whether it tries to fit to the training data as much as possible and thus still doesn’t use something it knows.
I think this is roughly right, but to try to be more precise, I’d say the counterexample is this:
Consider the Bayes net that represents the upper bound of all the understanding of the world you could extract doing all the tricks described (P vs NP, generalizing from less smart to more smart humans, etc).
Imagine that the AI does inference in that Bayes net.
However, the predictor’s Bayes net (which was created by a different process) still has latent knowledge that this Bayes net lacks.
By conjecture, we could not have possibly constructed a training data point that distinguished between doing inference on the upper-bound Bayes net and doing direct translation.
I think this is roughly right, but to try to be more precise, I’d say the counterexample is this:
Consider the Bayes net that represents the upper bound of all the understanding of the world you could extract doing all the tricks described (P vs NP, generalizing from less smart to more smart humans, etc).
Imagine that the AI does inference in that Bayes net.
However, the predictor’s Bayes net (which was created by a different process) still has latent knowledge that this Bayes net lacks.
By conjecture, we could not have possibly constructed a training data point that distinguished between doing inference on the upper-bound Bayes net and doing direct translation.