Why is this your central complaint about existing theoretical work?
Sorry, I meant that that was my central complaint about existing theoretical work that is trying to explain neural net generalization. (I was mostly thinking of work outside of the alignment community.) I wasn’t trying to make a claim about all theoretical work.
It’s my central complaint because we ~know that such an assumption is necessary (since the same neural net that generalizes well on real MNIST can also memorize a randomly labeled MNIST where it will obviously fail to generalize).
InfraBayes answers this question by observing that although we can’t easily use Solomonoff-like models of the whole universe, there are many patterns we can take advantage of which can be articulated with partial models.
I feel pretty convinced by this :) In particular the assumption on the real world could be something like “there exists a partial model that describes the real world well enough that we can prove a regret bound that is not vacuous” or something like that. And I agree this seems like a reasonable assumption.
Even if in some sense InfraBayes (or some other theory) turns out to explain the success of NNs, that does not actually imply it’ll give rise to something competitive with NNs.
Tbc I would see this as a success.
In other words, I had thought that you had (quite reasonably!) given up on learning theory because its results didn’t seem relevant. I had hoped to rekindle your interest by pointing out that we can now do much better than 90s-era learning theory, in ways that seem relevant for EG objective robustness.
I am interested! I listed it as one of the topics I saw as allowing us to make claims about objective robustness. I’m just saying that the current work doesn’t seem to be making much progress (I agree now though that InfraBayes is plausibly on a path where it could eventually help).
It would be surprising if I told you that some genetic algorithm found a billion-bit program that described the data perfectly and then generalized well. It would be much less surprising if I told you that this billion-bit program was actually a mixture model that had been initialized randomly and then tuned by the genetic algorithm.
Fwiw I don’t feel the force of this intuition, they seem about equally surprising (but I agree with you that it doesn’t seem cruxy).
Sorry, I meant that that was my central complaint about existing theoretical work that is trying to explain neural net generalization. (I was mostly thinking of work outside of the alignment community.) I wasn’t trying to make a claim about all theoretical work.
It’s my central complaint because we ~know that such an assumption is necessary (since the same neural net that generalizes well on real MNIST can also memorize a randomly labeled MNIST where it will obviously fail to generalize).
I feel pretty convinced by this :) In particular the assumption on the real world could be something like “there exists a partial model that describes the real world well enough that we can prove a regret bound that is not vacuous” or something like that. And I agree this seems like a reasonable assumption.
Tbc I would see this as a success.
I am interested! I listed it as one of the topics I saw as allowing us to make claims about objective robustness. I’m just saying that the current work doesn’t seem to be making much progress (I agree now though that InfraBayes is plausibly on a path where it could eventually help).
Fwiw I don’t feel the force of this intuition, they seem about equally surprising (but I agree with you that it doesn’t seem cruxy).
Great, I feel pretty resolved about this conversation now.