Daniel Kokotajlo answers What are some open exposition problems in AI?

Daniel Kokotajlo 17 Aug 2021 13:59 UTC
3 points
I’d love to see someone explain what the priors of neural networks are (I think it’s called “minimal circuit” or something like that) compared and contrasted with e.g. the Solomonoff prior, or the Levin prior. It would answer questions like these:
--What is the neural network prior? (My tentative, shitty answer: It’s the probability distribution over types of circuits, that tells you how some types are more likely to occur than others, were you to randomly sample parameter configurations.)
--How is it different than the solomonoff prior? (My TSA: The solomonoff prior is over types of programs rather than types of circuits. This isn’t by itself a big deal because circuits are programs and programs can be approximated by circuits. More importantly, there are many circuits that get high NN prior but low solomonoff prior, and vice versa. In particular solomonoff prior doesn’t penalize programs for running for a very long time, it instead entirely emphasizes minimum description length.)
--Why does this matter? (My TSA: As a first approximation, we should think of SGD on big neural networks as selecting the highest-prior circuit that scores perfectly on the training data. It’s like making a posterior by conditionalizing the prior on some data. Mignard et al would argue that this is more than just an approximation, though lots of people disagree with them and think it’s more complicated than that. This has important implications for inner alignment issues, see e.g. Paul’s stuff.)