bmg comments on Inner Alignment: Explain like I’m 12 Edition

bmg 18 Aug 2020 15:17 UTC
7 points

Since neural networks are universal function approximators, it is indeed the case that some of them will implement specific search algorithms.

I don’t think this specific point is true. It seems to me like the difference between functions and algorithms is important. You can also approximate any function with a sufficiently large look-up table, but simply using a look-up table to choose actions doesn’t involve search/planning.* In this regard, something like a feedforward neural network with frozen weights also doesn’t seem importantly different than a look-up table to me.

One naive perspective: Systems like AlphaGo and MuZero do search, because they implement Monte-Carlo tree search algorithms, but if you were to remove their MCTS components then they simply wouldn’t do search. Search algorithms can be used to update the weights of neural networks, but neural networks don’t themselves do search.

I think this naive perspective may be wrong, because it’s possible that recurrence is sufficient for search/planning processes to emerge (e.g. see this paper). But then, if that’s true, I think that the power of recurrence is the important thing to emphasize, rather than the fact that neural networks are universal function approximators.

*I’m thinking of search algorithms as cognitive processes, rather than input-output behaviors (which could be produced via a wide range of possible algorithms). If you’re thinking of them as behaviors, then my point no longer holds. Although I’ve interpreted the mesa-optimization paper (and most other discussions of mesa-optimization) as talking about cognitive processes.
- Rafael Harth 22 Aug 2020 18:53 UTC
  2 points
  Parent
  (I somehow didn’t notice your comment until now.) I believe you are correct. The theorem for function approximation I know also uses brute force (i.e., large networks) in the proof, so it doesn’t seem like evidence for the existence of [weights that implement algorithms].
  (And I am definitely not talking about algorithms in terms of input/output behavior.)
  I’ve changed the paragraph into
  You might wonder what the space of all models looks like. The typical answer is that the possible models are sets of weights for a neural network. The problem exists insofar as some sets of weights implement specific search algorithms.
  Anyone who knows of alternative evidence I can point to here is welcome to reply to this comment.