Ben Livengood answers Is “Recursive Self-Improvement” Relevant in the Deep Learning Paradigm?

Ben Livengood 6 Apr 2023 16:28 UTC
9 points
4
I think it’s premature to conclude that AGI progress will be large pre-trained transformers indefinitely into the future. They are surprisingly(?) effective but for comparison they are not as effective in the narrow domains where AlphaZero and AlphaStar are using value and action networks paired with Monte-Carlo search with orders of magnitude fewer parameters. We don’t know what MCTS on arbitrary domains will look like with 2-4 OOM-larger networks, which are within reach now. We haven’t formulated methods of self-play for improvement with LLMs and I think that’s also a potentially large overhang.

There’s also a human limit to the types of RSI we can imagine and once pre-trained transformers exceed human intelligence in the domain of machine learning those limits won’t apply. I think there’s probably significant overhang in prompt engineering, especially when new capabilities emerge from scaling, that could be exploited by removing the serial bottleneck of humans trying out prompts by hand.

Finally I don’t think GOFAI is dead; it’s still in its long winter waiting to bloom when enough intelligence is put into it. We don’t know the intelligence/capability threshold necessary to make substantial progress there. Generally, the bottleneck has been identifying useful mappings from the real world to mathematics and algorithms. Humans are pretty good at that, but we stalled at formalizing effective general intelligence itself. Our abstraction/modeling abilities, working memory, and time are too limited and we have no idea where those limits come from, whether LLMs are subject to the same or similar limits, or how the limits are reduced/removed with model scaling.
- DragonGod 6 Apr 2023 19:00 UTC
  4 points
  0
  Parent
  1. MCTS seems difficult in “rich” (complex/high dimensional problem domains, continuous, stochastic, large state/action spaces) environments (e.g. the real world)?
  2. My conclusion was that AGI progress would be deep learning based into the indefinite future, not pretrained transformers
  - Ben Livengood 6 Apr 2023 22:56 UTC
    3 points
    0
    Parent
    Naive MCTS in the real world does seem difficult to me, but e.g. action networks constrain the actual search significantly. Imagine a value network good at seeing if solutions work (maybe executing generated code and evaluating the output) and plugging a plain old LLM in as the action network; it could theoretically explore the large solution space better than beam search or argmax+temperature[0].
    
    0: https://openreview.net/forum?id=Lr8cOOtYbfL is from February and I found it after writing this comment, figuring someone else probably had the same idea.