Noosphere89 comments on My decomposition of the alignment problem

Noosphere89 9 Sep 2024 17:46 UTC
5 points
0
I do note that General Purpose Search can be almost reduced to learning, in that most things that you want General Purpose Search to do can also be done by learning, though I do think General Purpose Search will at least be the foundation/bootstrapping for learning:

https://x.com/nc_znc/status/1532040663302381568

https://x.com/andy_l_jones/status/1532048580747309056
- Daniel C 10 Sep 2024 13:45 UTC
  5 points
  0
  Parent
  Interesting! I have to read the papers in more depth but here are some of my initial reactions to that idea (let me know if it’s been addressed already):
  - AFAICT using learning to replace GPS either requires:1) Training examples of good actions or 2) An environment like chess where we can rapidly gain feedback through simulation. Sampling from the environment would be much more costly when these assumptions break down, and general purpose search can enable lower sample complexity because we get to use all the information in the world model
  - General purpose search requires certain properties of the world model that seem to be missing in current models. For instance, decomposing goals into subgoals is important for dealing with a high-dimensional action space, and that requires a high degree of modularity in the world model. Lazy world-modeling also seems important for planning in a world larger than yourself, but most of these properties aren’t present in the toy environments we use
  - Learning can be a component of general purpose search (eg as a general purpose generator of heuristic), where we can learn to rearrange the search ordering of actions so that more effective actions are searched first
  - I think using a fixed number of forward-passes to approximate GPS will eventually face limitations in environments that are complexed enough, because the space of programs which can dedicate potentially unlimited time to find solutions is strictly more expressive than the space of programs that has a fixed inference time
  What links here?
  - Noosphere89's comment on My decomposition of the alignment problem by Daniel C (11 Sep 2024 0:46 UTC; 3 points)
  - Daniel C's comment on My decomposition of the alignment problem by Daniel C (10 Sep 2024 23:23 UTC; 1 point)
  - Noosphere89 10 Sep 2024 13:48 UTC
    5 points
    3
    Parent
    Agree, learning can’t entirely replace General Purpose Search, and I agree something like General Purpose Search will still in practice be the backbone behind learning, due to your reasoning.
    
    That is, General Purpose Search will still be necessary for AIs, if only due to bootstrapping concerns, and I agree with your list of benefits of General Purpose Search.