No77e comments on Modern Transformers are AGI, and Human-Level

No77e 26 Mar 2024 20:14 UTC
4 points
0
One way in which “spending a whole lot of time working with a system / idea / domain, and getting to know it and understand it and manipulate it better and better over the course of time” could be solved automatically is just by having a truly huge context window. Example of an experiment: teach a particular branch of math to an LLM that has never seen that branch of math.
Maybe humans have just the equivalent of a sort of huge content window spanning selected stuff from their entire lifetimes, and so this kind of learning is possible for them.
- abramdemski 28 Mar 2024 17:30 UTC
  3 points
  0
  Parent
  I don’t think it is sensible to model humans as “just the equivalent of a sort of huge content window” because this is not a particularly good computational model of how human learning and memory work; but I do think that the technology behind the increasing context size of modern AIs contributes to them having a small but nonzero amount of the thing Steven is pointing at, due to the spontaneous emergence of learning algorithms.
  - Gerald Monroe 28 Mar 2024 17:51 UTC
    3 points
    0
    Parent
    You also have a simple algorithm problem. Humans learn by replacing bad policy with good. Aka a baby replaces “policy that drops objects picked up” ->. “policy that usually results in object retention”.
    
    This is because at a mechanistic level the baby tries many times to pickup and retain objects, and a fixed amount of circuitry in their brain has connections that resulted in a drop down weighted and ones they resulted in retention reinforced.
    
    This means that over time as the baby learns, the compute cost for motor manipulation remains constant. Technically O(1) though thats a bit of a confusing way to express it.
    
    With in context window learning, you can imagine an LLM+ robot recording :
    
    Robotic token string: <string of robotic policy tokens 1> : outcome, drop
    
    Robotic token string: <string of robotic policy tokens 2> : outcome, retain
    
    Robotic token string: <string of robotic policy tokens 2> : outcome, drop
    
    And so on extending and consuming all of the machines context window, and every time the machine decides which tokens to use next it needs O(n log n) compute to consider all the tokens in the window. (Used to be n^2, this is a huge advance)
    
    This does not scale. You will not get capable or dangerous AI this way. Obviously you need to compress that linear list of outcomes from different strategies to update the underlying network that generated them so it is more likely to output tokens that result in success.
    
    Same for any other task you want the model to do. In context learning scales poorly. This also makes it safe....