Adele Lopez comments on Views on when AGI comes and on strategy to reduce existential risk

Adele Lopez 10 Jul 2023 6:59 UTC
LW: 6 AF: 4
0
AF
Alright, to check if I understand, would these be the sorts of things that your model is surprised by?
1. An LLM solves a mathematical problem by introducing a novel definition which humans can interpret as a compelling and useful concept.
2. An LLM which can be introduced to a wide variety of new concepts not in its training data, and after a few examples and/or clarifying questions is able to correctly use the concept to reason about something.
3. A image diffusion model which is shown to have a detailed understanding of anatomy and 3D space, such that you can use it to transform an photo of a person into an image of the same person in a novel pose (not in its training data) and angle with correct proportions and realistic joint angles for the person in the input photo.
- TsviBT 10 Jul 2023 7:21 UTC
  LW: 7 AF: 4
  0
  AF Parent
  Unfortunately, more context is needed.
  
  An LLM solves a mathematical problem by introducing a novel definition which humans can interpret as a compelling and useful concept.
  
  I mean, I could just write a python script that prints out a big list of definitions of the form
  
  “A topological space where every subset with property P also has property Q”
  
  and having P and Q be anything from a big list of properties of subsets of topological spaces. I’d guess some of these will be novel and useful. I’d guess LLMs + some scripting could already take advantage of some of this. I wouldn’t be very impressed by that (though I think I would be pretty impressed by the LLM being able to actually tell the difference between valid proofs in reasonable generality). There are some versions of this I’d be impressed by, though. Like if an LLM had been the first to come up with one of the standard notions of curvature, or something, that would be pretty crazy.
  
  An LLM which can be introduced to a wide variety of new concepts not in its training data, and after a few examples and/or clarifying questions is able to correctly use the concept to reason about something.
  
  I haven’t tried this, but I’d guess if you give an LLM two lists of things where list 1 is [things that are smaller than a microwave and also red] and list 2 is [things that are either bigger than a microwave, or not red], or something like that, it would (maybe with some prompt engineering to get it to reason things out?) pick up that “concept” and then use it, e.g. sorting a new item, or deducing from “X is in list 1″ to “X is red”. That’s impressive (assuming it’s true), but not that impressive.
  
  On the other hand, if it hasn’t been trained on a bunch of statements about angular momentum, and then it can—given some examples and time to think—correctly answer questions about angular momentum, that would be surprising and impressive. Maybe this could be experimentally tested, though I guess at great cost, by training a LLM on a dataset that’s been scrubbed of all mention of stuff related to angular momentum (disallowing math about angular momentum, but allowing math and discussion about momentum and about rotation), and then trying to prompt it so that it can correctly answer questions about angular momentum. Like, the point here is that angular momentum is a “new thing under the sun” in a way that “red and smaller than microwave” is not a new thing under the sun.
  - Roman Leventov 10 Jul 2023 8:50 UTC
    5 points
    2
    Parent
    On the other hand, if it hasn’t been trained on a bunch of statements about angular momentum, and then it can—given some examples and time to think—correctly answer questions about angular momentum, that would be surprising and impressive. Maybe this could be experimentally tested, though I guess at great cost, by training a LLM on a dataset that’s been scrubbed of all mention of stuff related to angular momentum (disallowing math about angular momentum, but allowing math and discussion about momentum and about rotation), and then trying to prompt it so that it can correctly answer questions about angular momentum. Like, the point here is that angular momentum is a “new thing under the sun” in a way that “red and smaller than microwave” is not a new thing under the sun.
    Until recently, I thought that the fact that LLMs are not strong and efficient online (or quasi-online, i.e., need few examples) conceptual learners is a “big obstacle” for AGI or ASI. I no longer think so. Yes, humans evidently still have an edge in this, that is, humans can somehow relatively quickly and efficiently “surgeon” their world models to accommodate new concepts and use them efficiently in a far-ranging way. (Even though I suspect that we over-glorify this ability in humans and it more realistically takes weeks or even months for humans to fully integrate new conceptual frameworks in their thinking than hours, still, they should be able to do so without much external examples, which will be lacking if the concept is actually very new.)
    I no longer think this handicaps LLMs much. New powerful concepts that permeate practical and strategic reasoning in the real world are rarely invented and are spread through the society slowly. Just being a skillful user of existing concepts that are amptly described in books and otherwise in the training corpus of LLMs should be well enough for gaining capacity for recursive self-improvement, and quite far superhuman intelligence/strategy/agency more generally.
    Then, imagine that superhuman LLMs-based agents “won” and killed all humans. Even if they themselves don’t (or couldn’t!) invent ML paradigms for efficient online concept learning, they could still sort of hack through it, through experimenting with new concepts, trying to run a lot of simulations with them, checking these simulations against reality (filtering out incoherence/bad concepts), and then re-training themselves on the results of these simulations, and then giving text labels to the features found in their own DNNs to mark the corresponding concept.
    - TsviBT 10 Jul 2023 22:54 UTC
      4 points
      0
      Parent
      
      Just being a skillful user of existing concepts
      
      I don’t think they’re skilled users of existing concepts. I’m not saying it’s an “obstacle”, I’m saying that this behavior pattern would be a significant indicator to me that the system has properties that make it scary.
  - Nick_Tarleton 2 Feb 2025 23:25 UTC
    LW: 3 AF: 2
    0
    AF Parent
    I don’t really have an empirical basis for this, but: If you trained something otherwise comparable to, if not current, then near-future reasoning models without any mention of angular momentum, and gave it a context with several different problems to which angular momentum was applicable, I’d be surprised if it couldn’t notice that $\to r \times \to p$ was a common interesting quantity, and then, in an extension of that context, correctly answer questions about it. If you gave it successive problem sets where the sum of that quantity was applicable, the integral, maybe other things, I’d be surprised if a (maybe more powerful) reasoning model couldn’t build something worth calling the ability to correctly answer questions about angular momentum. Do you expect otherwise, and/or is this not what you had in mind?
    - TsviBT 3 Feb 2025 1:53 UTC
      LW: 3 AF: 2
      0
      AF Parent
      It’s a good question. Looking back at my example, now I’m just like “this is a very underspecified/confused example”. This deserves a better discussion, but IDK if I want to do that right now. In short the answer to your question is
      
      I at least would not be very surprised if gippity-seek-o5-noAngular could do what I think you’re describing.
      That’s not really what I had in mind, but I had in mind something less clear than I thought. The spirit is about “can the AI come up with novel concepts”, but the issue here is that “novel concepts” are big things, and their material and functioning and history are big and smeared out.
      
      I started writing out a bunch of thoughts, but they felt quite inadequate because I knew nothing about the history of the concept of angular momentum; so I googled around a tiny little bit. The situation seems quite awkward for the angular momentum lesion experiment. What did I “mean to mean” by “scrubbed all mention of stuff related to angular momentum”—presumably this would have to include deleting all subsequent ideas that use angular moment in their definitions, but e.g. did I also mean to delete the notion of cross product?
      
      It seems like angular momentum was worked on in great detail well before the cross product was developed at all explicitly. See https://arxiv.org/pdf/1511.07748 and https://en.wikipedia.org/wiki/Cross_product#History. Should I still expect gippity-seek-o5-noAngular to notice the idea if it doesn’t have the cross product available? Even if not, what does and doesn’t this imply about this decade’s AI’s ability to come up with novel concepts?
      
      (I’m going to mull on why I would have even said my previous comment above, given that on reflection I believe that “most” concepts are big and multifarious and smeared out in intellectual history. For some more examples of smearedness, see the subsection here: https://tsvibt.blogspot.com/2023/03/explicitness.html#the-axiom-of-choice)
      - Raemon 3 Feb 2025 7:49 UTC
        LW: 2 AF: 1
        0
        AF Parent
        That’s not really what I had in mind, but I had in mind something less clear than I thought. The spirit is about “can the AI come up with novel concepts”,
        I think one reason I think the current paradigm is “general enough, in principle”, is that I don’t think “novel concepts” is really The Thing. I think creativity / intelligence mostly is about is combining concepts, it’s just that really smart people are
        a) faster in raw horsepower and can handle more complexity at a time
        b) have a better set of building blocks to combine or apply to make new concepts (which includes building blocks for building better building blocks)
        c) have a more efficient search for useful/relevant building blocks (both metacognitive and object-level).
        Maybe you believe this, and think that “well yeah, it’s the efficient search that’s the important part, which we still don’t actually have a real working version of?”?
        It seems like the current models have basically all the tools a moderately smart human have, with regards to generating novel ideas, and the thing that they’re missing is something like “having a good metacognitive loop such that they notice when they’re doing a fake/dumb version of things, and course correcting” and “persistently pursue plans over long time horizons.” And it doesn’t seem to have zero of either of those, just not enough to get over some hump.
        I don’t see what’s missing that a ton of training on a ton of diverse, multimodal tasks + scaffoldin + data flywheel isn’t going to figure out.
        TsviBT 3 Feb 2025 8:00 UTC
        LW: 9 AF: 4
        0
        AF Parent
        
        really smart people
        
        Differences between people are less directly revelative of what’s important in human intelligence. My guess is that all or very nearly all human children have all or nearly all the intelligence juice. We just, like, don’t appreciate how much a child is doing in constructing zer world.
        
        the current models have basically all the tools a moderately smart human have, with regards to generating novel ideas
        
        Why on Earth do you think this? (I feel like I’m in an Asch Conformity test, but with really really high production value. Like, after the experiment, they don’t tell you what the test was about. They let you take the card home. On the walk home you ask people on the street, and they all say the short line is long. When you get home, you ask your housemates, and they all agree, the short line is long.)
        
        I don’t see what’s missing that a ton of training on a ton of diverse, multimodal tasks + scaffoldin + data flywheel isn’t going to figure out.
        
        My response is in the post.