Daniel Kokotajlo comments on AMA: Paul Christiano, alignment researcher

Daniel Kokotajlo 29 Apr 2021 8:35 UTC
LW: 15 AF: 9
AF
1. What credence would you assign to “+12 OOMs of compute would be enough for us to achieve AGI / TAI / AI-induced Point of No Return within five years or so.” (This is basically the same, though not identical, with this poll question)
2. Can you say a bit about where your number comes from? E.g. maybe 25% chance of scaling laws not continuing such that OmegaStar, Amp(GPT-7), etc. don’t work, 25% chance that they happen but don’t count as AGI / TAI / AI-PONR, for total of about 60%? The more you say the better, this is my biggest crux! Thanks!
- paulfchristiano 1 May 2021 1:24 UTC
  LW: 14 AF: 9
  AF Parent
  I’d say 70% for TAI in 5 years if you gave +12 OOM.
  I think the single biggest uncertainty is about whether we will be able to adapt sufficiently quickly to the new larger compute budgets (i.e. how much do we need to change algorithms to scale reasonably? it’s a very unusual situation and it’s hard to scale up fast and depends on exactly how far that goes). Maybe I think that there’s an 90% chance that TAI is in some sense possible (maybe: if you’d gotten to that much compute while remaining as well-adapted as we are now to our current levels of compute) and conditioned on that an 80% chance that we’ll actually do it vs running into problems?
  (Didn’t think about it too much, don’t hold me to it too much. Also I’m not exactly sure what your counterfactual is and didn’t read the original post in detail, I was just assuming that all existing and future hardware got 12OOM faster. If I gave numbers somewhere else that imply much less than that probability with +12OOM, then you should be skeptical of both.)
  - Daniel Kokotajlo 5 May 2021 10:43 UTC
    LW: 4 AF: 3
    AF Parent
    My counterfactual attempts to get at the question “Holding ideas constant, how much would we need to increase compute until we’d have enough to build TAI/AGI/etc. in a few years?” This is (I think) what Ajeya is talking about with her timelines framework. Her median is +12 OOMs. I think +12 OOMs is much more than 50% likely to be enough; I think it’s more like 80% and that’s after having talked to a bunch of skeptics, attempted to account for unknown unknowns, etc. She mentioned to me that 80% seems plausible to her too but that she’s trying to adjust downwards to account for biases, unknown unknowns, etc.
    Given that, am I right in thinking that your answer is really close to 90%, since failure-to-achieve-TAI/AGI/etc-due-to-being-unable-to-adapt-quickly-to-magically-increased-compute “shouldn’t count” for purposes of this thought experiment?
- paulfchristiano 1 May 2021 1:32 UTC
  LW: 12 AF: 8
  AF Parent
  (I don’t think Amp(GPT-7) will work though.)
  - Daniel Kokotajlo 1 May 2021 7:50 UTC
    LW: 2 AF: 2
    AF Parent
    I’m very glad to hear that! Can you say more about why?
    - paulfchristiano 1 May 2021 16:14 UTC
      LW: 14 AF: 9
      AF Parent
      Natural language has both noise (that you can never model) and signal (that you could model if you were just smart enough). GPT-3 is in the regime where it’s mostly signal (as evidenced by the fact that the loss keeps going down smoothly rather than approaching an asymptote). But it will soon get to the regime where there is a lot of noise, and by the time the model is 9 OOMs bigger I would guess (based on theory) that it will be overwhelmingly noise and training will be very expensive.
      So it may or may not work in the sense of meeting some absolute performance threshold, but it will certainly be a very bad way to get there and we’ll do something smarter instead.
      - Daniel Kokotajlo 5 May 2021 10:36 UTC
        LW: 5 AF: 3
        AF Parent
        Hmm, I don’t count “It may work but we’ll do something smarter instead” as “it won’t work” for my purposes.
        I totally agree that noise will start to dominate eventually… but the thing I’m especially interested in with Amp(GPT-7) is not the “7” part but the “Amp” part. Using prompt programming, fine-tuning on its own library, fine-tuning with RL, making chinese-room-bureaucracies, training/evolving those bureaucracies… what do you think about that? Naively the scaling laws would predict that we’d need far less long-horizon data to train them, since they have far fewer parameters, right? Moreover IMO evolved-chinese-room-bureaucracy is a pretty good model for how humans work, and in particular for how humans are able to generalize super well and make long-term plans etc. without many lifetimes of long-horizon training.