paulfchristiano comments on Christiano, Cotra, and Yudkowsky on AI progress

paulfchristiano 26 Nov 2021 7:53 UTC
LW: 9 AF: 6
AF
I agree we seem to have some kind of deeper disagreement here.
I think stack more layers + known training strategies (nothing clever) + simple strategies for using test-time compute (nothing clever, nothing that doesn’t use the ML as a black box) can get continuous improvements in tasks like reasoning (e.g. theorem-proving), meta-learning (e.g. learning to learn new motor skills), automating R&D (including automating executing ML experiments, or proposing new ML experiments), or basically whatever.
I think these won’t get to human level in the next 5 years. We’ll have crappy versions of all of them. So it seems like we basically have to get quantitative. If you want to talk about something we aren’t currently measuring, then that probably takes effort, and so it would probably be good if you picked some capability where you won’t just say “the Future is hard to predict.” (Though separately I expect to make somewhat better predictions than you in most of these domains.)
A plausible example is that I think it’s pretty likely that in 5 years, with mere stack more layers + known techniques (nothing clever), you can have a system which is clearly (by your+my judgment) “on track” to improve itself and eventually foom, e.g. that can propose and evaluate improvements to itself, whose ability to evaluate proposals is good enough that it will actually move in the right direction and eventually get better at the process, etc., but that it will just take a long time for it to make progress. I’d guess that it looks a lot like a dumb kid in terms of the kind of stuff it proposes and its bad judgment (but radically more focused on the task and conscientious and wise than any kid would be). Maybe I think that’s 10% unconditionally, but much higher given a serious effort. My impression is that you think this is unlikely without adding in some missing secret sauce to GPT, and that my picture is generally quite different from your criticallity-flavored model of takeoff.
What links here?
- Søren Elverlin's comment on Conversation on technology forecasting and gradualism by Richard_Ngo (10 Dec 2021 12:44 UTC; 1 point)
- Søren Elverlin 26 Nov 2021 14:42 UTC
  5 points
  Parent
  How long time do you see between “1 AI clearly on track to Foom” and “First AI to actually Foom”? My weak guess is Eliezer would say “Probably quite little time”, but your model of the world requires the GWP to double over a 4 year period, and I’m guessing that period probably starts later than 2026.
  
  I would be surprised if by 2027, I could point to an AI that for a full year had been on track to Foom, without Foom happening.
  - paulfchristiano 26 Nov 2021 19:29 UTC
    7 points
    Parent
    I think “on track to foom” is a very long way before “actually fooms.”