paulfchristiano comments on Beware boasting about non-existent forecasting track records

paulfchristiano 21 May 2022 20:01 UTC
71 points
Robin on AI timelines just seems particularly crazy. We can’t yet settle the ems vs de novo AI bet, but I think the writing is on the wall, and his forecasting methodology for the 300 year timeline seems so crazy—ask people in a bunch of fields “how far have you come to hman level, is it speeding up?” and then lean entirely on that (I think many of the short-term predictions are basically falsified now, in that if you ask people the same question they will give much higher percentages and many of the tasks are solved).
ETA: Going through the oldest examples from Robin’s survey to see how the methodology fares:
- Melanie Mitchell gives 5% progress in 20 years towards human-level analogical reasoning. But the kinds of string manipulation used in Mitchell’s copycat problem seems to just be ~totally solved by the current version of the OpenAI API. (I tried 10 random questions from this list, and the only one it got wrong was “a → ab, z → ?” where it said “z → z b” instead of what I presume was the intended “z → z y”. And in general it seems like we’ve come quite a long way.
- Murray Shanahan gives 10% progress on “knowledge representation” in 20 years, but I don’t know what this means so I’ll skip over it.
- Wendy Hall gives 1% on “computer-assisted training” in 20 years. I don’t know how to measure progress in this area, but I suspect any reasonable measure for the last 10 years will be >> 5%.
- Claire Cardie and Peter Norvig give 20% progress on NLP in 20 years. I think that 2013-2023 has seen much more than another 20% progress, in that it’s now becoming difficulty to write down any task in NLP for which models have subhuman performance (and instead we just use language to express increasingly-difficult non-language tasks)
- Aaron Dollar gives <1% on robotic grasping in 20 years. Hard to evaluate quantitatively, but seems very hard to argue we’ve come <10% more of the way in the last 10 years and in simulation I think we may just be roughly human level.
- Timothy Meese gives 5% progress on early human vision processing in 20 years, but I think it now seems like we are quite close to (or even past) human level at this task. Though I think maybe he’s talking about how much we understand early human vision process (in which case not clear what it’s doing in this list).
At any rate, the methodology looks to me like it’s making terrible predictions all over the place. I think I’m on the record objecting that these estimates seem totally unreasonable, though the only comment I can find by me on it is 5 years ago here where I say I don’t think it’s informative and give >20% that existing ML in particular will scale to human-level AI.
What links here?
- Habryka's comment on On Deference and Yudkowsky’s AI Risk Estimates by bgarfinkel (EA Forum; 19 Jun 2022 18:40 UTC; 89 points)
- Paul_Christiano's comment on On Deference and Yudkowsky’s AI Risk Estimates by bgarfinkel (EA Forum; 5 Jul 2022 21:16 UTC; 17 points)
- Tomás B. 22 May 2022 16:19 UTC
  15 points
  Parent
  Regarding the weird mediocrity of modern AI, isn’t part of this that GPT-3-style language models are almost aiming for mediocrity?
  Would a hypothetical “AlphaZero of code” which built its own abstractions from the ground up—and presumably would not reinvent Python (AlphaCode is cool and all, but it does strike me as a little absurd to see an AI write Python) - have this property?
  - paulfchristiano 22 May 2022 20:52 UTC
    6 points
    Parent
    Game-playing AI is also mediocre, as are models fine-tuned to write good code. 100B parameter models trained from scratch to write code (rather than to imitate human coders) would be much better but would take quite a lot longer to train, and I don’t see any evidence that they would spend less time in the mediocre subhuman regime (though I do agree that they would more easily go well past human level).
- CarlShulman 21 May 2022 22:11 UTC
  8 points
  Parent
  Also this.