Kaj_Sotala comments on GPT-3 Fiction Samples

Kaj_Sotala 25 Jun 2020 21:41 UTC
13 points
Okay, my intuitions for AI timelines just shifted to put considerably more probability on the short end.
- Quintin Pope 26 Jun 2020 3:39 UTC
  12 points
  0
  Parent
  Same. Specifically, I went from predicting 50% chance of human-level AGI within 40 years to 50% chance within 10 years.
  Andrew Mayne was also given access to the GPT-3 API. You can read his impressions here: https://andrewmayneblog.wordpress.com/
  I found his results very impressive as well. For example, he’s able to prompt GPT-3 to summarize a Wikipedia article on quantum computing at either a second grade or an eighth grade level, depending on the prompt.
  I actually put together a presentation on GPT-like architectures and their uses for my advisor: https://docs.google.com/presentation/d/1kCJ2PJ_3UteHBX5TWZyrF5ontEdNx_B4vi6KTmQmPNo/edit?usp=sharing
  It’s not really meant to be a stand alone explanation, but it does list some of GPT-2/3′s more impressive abilities. After compiling the presentation, I think we’ll look back on GPT-3 as the “Wright brothers” moment for AGI.
  Consider, this post suggests GPT-3 cost ~$4.6 million to train: https://lambdalabs.com/blog/demystifying-gpt-3. It would be well within Google/Microsoft/Amazon/DoD/etc’s budget to increase model size by another 2 (possibly 3) orders of magnitude. Based on the jump in GPT-3′s performance going from 13 B parameters to 175 B parameters, such a “GPT-4” would be absolutely stunning.
  - Daniel Kokotajlo 26 Jun 2020 10:10 UTC
    3 points
    Parent
    On the bright side, according to OpenAI’s scaling laws paper, GPT-3 is about the size that scaling was predicted to start breaking down. So maybe GPT-4 won’t actually be better than GPT-3. I’m not counting on it though.
    - gwern 26 Jun 2020 15:18 UTC
      9 points
      0
      Parent
      It’s possible that GPT-3 is roughly at where the maximally naive simple text LM begins to hit the constant wall, but I don’t regard this as important; as I emphasize at every turn, there are many distinct ways in which to improve it greatly using purely known methods, never mind future research approaches. The question is not whether there is any way GPT-4 might fail, but any way in which it might succeed.
  - Charlie Steiner 26 Jun 2020 12:46 UTC
    2 points
    Parent
    There’s a typo in your Andrew Mayne link, but thanks for linking it—that’s wild!
    - Pattern 26 Jun 2020 17:38 UTC
      2 points
      Parent
      https://andrewmayneblog.wordpress.com/
    - Quintin Pope 26 Jun 2020 17:45 UTC
      1 point
      Parent
      Thanks, fixed.
- [deleted] 28 Jun 2020 3:34 UTC
  2 points
  Parent
  - Raemon 28 Jun 2020 4:52 UTC
    5 points
    Parent
    My own take (not meant to be strong evidence of anything, mostly just kinda documenting my internal updating experience)
    I had already updated towards fairly shortish (like, maybe 20% chance of AGI in 20 years?). I initially had a surge of AAAAUGH maybe the end times around now right around the corner with GPT-3, but I think that was mostly unwarranted. (At least, GPT-3 didn’t seem like new information, it seemed roughly what I’d have expected GPT-3 to be like, and insofar as I’m updating shorter it seems like that means I just made a mistake last year when first evaluating GPT-2)
    I’m also interested in more of Kaj’s thoughts.
  - Kaj_Sotala 1 Jul 2020 13:12 UTC
    3 points
    Parent
    My largest update came from the bit where it figured out that it was expected to produce Harry Potter parodies in different styles. Previously GPT had felt cool, but basically a very advanced version of a Markov chain. But the HP thing felt like it would have required some kind of reasoning.
    - [deleted] 2 Jul 2020 6:19 UTC
      1 point
      Parent
      - Kaj_Sotala 2 Jul 2020 7:56 UTC
        6 points
        Parent
        I’m not sure how exactly reasoning should be defined and whether that part really requires reasoning or not. But if it’s just very advanced and incredible recognition and mimicry abilities, it still shifts my impression of what can be achieved using just advanced and incredible recognition and mimicry abilities. I would previously have assumed that you need something like reasoning for it, but if you don’t, then maybe the capacity for reasoning is slightly less important than I had thought.