Ofer comments on If GPT-6 is human-level AGI but costs $200 per page of output, what would happen?

Ofer 9 Oct 2020 15:50 UTC
7 points
Some quick thoughts/comments:

--It can predict random internet text better than the best humans

I wouldn’t use this metric. I don’t see how to map between it and anything we care about. If it’s defined in terms of accuracy when predicting the next word, I won’t be surprised if existing language models already outperform humans.

Also, I find the term “human-level AGI” confusing. Does it exclude systems that are super-human on some dimensions? If so, it seems too narrow to be useful. For the purpose of this post, I propose using the following definition: A system that is able to generate text in a way that allows to automatically perform any task that humans can perform by writing text.
- gwern 9 Oct 2020 16:35 UTC
  13 points
  Parent
  
  I wouldn’t use this metric. I don’t see how to map between it and anything we care about.
  
  Nevertheless, it works. That’s how self-supervised training/pretraining works.
  
  If it’s defined in terms of accuracy when predicting the next word, I won’t be surprised if existing language models already outperforms humans.
  
  They don’t. GPT-3 is still, as far as I can tell, about twice as bad in an absolute sense as humans in text prediction: https://www.gwern.net/Scaling-hypothesis#fn18
  - Ofer 9 Oct 2020 18:35 UTC
    3 points
    Parent
    
    Nevertheless, it works. That’s how self-supervised training/pretraining works.
    
    Right, I’m just saying that I don’t see how to map that metric to things we care about in the context of AI safety. If a language model outperforms humans at predicting the next word, maybe it’s just due to it being sufficiently superior at modeling low-level stuff (e.g. GPT-3 may be better than me at predicting you’ll write “That’s” rather than “That is”.)
    
    (As an aside, in the linked footnote I couldn’t easily spot any paper that actually evaluated humans on predicting the next word.)
    - gwern 9 Oct 2020 19:44 UTC
      4 points
      Parent
      
      (As an aside, in the linked footnote I couldn’t easily spot any paper that actually evaluated humans on predicting the next word.)
      
      Third paragraph:
      
      GPT-2 was benchmarked at 43 perplexity on the 1 Billion Word (1BW) benchmark vs a (highly extrapolated) human perplexity of 12
      
      https://www.gwern.net/docs/ai/2017-shen.pdf
      
      The LAMBADA dataset was also constructed using humans to predict the missing words, but GPT-3 falls far short of perfection there, so while I can’t numerically answer it (unless you trust OA’s reasoning there), it is still very clear that GPT-3 does not match or surpass humans at text prediction.
      - Ofer 10 Oct 2020 17:23 UTC
        3 points
        Parent
        
        GPT-2 was benchmarked at 43 perplexity on the 1 Billion Word (1BW) benchmark vs a (highly extrapolated) human perplexity of 12
        
        I wouldn’t say that that paper shows a (highly extrapolated) human perplexity of 12. It compares human-written sentences to language model generated sentences on the degree to which they seem “clearly human” vs “clearly unhuman” as judged by humans. Amusingly, for every 8 human-written sentences that were judged as “clearly human”, one human-written sentence was judged as “clearly unhuman”. And that 8:1 ratio is the thing from which human perplexity is being derived from. This doesn’t make sense to me.
        
        If the human annotators in this paper had never annotated human-written sentences as “clearly unhuman”, this extrapolation would have shown human perplexity of 1! (As if humans can magically predict an entire page of text sampled from the internet.)
        
        The LAMBADA dataset was also constructed using humans to predict the missing words, but GPT-3 falls far short of perfection there, so while I can’t numerically answer it (unless you trust OA’s reasoning there), it is still very clear that GPT-3 does not match or surpass humans at text prediction.
        
        If the comparison here is on the final LAMBADA dataset, after examples were filtered out based on disagreement between humans (as you mentioned in the newsletter), then it’s an unfair comparison. The examples are selected for being easy for humans.
        
        BTW, I think the comparison to humans on the LAMBADA dataset is indeed interesting in the context of AI safety (more so than “predict the next word in a random internet text”); because I don’t expect the perplexity/accuracy to depend much on the ability to model very low-level stuff (e.g. “that’s” vs “that is”).
- Daniel Kokotajlo 9 Oct 2020 19:46 UTC
  2 points
  Parent
  OK, fair enough.
  
  Yeah, human-level is supposed to mean not strongly superhuman at anything important, while also not being strongly subhuman in anything important.
  - Ofer 10 Oct 2020 17:33 UTC
    3 points
    Parent
    
    Yeah, human-level is supposed to mean not strongly superhuman at anything important, while also not being strongly subhuman in anything important.
    
    I think that’s roughly the concept Nick Bostrom used in Superintelligence when discussing takeoff dynamics. (The usage of that concept is my only major disagreement with that book.) IMO it would be very surprising if the first ML system that is not strongly subhuman at anything important would not be strongly superhuman at anything important (assuming this property is not optimized for).
    - Daniel Kokotajlo 10 Oct 2020 18:22 UTC
      3 points
      Parent
      Yeah, I think I agree with that. Nice.
  - ChristianKl 9 Oct 2020 20:31 UTC
    2 points
    Parent
    The most capable humans are often much more capable then the average and thus not superhuman. I remember the example of a hacker who gave a talk at the CCC about how he was in vacation in Taiwan and hacked their electronic payment system on the side. If you could scale him up 10,000 or 100,000 times the kind of cyberwar you could wage would be enormous.