habryka comments on Thoughts on the impact of RLHF research

habryka 26 Jan 2023 17:20 UTC
LW: 31 AF: 16
6
AF

I think the effect would have been very similar if it had been trained via supervised learning on good dialogs

I don’t currently think this is the case, and seems like the likely crux. In general it seems that RLHF is substantially more flexible in what kind of target task it allows you to train for, which is the whole reason for why you are working on it, and at least my model of the difficulty of generating good training data for supervised learning here is that it would have been a much greater pain, and would have been much harder to control in various fine-grained ways (including preventing the AI from saying controversial things), which had been the biggest problem with previous chat bot attempts.

For ChatGPT in particular, I think it was built by John Schulman’s team

I find a comparison with John Schulman here unimpressive if you want to argue progress on this was overdetermined, given the safety motivation by John, and my best guess being that if you had argued forcefully that RLHF was pushing on commercialization bottlenecks, that John would have indeed not worked on it.

Seeing RLHF teams in other organizations not directly downstream of your organizational involvement, or not quite directly entangled with your opinion, would make a bigger difference here.

I feel like the implicit model of the world you are using here is going to have effect sizes adding up to much more than the actual variance at stake

I don’t think so, and have been trying to be quite careful about this. Chat-GPT is just by far the most successful AI product to date, with by far the biggest global impact on AI investment and the most hype. I think $10B being downstream of that isn’t that crazy. The product has a user base not that different from other $10B products, and a growth rate to put basically all of them to shame, so I don’t think a $10B effect from Chat-GPT seems that unreasonable. There is only so much variance to go around, but Chat-GPT is absolutely massive in its impact.
- paulfchristiano 26 Jan 2023 17:48 UTC
  LW: 20 AF: 11
  8
  AF Parent
  I don’t currently think this is the case, and seems like the likely crux. In-general it seems that RLHF is substantially more flexible in what kind of target task it allows you to train, which is the whole reason for why you are working on it, and at least my model of the difficulty of generating good training data for supervised learning here is that it would have been a much greater pain, and would have been much harder to control in various fine-tuned ways (including preventing the AI from saying controversial things), which had been the biggest problem with previous chat bot attempts.
  I bet they did generate supervised data (certainly they do for InstructGPT), and supervised data seems way more fine-grained in what you are getting the AI to do. It’s just that supervised fine-tuning is worse.
  I think the biggest problem with previous chat-bot attempts is that the underlying models are way way weaker than GPT-3.5.
  I don’t think so, and have been trying to be quite careful about this. Chat-GPT is just by far the most successful AI product to date, with by far the biggest global impact on AI investment and the most hype. I think $10B being downstream of that isn’t that crazy. The product has a user base not that different from other $10B products, and a growth rate to put basically all of them to shame, so I don’t think a $10B effect from Chat-GPT seems that unreasonable. There is only so much variance to go around, but Chat-GPT is absolutely massive in its impact.
  This still seems totally unreasonable to me:
  - How much total investment do you think there is in AI in 2023?
  - How much variance do you think there is in the level of 2023 investment in AI? (Or maybe whatever other change you think is equivalent.)
  - How much influence are you giving to GPT-3, GPT-3.5, GPT-4? How much to the existence of OpenAI? How much to the existence of Google? How much to Jasper? How much to good GPUs?
  I think it’s unlikely that the reception of ChatGPT increased OpenAI’s valuation by $10B, much less investment in OpenAI, even before thinking about replaceability. I think that Codex, GPT-4, DALL-E, etc. are all very major parts of the valuation.
  I also think replaceability is a huge correction term here. I think it would be more reasonable to talk about moving how many dollars of investment how far forward in time.
  I find a comparison with John Schulman here unimpressive if you want to argue progress on this was overdetermined, given the safety motivation by John, and my best guess being that if you had argued forcefully that RLHF was pushing on commercialization bottlenecks, that John would have indeed not worked on it.
  I think John wants to make useful stuff, so I doubt this.
  - habryka 26 Jan 2023 21:14 UTC
    LW: 23 AF: 13
    7
    AF Parent
    How much total investment do you think there is in AI in 2023?
    My guess is total investment was around the $200B - $500B range, with about $100B of that into new startups and organizations, and around $100-$400B of that in organizations like Google and Microsoft outside of acquisitions. I have pretty high uncertainty on the upper end here, since I don’t know what fraction of Google’s revenue gets reinvested again into AI, how much Tesla is investing in AI, how much various governments are investing, etc.
    How much variance do you think there is in the level of 2023 investment in AI? (Or maybe whatever other change you think is equivalent.)
    Variance between different years depending on market condition and how much products take off seems like on the order of 50% to me. Like, different years have pretty hugely differing levels of investment.
    My guess is about 50% of that variance is dependent on different products taking off, how much traction AI is getting in various places, and things like Chat-GPT existing vs. not existing.
    So this gives around $50B - $125B of variance to be explained by product-adjacent things like Chat-GPT.
    How much influence are you giving to GPT-3, GPT-3.5, GPT-4? How much to the existence of OpenAI? How much to the existence of Google? How much to Jasper? How much to good GPUs?
    Existence of OpenAI is hard to disentangle from the rest. I would currently guess that in terms of total investment, GPT-2 → GPT-3 made a bigger difference than GPT-3.5 → Chat-GPT, but both made a much larger difference than GPT-3 → GPT-3.5.
    I don’t think Jasper made a huge difference, since its userbase is much smaller than Chat-GPT, and also evidently the hype from it has been much lower.
    Good GPUs feels kind of orthogonal. We can look at each product that makes up my 50% of the variance to be explained and see how useful/necessary good GPUs were for its development, and my sense is for Chat-GPT at least the effect of good GPUs were relatively minor since I don’t think the training to move from GPT-3.5 to Chat-GPT was very compute intensive.
    I would feel fine saying expected improvements in GPUs are responsible for 25% of the 50% variance (i.e. 17.5%) if you chase things back all the way, though that again feels like it isn’t trying to add up to 100% with the impact from “Chat-GPT”. I do think it’s trying to add up to 100% with the impact from “RLHF’s effect on Chat-GPT”, which I claimed was at least 50% of the impact of Chat-GPT in-particular.
    In any case, in order to make my case for $10B using these numbers I would have to argue that between 20% and 8% of the product-dependent variance in annual investment into AI is downstream of Chat-GPT, and indeed that still seems approximately right to me after crunching the numbers. It’s by far the biggest AI product of the last few years, it is directly credited with sparking an arms race between Google and Microsoft, and indeed even something as large as 40% wouldn’t seem totally crazy to me, since these kinds of things tend to be heavy-tailed, so if you select on the single biggest thing, there is a decent chance you underestimate its effect.
    - paulfchristiano 27 Jan 2023 16:44 UTC
      LW: 11 AF: 6
      −4
      AF Parent
      I didn’t realize how broadly you were defining AI investment. If you want to say that e.g ChatGPT increased investment by $10B out of $200-500B, so like +2-5%, I’m probably happy to agree (and I also think it had other accelerating effects beyond that).
      I would guess that a 2-5% increase in total investment could speed up AGI timelines 1-2 weeks depending on details of the dynamics, like how fast investment was growing, how much growth is exogenous vs endogenous, diminishing returns curves, importance of human capital, etc.. If you mean +2-5% investment in a single year then I would guess the impact is < 1 week.
      I haven’t thought about it much, but my all things considered estimate for the expected timelines slowdown if you just hadn’t done the ChatGPT release is probably between 1-4 weeks.
      Is that the kind of effect size you are imagining here? I guess the more important dynamic is probably more people entering the space rather than timelines per se?
      One thing worth pointing out in defense of your original estimate is that variance should add up to 100%, not effect sizes, so e.g. if the standard deviation is $100B then you could have 100 things each explaining ($10B)^2 of variance (and hence each responsible for +-$10B effect sizes after the fact).
      - habryka 27 Jan 2023 19:08 UTC
        LW: 12 AF: 6
        12
        AF Parent
        I didn’t realize how broadly you were defining AI investment. If you want to say that e.g ChatGPT increased investment by $10B out of $200-500B, so like +2-5%, I’m probably happy to agree (and I also think it had other accelerating effects beyond that).
        Makes sense, sorry for the miscommunication. I really didn’t feel like I was making a particularly controversial claim with the $10B, so was confused why it seemed so unreasonable to you.
        I do think those $10B are going to be substantially more harmful for timelines than other money in AI, because I do think a good chunk of that money will much more directly aim at AGI than most other investment. I don’t know what my multiplier here for effect should be, but my guess is something around 3-5x in expectation (I’ve historically randomly guessed that AI applications are 10x less timelines-accelerating per dollar than full-throated AGI-research, but I sure have huge uncertainty about that number).
        That, plus me thinking there is a long tail with lower probability where Chat-GPT made a huge difference in race dynamics, and thinking that this marginal increase in investment does probably translate into increases in total investment, made me think this was going to shorten timelines in-expectation by something closer to 8-16 weeks, which isn’t enormously far away from yours, though still a good bit higher.
        And yeah, I do think the thing I am most worried about with Chat-GPT in addition to just shortening timelines is increasing the number of actors in the space, which also has indirect effects on timelines. A world where both Microsoft and Google are doubling down on AI is probably also a world where AI regulation has a much harder time taking off. Microsoft and Google at large also strike me as much less careful actors than the existing leaders of AGI labs which have so far had a lot of independence (which to be clear, is less of an endorsement of current AGI labs, and more of a statement about very large moral-maze like institutions with tons of momentum). In-general the dynamics of Google and Microsoft racing towards AGI sure is among my least favorite takeoff dynamics in terms of being able to somehow navigate things cautiously.
        One thing worth pointing out in defense of your original estimate is that variance should add up to 100%, not effect sizes, so e.g. if the standard deviation is $100B then you could have 100 things each explaining ($10B)^2 of variance (and hence each responsible for +-$10B effect sizes after the fact).
        Oh, yeah, good point. I was indeed thinking of the math a bit wrong here. I will think a bit about how this adjusts my estimates, though I think I was intuitively taking this into account.
        Kaj_Sotala 30 Jan 2023 9:49 UTC
        LW: 5 AF: 2
        5
        AF Parent
        And yeah, I do think the thing I am most worried about with Chat-GPT in addition to just shortening timelines is increasing the number of actors in the space, which also has indirect effects on timelines. A world where both Microsoft and Google are doubling down on AI is probably also a world where AI regulation has a much harder time taking off.
        Maybe—but Microsoft and Google are huge organizations, and huge organizations have an incentive to push for regulation that imposes costs that they can pay while disproportionately hampering smaller competitors. It seems plausible to me that both M & G might prefer a regulatory scheme that overall slows down progress while cementing their dominance, since that would be a pretty standard regulatory-capture-driven-by-the-dominant-actors-in-the-field kind of scenario.
        A sudden wave of destabilizing AI breakthroughs—with DALL-E/Midjourney/Stable Diffusion suddenly disrupting art and Chat-GPT who-knows-how-many-things—can also make people on the street concerned and both more supportive of AI regulation in general, as well as more inclined to take AGI scenarios seriously in particular. I recently saw a blog post from someone speculating that this might cause a wide variety of actors—M & G included—with a desire to slow down AI progress to join forces to push for widespread regulation.
        konstantin 30 Jan 2023 17:52 UTC
        2 points
        0
        Parent
        It seems plausible to me that both M & G might prefer a regulatory scheme that overall slows down progress while cementing their dominance, since that would be a pretty standard regulatory-capture-driven-by-the-dominant-actors-in-the-field kind of scenario.
        Interesting. Where did something like this happen?
        Archimedes 5 Feb 2023 20:01 UTC
        4 points
        1
        Parent
        I asked Chat-GPT and one of the clearest examples it came up with is patent trolling by large pharmaceutical companies. Their lobbying tends to be far more focused on securing monopoly rights to their products for as long as possible than anything related to innovation.
        Other examples:
        Automakers lobbying for restrictive standards for potential market disruptors like electric or self-driving vehicles
        Telecoms lobbying against Net Neutrality
        Taxi companies lobbying against ridesharing startups
        Tech companies lobbying for intellectual property and data privacy regulations that they have better legal/compliance resources to handle
    - Vaniver 27 Jan 2023 17:50 UTC
      LW: 3 AF: 2
      1
      AF Parent
      Good GPUs feels kind of orthogonal.
      IMO it’s much easier to support high investment numbers in “AI” if you consider lots of semiconductor / AI hardware startup stuff as “AI investments”. My suspicion is that while GPUs were primarily a crypto thing for the last few years, the main growth outlook driving more investment is them being an AI thing.
    - Hoagy 27 Jan 2023 16:59 UTC
      LW: 2 AF: 2
      1
      AF Parent
      I’d be interested to know how you estimate the numbers here, they seem quite inflated to me.
      
      If 4 big tech companies were to invest $50B each in 2023 then, assuming average salary as $300k and 2:1 capital to salary then investment would be hiring about 50B/900K = 55,000 people to work on this stuff. For reference the total headcount at these orgs is roughly 100-200K.
      
      50B/yr is also around 25-50% of the size of the total income, and greater than profits for most which again seems high.
      
      Perhaps my capital ratio is way too low but I would find it hard to believe that these companies can meaningfully put that level of capital into action so quickly. I would guess more on the order of $50B between the major companies in 2023.
      
      Agree with paul’s comment above that timeline shifts are the most important variable.
  - habryka 20 Feb 2023 21:37 UTC
    LW: 9 AF: 5
    5
    AF Parent
    I think the qualitative difference between the supervised tuning done in text-davinci-002 and the RLHF in text-davinci-003 is modest (e.g. I’ve seen head-to-head comparisons suggesting real but modest effects on similar tasks).
    Ok, I think we might now have some additional data on this debate. It does indeed look like to me that Sydney was trained with the next best available technology after RLHF, for a few months, at least based on Gwern’s guesses here: https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned?commentId=AAC8jKeDp6xqsZK2K
    As far as I can tell this resulted in a system with much worse economic viability than Chat-GPT. I would overall describe Sydney as “economically unviable”, such that if Gwern’s story here is correct, the difference between using straightforward supervised training on chat transcripts and OpenAIs RLHF pipeline is indeed the difference between an economically viable and unviable product.
    There is a chance that Microsoft fixes this with more supervised training, but my current prediction is that they will have to fix this with RLHF, because the other technological alternatives are indeed no adequate substitutes from an economic viability perspective, which suggests that the development of RLHF did really matter a lot for this.
    - gwern 20 Feb 2023 21:47 UTC
      LW: 8 AF: 5
      3
      AF Parent
      Benchmarking on static datasets on ordinary tasks (typically not even adversarially collected in the first place) may not be a good way to extrapolate to differences in level of abuse for PR-sensitive actors like megacorps, especially for abusers that are attacking the retrieval functionality (as Sydney users explicitly were trying to populate Bing hits to steer Sydney), a functionality not involved in said benchmarking at all. Or to put it another way, the fact that text-davinci-003 does only a little better than text-davinci-002 in terms of accuracy % may tell you little about how profitable in $ each will be once 4chan & the coomers get their hands on it… It is not news to anyone here that average-case performance on proxy metrics on some tame canned datasets may be unrelated to out-of-distribution robustness on worst-case adversary-induced decision-relevant losses, in much the same way that model perplexity tells us little about what a model is useful for or how vulnerable it is.
      - habryka 20 Feb 2023 22:56 UTC
        LW: 3 AF: 2
        0
        AF Parent
        Yeah, this is basically my point. Not sure whether whether you are agreeing or disagreeing. I was specifically quoting Paul’s comment saying “I’ve seen only modest qualitative differences” in order to disagree and say “I think we’ve now seen substantial qualitative differences”.
        We have had 4chan play around with Chat-GPT for a while, with much less disastrous results than what happened when they got access to Sydney.
        It is not news to anyone here that average-case performance on proxy metrics on some tame canned datasets may be unrelated to out-of-distribution robustness on worst-case adversary-induced decision-relevant losses, in much the same way that model perplexity tells us little about what a model is useful for or how vulnerable it is.
        I wish that this not being news to anyone here was true but this does not currently seem true to me. But doesn’t seem worth going into.
        gwern 21 Feb 2023 1:23 UTC
        LW: 5 AF: 4
        3
        AF Parent
        I was elaborating in more ML-y jargon, and also highlighting that there are a lot of wildcards omitted from Paul’s comparison: retrieval especially was an interesting dynamic.
    - LawrenceC 20 Feb 2023 21:46 UTC
      LW: 8 AF: 4
      2
      AF Parent
      For what it’s worth, I buy the claim from Gwern that Microsoft trained Sydney pretty poorly, much worse than is achievable with SFT on highly rated data. For example, Sydney shows significant repetition, which you don’t see even on text-davinci-002 or (early 2022) LaMDA, both trained without RLHF.
      - habryka 20 Feb 2023 22:58 UTC
        LW: 4 AF: 2
        0
        AF Parent
        Yep, I think it’s pretty plausible this is just a data-quality issue, though I find myself somewhat skeptical of this. Maybe worth a bet?
        I would be happy to bet that conditional on them trying to solve this with more supervised training and no RLHF, we are going to see error modes substantially more catastrophic than current Chat-GPT.
  - Richard_Ngo 27 Jan 2023 19:38 UTC
    LW: 9 AF: 6
    4
    AF Parent
    Supervised data seems way more fine-grained in what you are getting the AI to do. It’s just that supervised fine-tuning is worse.
    My (pretty uninformed) guess here is that supervised fine-tuning vs RLHF has relatively modest differences in terms of producing good responses, but bigger differences in terms of avoiding bad responses. And it seems reasonable to model decisions about product deployments as being driven in large part by how well you can get AI not to do what you don’t want it to do.
    - ChristianKl 28 Jan 2023 0:54 UTC
      3 points
      2
      Parent
      And it seems reasonable to model decisions about product deployments as being driven in large part by how well you can get AI not to do what you don’t want it to do.
      It depends a lot on the use case.
      When it comes to what I’m doing with ChatGPT, I care more about the quality of the best answer when I generate five answers to a prompt than I care about the quality of the worst answer. I can choose the best answer myself and ignore the others.
      Many use cases have ways to filter for valuable results either automatically or by letting a human filter.
  - habryka 26 Jan 2023 17:54 UTC
    LW: 8 AF: 4
    1
    AF Parent
    
    I think it’s unlikely that the reception of ChatGPT increased OpenAI’s valuation by $10B, much less investment in OpenAI, even before thinking about replaceability.
    
    Note that I never said this, so I am not sure what you are responding to. I said Chat-GPT increases investment in AI by $10B, not that it increased investment into specifically OpenAI. Companies generally don’t have perfect mottes. Most of that increase in investment is probably in internal Google allocation and in increased investment into the overall AI industry.
- habryka 3 Feb 2023 23:47 UTC
  LW: 10 AF: 5
  5
  AF Parent
  Relevant piece of data: https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/?fbclid=IwAR3KTBnxC_y7n0TkrCdcd63oBuwnu6wyXcDtb2lijk3G-p9wdgD9el8KzQ4
  Feb 1 (Reuters) - ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after launch, making it the fastest-growing consumer application in history, according to a UBS study on Wednesday.
  The report, citing data from analytics firm Similarweb, said an average of about 13 million unique visitors had used ChatGPT per day in January, more than double the levels of December.
  “In 20 years following the internet space, we cannot recall a faster ramp in a consumer internet app,” UBS analysts wrote in the note.
  I had some decent probability on this outcome but I have increased my previous estimate of the impact of Chat-GPT by 50%, since I didn’t expect something this radical (“the single fastest growing consumer product in history”).