kuira comments on How to talk about reasons why AGI might not be near?

kuira 17 Sep 2023 10:21 UTC
24 points
15
Hasn’t the alignment community historically done a lot to fuel capabilities?
For example, here’s an excerpt from a post I read recently
My guess is RLHF research has been pushing on a commercialization bottleneck and had a pretty large counterfactual effect on AI investment, causing a huge uptick in investment into AI and potentially an arms race between Microsoft and Google towards AGI: https://www.lesswrong.com/posts/vwu4kegAEZTBtpT6p/thoughts-on-the-impact-of-rlhf-research?commentId=HHBFYow2gCB3qjk2i
- leogao 18 Sep 2023 19:37 UTC
  14 points
  1
  Parent
  I don’t think RLHF in particular had a very large counterfactual impact on commercialization or the arms race. The idea of non-RL instruction tuning for taking base models and making them more useful is very obvious for commercialization (there are multiple concurrent works to InstructGPT). PPO is better than just SFT or simpler approaches on top of SFT, but not groundbreakingly more so. You can compare text-davinci-002 (FeedME) and text-davinci-003 (PPO) to see.
  The arms race was directly caused by ChatGPT, which took off quite unexpectedly not because of model quality due to RLHF, but because the UI was much more intuitive to users than the Playground (instruction following GPT3.5 was already in the API and didn’t take off in the same way). The tech tree from having a powerful base model to having a chatbot is not constrained on RLHF existing at all, either.
  To be clear, I happen to also not be very optimistic about the alignment relevance of RLHF work beyond the first few papers—certainly if someone were to publish a paper today making RLHF twice as data efficient or whatever I would consider this basically just a capabilities paper.