Wei Dai comments on Predictions for GPT-N

Wei Dai 30 Jul 2020 3:06 UTC
22 points
Anyone want to predict when we’ll reach the same level of translation and other language capability as GPT-3 via iterated amplification or another “aligned” approach? (How far behind is alignment work compared to capability work?)
- David Scott Krueger (formerly: capybaralet) 1 Aug 2020 18:18 UTC
  3 points
  Parent
  I think GPT-3 should be viewed as roughly as aligned as IDA would be if we pursued it using our current understanding. GPT-3 is trained via self-supervised learning (which is, on the face of it, myopic), so the only obvious x-safety concerns are something like mesa-optimization.
  In my mind, the main argument for IDA being safe is still myopia.
  I think GPT-3 seems safer than (recursive) reward modelling, CIRL, or any other alignment proposals based on deliberately building agent-y AI systems.
  --------------------
  In the above, I’m ignoring the ways in which any of these systems increase x-risk via their (e.g. destabilizing) social impact and/or contribution towards accelerating timelines.
  What links here?
  - David Scott Krueger (formerly: capybaralet)'s comment on Developmental Stages of GPTs by orthonormal (1 Aug 2020 23:21 UTC; 6 points)