Steven Byrnes comments on My AGI Threat Model: Misaligned Model-Based RL Agent

Steven Byrnes 12 May 2021 20:12 UTC
LW: 4 AF: 3
AF
RE online learning, I acknowledge that a lot of reasonable people agree with you on that, and it’s hard to know for sure. But I argued my position in Against evolution as an analogy for how humans will build AGI.
Also there: a comment thread about why I’m skeptical that GPT-N would be capable of doing the things we want AGI to do, unless we fine-tune the weights on the fly, in a manner reminiscent of online learning (or amplification).
- abramdemski 13 May 2021 15:12 UTC
  LW: 2 AF: 2
  AF Parent
  I have not properly read all of that yet, but my very quick take is that your argument for a need for online learning strikes me as similar to your argument against the classic inner alignment problem applying to the architectures you are interested in. You find what I call mesa-learning implausible for the same reasons you find mesa-optimization implausible.
  Personally, I’ve come around to the position (seemingly held pretty strongly by other folks, eg Rohin) that mesa-learning is practically inevitable for most tasks.