RogerDearnaley comments on A “Bitter Lesson” Approach to Aligning AGI and ASI

RogerDearnaley 10 Jul 2024 6:04 UTC
2 points
0
I hadn’t yet got around to reading the CAST series: now I have to! :-)
Some of the authors of the Pretraining Language Models with Human Preferences paper now work at Anthropic. I would also love for Anthropic to hire me to work on this stuff!
In some sense, the human input and oversight in AI-assisted alignment is the same thing as corrigibility.