as a weak alignment techniques we might use to bootstrap strong alignment.
Yes, it also reminded me Christiano approach of amplification and distillation.
Thanks both! I definitely had the idea that Paul had mentioned something similar somewhere but hadn’t made it a top-level concept. I think there’s similar echos in how Eliezer talked about seed AI in the early Friendly AI work.
Yes, it also reminded me Christiano approach of amplification and distillation.
Thanks both! I definitely had the idea that Paul had mentioned something similar somewhere but hadn’t made it a top-level concept. I think there’s similar echos in how Eliezer talked about seed AI in the early Friendly AI work.