johnswentworth comments on Confucianism in AI Alignment

johnswentworth 3 Nov 2020 17:55 UTC
LW: 2 AF: 1
AF
To be clear, I am definitely not arguing for a pure mechanism-design approach to all of AI alignment. The argument in the OP is relevant to inner optimizers because we can’t just directly choose which goals to program into them. We can directly choose which goals to program into an outer optimizer, and I definitely think that’s the right way to go.