evhub comments on Risks from Learned Optimization: Conclusion and Related Work

evhub 24 Jul 2019 21:34 UTC
LW: 5 AF: 3
AF
I think it’s still an open question to what extent not having any mesa-optimization would hurt capabilities, but my sense is indeed that mesa-optimization is likely inevitable if you want to build safe AGI which is competitive with a baseline unaligned approach. Thus, I tend towards thinking that the right strategy is to understand that you’re definitely going to produce a mesa-optimizer, and just have a really strong story for why it will be aligned.