Stuart_Armstrong comments on The blue-minimising robot and model splintering

Stuart_Armstrong 4 Dec 2022 16:45 UTC
LW: 7 AF: 4
AF
This post is on a very important topic: how could we scale ideas about value extrapolation or avoiding goal misgeneralisation… all the way up to superintelligence? As such, its ideas are very worth exploring and getting to grips to. It’s a very important idea.

However, the post itself is not brilliantly written, and is more of “idea of a potential approach” than a well crafted theory post. I hope to be able to revisit it at some point soon, but haven’t been able to find or make the time, yet.