Having just read this sequence (or for some of the posts in it, reread them), I endorse it too: it’s excellent.
It covers a lot of the same ground as my post Requirements for a STEM-capable AGI Value Learner (my Case for Less Doom) in a more leisurely and discursive way, and I think ends up in about the same place: Value Learning isn’t about locating the One True Unified Utility Function that is the True Name of happiness and can thus be safely strongly optimized, it’s about treating researching human values like any other a STEM-like soft science field, and doing the same sorts of cautious, Bayesian, experimental things that we do in any scientific/technical/engineering effort, and avoiding Goodharting by being cautious enough not to trust models (of human values, or anything else) outside their experimentally-supported range of validity, like any sensible STE practitioner. So use all of STEM, don’t only think like a mathematician.
Having just read this sequence (or for some of the posts in it, reread them), I endorse it too: it’s excellent.
It covers a lot of the same ground as my post Requirements for a STEM-capable AGI Value Learner (my Case for Less Doom) in a more leisurely and discursive way, and I think ends up in about the same place: Value Learning isn’t about locating the One True Unified Utility Function that is the True Name of happiness and can thus be safely strongly optimized, it’s about treating researching human values like any other a STEM-like soft science field, and doing the same sorts of cautious, Bayesian, experimental things that we do in any scientific/technical/engineering effort, and avoiding Goodharting by being cautious enough not to trust models (of human values, or anything else) outside their experimentally-supported range of validity, like any sensible STE practitioner. So use all of STEM, don’t only think like a mathematician.