Is this in fact a part of the AI alignment problem, and if so is anyone trying to solve this facet of the problem and where might I go to read more about that?
Yes, it’s part of some approaches to the AI alignment problem. It used to be considered more central to AI alignment until people started thinking it might be too hard, and started working on other ways of trying to solve AI alignment that perhaps don’t require “finding an effective way to tell an AI what wellbeing is”. See AI Safety “Success Stories” where “Sovereign Singleton” requires solving this and the others don’t (at least not right away). See also Friendly AI and Coherent Extrapolated Volition.
Yes, it’s part of some approaches to the AI alignment problem. It used to be considered more central to AI alignment until people started thinking it might be too hard, and started working on other ways of trying to solve AI alignment that perhaps don’t require “finding an effective way to tell an AI what wellbeing is”. See AI Safety “Success Stories” where “Sovereign Singleton” requires solving this and the others don’t (at least not right away). See also Friendly AI and Coherent Extrapolated Volition.