My concern in this area is that we currently don’t have a single specification for such a utility function
I would claim that both Value Learning and Requirements for a Basin of Attraction to Alignment are outlines for how to create such a utility function. But I agree we don’t have a detailed specification yet.
I would claim that both Value Learning and Requirements for a Basin of Attraction to Alignment are outlines for how to create such a utility function. But I agree we don’t have a detailed specification yet.