I’m in a weird situation here: I’m not entirely sure whether the community considers the Learning Theory Agenda to be the same alignment plan as The Plan (which is arguably not a plan at all but he sure thinks about value learning!), and whether I can count things like the class of scalable oversight plans which take as read that “human values” are a specific natural object. Would you at least agree that those first two (or one???) rely on that?
I’m in a weird situation here: I’m not entirely sure whether the community considers the Learning Theory Agenda to be the same alignment plan as The Plan (which is arguably not a plan at all but he sure thinks about value learning!), and whether I can count things like the class of scalable oversight plans which take as read that “human values” are a specific natural object. Would you at least agree that those first two (or one???) rely on that?