If it helps, I have a discussion of Concept Extrapolation in the context of aligning a real-deal agent-y AGI in §14.4 here.
So far I can’t quite get the whole story to hang together, as you’ll see from that link. But I definitely see it as a “shot on goal”. (Well, at least, I think the broader project / framework is a “shot on goal”. I don’t find the image classification project to be directly addressing any of my most burning questions.)
If it helps, I have a discussion of Concept Extrapolation in the context of aligning a real-deal agent-y AGI in §14.4 here.
So far I can’t quite get the whole story to hang together, as you’ll see from that link. But I definitely see it as a “shot on goal”. (Well, at least, I think the broader project / framework is a “shot on goal”. I don’t find the image classification project to be directly addressing any of my most burning questions.)