Yeah, this seems right. I didn’t include them because it’s a lot more fuzzy and intuition-y than everything else that I’ve written. (This wasn’t an explicit, conscious choice; more like when I generated the list of things I wanted to write about, this wasn’t on it because it was insufficiently crystallized.) I agree that it really should be in the sequence somewhere, I’ll probably add it to the post on narrow value learning some time after the sequence is done.
AI safety without goal-directed behaviorvery vaguely gestures in the right direction, but there’s no reasonable way for a reader to figure out my hopes for narrow value learning from that post alone.
Yeah, this seems right. I didn’t include them because it’s a lot more fuzzy and intuition-y than everything else that I’ve written. (This wasn’t an explicit, conscious choice; more like when I generated the list of things I wanted to write about, this wasn’t on it because it was insufficiently crystallized.) I agree that it really should be in the sequence somewhere, I’ll probably add it to the post on narrow value learning some time after the sequence is done.
AI safety without goal-directed behavior very vaguely gestures in the right direction, but there’s no reasonable way for a reader to figure out my hopes for narrow value learning from that post alone.