Ok, I think I mostly understand now, but it seems like I had to do a lot of guessing and asking questions to figure out what your hopes are for the future of narrow value learning and how you see it potentially fit into the big picture for long term AI safety, which are important motivations for this part of the sequence. Did you write about them somewhere that I missed, or were you planning to write about them later? If later, I think it would have been better to write about them at the same time that you introduced narrow value learning, so readers have some idea of why they should pay attention to it. (This is mostly feedback for future reference, but I guess you could also add to previous posts for the benefit of future readers.)
Yeah, this seems right. I didn’t include them because it’s a lot more fuzzy and intuition-y than everything else that I’ve written. (This wasn’t an explicit, conscious choice; more like when I generated the list of things I wanted to write about, this wasn’t on it because it was insufficiently crystallized.) I agree that it really should be in the sequence somewhere, I’ll probably add it to the post on narrow value learning some time after the sequence is done.
AI safety without goal-directed behaviorvery vaguely gestures in the right direction, but there’s no reasonable way for a reader to figure out my hopes for narrow value learning from that post alone.
Ok, I think I mostly understand now, but it seems like I had to do a lot of guessing and asking questions to figure out what your hopes are for the future of narrow value learning and how you see it potentially fit into the big picture for long term AI safety, which are important motivations for this part of the sequence. Did you write about them somewhere that I missed, or were you planning to write about them later? If later, I think it would have been better to write about them at the same time that you introduced narrow value learning, so readers have some idea of why they should pay attention to it. (This is mostly feedback for future reference, but I guess you could also add to previous posts for the benefit of future readers.)
Yeah, this seems right. I didn’t include them because it’s a lot more fuzzy and intuition-y than everything else that I’ve written. (This wasn’t an explicit, conscious choice; more like when I generated the list of things I wanted to write about, this wasn’t on it because it was insufficiently crystallized.) I agree that it really should be in the sequence somewhere, I’ll probably add it to the post on narrow value learning some time after the sequence is done.
AI safety without goal-directed behavior very vaguely gestures in the right direction, but there’s no reasonable way for a reader to figure out my hopes for narrow value learning from that post alone.