Thanks. I guess I’d just prefer it if more people were saying, “Hey, even though it seems difficult, we need to go hard after conscience guard rails (or ‘value alignment’) for AI now and not wait until we have AI’s that could help us figure this out. Otherwise, some of us we might not make it until we have AI’s that could help us figure this out.” But I also realize that I’m just generally much more optimistic about the tractability of this problem than most people appear to be, although Shane Legg seemed to say it wasn’t “too hard,” haha.[1]
Legg was talking about something different than I am, though—he was talking about “fairly normal” human values and ethics, or what most people value, while I’m basically talking about what most people would value if they were wiser.
Oh hey—I just stumbled back on this comment and realized: it’s the primary reason I wrote
Intent alignment as a stepping-stone to value alignment
On not giving up on value alignment, while acknowledging that instruction-following is a much safer first alignment target.
Thanks. I guess I’d just prefer it if more people were saying, “Hey, even though it seems difficult, we need to go hard after conscience guard rails (or ‘value alignment’) for AI now and not wait until we have AI’s that could help us figure this out. Otherwise, some of us we might not make it until we have AI’s that could help us figure this out.” But I also realize that I’m just generally much more optimistic about the tractability of this problem than most people appear to be, although Shane Legg seemed to say it wasn’t “too hard,” haha.[1]
Legg was talking about something different than I am, though—he was talking about “fairly normal” human values and ethics, or what most people value, while I’m basically talking about what most people would value if they were wiser.