You had me at the title. :) I feel like I’ve been waiting for ten years for someone to make this point. (Ten years ago was roughly when I first heard about and started getting into game and decision theory). I’m not sure I like the taxonomy of alignment approaches you make, but I still think it’s valuable to highlight the ‘magic parts’ that are so often glossed over and treated as unimportant.
I spent way too many years metaphorically glancing around the room, certain that I must be missing something that is obvious to everyone else. I wish somebody had told me that I wasn’t missing anything, and these conceptual blank spots are very real and very important.
As for the latter bit, I am not really an Alignment Guy. The taxonomy I offer is very incomplete. I do think that the idea of framing the Alignment landscape in terms of “how does it help build a good decision tree? what part of that process does it address or solve?” has some potential.
Huh, this feels like it lines up with my view from 11 years ago, but I definitely didn’t state it as crisply as I remember stating it. (Maybe it was in a comment I couldn’t find?) Like, the math is trivial and the difficulties are all in problem formulation and model elicitation (of both preferences and dynamics).
You had me at the title. :) I feel like I’ve been waiting for ten years for someone to make this point. (Ten years ago was roughly when I first heard about and started getting into game and decision theory). I’m not sure I like the taxonomy of alignment approaches you make, but I still think it’s valuable to highlight the ‘magic parts’ that are so often glossed over and treated as unimportant.
I spent way too many years metaphorically glancing around the room, certain that I must be missing something that is obvious to everyone else. I wish somebody had told me that I wasn’t missing anything, and these conceptual blank spots are very real and very important.
As for the latter bit, I am not really an Alignment Guy. The taxonomy I offer is very incomplete. I do think that the idea of framing the Alignment landscape in terms of “how does it help build a good decision tree? what part of that process does it address or solve?” has some potential.
Huh, this feels like it lines up with my view from 11 years ago, but I definitely didn’t state it as crisply as I remember stating it. (Maybe it was in a comment I couldn’t find?) Like, the math is trivial and the difficulties are all in problem formulation and model elicitation (of both preferences and dynamics).