My issue with a textbook comes more from the lack of consensus. Like, the fundamentals (what you would put in the first few chapters) for embedded agency are different from those for preference learning, different from those for inner alignment, different from those for agent incentives (to only quote a handful of research directions). IMO, a textbook would either overlook big chunks of the field or look more like an enumeration of approaches than a unified resource.
IMO, a textbook would either overlook big chunks of the field or look more like an enumeration of approaches than a unified resource.
Textbooks that cover a number of different approaches without taking a position on which one is the best are pretty much the standard in many fields. (I recall struggling with it in some undergraduate psychology courses, as previous schooling didn’t prepare me for a textbook that would cover three mutually exclusive theories and present compelling evidence in favor of each. Before moving on and presenting three mutually exclusive theories about some other phenomenon on the very next page.)
Fair enough. I think my real issue with an AI Alignment textbook is that for me a textbook presents relatively foundational and well established ideas and theories (maybe multiple ones), whereas I feel that AI Alignment is basically only state-of-the-art exploration, and that we have very few things that should actually be put into a textbook right now.
But I could change my mind if you have an example of what should be included in such an AI Alignment textbook.
That doesn’t seem like a big problem to me. Just make a different textbook for each major approach, or a single textbook that talks about each of them in turn. I would love such a book, and would happily recommend it to people looking to learn more about the field.
Or, just go ahead and overlook big chunks of the field. As long as you are clear that this is what you are doing, the textbook will still be useful for those interested in the chunk it covers.
As I said in my answer to Kaj, the real problem I see is that I don’t think we have the necessary perspective to write a useful textbook. Textbooks basically never touch research in the last ten years, or that research must be really easy to interpret and present, which is not the case here.
I think we do. I also think attempting to write a textbook would speed up the process of acquiring more perspective. Our goals, motivations, and constraints are very different from the goals and motivations of most textbook-writers, I think, so I don’t feel much pressure to defer to the collective judgment of other textbook-writers.
My issue with a textbook comes more from the lack of consensus. Like, the fundamentals (what you would put in the first few chapters) for embedded agency are different from those for preference learning, different from those for inner alignment, different from those for agent incentives (to only quote a handful of research directions). IMO, a textbook would either overlook big chunks of the field or look more like an enumeration of approaches than a unified resource.
Textbooks that cover a number of different approaches without taking a position on which one is the best are pretty much the standard in many fields. (I recall struggling with it in some undergraduate psychology courses, as previous schooling didn’t prepare me for a textbook that would cover three mutually exclusive theories and present compelling evidence in favor of each. Before moving on and presenting three mutually exclusive theories about some other phenomenon on the very next page.)
Fair enough. I think my real issue with an AI Alignment textbook is that for me a textbook presents relatively foundational and well established ideas and theories (maybe multiple ones), whereas I feel that AI Alignment is basically only state-of-the-art exploration, and that we have very few things that should actually be put into a textbook right now.
But I could change my mind if you have an example of what should be included in such an AI Alignment textbook.
That doesn’t seem like a big problem to me. Just make a different textbook for each major approach, or a single textbook that talks about each of them in turn. I would love such a book, and would happily recommend it to people looking to learn more about the field.
Or, just go ahead and overlook big chunks of the field. As long as you are clear that this is what you are doing, the textbook will still be useful for those interested in the chunk it covers.
As I said in my answer to Kaj, the real problem I see is that I don’t think we have the necessary perspective to write a useful textbook. Textbooks basically never touch research in the last ten years, or that research must be really easy to interpret and present, which is not the case here.
I’m open to being proven wrong, though.
I think we do. I also think attempting to write a textbook would speed up the process of acquiring more perspective. Our goals, motivations, and constraints are very different from the goals and motivations of most textbook-writers, I think, so I don’t feel much pressure to defer to the collective judgment of other textbook-writers.