Thanks for elaborating. I don’t think I have the necessary familiarity with the alignment research community to assess your characterization of the situation, but I appreciate your willingness to raise potentially unpopular hypotheses to attention. +1
+1 for this whole conversation, including Adam pushing back re prosaic alignment / trying to articulate disagreements! I agree that this is an important thing to talk about more.
I like the ‘give more concrete feedback on specific research directions’ idea, especially if it helps clarify generators for Eliezer’s pessimism. If Eliezer is pessimistic about a bunch of different research approaches simultaneously, and you’re simultaneously optimistic about all those approaches, then there must be some more basic disagreement(s) behind that.
From my perspective, the OP discussion is the opening salvo in ‘MIRI does a lot more model-sharing and discussion’. It’s more like a preface than like a conclusion, and the next topic we plan to focus on is why Eliezer-cluster people think alignment is hard, how we’re thinking about AGI, etc. In the meantime, I’m strongly in favor of arguing about this a bunch in the comments, sharing thoughts and reflections on your own models, etc. -- going straight for the meaty central disagreements now, not waiting to hash this out later.
Someone privately contacted me to express confusion, because they thought my ‘+1’ means that I think adamShimi’s initial comment was unusually great. That’s not the case. The reasons I commented positively are:
I think this overall exchange went well—it raised good points that might have otherwise been neglected, and everyone quickly reached agreement about the real crux.
I want to try to cancel out any impression that criticizing / pushing back on Eliezer-stuff is unwelcome, since Adam expressed worries about a “taboo on criticizing MIRI and EY too hard”.
On a more abstract level, I like seeing people ‘blurt out what they’re actually thinking’ (if done with enough restraint and willingness-to-update to mostly avoid demon threads), even if I disagree with the content of their thought. I think disagreements are often tied up in emotions, or pattern-recognition, or intuitive senses of ‘what a person/group/forum is like’. This can make it harder to epistemically converge about tough topics, because there’s a temptation to pretend your cruxes are more simple and legible than they really are, and end up talking about non-cruxy things.
Separately, I endorse Ben Pace’s question (“Can you make a positive case here for how the work being done on prosaic alignment leads to success?”) as the thing to focus on.
Thanks for the kind answer, even if we’re probably disagreeing about most points in this thread. I think message like yours really help in making everyone aware that such topics can actually be discussed publicly without big backlash.
I like the ‘give more concrete feedback on specific research directions’ idea, especially if it helps clarify generators for Eliezer’s pessimism. If Eliezer is pessimistic about a bunch of different research approaches simultaneously, and you’re simultaneously optimistic about all those approaches, then there must be some more basic disagreement(s) behind that.
That sounds amazing! I definitely want to extract some of the epistemic strategies that EY uses to generate criticisms and break proposals. :)
From my perspective, the OP discussion is the opening salvo in ‘MIRI does a lot more model-sharing and discussion’. It’s more like a preface than like a conclusion, and the next topic we plan to focus on is why Eliezer-cluster people think alignment is hard, how we’re thinking about AGI, etc. In the meantime, I’m strongly in favor of arguing about this a bunch in the comments, sharing thoughts and reflections on your own models, etc. -- going straight for the meaty central disagreements now, not waiting to hash this out later.
Thanks for elaborating. I don’t think I have the necessary familiarity with the alignment research community to assess your characterization of the situation, but I appreciate your willingness to raise potentially unpopular hypotheses to attention. +1
Thanks for taking the time of asking a question about the discussion even if you lack expertise on the topic. ;)
+1 for this whole conversation, including Adam pushing back re prosaic alignment / trying to articulate disagreements! I agree that this is an important thing to talk about more.
I like the ‘give more concrete feedback on specific research directions’ idea, especially if it helps clarify generators for Eliezer’s pessimism. If Eliezer is pessimistic about a bunch of different research approaches simultaneously, and you’re simultaneously optimistic about all those approaches, then there must be some more basic disagreement(s) behind that.
From my perspective, the OP discussion is the opening salvo in ‘MIRI does a lot more model-sharing and discussion’. It’s more like a preface than like a conclusion, and the next topic we plan to focus on is why Eliezer-cluster people think alignment is hard, how we’re thinking about AGI, etc. In the meantime, I’m strongly in favor of arguing about this a bunch in the comments, sharing thoughts and reflections on your own models, etc. -- going straight for the meaty central disagreements now, not waiting to hash this out later.
Someone privately contacted me to express confusion, because they thought my ‘+1’ means that I think adamShimi’s initial comment was unusually great. That’s not the case. The reasons I commented positively are:
I think this overall exchange went well—it raised good points that might have otherwise been neglected, and everyone quickly reached agreement about the real crux.
I want to try to cancel out any impression that criticizing / pushing back on Eliezer-stuff is unwelcome, since Adam expressed worries about a “taboo on criticizing MIRI and EY too hard”.
On a more abstract level, I like seeing people ‘blurt out what they’re actually thinking’ (if done with enough restraint and willingness-to-update to mostly avoid demon threads), even if I disagree with the content of their thought. I think disagreements are often tied up in emotions, or pattern-recognition, or intuitive senses of ‘what a person/group/forum is like’. This can make it harder to epistemically converge about tough topics, because there’s a temptation to pretend your cruxes are more simple and legible than they really are, and end up talking about non-cruxy things.
Separately, I endorse Ben Pace’s question (“Can you make a positive case here for how the work being done on prosaic alignment leads to success?”) as the thing to focus on.
Thanks for the kind answer, even if we’re probably disagreeing about most points in this thread. I think message like yours really help in making everyone aware that such topics can actually be discussed publicly without big backlash.
That sounds amazing! I definitely want to extract some of the epistemic strategies that EY uses to generate criticisms and break proposals. :)
Excited about that!