I don’t want to speak for Nate, and I also don’t want to particularly defend my own behavior here, but I have kind of done similar things around trying to engage with the “AI is easy to control” stuff.
I found it quite hard to engage with directly. I have read the post, but I would not claim to be able to be close to passing an ITT of its authors and bounced off a few times, and I don’t currently expect direct conversation with Quintin or Nora to be that productive (though I would still be up for it and would give it a try).
So I have been talking to friends and other people in my social circle who I have a history of communicating well with about the stuff, and I think that’s been valuable to me. Many of them had similar experiences, so in some sense it did feel like a group of blind men groping around an elephant, but I don’t have a much better alternative. I did not find the original post easy to understand, or the kind of thing I felt capable of responding to.
I would kind of appreciate better suggestions. I have not found just forcing myself to engage more with the original post to help me much. Dialogues like this do actually seem helpful to me (and I found reading this valuable).
How much have you read about deep learning from “normal” (non-xrisk-aware) AI academics? Belrose’s Tweet-length argument against deceptive alignment sounds really compelling to the sort of person who’s read (e.g.) Simon Prince’s textbook but not this website. (This is a claim about what sounds compelling to which readers rather than about the reality of alignment, but if xrisk-reducers don’t understand why an argument would sound compelling to normal AI practitioners in the current paradigm, that’s less dignified than understanding it well enough to confirm or refute it.)
I think I could pass the ITTs of Quintin/Nora sufficiently to have a productive conversation while also having interesting points of disagreement. If that’s the bottleneck, I’d be interested in participating in some dialogues, if it’s a “people genuinely trying to understand each other’s views” vibe rather than a “tribalistically duking it out for the One True Belief” vibe.
This is really interesting, because I find Quintin and Nora’s content incredibly clear and easy to understand.
As one hypothesis (which I’m not claiming is true for you, just a thing to consider)—When someone is pointing out a valid flaw in my views or claims, I personally find the critique harder to “understand” at first. (I know this because I’m considering the times where I later agreed the critique was valid, even though it was “hard to understand” at the time.) I think this “difficulty” is basically motivated cognition.
I am a bit stressed right now, and so maybe am reading your comment too much as a “gotcha”, but on the margin I would like to avoid psychologizing of me here (I think it’s sometimes fine, but the above already felt a bit vulnerable and this direction feels like it disincentivizes that). I generally like sharing the intricacies and details of my motivations and cognition, but this is much harder if this immediately causes people to show up to dissect my motivations to prove their point.
More on the object-level, I don’t think this is the result of motivated cognition, though it’s of course hard to rule out. I would prefer if this kind of thing doesn’t become a liability to say out loud in contexts like this. I expect it will make conversations where people try to understand where other people are coming from go better.
Sorry if I overreacted in this comment. I do think in a different context, on maybe a different day I would be up for poking at my motivations and cognition and see whether indeed they are flawed in this way (which they very well might be), but I don’t currently feel like it’s the right move in this context.
I think it’s sometimes fine, but the above already felt a bit vulnerable and this direction feels like it disincentivizes that
FWIW, I think your original comment was good and I’m glad you made it, and want to give you some points for it. (I guess that’s what the upvote buttons are for!)
Fwiw, I generally find Quintin’s writing unclear and difficult to read (I bounce a lot) and Nora’s clear and easy, even though I agree with Quintin slightly more (although I disagree with both of them substantially).
I do think there is something to “views that are very different from ones own” being difficult to understand, sometimes, although I think this can be for a number of reasons. Like, for me at least, understanding someone with very different beliefs can be both time intensive and cognitively demanding—I usually have to sit down and iterate on “make up a hypothesis of what I think they’re saying, then go back and check if that’s right, update hypothesis, etc.” This process can take hours or days, as the cruxes tend to be deep and not immediately obvious.
Usually before I’ve spent significant time on understanding writing in this way, e.g. during the first few reads, I feel like I’m bouncing, or otherwise find myself wanting to leave. But I think the bouncing feeling is (in part) tracking that the disagreement is really pervasive and that I’m going to have to put in a bunch of effort if I actually want to understand it, rather than that I just don’t like that they disagree with me.
Because of this, I personally get a lot of value out of interacting with friends who have done the “translating it closer to my ontology” step—it reduces the understanding cost a lot for me, which tends to be higher the further from my worldview the writing is.
Yeah, for me the early development of shard theory work was confusing for similar reasons. Quintin framed values as contextual decision influences and thought these were fundamental, while I’d absorbed from Eliezer that values were like a utility function. They just think in very different frames. This is why science is so confusing until one frame proves useful and is established as a Kuhnian paradigm.
I don’t want to speak for Nate, and I also don’t want to particularly defend my own behavior here, but I have kind of done similar things around trying to engage with the “AI is easy to control” stuff.
I found it quite hard to engage with directly. I have read the post, but I would not claim to be able to be close to passing an ITT of its authors and bounced off a few times, and I don’t currently expect direct conversation with Quintin or Nora to be that productive (though I would still be up for it and would give it a try).
So I have been talking to friends and other people in my social circle who I have a history of communicating well with about the stuff, and I think that’s been valuable to me. Many of them had similar experiences, so in some sense it did feel like a group of blind men groping around an elephant, but I don’t have a much better alternative. I did not find the original post easy to understand, or the kind of thing I felt capable of responding to.
I would kind of appreciate better suggestions. I have not found just forcing myself to engage more with the original post to help me much. Dialogues like this do actually seem helpful to me (and I found reading this valuable).
How much have you read about deep learning from “normal” (non-xrisk-aware) AI academics? Belrose’s Tweet-length argument against deceptive alignment sounds really compelling to the sort of person who’s read (e.g.) Simon Prince’s textbook but not this website. (This is a claim about what sounds compelling to which readers rather than about the reality of alignment, but if xrisk-reducers don’t understand why an argument would sound compelling to normal AI practitioners in the current paradigm, that’s less dignified than understanding it well enough to confirm or refute it.)
I think I could pass the ITTs of Quintin/Nora sufficiently to have a productive conversation while also having interesting points of disagreement. If that’s the bottleneck, I’d be interested in participating in some dialogues, if it’s a “people genuinely trying to understand each other’s views” vibe rather than a “tribalistically duking it out for the One True Belief” vibe.
This is really interesting, because I find Quintin and Nora’s content incredibly clear and easy to understand.
As one hypothesis (which I’m not claiming is true for you, just a thing to consider)—When someone is pointing out a valid flaw in my views or claims, I personally find the critique harder to “understand” at first. (I know this because I’m considering the times where I later agreed the critique was valid, even though it was “hard to understand” at the time.) I think this “difficulty” is basically motivated cognition.
I am a bit stressed right now, and so maybe am reading your comment too much as a “gotcha”, but on the margin I would like to avoid psychologizing of me here (I think it’s sometimes fine, but the above already felt a bit vulnerable and this direction feels like it disincentivizes that). I generally like sharing the intricacies and details of my motivations and cognition, but this is much harder if this immediately causes people to show up to dissect my motivations to prove their point.
More on the object-level, I don’t think this is the result of motivated cognition, though it’s of course hard to rule out. I would prefer if this kind of thing doesn’t become a liability to say out loud in contexts like this. I expect it will make conversations where people try to understand where other people are coming from go better.
Sorry if I overreacted in this comment. I do think in a different context, on maybe a different day I would be up for poking at my motivations and cognition and see whether indeed they are flawed in this way (which they very well might be), but I don’t currently feel like it’s the right move in this context.
FWIW, I think your original comment was good and I’m glad you made it, and want to give you some points for it. (I guess that’s what the upvote buttons are for!)
Fwiw, I generally find Quintin’s writing unclear and difficult to read (I bounce a lot) and Nora’s clear and easy, even though I agree with Quintin slightly more (although I disagree with both of them substantially).
I do think there is something to “views that are very different from ones own” being difficult to understand, sometimes, although I think this can be for a number of reasons. Like, for me at least, understanding someone with very different beliefs can be both time intensive and cognitively demanding—I usually have to sit down and iterate on “make up a hypothesis of what I think they’re saying, then go back and check if that’s right, update hypothesis, etc.” This process can take hours or days, as the cruxes tend to be deep and not immediately obvious.
Usually before I’ve spent significant time on understanding writing in this way, e.g. during the first few reads, I feel like I’m bouncing, or otherwise find myself wanting to leave. But I think the bouncing feeling is (in part) tracking that the disagreement is really pervasive and that I’m going to have to put in a bunch of effort if I actually want to understand it, rather than that I just don’t like that they disagree with me.
Because of this, I personally get a lot of value out of interacting with friends who have done the “translating it closer to my ontology” step—it reduces the understanding cost a lot for me, which tends to be higher the further from my worldview the writing is.
Yeah, for me the early development of shard theory work was confusing for similar reasons. Quintin framed values as contextual decision influences and thought these were fundamental, while I’d absorbed from Eliezer that values were like a utility function. They just think in very different frames. This is why science is so confusing until one frame proves useful and is established as a Kuhnian paradigm.