A Ray comments on Alex Ray’s Shortform

A Ray Jan 20, 2022, 11:53 PM
3 points
0
I engage too much w/ generalizations about AI alignment researchers.
Noticing this behavior seems useful for analyzing it and strategizing around it.
A sketch of a pattern to be on the lookout for in particular is “AI Alignment researchers make mistake X” or “AI Alignment researchers are wrong about Y”. I think in the extreme I’m pretty activated/triggered by this, and this causes me to engage with it to a greater extent than I would have otherwise.
This engagement is probably encouraging more of this to happen, so I think more of a pause and reflection would make more good things happen.
It’s worth acknowledging that this comes from a very deep sense of caring and importantness about AI alignment research that I feel. I spend a lot of my time (both work time and free time) trying to foster and grow the field. It seems reasonable I want people to have correct beliefs about this.
It’s also worth acknowledging that there will be some cases where my disagreement is wrong. I definitely don’t know all AI alignment researchers, and there will be cases where there are broad field-wide phenomena that I missed. However I think this is rare, and probably most of the people I interact with will have less experience w/ the field of AI alignment.
Another confounder is that the field is both pretty jargon-heavy and very confused about a lot of things. This can lead to a bunch of semantics confusions masking other intellectual progress. I’m definitely not in the “words have meanings, dammit” extreme, and maybe I can do a better job asking people to clarify and define things that I think are being confused.
A takeaway I have right now from reflecting on this, is that “I disagree about <sweeping generalization about AI alignment researchers>” feels like a simple and neutral statement to me, but isn’t a simple and neutral statement in a dialog.
Thinking about the good stuff about scout mindset, I think things that I could do instead that would be better:
1. I can do a better job conserving my disagreement. I like the model of treating this as a limited resource, and ensuring it is well-spent seems good.
2. I can probably get better mileage out of pointing out areas of agreement than going straight to highlighting disagreements (crocker’s rules be damned)
3. I am very optimistic about double-crux as a thing that should be used more widely, as well as a specific technique to deploy in these circumstances. I am confused why I don’t see more of it happening.
I think that’s all for the reflection on this for now.