In particular, in a fast takeoff world, AI takeover risk never looks much more obvious than it does now, and so x-risk-motivated people should be assumed to cause the majority of the research on alignment that happens.
I strongly disagree with that and I don’t think it follows from the premise. I think by most reasonable definitions of alignment it is already the case that most of the research is not done by x-risk motivated people.
Furthermore, I think it reflects poorly on this community that this sort of sentiment seems to be common.
It’s possible that a lot of our disagreement is due to different definitions of “research on alignment”, where you would only count things that (e.g.) 1) are specifically about alignment that likely scales to superintelligent systems, or 2) is motivated by X safety.
To push back on that a little bit... RE (1): It’s not obvious what will scale, And I think historically this community has been too pessimistic (i.e. almost completely dismissive) about approaches that seem hacky or heuristic. RE (2): This is basically circular.
I disagree, so I’m curious about what are great examples for you of good research on alignment that is not done by x-risk motivated people? (Not being dismissive, I’m genuinely curious, and discussing specifics sounds more promising than downvoting you to oblivion and not having a conversation at all).
Examples would be interesting, certainly. Concerning the post’s point, I’d say the relevant claim is that [type of alignment research that’ll be increasingly done in slow takeoff scenarios] is already being done by non x-risk motivated people.
I guess the hope is that at some point there are clear-to-everyone problems with no hacky solutions, so that incentives align to look for fundamental fixes—but I wouldn’t want to rely on this.
1) I think even non-obvious issues can get much more research traction than AI safety does today. And I don’t even think that catastrophic risks from AI are particularly non-obvious?
2) Not sure how broadly “cause the majority of research” is defined here, but I have some hope we can find ways to turn money into relevant research
I strongly disagree with that and I don’t think it follows from the premise. I think by most reasonable definitions of alignment it is already the case that most of the research is not done by x-risk motivated people.
Furthermore, I think it reflects poorly on this community that this sort of sentiment seems to be common.
It’s possible that a lot of our disagreement is due to different definitions of “research on alignment”, where you would only count things that (e.g.) 1) are specifically about alignment that likely scales to superintelligent systems, or 2) is motivated by X safety.
To push back on that a little bit...
RE (1): It’s not obvious what will scale, And I think historically this community has been too pessimistic (i.e. almost completely dismissive) about approaches that seem hacky or heuristic.
RE (2): This is basically circular.
I disagree, so I’m curious about what are great examples for you of good research on alignment that is not done by x-risk motivated people? (Not being dismissive, I’m genuinely curious, and discussing specifics sounds more promising than downvoting you to oblivion and not having a conversation at all).
Examples would be interesting, certainly. Concerning the post’s point, I’d say the relevant claim is that [type of alignment research that’ll be increasingly done in slow takeoff scenarios] is already being done by non x-risk motivated people.
I guess the hope is that at some point there are clear-to-everyone problems with no hacky solutions, so that incentives align to look for fundamental fixes—but I wouldn’t want to rely on this.
I also stumbled over this sentence.
1) I think even non-obvious issues can get much more research traction than AI safety does today. And I don’t even think that catastrophic risks from AI are particularly non-obvious?
2) Not sure how broadly “cause the majority of research” is defined here, but I have some hope we can find ways to turn money into relevant research