The view that I think you’re referring to is somewhat more nuanced. “Studying AI” and “trying to develop AI” are refer to fairly wide classes of activities, which may have very different risk profiles. If one buys the general class of arguments for AI risk, then “trying to develop AI” almost certainly means advancing AI capabilities, and shortens timelines (which is bad). “Studying AI” could mean anything from “doing alignment research” (probably good), “doing capabilities research” (probably bad), or something else entirely (the expected value of which would depend on specifics).
I think you can lump them together for this conversation
Why do you think this?
It seems to me that reading books about deep learning is a just fine thing to do, but that publishing papers that push forward the frontier of deep learning is plausibly quite bad. These seem like such different activities that I’m not at all inclined to lump them together for the purposes of this question.
I wouldn’t call it an infohazard; generally that refers to information that’s harmful simply to know, rather than because it might e.g. advance timelines.
There are arguments to be made about how much overlap there is between capabilities research and alignment research, but I think by default most things that would be classified as capabilities research do not meaningful advance AI alignment. For that to be true, you’d need >50% of all capabilities work to advance alignment “by default” (and without requiring any active effort to “translate” that capabilities work into something helpful for alignment), since the relative levels of effort invested are so skewed to capabilities. See also https://www.lesswrong.com/tag/differential-intellectual-progress.
I think there’s probably value in being on an alignment team at a “capabilities” org, or even embedded in a capabilities team if the role itself doesn’t involve work that contributes to capabilities (either via first-order or second-order effects).
I think that the “in the room” argument might start to make sense when there’s actually a plan for alignment that’s in a sufficiently ready state to be operationalized. AFAICT nobody has such a plan yet. For that reason, I think maintaining & improving lines of communication is very important, but if I had to guess, I’d say you could get most of the anticipated benefit there without directly doing capabilities work.
Yes this does seem to be happening. It also appears to be unavoidable.
Our state of knowledge is nowhere near being able to guarantee that any AGI we develop will not kill us all. We are already developing AI that is superhuman in increasingly many aspects. Those who are actively working right now to bring the rest of the capabilities up to and above human levels obviously can’t be sufficiently concerned, or they would not be doing it.
The view that I think you’re referring to is somewhat more nuanced. “Studying AI” and “trying to develop AI” are refer to fairly wide classes of activities, which may have very different risk profiles. If one buys the general class of arguments for AI risk, then “trying to develop AI” almost certainly means advancing AI capabilities, and shortens timelines (which is bad). “Studying AI” could mean anything from “doing alignment research” (probably good), “doing capabilities research” (probably bad), or something else entirely (the expected value of which would depend on specifics).
Why do you think this?
It seems to me that reading books about deep learning is a just fine thing to do, but that publishing papers that push forward the frontier of deep learning is plausibly quite bad. These seem like such different activities that I’m not at all inclined to lump them together for the purposes of this question.
I wouldn’t call it an infohazard; generally that refers to information that’s harmful simply to know, rather than because it might e.g. advance timelines.
There are arguments to be made about how much overlap there is between capabilities research and alignment research, but I think by default most things that would be classified as capabilities research do not meaningful advance AI alignment. For that to be true, you’d need >50% of all capabilities work to advance alignment “by default” (and without requiring any active effort to “translate” that capabilities work into something helpful for alignment), since the relative levels of effort invested are so skewed to capabilities. See also https://www.lesswrong.com/tag/differential-intellectual-progress.
I think there’s probably value in being on an alignment team at a “capabilities” org, or even embedded in a capabilities team if the role itself doesn’t involve work that contributes to capabilities (either via first-order or second-order effects).
I think that the “in the room” argument might start to make sense when there’s actually a plan for alignment that’s in a sufficiently ready state to be operationalized. AFAICT nobody has such a plan yet. For that reason, I think maintaining & improving lines of communication is very important, but if I had to guess, I’d say you could get most of the anticipated benefit there without directly doing capabilities work.
Yes this does seem to be happening. It also appears to be unavoidable.
Our state of knowledge is nowhere near being able to guarantee that any AGI we develop will not kill us all. We are already developing AI that is superhuman in increasingly many aspects. Those who are actively working right now to bring the rest of the capabilities up to and above human levels obviously can’t be sufficiently concerned, or they would not be doing it.