I’m curious about the disagree votes as well; it would be useful to hear from those disagreeing. Making the public more aware of the harmful capabilities of current models is valuable in my view, because it helps make slowdowns and other safety legislation more viable. One could argue that this provides a blueprint for misuse, but it seems unlikely that misuse is bottlenecked on how-to resources; it’s not difficult to find information on how to jailbreak models (eg it’s all over Twitter).
That said, while I do think it’s important to ensure that the public is aware of both current and future risks, unilaterally pointing the media in the direction of potentially sensationalizable individual studies is probably not the best way to go about that. In retrospect my suggestion to consider that was itself ill-considered, and I retract it.
I’m not sure. My second thoughts were eg, ‘Interactions with the media often don’t go the way people expected’ and ‘Sensationalizable research often gets spun into pre-existing narratives and can end up having net-negative consequences.’ It’s possible that my original suggestion makes sense, but my uncertainty is high enough that on reflection I’m not comfortable endorsing it, especially given my own lack of experience dealing with the media.
I’m curious about the disagree votes as well; it would be useful to hear from those disagreeing. Making the public more aware of the harmful capabilities of current models is valuable in my view, because it helps make slowdowns and other safety legislation more viable. One could argue that this provides a blueprint for misuse, but it seems unlikely that misuse is bottlenecked on how-to resources; it’s not difficult to find information on how to jailbreak models (eg it’s all over Twitter).
That said, while I do think it’s important to ensure that the public is aware of both current and future risks, unilaterally pointing the media in the direction of potentially sensationalizable individual studies is probably not the best way to go about that. In retrospect my suggestion to consider that was itself ill-considered, and I retract it.
What would a better way look like?
I’m not sure. My second thoughts were eg, ‘Interactions with the media often don’t go the way people expected’ and ‘Sensationalizable research often gets spun into pre-existing narratives and can end up having net-negative consequences.’ It’s possible that my original suggestion makes sense, but my uncertainty is high enough that on reflection I’m not comfortable endorsing it, especially given my own lack of experience dealing with the media.