I agree that there is a lot of ‘red teaming to save corporate face’ going on, which is part of a workflow which makes the products less useful to end-users and has neutral or negative impacts on catastrophic risks.
I can also confidently state that there is simultaneously at least some ‘catastrophic risk red teaming’ being undertaken, which does shape products in helpful ways. I think part of why this seems like it’s little or no part of the product-shaping behavior of the corporations involved is ‘deniability maintenance’. In order to avoid culpability risk and avoid negative consumer perceptions, it is in the interest of the AI corporations to hide evidence of catastrophic risks, while at the same time seeking to mitigate those risks. Part of this hiding process would surely be to restrict those who know specific details about ongoing catastrophic risk red teaming from talking publicly about their efforts.
With such dynamics in play, you should not count absence of evidence as evidence of absence. In such a context, the silence itself should seem suspicious. Think about the quantity and strength of evidence (versus ungrounded proclamations) which you have seen presented on the specific topic of “this research is proof that there are no catastrophic risks.” That specific topic of research seems remarkably quiet when you think of it like that. Perhaps suspiciously so.
Red-teaming is being done in a way that doesn’t reduce existential risk at all but instead makes models less useful for users.
https://x.com/shaunralston/status/1821828407195525431
I agree that there is a lot of ‘red teaming to save corporate face’ going on, which is part of a workflow which makes the products less useful to end-users and has neutral or negative impacts on catastrophic risks.
I can also confidently state that there is simultaneously at least some ‘catastrophic risk red teaming’ being undertaken, which does shape products in helpful ways. I think part of why this seems like it’s little or no part of the product-shaping behavior of the corporations involved is ‘deniability maintenance’. In order to avoid culpability risk and avoid negative consumer perceptions, it is in the interest of the AI corporations to hide evidence of catastrophic risks, while at the same time seeking to mitigate those risks. Part of this hiding process would surely be to restrict those who know specific details about ongoing catastrophic risk red teaming from talking publicly about their efforts.
With such dynamics in play, you should not count absence of evidence as evidence of absence. In such a context, the silence itself should seem suspicious. Think about the quantity and strength of evidence (versus ungrounded proclamations) which you have seen presented on the specific topic of “this research is proof that there are no catastrophic risks.” That specific topic of research seems remarkably quiet when you think of it like that. Perhaps suspiciously so.