Enforcement of mitigations when it’s someone else who removes them won’t be seen as relevant, since in this religion a contributor is fundamentally not responsible for how the things they release will be used by others.
This may be true of people who talk a lot about open source, but among actual maintainers the attitude is pretty different. If some user causes harm with an overall positive tool, that’s on the user; but if the contributor has built something consistently or overall harmful that is indeed on them. Maintainers tend to avoid working on projects which are mostly useful for surveillance, weapons, etc. for pretty much this reason.
Source: my personal experience as a a maintainer and PSF Fellow, and the multiple Python core developers I just checked with at the PyCon sprints.
if the contributor has built something consistently or overall harmful that is indeed on them
I agree, this is in accord with the dogma. But for AI, overall harm is debatable and currently purely hypothetical, so this doesn’t really apply. There is a popular idea that existential risk from AI has little basis in reality since it’s not already here to be observed. Thus contributing to public AI efforts remains fine (which on first order effects is perfectly fine right now).
My worry is that this attitude reframes commitments from RSP-like documents, so that people don’t see the obvious implication of how releasing weights breaks the commitments (absent currently impossible feats of unlearning), and don’t see themselves as making a commitment to avoid releasing high-ASL weights even as they commit to such RSPs. If this point isn’t written down, some people will only become capable of noticing it if actual catastrophes shift the attitude to open weights foundation models being harmful overall (even after we already get higher up in ASLs). Which doesn’t necessarily happen even if there are some catastrophes with a limited blast radius, since they get to be balanced out by positive effects.
This may be true of people who talk a lot about open source, but among actual maintainers the attitude is pretty different. If some user causes harm with an overall positive tool, that’s on the user; but if the contributor has built something consistently or overall harmful that is indeed on them. Maintainers tend to avoid working on projects which are mostly useful for surveillance, weapons, etc. for pretty much this reason.
Source: my personal experience as a a maintainer and PSF Fellow, and the multiple Python core developers I just checked with at the PyCon sprints.
I agree, this is in accord with the dogma. But for AI, overall harm is debatable and currently purely hypothetical, so this doesn’t really apply. There is a popular idea that existential risk from AI has little basis in reality since it’s not already here to be observed. Thus contributing to public AI efforts remains fine (which on first order effects is perfectly fine right now).
My worry is that this attitude reframes commitments from RSP-like documents, so that people don’t see the obvious implication of how releasing weights breaks the commitments (absent currently impossible feats of unlearning), and don’t see themselves as making a commitment to avoid releasing high-ASL weights even as they commit to such RSPs. If this point isn’t written down, some people will only become capable of noticing it if actual catastrophes shift the attitude to open weights foundation models being harmful overall (even after we already get higher up in ASLs). Which doesn’t necessarily happen even if there are some catastrophes with a limited blast radius, since they get to be balanced out by positive effects.