I think compute monitoring for literally 8x 4090 is very likely to be way too hard against a reasonably commited adversary. (The hardware is already way too easy to access and broadly distributed. Also, I think if this causes problems, you might also have issues with people’s mac books which is super costly.)
My overall guess is that for realistically sized models (0.1-100 trillion parameters), we can’t prevent doing a small amount of finetuning and inference against an ML competent adversary.
But, maybe there is some hope for models which are more like 100 trillion parameters? (Minimally, I think 8x 4090 isn’t going to work well for models of this size.)
Yeah, the question of where the threshold of dangerous capabilities is seems very important to whether we can hope to make compute governance a part of restricting bad actors from using fine-tuning & inference to do bad things.
The reason this is important to try to predict is because we really want to not release powerfully dangerous open-source models which bad actors could use for bad things. Once those models have been released, there’s no taking them back. So if there’s a risk that a given model would be dangerous if released openly so that bad actors could have a private copy to fine-tune, then the intervention point is to disallow the release of the model.
I think compute monitoring for literally 8x 4090 is very likely to be way too hard against a reasonably commited adversary. (The hardware is already way too easy to access and broadly distributed. Also, I think if this causes problems, you might also have issues with people’s mac books which is super costly.)
My overall guess is that for realistically sized models (0.1-100 trillion parameters), we can’t prevent doing a small amount of finetuning and inference against an ML competent adversary.
But, maybe there is some hope for models which are more like 100 trillion parameters? (Minimally, I think 8x 4090 isn’t going to work well for models of this size.)
Yeah, the question of where the threshold of dangerous capabilities is seems very important to whether we can hope to make compute governance a part of restricting bad actors from using fine-tuning & inference to do bad things.
The reason this is important to try to predict is because we really want to not release powerfully dangerous open-source models which bad actors could use for bad things. Once those models have been released, there’s no taking them back. So if there’s a risk that a given model would be dangerous if released openly so that bad actors could have a private copy to fine-tune, then the intervention point is to disallow the release of the model.