Nathan Helm-Burger comments on The case for ensuring that powerful AIs are controlled

Nathan Helm-Burger 29 Jan 2024 4:15 UTC
2 points
0
I agree with pretty much all of this and appreciate your clear framing of the issues at hand. It seems like where our concerns differ is around these two issues: I believe that the offense-defense balance for AI-enabled biorisk is such that a bad actor with open-weights fine-tuned model could kill billions of people with less than 100k. I don’t think that the fine-tuning or inference would require more than a single server with 8x GPUs (potentially even just 8x 4090s). So unless the compute regulations are monitoring individual 4090 GPUs, then you aren’t blocking inference or fine-tuning. Training takes a bunch of servers (often millions or billions of dollars worth of hardware), and thus seems more plausible to monitor.

I haven’t heard any concrete prposals for compute monitoring at the level of 8x 4090 GPUs, have you?
- ryan_greenblatt 29 Jan 2024 17:37 UTC
  2 points
  0
  Parent
  I think compute monitoring for literally 8x 4090 is very likely to be way too hard against a reasonably commited adversary. (The hardware is already way too easy to access and broadly distributed. Also, I think if this causes problems, you might also have issues with people’s mac books which is super costly.)
  
  My overall guess is that for realistically sized models (0.1-100 trillion parameters), we can’t prevent doing a small amount of finetuning and inference against an ML competent adversary.
  
  But, maybe there is some hope for models which are more like 100 trillion parameters? (Minimally, I think 8x 4090 isn’t going to work well for models of this size.)
  - Nathan Helm-Burger 29 Jan 2024 17:47 UTC
    2 points
    0
    Parent
    Yeah, the question of where the threshold of dangerous capabilities is seems very important to whether we can hope to make compute governance a part of restricting bad actors from using fine-tuning & inference to do bad things.
    The reason this is important to try to predict is because we really want to not release powerfully dangerous open-source models which bad actors could use for bad things. Once those models have been released, there’s no taking them back. So if there’s a risk that a given model would be dangerous if released openly so that bad actors could have a private copy to fine-tune, then the intervention point is to disallow the release of the model.