ryan_greenblatt comments on Preventing model exfiltration with upload limits

ryan_greenblatt 6 Feb 2024 17:46 UTC
2 points
0
Note that in this post, the scheme we’re proposing doesn’t necessarily involve disconnecting the inference data center from the internet, just limiting total uploads. (Though some sort of isolation like I describe in this comment could also be worthwhile, but perhaps more costly.)

However, we are proposing that the model weights don’t end up on general purpose employee workstations which aren’t in a specific inference datacenter. I think this security measure should be pretty workable and is very likely to happen by default.
- Steven Byrnes 6 Feb 2024 19:53 UTC
  2 points
  2
  Parent
  I guess I was thinking, the number of bytes going from the data center to a general-purpose employee workstation could be large. For example, maybe the employee wants to run automated tests that involve lots of LLM queries, and then analyze all the output data in a jupyter notebook. Or maybe the employee wants to download the server logs and analyze them in a jupyter notebook. Or whatever. Sorry if I’m saying something stupid, feel free to ignore.
  What links here?
  - ryan_greenblatt's comment on Preventing model exfiltration with upload limits by ryan_greenblatt (6 Feb 2024 20:18 UTC; 5 points)
  - ryan_greenblatt 6 Feb 2024 20:18 UTC
    5 points
    0
    Parent
    Yep, employee usage could be a high fraction of all output data. I think by default it’s probably still comparable to the size of the model weights (or smaller) for same reasons noted elsewhere. (Like there doesn’t seem to be a specific reason to think employee usage will cause an exception to the arguments I make elsewhere, so the total data seems probably managable given some precautions around uploading massive server logs or similar.)
    
    If employee usage is a high fraction of overall usage, we might be able to further restrict uploads by co-locating a bunch of employee usage in an adjacent data center and applying similar safe guards to that co-located data center as I discuss here. Of course, this doesn’t help rule out exfiltration to the employee-usage-data-center, but we might prefer that exfiltration over arbitrary exfiltration (e.g. because we control the GPUS in the employee datacenter and we could possibly limit the number of gpus or the interconnect in that co-located data center depending on what employees need).