I think limiting attack surface using various types of isolation seems promsing, but orthogonal to the intervention we’re describing in this post.
The issue is that for a model to work it needs a lot of I/O bandwidth. You’re thinking ok, if all it needs to do is emit output tokens to a user window, then if those tokens are small in size over time relative to the model weights, you can restrict uploads.
But consider how many parallel sessions there are now (millions per month) or will be in the future. Leaks could be coordinated or randomized to parts of the weights so each parallel session leaks a little more.
Everyone is using a common golden model as well, and market dynamics want in the future there to be just a few models everyone’s using.
Suppose there are 10 million user sessions a month and a leak of 128 bytes per session. Model is 0.64 terrabytes. (Gpt-4 assuming it’s distilled 10x for the T model). Then it takes 550 months to leak.
If there’s 1 billion sessions a month it leaks in 5.5 months.
TLDR: we actually think it’s reasonably likely that the total data outflow is of comparable scale to model weights (for an AI lab’s most capable model) under normal commercial operation.
Thinking these are of comparable scale is a core assumption for the upload limiting method to make sense. (Without the introduction of some more exotic approaches.)
Also, I added a BOTEC about viability for GPT-6 showing that it naively seems like we could be fine with GPT-6 generating an entire book for every human on earth.
The issue is that for a model to work it needs a lot of I/O bandwidth. You’re thinking ok, if all it needs to do is emit output tokens to a user window, then if those tokens are small in size over time relative to the model weights, you can restrict uploads.
But consider how many parallel sessions there are now (millions per month) or will be in the future. Leaks could be coordinated or randomized to parts of the weights so each parallel session leaks a little more.
Everyone is using a common golden model as well, and market dynamics want in the future there to be just a few models everyone’s using.
Suppose there are 10 million user sessions a month and a leak of 128 bytes per session. Model is 0.64 terrabytes. (Gpt-4 assuming it’s distilled 10x for the T model). Then it takes 550 months to leak.
If there’s 1 billion sessions a month it leaks in 5.5 months.
It’s better than nothing.
See this section of the post for commentary.
TLDR: we actually think it’s reasonably likely that the total data outflow is of comparable scale to model weights (for an AI lab’s most capable model) under normal commercial operation.
Thinking these are of comparable scale is a core assumption for the upload limiting method to make sense. (Without the introduction of some more exotic approaches.)
Also, I added a BOTEC about viability for GPT-6 showing that it naively seems like we could be fine with GPT-6 generating an entire book for every human on earth.