Adam Karvonen comments on Preventing model exfiltration with upload limits

Adam Karvonen 22 Jul 2024 20:44 UTC
1 point
0
Another potential option (that could stack with text compression) is to filter outgoing data. In particular, if we only allow outgoing data to be language tokens, and we filter the outgoing data to have low perplexity as measured by a small, secret language model, then it could become very difficult to encode terabytes of model weights in a way that has low perplexity.
I discuss this idea more here.
- ryan_greenblatt 22 Jul 2024 22:36 UTC
  2 points
  0
  Parent
  I don’t think this stacks with compression—if you compress data then it is no longer low perplexity. Data which is compressed as well as you can should look random to you (rather than predictable).
  
  I think filtering like this is strictly worse than compression for reasons I discuss in my response here.
  
  (But I appreciate the idea!)
  - Adam Karvonen 22 Jul 2024 23:18 UTC
    1 point
    −4
    Parent
    I think it does stack if the perplexity filter is applied before the compression step, which relies on an underlying assumption that it’s easier to secure the internet connection choke point rather than the entire data center.
    
    I will try to continue the discussion here for simplicity.