Mikhail Samin comments on meemi’s Shortform

Mikhail Samin Jan 19, 2025, 10:58 AM
90 points
57
we have a verbal agreement that these materials will not be used in model training
Get that agreement in writing.
I am happy to bet 1:1 OpenAI will refuse to make an agreement in writing to not use the problems/the answers for training.
You have done work that contributes to AI capabilities, and you have misled mathematicians who contributed to that work about its nature.
- DavidHolmes Jan 19, 2025, 11:58 AM
  22 points
  15
  Parent
  
  Get that agreement in writing.
  
  I’m not sure that would be particularly reassuring to me (writing as one of the contributors). First, how would one check that the agreement had been adhered to (maybe it’s possible, I don’t know)? Second, people in my experience often don’t notice they are training on data (as mentioned in a post above by ozziegooen).
  - Moebius314 Jan 19, 2025, 1:30 PM
    6 points
    1
    Parent
    I agree entirely that it would not be very reassuring, for the reasons you explained. But I would still consider it a mildly interesting signal to see if OpenAI would be willing to provide such an agreement in writing, and maybe make a public statement on the precise way they used the data so far.
    
    Also: if they make a legally binding commitment, and then later evidence shows up that they violated the terms of this agreement (e.g. via whistleblowers), I do think that this is a bigger legal risk for them than breeching some fuzzy verbal agreement.