Well, I’d sure like to know whether you are planning to give the dataset to OpenAI or any other frontier companies! It might influence my opinion of whether this work is net positive or net negative.
I can’t make any confident claims or promises right now, but my best guess is that we will make sure this new benchmark stays entirely private and under Epoch’s control, to the extent this is feasible for us. However, I want to emphasize that by saying this, I’m not making a public commitment on behalf of Epoch.
I’m not completely sure, since I was not personally involved in the relevant negotiations for FrontierMath. However, what I can say is that Tamay already indicated that Epoch should have tried harder to obtain different contract terms that enabled us to have greater transparency. I don’t think it makes sense for him to say that unless he believes it was feasible to have achieved a different outcome.
Also, I want to clarify that this new benchmark is separate from FrontierMath and we are under different constraints with regards to it.
Well, I’d sure like to know whether you are planning to give the dataset to OpenAI or any other frontier companies! It might influence my opinion of whether this work is net positive or net negative.
I can’t make any confident claims or promises right now, but my best guess is that we will make sure this new benchmark stays entirely private and under Epoch’s control, to the extent this is feasible for us. However, I want to emphasize that by saying this, I’m not making a public commitment on behalf of Epoch.
Was [keeping FrontierMath entirely private and under Epoch’s control] feasible for Epoch in the same sense of “feasible” you are using here?
I’m not completely sure, since I was not personally involved in the relevant negotiations for FrontierMath. However, what I can say is that Tamay already indicated that Epoch should have tried harder to obtain different contract terms that enabled us to have greater transparency. I don’t think it makes sense for him to say that unless he believes it was feasible to have achieved a different outcome.
Also, I want to clarify that this new benchmark is separate from FrontierMath and we are under different constraints with regards to it.