That used to work, but as of March you can only get the pre-logit_bias logprobs back. They didn’t announce the change, but it’s discussed in the OpenAI forums eg here. I noticed the change when all my code suddenly broke; you can still see remnants of that approach in the code.
I’m aware of the paper because of the impact it had. I might personallynot have chosen to draw their attention to the issue, since the main effect seems to be making some research significantly more difficult, and I haven’t heard of any attempts to deliberately exfiltrate weights that this would be preventing.
On reflection I somewhat endorse pointing the risk out after discovering it, in the spirit of open collaboration, as you did. It was just really frustrating when all my experiments suddenly broke for no apparent reason. But that’s mostly on OpenAI for not announcing the change to their API (other than emails sent to some few people). Apologies for grouching in your direction.
That used to work, but as of March you can only get the pre-logit_bias logprobs back. They didn’t announce the change, but it’s discussed in the OpenAI forums eg here. I noticed the change when all my code suddenly broke; you can still see remnants of that approach in the code.
They emailed some people about this: https://x.com/brianryhuang/status/1763438814515843119
The reason is that it may allow unembedding matrix weight stealing: https://arxiv.org/abs/2403.06634
I’m aware of the paper because of the impact it had. I might personally not have chosen to draw their attention to the issue, since the main effect seems to be making some research significantly more difficult, and I haven’t heard of any attempts to deliberately exfiltrate weights that this would be preventing.
On reflection I somewhat endorse pointing the risk out after discovering it, in the spirit of open collaboration, as you did. It was just really frustrating when all my experiments suddenly broke for no apparent reason. But that’s mostly on OpenAI for not announcing the change to their API (other than emails sent to some few people). Apologies for grouching in your direction.