Maybe they are preparing for switching from merely encouraging their main model to do CoT (old technique) to a full RL-based reasoning model. I recently saw this, before the GUI aborted and said the model was over capacity:
Then it wouldn’t make sense anymore to have the non-reasoning model attempt to do CoT.
Maybe they are preparing for switching from merely encouraging their main model to do CoT (old technique) to a full RL-based reasoning model. I recently saw this, before the GUI aborted and said the model was over capacity:
Then it wouldn’t make sense anymore to have the non-reasoning model attempt to do CoT.
I have also seen this.