It’ll be interesting to see whether the process supervision approach that OpenAI are reputedly taking with ‘Strawberry’ will make a bit difference to that. It’s a different framing (rewarding good intermediate steps) but seems arguably equivalent.
It’ll be interesting to see whether the process supervision approach that OpenAI are reputedly taking with ‘Strawberry’ will make a bit difference to that. It’s a different framing (rewarding good intermediate steps) but seems arguably equivalent.