no amount of inference compute can make GPT-2 solve AIME
That’s because GPT-2 isn’t COT fine-tuned. Plenty of people are predicting it may be possible to get GPT-4 level performance out of a GPT-2 sized model with COT. How confident are you that they’re wrong? (o1-mini is dramatically better than GPT-4 and likely 30b-70b parameters)
That’s because GPT-2 isn’t COT fine-tuned. Plenty of people are predicting it may be possible to get GPT-4 level performance out of a GPT-2 sized model with COT. How confident are you that they’re wrong? (o1-mini is dramatically better than GPT-4 and likely 30b-70b parameters)