Is this actually the case? Not explicitly disagreeing, but just want to point out there is still a niche community who prefers using the oldest available 0314 gpt-4 checkpoint via API, which by the way is still almost the same price as 4.5, hardware improvements notwithstanding, and pretty much the only way to still get access to a model that presumably makes use of the full ~1.8 trillion parameters 4th-gen gpt was trained with.
Speaking of conflation, you see it everywhere in papers: somehow most people now entirely conflate gpt-4 with gpt-4 turbo, which replaced the full gpt-4 on chatgpt very quickly, and forget that there were many complaints back then that the faster (shrinking) model iterations were losing the “big model smell”, despite climbing the benchmarks.
And so when lots of people seem to describe 4.5′s advantages vs 4o as coming down to a “big model smell”, I think it is important to remember 4turbo and later 4o are clearly optimized for speed, price and benchmarks far more than original release gpt-4 was, and comparisons on taste/aesthetics/intangibles may be more fitting when using the original, non-goodharted, full scale gpt-4 model. At the very least, it should fully and properly represent what it looks like to have a clean ~10x less training compute vs 4.5.
Is this actually the case? Not explicitly disagreeing, but just want to point out there is still a niche community who prefers using the oldest available 0314 gpt-4 checkpoint via API, which by the way is still almost the same price as 4.5, hardware improvements notwithstanding, and pretty much the only way to still get access to a model that presumably makes use of the full ~1.8 trillion parameters 4th-gen gpt was trained with.
Speaking of conflation, you see it everywhere in papers: somehow most people now entirely conflate gpt-4 with gpt-4 turbo, which replaced the full gpt-4 on chatgpt very quickly, and forget that there were many complaints back then that the faster (shrinking) model iterations were losing the “big model smell”, despite climbing the benchmarks.
And so when lots of people seem to describe 4.5′s advantages vs 4o as coming down to a “big model smell”, I think it is important to remember 4turbo and later 4o are clearly optimized for speed, price and benchmarks far more than original release gpt-4 was, and comparisons on taste/aesthetics/intangibles may be more fitting when using the original, non-goodharted, full scale gpt-4 model. At the very least, it should fully and properly represent what it looks like to have a clean ~10x less training compute vs 4.5.