Unclear if going beyond GPT-5 will be crucial, at that point researchers might get more relevant than compute again. GPT-4 level models (especially the newer ones) have the capability to understand complicated non-specialized text (now I can be certain some of my more obscure comments are Objectively Understandable), so GPT-5 level models will understand very robustly. If this is sufficient signal to get RL-like things off the ground (automating most labeling with superhuman quality, usefully scaling post-training to the level of pre-training), more scale won’t necessarily help on the currently-somewhat-routine pre-training side.
Unclear if going beyond GPT-5 will be crucial, at that point researchers might get more relevant than compute again. GPT-4 level models (especially the newer ones) have the capability to understand complicated non-specialized text (now I can be certain some of my more obscure comments are Objectively Understandable), so GPT-5 level models will understand very robustly. If this is sufficient signal to get RL-like things off the ground (automating most labeling with superhuman quality, usefully scaling post-training to the level of pre-training), more scale won’t necessarily help on the currently-somewhat-routine pre-training side.