My impression is that, for all of these proposals, however much resources you’ve already put into training, putting more resources into training will continue to improve performance.
I think this is incorrect. Most training setups eventually flatline, or close to it (e.g. see AlphaZero’s ELO curve), and need algorithmic or other improvements to do better.
For individual ML models, sure, but not for classes of similar models. E.g. GPT-3 presumably was more expensive to train than GPT-2 as part of the cost to getting better results. For each of the proposals in the OP, training costs constrain how complex a model you can train, which in turn would affect performance.
I think this is incorrect. Most training setups eventually flatline, or close to it (e.g. see AlphaZero’s ELO curve), and need algorithmic or other improvements to do better.
For individual ML models, sure, but not for classes of similar models. E.g. GPT-3 presumably was more expensive to train than GPT-2 as part of the cost to getting better results. For each of the proposals in the OP, training costs constrain how complex a model you can train, which in turn would affect performance.