Thanks for this comment! I think it raises a very fair point. I do expect algorithmic improvements to make a significant difference—even if AGIs start off thinking more slowly than humans, algorithmic improvements would then allow us to train a smaller model with the same level of performance, but much faster. (This paper isn’t quite measuring the same thing, but “efficiency doubling every 16 months” seems like a reasonable baseline to me.) And techniques like model distillation would also help with that.
However, I think the claim as originally written was probably misleading, so I’ve rephrased it to focus on algorithmic improvements.
Thanks for this comment! I think it raises a very fair point. I do expect algorithmic improvements to make a significant difference—even if AGIs start off thinking more slowly than humans, algorithmic improvements would then allow us to train a smaller model with the same level of performance, but much faster. (This paper isn’t quite measuring the same thing, but “efficiency doubling every 16 months” seems like a reasonable baseline to me.) And techniques like model distillation would also help with that.
However, I think the claim as originally written was probably misleading, so I’ve rephrased it to focus on algorithmic improvements.