We’re starting to have enough experience with the size of improvements produced by fine-tuning, scaffolding, prompting techniques, RAG, advances etc to be able to guesstimate the plausible size of further improvements (and amount of effort involved), so that we can try to leave some appropriate safety margin for it. That doesn’t rule out the possibility of something out-of-distribution coming along, but it does at least reduce it.
We’re starting to have enough experience with the size of improvements produced by fine-tuning, scaffolding, prompting techniques, RAG, advances etc to be able to guesstimate the plausible size of further improvements (and amount of effort involved), so that we can try to leave some appropriate safety margin for it. That doesn’t rule out the possibility of something out-of-distribution coming along, but it does at least reduce it.