Here we’re saying that the continual fine-tuning might not necessarily resolve causal confusion within the model; instead, it will help the model learn the (new) spurious correlations so that it still performs well on the test data. This is assuming that continual fine-tuning is using a similar ERM-based method (e.g. the same pretraining objective but on the new data distribution). In hindsight, we probably should have written “continual training” rather than specifically “continual fine-tuning”. If you could continually train online in the deployment environment then that would be better, and whether it’s enough is very related to whether online training is enough, which is one of the key open questions we mention.
Here we’re saying that the continual fine-tuning might not necessarily resolve causal confusion within the model; instead, it will help the model learn the (new) spurious correlations so that it still performs well on the test data. This is assuming that continual fine-tuning is using a similar ERM-based method (e.g. the same pretraining objective but on the new data distribution). In hindsight, we probably should have written “continual training” rather than specifically “continual fine-tuning”. If you could continually train online in the deployment environment then that would be better, and whether it’s enough is very related to whether online training is enough, which is one of the key open questions we mention.