I have sometimes heard it claimed that as a consequence of this result, as we move to doing machine learning with ever larger datasets and ever bigger models, the impact of our training processes’ inductive biases will become negligible.
I’m confused by this. As I see it, the consequence of that result is that as you move to larger datasets, holding model size fixed, the impact of inductive biases will decrease. This is consistent with double descent, where as you have larger and larger datasets, you get into the underparameterized regime, which follows the normal story of more data = better.
As you increase the size of your models, holding dataset size fixed, the impact of inductive biases should increase, since you need more information to pick out the right model, but the data provides exactly the same amount of information. And that’s consistent with what happens in the overparameterized regime.
I’m confused by this. As I see it, the consequence of that result is that as you move to larger datasets, holding model size fixed, the impact of inductive biases will decrease. This is consistent with double descent, where as you have larger and larger datasets, you get into the underparameterized regime, which follows the normal story of more data = better.
As you increase the size of your models, holding dataset size fixed, the impact of inductive biases should increase, since you need more information to pick out the right model, but the data provides exactly the same amount of information. And that’s consistent with what happens in the overparameterized regime.