Actually in social science problems in high dimensional spaces it is rather common to have parametrized models with hundreds or more parameters, especially when one has hundreds of thousands or more data points. For example, one often uses “fixed effects” for individual times or spatial regions, and matrices of interaction terms between basic effects. Folks definitely use param stat methods for such things.
Sure, lots of locally parametric statistics isn’t the same thing as having so many global parameters as to make few assumptions about the shape of the curve. Still, I think this is where we both nod and agree that there’s no absolute border between “parametric” and “nonparametric”?
Well there are clearly many ways to define that distinction. But regarding the costs of communicating and checking, the issue is whether one tells the model or the data set plus metric. Academics usually prefer to communicate a model, and I’m guessing that given their purposes this is probably usually best.
Sure. Though I note that if you’re already communicating a regional map with thousands of locally-fit parameters, you’re already sending a file, and at that point it’s pretty much as easy to send 10MB as 10KB, these days. But there’s all sorts of other reasons why parametric models are more useful for things like rendering causal predictions, relating to other knowledge and other results, and so on. I’m not objecting to that, per se, although in some cases it provides a motive to oversimplify and draw lines through graphs that don’t look like lines...
...but I’m not sure that’s relevant to the original point. From my perspective, the key question is to what degree a statistical method assumes that the underlying generator is simple, versus not imposing much of its own assumptions about the shape of the curve.
Actually in social science problems in high dimensional spaces it is rather common to have parametrized models with hundreds or more parameters, especially when one has hundreds of thousands or more data points. For example, one often uses “fixed effects” for individual times or spatial regions, and matrices of interaction terms between basic effects. Folks definitely use param stat methods for such things.
Sure, lots of locally parametric statistics isn’t the same thing as having so many global parameters as to make few assumptions about the shape of the curve. Still, I think this is where we both nod and agree that there’s no absolute border between “parametric” and “nonparametric”?
Well there are clearly many ways to define that distinction. But regarding the costs of communicating and checking, the issue is whether one tells the model or the data set plus metric. Academics usually prefer to communicate a model, and I’m guessing that given their purposes this is probably usually best.
Sure. Though I note that if you’re already communicating a regional map with thousands of locally-fit parameters, you’re already sending a file, and at that point it’s pretty much as easy to send 10MB as 10KB, these days. But there’s all sorts of other reasons why parametric models are more useful for things like rendering causal predictions, relating to other knowledge and other results, and so on. I’m not objecting to that, per se, although in some cases it provides a motive to oversimplify and draw lines through graphs that don’t look like lines...
...but I’m not sure that’s relevant to the original point. From my perspective, the key question is to what degree a statistical method assumes that the underlying generator is simple, versus not imposing much of its own assumptions about the shape of the curve.