Perhaps surprisingly, statistics has an answer, and that answer is no. If in your application the usefulness of a statistical model is equivalent to its predictive performance, then choose your model using cross-validation, which directly optimizes for predictive performance. When that gets too expensive, use the AIC, which is equivalent to cross-validation as the amount of data grows without bound. But if the true model is available, neither AIC nor cross-validation will pick it out of the set of models being considered as the amount of data grows without bound.
define: A theory’s “truthfulness” as how much probability mass it has after appropriate selection of prior and applications of Bayes’ theorem. It works as a good measure for a theory’s “usefulness” as long as resource limitations and psychological side effects aren’t important.
define: A theory’s “usefulness” as a function of resources needed to calculate its predictions to a certain degree of accuracy, the “truthfulness” of the theory itself, and side effects. Squinting at it, I get something roughly like:
usefulness(truthfulness, resources, side effects) = truthfulness * accuracy(resources) + messiness(side effects)
So I define “usefulness” as a function and “truthfulness” as its limiting value as side effects go to 0 and resources go to infinity.
Notice how I shaped the definition of “usefulness” to avoid mention of context specific utilities; I purposefully avoided making it domain specific or talking about what the theory is trying to predict. I did this to maintain generality.
(Note: For now I’m polishing over the issue of how to deal with abstracting over concrete hypotheses and integrating the properties of this abstraction with the definitions)
Your definition of usefulness fails to include the utility of the predictions made, which is the most important factor. A theory is useful if there is a chain of inference from it to a concrete application, and its degree of usefulness depends on the utility of that application, whether it could have been reached without using the theory, and the resources required to follow that chain of inference. Measuring usefulness requires entangling theories with applications and decisions, whereas truthfulness does not. Consequently, it’s incorrect to treat truthfulness as a special case of usefulness or vise versa.
Measuring usefulness requires entangling theories with applications and decisions, whereas truthfulness does not. Consequently, it’s incorrect to treat truthfulness as a special case of usefulness or vise versa.
From pwno:
“Aren’t true theories defined by how useful they are in some application?”
My definition of “usefulness” was built with the express purpose of relating the truth of theories to how useful they are and is very much a context specific temporary definition (hence “define:”). If I had tried to deal with it directly I would have had something uselessly messy and incomplete, or I could have used a true but also uninformative expectation approach and hid all of the complexity. Instead, I experimented and tried to force the concepts to unify in some way. To do so I stretched the definition of usefulness pretty much to the breaking point and omitted any direct relation to utility functions. I found it a useful thought to think and hope you do as well even if you take issue with my use of the name “usefulness”.
Aren’t true theories defined by how useful they are in some application?
Perhaps surprisingly, statistics has an answer, and that answer is no. If in your application the usefulness of a statistical model is equivalent to its predictive performance, then choose your model using cross-validation, which directly optimizes for predictive performance. When that gets too expensive, use the AIC, which is equivalent to cross-validation as the amount of data grows without bound. But if the true model is available, neither AIC nor cross-validation will pick it out of the set of models being considered as the amount of data grows without bound.
define: A theory’s “truthfulness” as how much probability mass it has after appropriate selection of prior and applications of Bayes’ theorem. It works as a good measure for a theory’s “usefulness” as long as resource limitations and psychological side effects aren’t important.
define: A theory’s “usefulness” as a function of resources needed to calculate its predictions to a certain degree of accuracy, the “truthfulness” of the theory itself, and side effects. Squinting at it, I get something roughly like: usefulness(truthfulness, resources, side effects) = truthfulness * accuracy(resources) + messiness(side effects)
So I define “usefulness” as a function and “truthfulness” as its limiting value as side effects go to 0 and resources go to infinity. Notice how I shaped the definition of “usefulness” to avoid mention of context specific utilities; I purposefully avoided making it domain specific or talking about what the theory is trying to predict. I did this to maintain generality.
(Note: For now I’m polishing over the issue of how to deal with abstracting over concrete hypotheses and integrating the properties of this abstraction with the definitions)
Your definition of usefulness fails to include the utility of the predictions made, which is the most important factor. A theory is useful if there is a chain of inference from it to a concrete application, and its degree of usefulness depends on the utility of that application, whether it could have been reached without using the theory, and the resources required to follow that chain of inference. Measuring usefulness requires entangling theories with applications and decisions, whereas truthfulness does not. Consequently, it’s incorrect to treat truthfulness as a special case of usefulness or vise versa.
Thank you—that’s an excellent summary.
From pwno: “Aren’t true theories defined by how useful they are in some application?”
My definition of “usefulness” was built with the express purpose of relating the truth of theories to how useful they are and is very much a context specific temporary definition (hence “define:”). If I had tried to deal with it directly I would have had something uselessly messy and incomplete, or I could have used a true but also uninformative expectation approach and hid all of the complexity. Instead, I experimented and tried to force the concepts to unify in some way. To do so I stretched the definition of usefulness pretty much to the breaking point and omitted any direct relation to utility functions. I found it a useful thought to think and hope you do as well even if you take issue with my use of the name “usefulness”.