I strongly disagree. “Fitting” data is not a theory-neutral process. As khafra points out, if you just have two time series, you can do linear regression to see if they seem correlated, and make predictions based off that. But for this to work requires lots of assumptions—one might even call it a ‘theory’ - about the world. For how this can go wrong, see the pirate theory of global warming.svg).
Conversely, “first principles” as they exist in reality are usually grounded in experiment. This is most glaring in the case of climate models. What does their code implement? Conservation of mass? Experimental result. Heat transfer? Experimental result. Cloud formation? Experiment. Optical properties of gases, experiment, solar spectrum, experiment, black-body radiation, experiment, Earth’s geography, experiment, seasonal cycles, experiment. This is all data! Using this data is just as much “just looking at the data” as linear regression.
Point taken, and I agree. I’ll try to better formulate what I meant:
Some theories are developed using data about the system you want to study. E.g., past climate data.
And some theories are developed using data about other systems. Either similar but causally unrelated ones (e.g., greenhouse effect in an actual greenhouse), or models which are so simplified that there’s a serious worry they may be too simplified to apply to the original system (e.g., black-body radiation). They also have the advantage that if they work on the system you want to study, then they let you explain it in terms of other things which you already understand.
On an abstract Bayesian level, they’re all the same; we don’t compartmentalize data about past climate from data about the optical properties of gasses. But for humans who work in different fields the difference matters.
I strongly disagree. “Fitting” data is not a theory-neutral process. As khafra points out, if you just have two time series, you can do linear regression to see if they seem correlated, and make predictions based off that. But for this to work requires lots of assumptions—one might even call it a ‘theory’ - about the world. For how this can go wrong, see the pirate theory of global warming.svg).
Conversely, “first principles” as they exist in reality are usually grounded in experiment. This is most glaring in the case of climate models. What does their code implement? Conservation of mass? Experimental result. Heat transfer? Experimental result. Cloud formation? Experiment. Optical properties of gases, experiment, solar spectrum, experiment, black-body radiation, experiment, Earth’s geography, experiment, seasonal cycles, experiment. This is all data! Using this data is just as much “just looking at the data” as linear regression.
Point taken, and I agree. I’ll try to better formulate what I meant:
Some theories are developed using data about the system you want to study. E.g., past climate data.
And some theories are developed using data about other systems. Either similar but causally unrelated ones (e.g., greenhouse effect in an actual greenhouse), or models which are so simplified that there’s a serious worry they may be too simplified to apply to the original system (e.g., black-body radiation). They also have the advantage that if they work on the system you want to study, then they let you explain it in terms of other things which you already understand.
On an abstract Bayesian level, they’re all the same; we don’t compartmentalize data about past climate from data about the optical properties of gasses. But for humans who work in different fields the difference matters.