I have this vague idea that sometime in our past, people thought that knowledge was like an almanac; a repository of zillions of tiny true facts that summed up to being able to predict stuff about stuff, but without a general understanding of how things work. There was no general understanding because any heuristic that would begin to explain how things work would immediately be discounted by the single tiny fact, easily found, that contradicted it. Details and concern with minutia and complexity is actually anti-science for this reason. It’s not that details and complexity aren’t important, but you make no progress if you consider them from the beginning.
And then I wondered: is this knee-jerk reaction to dismiss any challenge of the keep-it-simple conventional wisdom the reason why we’re not making more progress in complex fields like biology?
For classical physics it has been the case that the simpler the hypothetical model you verify, the more you cash out in terms of understanding physics. The simpler the hypothesis you test, the easier it is to establish if the hypothesis is true and the more you learn about physics if it is true. However, what considering and verifying simpler and simpler hypotheses actually does is transfer the difficulty of understanding the real-world problem to the experimental set-up. To verify your super-simple hypothesis, you need to eliminate confounding factors from the experiment. Success in classical physics has occurred because when experiments were done, confounding factors could be eliminated through a well-designed set-up or were small enough to neglect. (Consider Galileo’s purported experiment of dropping two objects from a height – in real life that particular experiment doesn’t work because the lighter object may fall more slowly.)
In complex fields this type of modeling via simplification doesn’t seem to cash out as well, because it’s more difficult to control the experimental set-up and the confounding effects aren’t negligible. So while I’ve always believed that models need to be simple, I would consider a different paradigm if it could work. How could understanding the world work any other way than through simple models?
Some method trends in biology: high through-put, random searches, brute force, etc.
I must disagree with premise that biology is not making progress while physics is. As far as I can tell biology is making progress many orders of magnitude larger and more practically significant than physics at the moment.
And it requires this messy complex paradigm of accumulating plenty of data and mining it for complicated regularities—even the closest things biology has to “physical laws” like the Central Dogma or how DNA sequences translate to protein sequences, each have enough exceptions and footnotes to fill a small book.
The world isn’t simple. Simple models are usually very wrong. Exceptions to this pattern like basic physics are extremely unusual, and shouldn’t be taken as a paradigm for all science.
The catch is that complex models are also usually very wrong. Most possible models of reality are wrong, because there are an infinite legion of models and only one reality. And if you try too hard to create a perfectly nuanced and detailed model, because you fear your bias in favor of simple mathematical models, there’s a risk. You can fall prey to the opposing bias: the temptation to add an epicycle to your model instead of rethinking your premises. As one of the wiser teachers of one of my wiser teachers said, you can always come up with a function that fits 100 data points perfectly… if you use a 99th-order polynomial.
Naturally, this does not mean that the data are accurately described by a 99th-order polynomial, or that the polynomial has any predictive power worth giving a second glance. Tacking on more complexity and free parameters doesn’t guarantee a good theory any more than abstracting them out does.
I must disagree with premise that biology is not making progress while physics is. As far as I can tell biology is making progress many orders of magnitude larger and more practically significant than physics at the moment.
I actually entirely agree with you. Biology is making terrific progress, and shouldn’t be overly compared with physics. Two supporting comments:
First, when biology is judged as nascent, this may be because it is being overly compared with physics. Success in physics meant finding and describing the most fundamental relationship between variables analytically, but this doesn’t seem to be what the answers look like in biology. (As Simon Jester wrote here, describing the low-level rules is just the beginning, not the end.) And the relatively simple big ideas, like the theory of evolution and the genetic code, are still often judged as inferior in some way as scientific principles. Perhaps because they’re not so closely identified with mathematical equations.
Further, and secondly, the scientific culture that measures progress in biology using the physics paradigm may still be slowing down our progress. While we are making good progress, I also feel a resistance: the reality of biology doesn’t seem to be responding well to the scientific epistemology we are throwing at it. But I’m still open-minded, maybe our epistemology needs to be updated or maybe our epistemology is fine and we just need to keep forging on.
Rather than describing the difference between physics and biology as “simple models” vs. “complex models”, describe them in terms of expected information content.
Physicists generally expect an eventual Grand Unified Theory to be small in information content (one or a few pages of very dense differential equations, maybe as small as this: http://www.cs.ru.nl/~freek/sm/sm4.gif ). On the order of kilobytes, plus maybe some free parameters.
Biologists generally expect an eventual understanding of a species to be much much bigger. At the very least, the compressed human genome alone is almost a gigabyte; a theory describing how it works would be (conservatively) of the same order of magnitude.
All things being equal, would biologists prefer a yottabyte-sized theory to a zettabyte-sized theory? No, absolutely not! The scientific preference is still MOSTLY in the direction of simplicity.
There’s a lot of sizes out there, and the fact that gigabyte-sized theories seem likely to defeat kilobyte-sized theories in the biological domain shouldn’t be construed as a violation of the general “prefer simplicity” rule.
Biology is a special case of physics. Physicists may at some point arrive at a Grand Unified Theory of Everything that theoretically implies all of biology.
Biology is the classification and understanding of the complicated results of physics, so it is in many ways basically an almanac.
I hope that when we understand biology better, it won’t seem like an almanac. I predict that our understanding of what “understanding” means will shift dramatically as we continue to make progress in biology. For example—just speculating—perhaps we will feel like we understand something if we can compute it. Perhaps we will develop and run models of biological phenemena as trivially as using a calculator, so that such knowledge seems like an extension of what we “know”. And then understanding will mean identifying the underlying rules, while the almanac part will just be the nitty gritty output; like doing a physics calculation for specific forces. (For example, it’s pretty neat that WHO is using modeling in real time to generate information about the H1N1 pandemic.)
My use of the world “almanac” was more of a reference to the breadth of the area covered by biology, rather than a comment on the difficulty or content of the information.
It’s funny that you mention predictive modeling—one of the main functions of an Almanac is to provide predictions based on models.
From http://en.wikipedia.org/wiki/Almanac:
“Modern almanacs include a comprehensive presentation of statistical and descriptive data covering the entire world. Contents also include discussions of topical developments and a summary of recent historical events.”
Yes, I noticed that I was still nevertheless describing biology as an almanac, as a library of information (predictions) that we will feel like we own because we can generate it. I suppose the best way to say what I was trying to say is that I hope that when we have a better understanding of biology, the term “almanac” won’t seem pejorative, but the legitimate way of understanding something that has large numbers of similar interacting components.
This is profoundly misleading. Physicists already have a good handle on how the things biological systems are made of work, but it’s a moot point because trying to explain the details of how living things operate in terms of subatomic particles is a waste of time. Unless you’ve got a thousand tons of computronium tucked away in your back pocket, you’re never going to be able to produce useful results in biology purely by using the results of physics.
Therefore, the actual study of biology is largely separate from physics, except for the very indirect route of quantum physics ⇒ molecular chemistry ⇒ biochemistry ⇒ biology. Most of the research in the field has little to do with those paths, and each step in the indirect chain is another level of abstraction that allows you to ignore more of the details of how the physics itself works.
The ultimate goal of physics is to break things down until we discover the simplest, most basic rules that govern the universe.
The goals of biology do not lead down what you call the “indirect route.” As you state, Biology abstracts away the low-level physics and tries to understand the extremely complicated interactions that take place at a higher level.
Biology attempts to classify and understand all of the species, their systems, their subsystems, their biochemistry, and their interspecies and environmental interactions. The possible sum total of biological knowledge is an essentially limitless dataset, what I might call the “Almanac of Life.”
I’m not sure quite where you think we disagree. I don’t see anything in our two posts that’s contradictory—unless you find the use of the word “Almanac” disparaging to biologists? I hope it’s clear that it wasn’t a literal use—biology clearly isn’t a yearly book of tabular data, so perhaps the simile is inapt.
The way you put it does seem to disparage biologists, yes. The biologists are doing work that is qualitatively different from what physicists do, and that produces results the physicists never will (without the aforementioned thousand tons of computronium, at least). In a very real sense, biologists are exploring an entirely different ideaspace from the one the physicists live in. No amount of investigation into physics in isolation would have given us the theory of evolution, for instance.
And weirdly, I’m not a biologist; I’m an apprentice physicist. I still recognize that they’re doing something I’m not, rather than something that I might get around to by just doing enough physics to make their results obvious.
I have this vague idea that sometime in our past, people thought that knowledge was like an almanac; a repository of zillions of tiny true facts that summed up to being able to predict stuff about stuff, but without a general understanding of how things work. There was no general understanding because any heuristic that would begin to explain how things work would immediately be discounted by the single tiny fact, easily found, that contradicted it. Details and concern with minutia and complexity is actually anti-science for this reason. It’s not that details and complexity aren’t important, but you make no progress if you consider them from the beginning.
And then I wondered: is this knee-jerk reaction to dismiss any challenge of the keep-it-simple conventional wisdom the reason why we’re not making more progress in complex fields like biology?
For classical physics it has been the case that the simpler the hypothetical model you verify, the more you cash out in terms of understanding physics. The simpler the hypothesis you test, the easier it is to establish if the hypothesis is true and the more you learn about physics if it is true. However, what considering and verifying simpler and simpler hypotheses actually does is transfer the difficulty of understanding the real-world problem to the experimental set-up. To verify your super-simple hypothesis, you need to eliminate confounding factors from the experiment. Success in classical physics has occurred because when experiments were done, confounding factors could be eliminated through a well-designed set-up or were small enough to neglect. (Consider Galileo’s purported experiment of dropping two objects from a height – in real life that particular experiment doesn’t work because the lighter object may fall more slowly.)
In complex fields this type of modeling via simplification doesn’t seem to cash out as well, because it’s more difficult to control the experimental set-up and the confounding effects aren’t negligible. So while I’ve always believed that models need to be simple, I would consider a different paradigm if it could work. How could understanding the world work any other way than through simple models?
Some method trends in biology: high through-put, random searches, brute force, etc.
I must disagree with premise that biology is not making progress while physics is. As far as I can tell biology is making progress many orders of magnitude larger and more practically significant than physics at the moment.
And it requires this messy complex paradigm of accumulating plenty of data and mining it for complicated regularities—even the closest things biology has to “physical laws” like the Central Dogma or how DNA sequences translate to protein sequences, each have enough exceptions and footnotes to fill a small book.
The world isn’t simple. Simple models are usually very wrong. Exceptions to this pattern like basic physics are extremely unusual, and shouldn’t be taken as a paradigm for all science.
The catch is that complex models are also usually very wrong. Most possible models of reality are wrong, because there are an infinite legion of models and only one reality. And if you try too hard to create a perfectly nuanced and detailed model, because you fear your bias in favor of simple mathematical models, there’s a risk. You can fall prey to the opposing bias: the temptation to add an epicycle to your model instead of rethinking your premises. As one of the wiser teachers of one of my wiser teachers said, you can always come up with a function that fits 100 data points perfectly… if you use a 99th-order polynomial.
Naturally, this does not mean that the data are accurately described by a 99th-order polynomial, or that the polynomial has any predictive power worth giving a second glance. Tacking on more complexity and free parameters doesn’t guarantee a good theory any more than abstracting them out does.
I actually entirely agree with you. Biology is making terrific progress, and shouldn’t be overly compared with physics. Two supporting comments:
First, when biology is judged as nascent, this may be because it is being overly compared with physics. Success in physics meant finding and describing the most fundamental relationship between variables analytically, but this doesn’t seem to be what the answers look like in biology. (As Simon Jester wrote here, describing the low-level rules is just the beginning, not the end.) And the relatively simple big ideas, like the theory of evolution and the genetic code, are still often judged as inferior in some way as scientific principles. Perhaps because they’re not so closely identified with mathematical equations.
Further, and secondly, the scientific culture that measures progress in biology using the physics paradigm may still be slowing down our progress. While we are making good progress, I also feel a resistance: the reality of biology doesn’t seem to be responding well to the scientific epistemology we are throwing at it. But I’m still open-minded, maybe our epistemology needs to be updated or maybe our epistemology is fine and we just need to keep forging on.
Rather than describing the difference between physics and biology as “simple models” vs. “complex models”, describe them in terms of expected information content.
Physicists generally expect an eventual Grand Unified Theory to be small in information content (one or a few pages of very dense differential equations, maybe as small as this: http://www.cs.ru.nl/~freek/sm/sm4.gif ). On the order of kilobytes, plus maybe some free parameters.
Biologists generally expect an eventual understanding of a species to be much much bigger. At the very least, the compressed human genome alone is almost a gigabyte; a theory describing how it works would be (conservatively) of the same order of magnitude.
All things being equal, would biologists prefer a yottabyte-sized theory to a zettabyte-sized theory? No, absolutely not! The scientific preference is still MOSTLY in the direction of simplicity.
There’s a lot of sizes out there, and the fact that gigabyte-sized theories seem likely to defeat kilobyte-sized theories in the biological domain shouldn’t be construed as a violation of the general “prefer simplicity” rule.
The uncompressed human genome is about 750 megabytes.
Thanks, and I apologize for the error.
Biology is a special case of physics. Physicists may at some point arrive at a Grand Unified Theory of Everything that theoretically implies all of biology.
Biology is the classification and understanding of the complicated results of physics, so it is in many ways basically an almanac.
I hope that when we understand biology better, it won’t seem like an almanac. I predict that our understanding of what “understanding” means will shift dramatically as we continue to make progress in biology. For example—just speculating—perhaps we will feel like we understand something if we can compute it. Perhaps we will develop and run models of biological phenemena as trivially as using a calculator, so that such knowledge seems like an extension of what we “know”. And then understanding will mean identifying the underlying rules, while the almanac part will just be the nitty gritty output; like doing a physics calculation for specific forces. (For example, it’s pretty neat that WHO is using modeling in real time to generate information about the H1N1 pandemic.)
My use of the world “almanac” was more of a reference to the breadth of the area covered by biology, rather than a comment on the difficulty or content of the information.
It’s funny that you mention predictive modeling—one of the main functions of an Almanac is to provide predictions based on models.
From http://en.wikipedia.org/wiki/Almanac: “Modern almanacs include a comprehensive presentation of statistical and descriptive data covering the entire world. Contents also include discussions of topical developments and a summary of recent historical events.”
Yes, I noticed that I was still nevertheless describing biology as an almanac, as a library of information (predictions) that we will feel like we own because we can generate it. I suppose the best way to say what I was trying to say is that I hope that when we have a better understanding of biology, the term “almanac” won’t seem pejorative, but the legitimate way of understanding something that has large numbers of similar interacting components.
This is profoundly misleading. Physicists already have a good handle on how the things biological systems are made of work, but it’s a moot point because trying to explain the details of how living things operate in terms of subatomic particles is a waste of time. Unless you’ve got a thousand tons of computronium tucked away in your back pocket, you’re never going to be able to produce useful results in biology purely by using the results of physics.
Therefore, the actual study of biology is largely separate from physics, except for the very indirect route of quantum physics ⇒ molecular chemistry ⇒ biochemistry ⇒ biology. Most of the research in the field has little to do with those paths, and each step in the indirect chain is another level of abstraction that allows you to ignore more of the details of how the physics itself works.
The ultimate goal of physics is to break things down until we discover the simplest, most basic rules that govern the universe.
The goals of biology do not lead down what you call the “indirect route.” As you state, Biology abstracts away the low-level physics and tries to understand the extremely complicated interactions that take place at a higher level.
Biology attempts to classify and understand all of the species, their systems, their subsystems, their biochemistry, and their interspecies and environmental interactions. The possible sum total of biological knowledge is an essentially limitless dataset, what I might call the “Almanac of Life.”
I’m not sure quite where you think we disagree. I don’t see anything in our two posts that’s contradictory—unless you find the use of the word “Almanac” disparaging to biologists? I hope it’s clear that it wasn’t a literal use—biology clearly isn’t a yearly book of tabular data, so perhaps the simile is inapt.
The way you put it does seem to disparage biologists, yes. The biologists are doing work that is qualitatively different from what physicists do, and that produces results the physicists never will (without the aforementioned thousand tons of computronium, at least). In a very real sense, biologists are exploring an entirely different ideaspace from the one the physicists live in. No amount of investigation into physics in isolation would have given us the theory of evolution, for instance.
And weirdly, I’m not a biologist; I’m an apprentice physicist. I still recognize that they’re doing something I’m not, rather than something that I might get around to by just doing enough physics to make their results obvious.