The Efficient Market Hypothesis in Research
A classic economics joke goes like this:
Two economists are walking down a road, when one of them notices a $20 bill on the ground. He turns to his friend and exclaims: “Look, a $20 bill!” The other replies: “Nah, if there’s a $20 on the bill on the ground, someone would’ve picked it up already.”
The economists in the joke believe in the Efficient Market Hypothesis (EMH), which roughly says that financial markets are efficient and there’s no way to “beat the market” by making intelligent trades.
If the EMH was true, then why is there still a trillion-dollar finance industry with active mutual funds and hedge funds? In reality, the EMH is not a universal law of economics (like the law of gravity), but more like an approximation. There may exist inefficiencies in markets where stock prices follow a predictable pattern and there is profit to be made (e.g.: stock prices fall when it’s cloudy in New York). However, as soon as someone notices the pattern and starts exploiting it (by making a trading algorithm based on weather data), the inefficiency disappears. The next person will find zero correlation between weather in New York and stock prices.
There is a close parallel in academic research. Here, the “market” is generally efficient: most problems that are solvable are already solved. There are still “inefficiencies”: open problems that can be reasonably solved, and one “exploits” them by solving it and publishing a paper. Once exploited, it is no longer available: nobody else can publish the same paper solving the same problem.
Where does this leave the EMH? In my view, the EMH is a useful approximation, but its accuracy depends on your skill and expertise. For non-experts, the EMH is pretty much universally true: it’s unlikely that you’ve found an inefficiency that everyone else has missed. For experts, the EMH is less often true: when you’re working in highly specialized areas that only a handful of people understand, you begin to notice more inefficiencies that are still unexploited.
A large inefficiency is like a $20 bill on the ground: it gets picked up very quickly. An example of this is when a new tool is invented that can straightforwardly be applied to a wide range of problems. When the BERT model was released in 2018, breaking the state-of-the-art on all the NLP benchmarks, there was instantly an explosion of activity as researchers raced to apply it to all the important NLP problems and be the first to publish. By mid-2019, all the straightforward applications of BERT were done, and the $20 bill was no more.
Above: Representation of the EMH in research. To outsiders, there are no inefficiencies; to experts, inefficiencies exist briefly before they are exploited. Loosely inspired by this diagram by Matt Might.
The EMH implies various heuristics that I use to guide my daily research. If I have a research idea that’s relatively obvious, and the tools to attack it have existed for a while (say, >= 3 years), then probably one of the following is true:
Someone already published it 3 years ago.
Your idea doesn’t work very well.
The result is not that useful or interesting.
One of your basic assumptions is wrong, so your idea doesn’t even make sense.
Etc.
Conversely, a research idea is much more likely to be fruitful (i.e., a true inefficiency) if the tools to solve it have only existed for a few months, requires data and resources that nobody else has access to, or requires rare combinations of insights that conceivably nobody has thought of.
Outside the realm of the known (the red area in my diagram), there are many questions that are unanswerable. These include the hard problems of consciousness and free will, P=NP, etc, or more mundane problems where our current methods are not strong enough. For an outsider, these might seem like inefficiencies, but it would be wise to assume they’re not. The EMH ensures that true inefficiencies are quickly picked up.
To give a more relatable example, take the apps Uber (launched in 2009) and Instagram (launched in 2010). Many of the apps on your phone probably launched around the same time. In order for Uber and Instagram to work, people needed to have smartphones that were connected to the internet, with GPS (for Uber) and decent quality cameras (for Instagram). Neither of these ideas would’ve been possible in 2005, but thanks to the EMH, as soon as smartphone adoption took off, we didn’t have to wait very long to see all the viable use-cases for the new technology to emerge.
Originally posted on my blog: Lucky’s Notes.
The research community is very far from being efficient.
One of my own fields of research is Markov chain Monte Carlo methods, and their applications in computations for Bayesian models. Markov chain Monte Carlo (MCMC) was invented in the early 1950s, for use in statistical physics. It was not used by Bayesian statisticians until around 1990. There was no reason that it could not have been used before then—the methods of the 1950s could have been directly applied to many Bayesian inference problems.
In 1970, a paper generalizing the most common MCMC algorithm (the “Metropolis” method) was published in Biometrika, one of the top statistics journals. This didn’t prompt anyone to start using it for Bayesian inference.
In the early 1980s, MCMC was used by some engineers and computer scientists (eg, by Geoffrey Hinton for maximum likelihood inference for log-linear models with latent variables, also known as “Boltzmann machines”). This also didn’t prompt anyone to start using it for Bayesian inference.
After a form of MCMC starting being used by Bayeian statisticians around 1990, it took many years for the literature on MCMC methods used by physicists to actually be used by statisticians. This despite the fact that I wrote a review paper describing just about all these methods in terms readily accessible to statisticians in 1993.
In 1992, I started using the Hamiltonian Monte Carlo method (aka, hybrid Monte Carlo, or HMC) for Bayesian inference for neural network models. This method was invented by physicists in 1987. (It could have been invented in the 1950s, but just wasn’t.) I demonstrated that HMC was often hundreds or thousands of times faster than simpler methods, gave talks on this at conferences, and wrote my thesis (later book) on Bayesian learning in which this was a major theme. It wasn’t much used by other statisticians until after I wrote another review paper in 2010, which for some reason led to it catching on. It is now widely used in packages such as Stan.
Another of my research areas is error-correcting codes. In 1948, Claude Shannon proved his noisy coding theorem, establishing the theoretical (but not practical) limits of error correction. In 1963, Robert Gallager invented Low Density Parity Check (LDPC) codes. For many years after this, standard texbooks stated that the theoretical limit proved to be possible by Shannon was unlikely to ever be closely approached by codes with practical encoding and decoding algorithms. In 1996, David MacKay and I showed that a slight variation on Gallager’s LDPC codes comes very close to achieving the Shannon limit on performance. (A few years before then, “Turbo codes” had achieved similar performance.) These and related codes are now very widely used.
These are examples of good ideas that took far longer to be widely used than one would expect in an efficient research community. There are also many bad ideas that persist for far longer than they should.
I think both problems are at least partially the result of perverse incentives of researchers.
Lots of research is very incremental—what you describe as ”...there was instantly an explosion of activity as researchers raced to apply it to all the important NLP problems and be the first to publish”. Sometimes, of course, this explosion of activity is useful. But often it is not—the idea isn’t actually very good, it’s just the sort of idea on which it is easy to write more and more papers, often precisely because it isn’t very good. And sometimes this explosion of activity doesn’t happen when it would have been useful, because the activity required is not the sort that leads to easy papers—eg, the needed activity is to apply the idea to practical problems, but that isn’t the “novel” research that leads to tenure, or the idea requires learning some new tools and that’s too much trouble, or the way forward is by messy empirical work that doesn’t look as impressive as proving theorems (even if the theorems are actually pointless), or extending an idea that someone else came up with doesn’t seem like as good a career move as developing your own ideas (even when your ideas aren’t as good).
The easy rewards from incremental research may mean that researchers don’t spend much, or any, time on thinking about actual original ideas. Getting such ideas may require reading extensively in diverse fields, and getting one’s hands dirty with the low-level work that is necessary to develop real intuition about how things work, and what is important. Academic researchers can’t easily find time for this, and may be forced into either doing incremental research, or becoming research managers rather than actual researchers.
In my case, the best research environment was when I was a PhD student (with Geoffrey Hinton). But I’m not sure things are still as good for PhD students. The level of competition for short-term rewards may be higher than back in the 1990s.
Yeah, I agree that the EMH holds true more for incremental research than for truly groundbreaking ideas. I’m not too familiar with MCMC or Bayesian inference so correct me if I’m wrong, but I’m guessing these advancements required combining of ideas that nobody expected would work? The deep learning revolution could probably have happened sooner (in the sense that all the prerequisite tools existed), but few people before 2010 expected neural networks to work so consequently the inefficiencies there remained undiscovered.
At the same time, I wouldn’t denigrate research that you might view as “incremental”, because most research is of that nature. By this I mean, for every paper published in the ACL / EMNLP conferences, if the authors hadn’t published it, someone else would almost certainly have published something very similar within 1-2 years. Exceptions to this are few and far between—science advances via an accumulation of many small contributions.
I think the problem with MCMC is that’s an incredible dirty thing from the perspective of a mathematician. It’s way to practically useful as opposed to being about mathematical theorems. MCMC is about doing an efficient way to do calculation and doing calculations is low status for mathematicians. It has ugly randomness in it.
I was personally taught MCMC when studying bioinformatics and I was a bit surprised when talking with a friend who was a math Phd who had a problem where MCMC would have worked very well for a subproblem but it was completely out his radar.
MCMC was something that came out of compuer science and not out of the statistic community. Most people in statistics cared about statistical significance. The math community already looks down at the statistics community and MCMC seems to be even worse from that perspective.
My statistics proof said that in principle bioinformatics could have had been a subfield of statistics but the way of doing things was in the beginning rejected by the statistics community so that bioinformatics had to become it’s own field (and it’s the field where MCMC was used a lot because you actually need it for the problems that bioinformatics cares about).
Certainly some incremental research is very useful. But much of it isn’t. I’m not familiar with the ACL and EMNLP conferences, but for ML and statistics, there are large numbers of papers that don’t really contribute much (and these aren’t failed attempts at breakthroughs). You can see that this must be true from the sheer volume of papers now—there can’t possibly be that many actual advances.
For LDPC codes, it certainly was true that for years people didn’t realize their potential. But there wasn’t any good reason not to investigate—it’s sort of like nobody pointing a telescope at Saturn because Venus turned out to be rather featureless, and why would Saturn be different? There was a bit of tunnel vision, with an unjustified belief that one couldn’t really expect much more than what the codes being investigated delivered—though one could of course publish lots of papers on a new variation in sequential decoding of convolutional codes. (There was good evidence that this would never lead to the Shannon limit—but that of course must surely be unobtainable...)
Regarding MCMC and Bayesian inference, I think there was just nobody making the connection—nobody who actually knew what the methods from physics could do, and also knew what the computational obstacles for Bayesian inference were. I don’t think anyone thought of applying the Metropolis algorithm to Bayesian inference and then said, “but surely that wouldn’t work...”. It’s obviously worth a try.
That’s false. For most problem that are solvable nobody cares about solving them as it’s not valuable to solve them.
Larry McEnerney’s LEADERSHIP LAB: The Craft of Writing Effectively has a huge impact on how I think about reasearch. The research that’s done depends highly on what’s valued in certain research communities.
Two look at two of my recent research questions:
After reading about how Fibrin fixates fascia because it’s in fascial fluid and accumlates when the flow of fascial fluid is blocked, I wanted to know the protein composition of fascial fluid. I don’t think there’s any available paper that gives me the answer. That’s not because it would be very hard to do the relevant research but because few people care about the protein composition of fascial fluid. This is not research I can do, but someone who had a lab could easily do it.
My other recent research idea came after reading Core Pathways of Aging. I had a feeling like “Why did I believe the bullshit they told my in MicroBiology 101 (might also have been 102) about transposons?” Then I started running an evolutionary simulation of human hunter gatherers for transposons to understand better what’s going on. While I searched around for papers on other simulations on hunter gatherers and found that nobody really does this and that people instead use some differential equations from the 60′s with fancy math from a time before the compute to do good simulations was available.
While I’m still in my project and the outcome isn’t yet final, it’s very interesting how my model produces right now the 51% male birth rate that’s the literature value and not the 50% I would have naively expected. It seems to me like the research topic is underexplored because it didn’t fall into what any researcher in the field considered interesting.
True, I guess a more precise statement is “most problems that are important and solvable are already solved”. There are lots of small gaps in my research as well, like “what if we make a minor adjustment to method X”—whatever the outcome, it’s below the bar for a publication so they’re generally left untouched.
No, there are plenty of important problems that nobody has an incentive to solve. See Eliezer’s Inadequate Equilibria. It’s central that there’s a research community that cares about the problem.
Take Ivermectin pre-COVID. It worked very well for getting rid of parasites after being invented in 1975. Well enough to lead to a Nobel prize. In 2018 there’s a paper saying:
The question whether Ivermectin is a viable treatment against influenza and maybe a broad spectrum antiviral is an important problem. On the other hand it’s not a very valuable problem for anyone to find out given that Ivermectin is long off patent.
The way the last sentence of the paper is formulate is very interesting. As far as Influenza being important, the fact that we have every year a lot of influenza deaths should be enough to demonstrate that it’s an important problem. The community that produces regular drugs however doesn’t really care about repurposing a generic.
On the other hand there’s a community that cares for pandemic preparedness. The pandemic preparedness community cares less about whether it’s possible to patent treatments and cares more about health outcomes so he pitches it as being valuable for the pandemic preparedness community.
The tools to solve the problem of whether or not you can use ivermectin as antiviral against influenza and also against Coronaviruses exist since it hit the market in 1981. It was just never valuable enough for anybody to find out until some people thought about running small trials for all the substances that might help against COVID-19.
The people who did consider it valuable also were mostly small funders so we still haven’t highly powered trials that tell us with high certainty about the effects of ivermectin. The big healthcare funders didn’t consider it valuable to fund the studies early in the pandemic but that doesn’t mean running the studies wasn’t important.
MIRI’s attempt to publish ideas into the academic community had the problem of there not really being an existing academic community that values what they do. That doesn’t mean that MIRI’s work is not important. It just means nobody in academia cares.
Important work that has no field that values it has a hard time getting produced.
To be fair, almost nobody considered a pandemic to be a serious possibility prior to 2020, so it is understandable that pandemic preparedness research was a low-priority area. There may be lots of open and answerable questions in unpopular topics, but if the topic is obscure, the payoff for making a discovery is small (in terms of reputation and recognition).
Of course, COVID-19 has proven to us that pandemic research is important, and immediately researchers poured in from everywhere to work on various facets of the problem (e.g., I even joined in an effort to build a ventilator simulator). The payoff increased, so the inefficiencies quickly disappeared.
Now you can argue that pandemic research should’ve been more prioritized before. That is obvious in hindsight but was not at all obvious in 2019. Out of the zillions of low-priority research areas that nobody cares about now, how will you decide which one will become important? Unless you have a time machine to see into the future, it remains a low-payoff endeavor.
In the US alone depending on the year there are something between 10,000 and 60,000 flu deaths and a lot of additional harm due to people being ill. Whether or not pandemics are a concern it’s an important problem to deal with that.
There was money in pandemic preparedness. The Gates Foundation and organizations like CEPI were interested in it. They let themselves be conned by mRNA researchers and as a result funded mRNA research where there’s a good chance that it had net harm as it made us focus our vaccine trials on mRNA vaccines instead of focusing them on well-understand existing vaccine platforms that are easy to scale up and come with less side-effects.
The study from 2018 I referred is written in a way it is to advocate that part of this money goes into studying ivermectin for influenza. With the knowledge of hindsight that would have been more important.
In any case, my main point here is that what was prioritized (or was found to be valuable in Larry McEnerney terms) and what was important were two different things.
If you want to do important research and not just research that’s prioritized (found to be valuable by a particular community) it’s important to be able to mentally distinguish the two. Paradigm changing research for example generally isn’t valuable for the community that operates in an existing paradigm.
Sydney Brenner who was for example on of the people who started the molecular biology field is on record for saying that the kind of paradigm creating work back then would have been a hard time getting funded in today’s enviroment.
Given that there’s an efficient market as far as producing work that’s valued by established funders and not an efficient market for creating important work any researcher that actually wants to do important work and not just work that’s perceived as valuable has to keep the two apart. The efficient market hypothesis implies that most of the open opportunities to do important work are not seen as valuable by existing research communities.