Yeah, I agree that the EMH holds true more for incremental research than for truly groundbreaking ideas. I’m not too familiar with MCMC or Bayesian inference so correct me if I’m wrong, but I’m guessing these advancements required combining of ideas that nobody expected would work? The deep learning revolution could probably have happened sooner (in the sense that all the prerequisite tools existed), but few people before 2010 expected neural networks to work so consequently the inefficiencies there remained undiscovered.
At the same time, I wouldn’t denigrate research that you might view as “incremental”, because most research is of that nature. By this I mean, for every paper published in the ACL / EMNLP conferences, if the authors hadn’t published it, someone else would almost certainly have published something very similar within 1-2 years. Exceptions to this are few and far between—science advances via an accumulation of many small contributions.
I think the problem with MCMC is that’s an incredible dirty thing from the perspective of a mathematician. It’s way to practically useful as opposed to being about mathematical theorems. MCMC is about doing an efficient way to do calculation and doing calculations is low status for mathematicians. It has ugly randomness in it.
I was personally taught MCMC when studying bioinformatics and I was a bit surprised when talking with a friend who was a math Phd who had a problem where MCMC would have worked very well for a subproblem but it was completely out his radar.
MCMC was something that came out of compuer science and not out of the statistic community. Most people in statistics cared about statistical significance. The math community already looks down at the statistics community and MCMC seems to be even worse from that perspective.
My statistics proof said that in principle bioinformatics could have had been a subfield of statistics but the way of doing things was in the beginning rejected by the statistics community so that bioinformatics had to become it’s own field (and it’s the field where MCMC was used a lot because you actually need it for the problems that bioinformatics cares about).
Certainly some incremental research is very useful. But much of it isn’t. I’m not familiar with the ACL and EMNLP conferences, but for ML and statistics, there are large numbers of papers that don’t really contribute much (and these aren’t failed attempts at breakthroughs). You can see that this must be true from the sheer volume of papers now—there can’t possibly be that many actual advances.
For LDPC codes, it certainly was true that for years people didn’t realize their potential. But there wasn’t any good reason not to investigate—it’s sort of like nobody pointing a telescope at Saturn because Venus turned out to be rather featureless, and why would Saturn be different? There was a bit of tunnel vision, with an unjustified belief that one couldn’t really expect much more than what the codes being investigated delivered—though one could of course publish lots of papers on a new variation in sequential decoding of convolutional codes. (There was good evidence that this would never lead to the Shannon limit—but that of course must surely be unobtainable...)
Regarding MCMC and Bayesian inference, I think there was just nobody making the connection—nobody who actually knew what the methods from physics could do, and also knew what the computational obstacles for Bayesian inference were. I don’t think anyone thought of applying the Metropolis algorithm to Bayesian inference and then said, “but surely that wouldn’t work...”. It’s obviously worth a try.
Yeah, I agree that the EMH holds true more for incremental research than for truly groundbreaking ideas. I’m not too familiar with MCMC or Bayesian inference so correct me if I’m wrong, but I’m guessing these advancements required combining of ideas that nobody expected would work? The deep learning revolution could probably have happened sooner (in the sense that all the prerequisite tools existed), but few people before 2010 expected neural networks to work so consequently the inefficiencies there remained undiscovered.
At the same time, I wouldn’t denigrate research that you might view as “incremental”, because most research is of that nature. By this I mean, for every paper published in the ACL / EMNLP conferences, if the authors hadn’t published it, someone else would almost certainly have published something very similar within 1-2 years. Exceptions to this are few and far between—science advances via an accumulation of many small contributions.
I think the problem with MCMC is that’s an incredible dirty thing from the perspective of a mathematician. It’s way to practically useful as opposed to being about mathematical theorems. MCMC is about doing an efficient way to do calculation and doing calculations is low status for mathematicians. It has ugly randomness in it.
I was personally taught MCMC when studying bioinformatics and I was a bit surprised when talking with a friend who was a math Phd who had a problem where MCMC would have worked very well for a subproblem but it was completely out his radar.
MCMC was something that came out of compuer science and not out of the statistic community. Most people in statistics cared about statistical significance. The math community already looks down at the statistics community and MCMC seems to be even worse from that perspective.
My statistics proof said that in principle bioinformatics could have had been a subfield of statistics but the way of doing things was in the beginning rejected by the statistics community so that bioinformatics had to become it’s own field (and it’s the field where MCMC was used a lot because you actually need it for the problems that bioinformatics cares about).
Certainly some incremental research is very useful. But much of it isn’t. I’m not familiar with the ACL and EMNLP conferences, but for ML and statistics, there are large numbers of papers that don’t really contribute much (and these aren’t failed attempts at breakthroughs). You can see that this must be true from the sheer volume of papers now—there can’t possibly be that many actual advances.
For LDPC codes, it certainly was true that for years people didn’t realize their potential. But there wasn’t any good reason not to investigate—it’s sort of like nobody pointing a telescope at Saturn because Venus turned out to be rather featureless, and why would Saturn be different? There was a bit of tunnel vision, with an unjustified belief that one couldn’t really expect much more than what the codes being investigated delivered—though one could of course publish lots of papers on a new variation in sequential decoding of convolutional codes. (There was good evidence that this would never lead to the Shannon limit—but that of course must surely be unobtainable...)
Regarding MCMC and Bayesian inference, I think there was just nobody making the connection—nobody who actually knew what the methods from physics could do, and also knew what the computational obstacles for Bayesian inference were. I don’t think anyone thought of applying the Metropolis algorithm to Bayesian inference and then said, “but surely that wouldn’t work...”. It’s obviously worth a try.