Also note that Barnett said “any novel predictions” which is not part of the wikipedia definition of falsifiability right? The wikipedia definition doesn’t make reference to an existing community of scientists who already made predictions, such that a new hypothesis can be said to have made novel vs. non-novel predictions.
I totally agree btw that it matters sociologically who is making novel predictions and who is sticking with the crowd. And I do in fact ding MIRI points for this relative to some other groups. However I think relative to most elite opinion-formers on AGI matters, MIRI performs better than average on this metric.
But note that this ‘novel predictions’ metric is about people/institutions, not about hypotheses.
However I think relative to most elite opinion-formers on AGI matters, MIRI performs better than average on this metric.
Agree with this, with the caveat that I think all of their rightness relative to others fundamentally was in believing that short timelines were plausible enough, combined with believing in AI being the most major force of the 21st century by far, compared to other technologies, and basically a lot of their other specific predictions are likely to be pretty wrong.
I like this comment here about a useful comparison point to MIRI, where physicists were right about the higgs boson existing, but wrong on the theories like supersymmetry where people expected the higgs mass to be naturally stabilized, and assuming supersymmetry is correct for our universe, the theory cannot stabilize the mass of the higgs, or solve the hierarchy problem:
I think I agree with this—but do you see how it makes me frustrated to hear people dunk on MIRI’s doomy views as unfalsifiable? Here’s what happened in a nutshell:
MIRI: “AGI is coming and it will kill everyone.” Everyone else: “AGI is not coming and if it did it wouldn’t kill everyone.” time passes, evidence accumulates... Everyone else: “OK, AGI is coming, but it won’t kill everyone” Everyone else: “Also, the hypothesis that it won’t kill everyone is unfalsifiable so we shouldn’t believe it.”
Yeah, I think this is actually a problem I see here, though admittedly I often see the hypotheses be vaguely formulated, and I kind of agree with Jotto999 that the verbal forecasts give far too much room for leeway here:
I like that metric, but the metric I’m discussing is more:
Are they proposing clear hypotheses?
Do their hypotheses make novel testable predictions?
Are they making those predictions explicit?
So for example, looking at MIRI’s very first blog post in 2007: The Power of Intelligence. I used the first just to avoid cherry-picking.
Hypothesis: intelligence is powerful. (yes it is)
This hypothesis is a necessary precondition for what we’re calling “MIRI doom theory” here. If intelligence is weak then AI is weak and we are not doomed by AI.
Predictions that I extract:
An AI can do interesting things over the Internet without a robot body.
An AI can get money.
An AI can be charismatic.
An AI can send a ship to Mars.
An AI can invent a grand unified theory of physics.
An AI can prove the Riemann Hypothesis.
An AI can cure obesity, cancer, aging, and stupidity.
Not a novel hypothesis, nor novel predictions, but also not widely accepted in 2007. As predictions they have aged very well, but they were unfalsifiable. If 2025 Claude had no charisma, it would not falsify the prediction that an AI can be charismatic.
I don’t mean to ding MIRI any points here, relative or otherwise, it’s just one blog post, I don’t claim it supports Barnett’s complaint by itself. I mostly joined the thread to defend the concept of asymmetric falsifiability.
Martin Randall extracted the practical consequences of this here:
In that context, if a hypothesis makes no novel predictions, and the predictions it makes are a superset of the predictions of other hypotheses, it’s less empirically vulnerable, and in some relative sense “unfalsifiable”, compared to those other hypotheses.
Also note that Barnett said “any novel predictions” which is not part of the wikipedia definition of falsifiability right? The wikipedia definition doesn’t make reference to an existing community of scientists who already made predictions, such that a new hypothesis can be said to have made novel vs. non-novel predictions.
I totally agree btw that it matters sociologically who is making novel predictions and who is sticking with the crowd. And I do in fact ding MIRI points for this relative to some other groups. However I think relative to most elite opinion-formers on AGI matters, MIRI performs better than average on this metric.
But note that this ‘novel predictions’ metric is about people/institutions, not about hypotheses.
Agree with this, with the caveat that I think all of their rightness relative to others fundamentally was in believing that short timelines were plausible enough, combined with believing in AI being the most major force of the 21st century by far, compared to other technologies, and basically a lot of their other specific predictions are likely to be pretty wrong.
I like this comment here about a useful comparison point to MIRI, where physicists were right about the higgs boson existing, but wrong on the theories like supersymmetry where people expected the higgs mass to be naturally stabilized, and assuming supersymmetry is correct for our universe, the theory cannot stabilize the mass of the higgs, or solve the hierarchy problem:
https://www.lesswrong.com/posts/ZLAnH5epD8TmotZHj/you-can-in-fact-bamboozle-an-unaligned-ai-into-sparing-your#Ha9hfFHzJQn68Zuhq
I think I agree with this—but do you see how it makes me frustrated to hear people dunk on MIRI’s doomy views as unfalsifiable? Here’s what happened in a nutshell:
MIRI: “AGI is coming and it will kill everyone.”
Everyone else: “AGI is not coming and if it did it wouldn’t kill everyone.”
time passes, evidence accumulates...
Everyone else: “OK, AGI is coming, but it won’t kill everyone”
Everyone else: “Also, the hypothesis that it won’t kill everyone is unfalsifiable so we shouldn’t believe it.”
Yeah, I think this is actually a problem I see here, though admittedly I often see the hypotheses be vaguely formulated, and I kind of agree with Jotto999 that the verbal forecasts give far too much room for leeway here:
I like Eli Tyre’s comment here:
https://www.lesswrong.com/posts/ZEgQGAjQm5rTAnGuM/beware-boasting-about-non-existent-forecasting-track-records#Dv7aTjGXEZh6ALmZn
I like that metric, but the metric I’m discussing is more:
Are they proposing clear hypotheses?
Do their hypotheses make novel testable predictions?
Are they making those predictions explicit?
So for example, looking at MIRI’s very first blog post in 2007: The Power of Intelligence. I used the first just to avoid cherry-picking.
Hypothesis: intelligence is powerful. (yes it is)
This hypothesis is a necessary precondition for what we’re calling “MIRI doom theory” here. If intelligence is weak then AI is weak and we are not doomed by AI.
Predictions that I extract:
An AI can do interesting things over the Internet without a robot body.
An AI can get money.
An AI can be charismatic.
An AI can send a ship to Mars.
An AI can invent a grand unified theory of physics.
An AI can prove the Riemann Hypothesis.
An AI can cure obesity, cancer, aging, and stupidity.
Not a novel hypothesis, nor novel predictions, but also not widely accepted in 2007. As predictions they have aged very well, but they were unfalsifiable. If 2025 Claude had no charisma, it would not falsify the prediction that an AI can be charismatic.
I don’t mean to ding MIRI any points here, relative or otherwise, it’s just one blog post, I don’t claim it supports Barnett’s complaint by itself. I mostly joined the thread to defend the concept of asymmetric falsifiability.
Martin Randall extracted the practical consequences of this here: