If you don’t believe me about that aspect of heroic epistemology, feel free not to believe me about not multiplying small probabilities either.
Multiplying small probabilities seems fine to me, whereas I really don’t get “heroic epistemology”.
You seem to be suggesting that “heroic epistemology” and “multiplying small probabilities” both lead to the same conclusion: support MIRI’s work on FAI. But this is the case only if working on FAI has no negative consequences. In that case, “small chance of success” plus “multiplying small probabilities” warrants working on FAI, just as “medium probability of success” and “not multiplying small probabilities” does. But since working on FAI does have negative consequences, namely shortening AI timelines and (in the later stages) possibly directly causing the creation an UFAI, just allowing multiplication by small probabilities is not sufficient to warrant working on FAI if the probability of success is low.
I am really worried that you are justifying your current course of action through a novel epistemology of your own invention, which has not been widely vetted (or even widely understood). Most new ideas are wrong, and I think you ought to treat your own new ideas with deeper suspicion.
I’m a reactionary, not an innovator, dammit! Reacting against this newfangled antiheroic ‘reference class’ claim that says we ought to let the world burn because we don’t have enough of a hero license!
Ahem.
I’m also really unconvinced by the claim that this work could reasonably have expected net negative consequences. I’m worried about the dynamics and evidence of GiveDirectly. But I don’t think GD has negative consequences, that would be a huge stretch. It’s possible maybe but it’s certainly not the arithmetic expectation and with that said, I worry that this ‘maybe negative’ stuff is impeding EA motivation generally, there is much that is ineffectual to be wary of, and missed opportunity costs, but trying to warn people against reverse or negative effects seems pretty perverse for anything that has made it onto Givewell’s Top 3, or CFAR, or FHI, or MIRI. Info that shortens AI timelines should mostly just not be released publicly and I don’t see any particularly plausible way for a planet to survive without having some equivalent of MIRI doing MIRI’s job, and the math thereof should be started as early as feasible.
I’m a reactionary, not an innovator, dammit! Reacting against this newfangled antiheroic ‘reference class’ claim that says we ought to let the world burn because we don’t have enough of a hero license!
“Reference class” to me is just an intuitive way of thinking about updating on certain types of evidence. It seems like you’re saying that in some cases we ought to use the inside view, or weigh object-level evidence more heavily, but 1) I don’t understand why you are not worried about “inside view” reasoning typically producing overconfidence or why you don’t think it’s likely to produce overconfidence in this case, and 2) according to my inside view, the probability of a team like the kind you’re envisioning solving FAI is low, and a typical MIRI donor or potential donor can’t be said to have much of an inside view on this matter, and has to use “reference class” reasoning. So what is your argument here?
I’m also really unconvinced by the claim that this work could reasonably have expected net negative consequences.
Every AGI researcher is unconvinced by that, about their own work.
but trying to warn people against reverse or negative effects seems pretty perverse for anything that has made it onto Givewell’s Top 3, or CFAR, or FHI, or MIRI
CFAR and MIRI were created by you, to help you build FAI. If FHI has endorsed your plan for building FAI (as opposed to endorsing MIRI as an organization that’s a force for good overall, which I’d agree with and I’ve actually provided various forms of support to MIRI because of that), I’m not aware of it. I also think I’ve thought enough about this topic to give some weight to my own judgments, so even if FHI does endorse your plan, I’d want to see their reasoning (which I definitely have not seen) and not just take their word. I note that Givewell does publish its analyses and are not asking people to just trust it.
Info that shortens AI timelines should mostly just not be released publicly
My model of FAI development says that you have to get most of the way to being able to build an AGI just to be able to start working on many Friendliness-specific problems, and solving those problems would take a long time relative to finishing rest of the AGI capability work. Unless you’re flying completely below the radar, which is incompatible with your plan for funding via public donations, what is stopping your unpublished results from being stolen or leaked in the mean time? And just gathering 10 to 50 world-class talents to work on FAI is likely to spur competition and speed up AGI progress. The fact that you seem to be overconfident about your chance of success also suggests that you are likely to be overconfident in other areas, and indicates a high risk of accidental UFAI creation (relative to the probability of success, not necessarily high in absolute terms).
My model of FAI development says that you have to get most of the way to being able to build an AGI just to be able to start working on many Friendliness-specific problems, and solving those problems would take a long time relative to finishing rest of the AGI capability work.
Agree, though luckily there are other Friendliness-specific problems that we can start solving right now.
Unless you’re flying completely below the radar, which is incompatible with your plan for funding via public donations, what is stopping your unpublished results from being stolen or leaked in the mean time?
Presumably, security technology similar to what has mostly worked for the Manhattan project, secret NSA projects, etc. But yeah, it’s a big worry. But what did you have in mind about flying completely under the radar? There are versions of an FAI team that could be funded pretty discretely by just one person.
Agree, though luckily there are other Friendliness-specific problems that we can start solving right now.
I listed some in another comment, but they are not the current focus of MIRI research. Instead, MIRI is focusing on FAI-relevant problems that do shorten AI timelines (i.e., working on “get most of the way to being able to build an AGI”), such as decision theory and logical uncertainty.
Presumably, security technology similar to what has mostly worked for the Manhattan project, secret NSA projects, etc.
As I noted in previous comments, the economics of information security seems to greatly favor the offense, so you have to have to spend much more resources than your attackers in order to maintain secrets.
But what did you have in mind about flying completely under the radar? There are versions of an FAI team that could be funded pretty discretely by just one person.
That’s probably the best bet as far as avoiding having your results stolen, but introduces other problems, such as how to attract talent, and whether you can fund a large enough team that way. (Small teams might increase the chances of accidental UFAI creation, since there would be less people to look out for errors.) And given that Eliezer is probably already on the radar of most AGI researchers, you’d have to find a replacement for him on this “under the radar” team.
I should ask this question now rather than later: Is there a concrete policy alternative being considered by you?
Every AGI researcher is unconvinced by that, about their own work.
And on one obvious ‘outside view’, they’d be right—it’s a very strange and unusual situation, which took me years to acknowledge, that this one particular class of science research could have perverse results. There’s many attempted good deeds which have no effect, but complete backfires make the news because they’re rare.
(Hey, maybe the priors in favor of good outcomes from the broad reference class of scientific research are so high that we should just ignore the inside view which says that AGI research will have a different result!)
And even AGI research doesn’t end up making it less likely that AGI will be developed, please note—it’s not that perverse in its outcome.
Is there a concrete policy alternative being considered by you?
I’m currently in favor of of the following:
research on strategies for navigating intelligence explosion (what I called “Singularity Strategies”)
pushing for human intelligence enhancement
pushing for a government to try to take an insurmountable tech lead via large scale intelligence enhancement
research into a subset of FAI-related problems that do not shorten AI timelines (at least as far as we can tell), such as consciousness, normative ethics, metaethics, metaphilosophy
advocacy/PR/academic outreach on the dangers of AGI progress
There’s many attempted good deeds which have no effect, but complete backfires make the news because they’re rare.
What about continuing physics research possibly leading to a physics disaster or new superweapons, biotech research leading to biotech disasters, nanotech research leading to nanotech disasters, WBE research leading to value drift and Malthusian outcomes, computing hardware research leading to deliberate or accidental creation of massive simulated suffering (aside from UFAI)? In addition, I thought you believed that faster economic growth made a good outcome less likely, which would imply that most scientific research is bad?
And even AGI research doesn’t end up making it less likely that AGI will be developed, please note—it’s not that perverse in its outcome.
Many AGI researchers seem to think that their research will result in a benevolent AGI, and I’m assuming you agree that their research does make it less likely that such an AGI will be eventually developed.
It seems odd to insist that someone explicitly working on benevolence should consider themselves to be in the same reference class as someone who thinks they just need to take care of the AGI and the benevolence will pretty much take care of itself.
I wasn’t intending to use “AGI researchers” as a reference class to show that Eliezer’s work is likely to have net negative consequences, but to show that people whose work can reasonably be expected to have net negative consequences (of whom AGI researchers is a prominent class) still tend not to believe such claims, and therefore Eliezer’s failure to be convinced is not of much evidential value to others.
The reference class I usually do have in mind when I think of Eliezer is philosophers who think they have the right answer to some philosophical problem (virtually all of whom end up being wrong or at least incomplete even if they are headed in the right direction).
ETA: I’ve written a post that expands on this comment.
Multiplying small probabilities seems fine to me, whereas I really don’t get “heroic epistemology”.
You seem to be suggesting that “heroic epistemology” and “multiplying small probabilities” both lead to the same conclusion: support MIRI’s work on FAI. But this is the case only if working on FAI has no negative consequences. In that case, “small chance of success” plus “multiplying small probabilities” warrants working on FAI, just as “medium probability of success” and “not multiplying small probabilities” does. But since working on FAI does have negative consequences, namely shortening AI timelines and (in the later stages) possibly directly causing the creation an UFAI, just allowing multiplication by small probabilities is not sufficient to warrant working on FAI if the probability of success is low.
I am really worried that you are justifying your current course of action through a novel epistemology of your own invention, which has not been widely vetted (or even widely understood). Most new ideas are wrong, and I think you ought to treat your own new ideas with deeper suspicion.
I’m a reactionary, not an innovator, dammit! Reacting against this newfangled antiheroic ‘reference class’ claim that says we ought to let the world burn because we don’t have enough of a hero license!
Ahem.
I’m also really unconvinced by the claim that this work could reasonably have expected net negative consequences. I’m worried about the dynamics and evidence of GiveDirectly. But I don’t think GD has negative consequences, that would be a huge stretch. It’s possible maybe but it’s certainly not the arithmetic expectation and with that said, I worry that this ‘maybe negative’ stuff is impeding EA motivation generally, there is much that is ineffectual to be wary of, and missed opportunity costs, but trying to warn people against reverse or negative effects seems pretty perverse for anything that has made it onto Givewell’s Top 3, or CFAR, or FHI, or MIRI. Info that shortens AI timelines should mostly just not be released publicly and I don’t see any particularly plausible way for a planet to survive without having some equivalent of MIRI doing MIRI’s job, and the math thereof should be started as early as feasible.
“Reference class” to me is just an intuitive way of thinking about updating on certain types of evidence. It seems like you’re saying that in some cases we ought to use the inside view, or weigh object-level evidence more heavily, but 1) I don’t understand why you are not worried about “inside view” reasoning typically producing overconfidence or why you don’t think it’s likely to produce overconfidence in this case, and 2) according to my inside view, the probability of a team like the kind you’re envisioning solving FAI is low, and a typical MIRI donor or potential donor can’t be said to have much of an inside view on this matter, and has to use “reference class” reasoning. So what is your argument here?
Every AGI researcher is unconvinced by that, about their own work.
CFAR and MIRI were created by you, to help you build FAI. If FHI has endorsed your plan for building FAI (as opposed to endorsing MIRI as an organization that’s a force for good overall, which I’d agree with and I’ve actually provided various forms of support to MIRI because of that), I’m not aware of it. I also think I’ve thought enough about this topic to give some weight to my own judgments, so even if FHI does endorse your plan, I’d want to see their reasoning (which I definitely have not seen) and not just take their word. I note that Givewell does publish its analyses and are not asking people to just trust it.
My model of FAI development says that you have to get most of the way to being able to build an AGI just to be able to start working on many Friendliness-specific problems, and solving those problems would take a long time relative to finishing rest of the AGI capability work. Unless you’re flying completely below the radar, which is incompatible with your plan for funding via public donations, what is stopping your unpublished results from being stolen or leaked in the mean time? And just gathering 10 to 50 world-class talents to work on FAI is likely to spur competition and speed up AGI progress. The fact that you seem to be overconfident about your chance of success also suggests that you are likely to be overconfident in other areas, and indicates a high risk of accidental UFAI creation (relative to the probability of success, not necessarily high in absolute terms).
Agree, though luckily there are other Friendliness-specific problems that we can start solving right now.
Presumably, security technology similar to what has mostly worked for the Manhattan project, secret NSA projects, etc. But yeah, it’s a big worry. But what did you have in mind about flying completely under the radar? There are versions of an FAI team that could be funded pretty discretely by just one person.
I listed some in another comment, but they are not the current focus of MIRI research. Instead, MIRI is focusing on FAI-relevant problems that do shorten AI timelines (i.e., working on “get most of the way to being able to build an AGI”), such as decision theory and logical uncertainty.
As I noted in previous comments, the economics of information security seems to greatly favor the offense, so you have to have to spend much more resources than your attackers in order to maintain secrets.
That’s probably the best bet as far as avoiding having your results stolen, but introduces other problems, such as how to attract talent, and whether you can fund a large enough team that way. (Small teams might increase the chances of accidental UFAI creation, since there would be less people to look out for errors.) And given that Eliezer is probably already on the radar of most AGI researchers, you’d have to find a replacement for him on this “under the radar” team.
I should ask this question now rather than later: Is there a concrete policy alternative being considered by you?
And on one obvious ‘outside view’, they’d be right—it’s a very strange and unusual situation, which took me years to acknowledge, that this one particular class of science research could have perverse results. There’s many attempted good deeds which have no effect, but complete backfires make the news because they’re rare.
(Hey, maybe the priors in favor of good outcomes from the broad reference class of scientific research are so high that we should just ignore the inside view which says that AGI research will have a different result!)
And even AGI research doesn’t end up making it less likely that AGI will be developed, please note—it’s not that perverse in its outcome.
I’m currently in favor of of the following:
research on strategies for navigating intelligence explosion (what I called “Singularity Strategies”)
pushing for human intelligence enhancement
pushing for a government to try to take an insurmountable tech lead via large scale intelligence enhancement
research into a subset of FAI-related problems that do not shorten AI timelines (at least as far as we can tell), such as consciousness, normative ethics, metaethics, metaphilosophy
advocacy/PR/academic outreach on the dangers of AGI progress
What about continuing physics research possibly leading to a physics disaster or new superweapons, biotech research leading to biotech disasters, nanotech research leading to nanotech disasters, WBE research leading to value drift and Malthusian outcomes, computing hardware research leading to deliberate or accidental creation of massive simulated suffering (aside from UFAI)? In addition, I thought you believed that faster economic growth made a good outcome less likely, which would imply that most scientific research is bad?
Many AGI researchers seem to think that their research will result in a benevolent AGI, and I’m assuming you agree that their research does make it less likely that such an AGI will be eventually developed.
It seems odd to insist that someone explicitly working on benevolence should consider themselves to be in the same reference class as someone who thinks they just need to take care of the AGI and the benevolence will pretty much take care of itself.
I wasn’t intending to use “AGI researchers” as a reference class to show that Eliezer’s work is likely to have net negative consequences, but to show that people whose work can reasonably be expected to have net negative consequences (of whom AGI researchers is a prominent class) still tend not to believe such claims, and therefore Eliezer’s failure to be convinced is not of much evidential value to others.
The reference class I usually do have in mind when I think of Eliezer is philosophers who think they have the right answer to some philosophical problem (virtually all of whom end up being wrong or at least incomplete even if they are headed in the right direction).
ETA: I’ve written a post that expands on this comment.