I strongly agree with this comment, and also have a response to Eliezer’s response to it. While I share TheOtherDave’s views, as TheOtherDave noted, he doesn’t necessarily share mine!
It’s not the large consequences that make it a priori unlikely that an organization is really good at mitigating existential risks—it’s the objectively small probabilities and lack of opportunity to learn by trial and error.
If your goal is to prevent heart attacks in chronically obese, elderly people, then you’re dealing with reasonably large probabilities. For example, the AHA estimates that a 60-year-old, 5′8″ man weighing 220 pounds has a 10% chance of having a heart attack in the next 10 years. You can fiddle with their calculator here. This is convenient, because you can learn by trial or error whether your strategies are succeeding. If only 5% of a group of the elderly obese under your treatment have heart attacks over the next 10 years, then you’re probably doing a good job. If 12% have heart attacks, you should probably try another tactic. These are realistic swings to expect from an effective treatment—it might really be possible to cut the rate of heart attacks in half among a particular population.This study, for example, reports a 25% relative risk reduction. If an organization claims to be doing really well at preventing heart attacks, it’s a credible signal—if they weren’t doing well, someone could check their results and prove it, which would be embarrassing for the organization. So, that kind of claim only needs a little bit of evidence to support it.
On the other hand, any given existential risk has a small chance of happening, a smaller chance of being mitigated, and, by definition, little or no opportunity to learn by trial and error. For example, the odds of an artificial intelligence explosion in the next 10 years might be 1%. A team of genius mathematicians funded with $5 million over the next 10 years might be able to reduce that risk to 0.8%. However, this would be an extraordinarily difficult thing to estimate. These numbers come from back-of-the-envelope Fermi calculations, not from hard data. They can’t come from hard data—by definition, existential risks haven’t happened yet. Suppose 10 years go by, and the Singularity Institute gets plenty of funding, and they declare that they successfully reduced the risk of unfriendly AI down to 0.5%, and that they are on track to do the same for the next decade. How would anyone even go about checking this claim?
An unfriendly intelligence explosion, by its very nature, will use tactics and weaknesses that we are not presently aware of. If we learn about some of these weaknesses and correct them, then uFAI would use other weaknesses. The Singularity Institute wants to promote the development of a provably friendly AI; the thought is that if the AI’s source code can be shown mathematically to be friendly, then, as long as the proof is correct and the code is faithfully entered by the programmers and engineers, we can achieve absolute protection against uFAI, because the FAI will be smart enough to figure that out for us. But while it’s very plausible to think that we will face significant AI risk in the next 30 years (i.e., the risk arises under a disjunctive list of conditions), it’s not likely that we will face AI risk, and that AI will turn out to have the capacity to exponentially self-improve, and that there is a theoretical piece of source code that would be friendly, and that at least one such code can provably be shown to be friendly, and that a team of genius mathematicians will actually find that proof, and that these mathematicians will prevail upon a group of engineers to build the FAI before anyone else builds a competing model. This is a conjunctive scenario.
It’s not at all clear to me how just generally having a team of researchers who are moderately familiar with the properties of the mathematical objects that determine the friendliness of AI could do anything to reduce existential risk if this conjunctive scenario doesn’t come to pass. In other words, if we get self-replicating autonomous moderately intelligent AIs, or if it turns out that there’s no such thing as a mathematical proof of friendliness, or if AI first comes about by way of whole brain emulation, then I don’t understand how the Singularity Institute proposes to make itself useful. It’s not a crazy thought that having a ready-made team of seasoned amateurs ready to tackle the problems of AI would yield better results than having to improvise a response team from scratch...but there are other charitable proposals (including proposals to reduce other kinds of x-risk) that I find considerably more compelling. If you want me to donate to the Singularity Institute, you’ll have to come up with a better plan than “This incredibly specific scenario might come to pass and we have a small chance of being able to mitigate the consequences if it does, and even if the scenario doesn’t come to pass, it would still probably be good to have people like us on hand to cope with unspecified similar problems in unspecified ways.”
By way of analogy, a group of forward-thinking humanitarians in 1910 could have plausibly argued that somebody ought to start getting ready to think about ways to help protect the world against the unknown risks of new discoveries in theoretical physics...but they probably would have been better off thinking up interesting ways of stopping World War I or a re-occurrence of the dreaded 1893 Russian Flu. The odds that even a genius team of humanitarian physicists would have anticipated the specific course that cutting-edge physics would take—involving radioactivity, chain reactions, uranium enrichment, and implosion bombs—just from baseline knowledge about Bohr’s model of the atom and Marie Curie’s discovery of radioactivity—are already incredibly low. The further odds that they would take useful steps, in the 1910s, to devise and execute an effective plan to stop the development of nuclear weapons or even to ensure that they were not used irresponsibly, seem astronomically low. The team might manage, in a general way, to help improve the security controls on known radioactive materials—but, as actually happened, new materials were found to be radioactive, and new ways were found of artificially enhancing the radioactivity of a substance, and in any event most governments had secret stockpiles of fissile material that would not have been reached by ordinary security controls.
Today, we know a little something about computer science, and it’s understandable to want to develop expertise in how to keep computers safe—but we can’t anticipate the specific course of discoveries in cutting-edge computer science, and even if we could, it’s unlikely that we’ll be able to take action now to help us cope with them, and if our guesses about the future prove to be close but not exactly accurate, then it’s even more unlikely that the plans we make now based on our guesses will wind up being useful.
That’s why I prefer to donate to charities that are attempting either to (a) alleviate suffering that is currently and verifiably happening, e.g., Deworm the World, or (b) obviously useful for preventing existential risks in a disjunctive way, e.g., the Millenium Seed Bank. I have nothing against the SI—I wish you well and hope you grow and succeed. I think you’re doing better than the vast majority of charities out there. I just also think there are even better uses for my money.
EDIT: Clarified that my views may be different from TheOtherDave’s, even though I agree with his views.
I strongly agree with this comment, and also have a response to Eliezer’s response to it. While I share TheOtherDave’s views, as TheOtherDave noted, he doesn’t necessarily share mine!
It’s not the large consequences that make it a priori unlikely that an organization is really good at mitigating existential risks—it’s the objectively small probabilities and lack of opportunity to learn by trial and error.
If your goal is to prevent heart attacks in chronically obese, elderly people, then you’re dealing with reasonably large probabilities. For example, the AHA estimates that a 60-year-old, 5′8″ man weighing 220 pounds has a 10% chance of having a heart attack in the next 10 years. You can fiddle with their calculator here. This is convenient, because you can learn by trial or error whether your strategies are succeeding. If only 5% of a group of the elderly obese under your treatment have heart attacks over the next 10 years, then you’re probably doing a good job. If 12% have heart attacks, you should probably try another tactic. These are realistic swings to expect from an effective treatment—it might really be possible to cut the rate of heart attacks in half among a particular population.This study, for example, reports a 25% relative risk reduction. If an organization claims to be doing really well at preventing heart attacks, it’s a credible signal—if they weren’t doing well, someone could check their results and prove it, which would be embarrassing for the organization. So, that kind of claim only needs a little bit of evidence to support it.
On the other hand, any given existential risk has a small chance of happening, a smaller chance of being mitigated, and, by definition, little or no opportunity to learn by trial and error. For example, the odds of an artificial intelligence explosion in the next 10 years might be 1%. A team of genius mathematicians funded with $5 million over the next 10 years might be able to reduce that risk to 0.8%. However, this would be an extraordinarily difficult thing to estimate. These numbers come from back-of-the-envelope Fermi calculations, not from hard data. They can’t come from hard data—by definition, existential risks haven’t happened yet. Suppose 10 years go by, and the Singularity Institute gets plenty of funding, and they declare that they successfully reduced the risk of unfriendly AI down to 0.5%, and that they are on track to do the same for the next decade. How would anyone even go about checking this claim?
An unfriendly intelligence explosion, by its very nature, will use tactics and weaknesses that we are not presently aware of. If we learn about some of these weaknesses and correct them, then uFAI would use other weaknesses. The Singularity Institute wants to promote the development of a provably friendly AI; the thought is that if the AI’s source code can be shown mathematically to be friendly, then, as long as the proof is correct and the code is faithfully entered by the programmers and engineers, we can achieve absolute protection against uFAI, because the FAI will be smart enough to figure that out for us. But while it’s very plausible to think that we will face significant AI risk in the next 30 years (i.e., the risk arises under a disjunctive list of conditions), it’s not likely that we will face AI risk, and that AI will turn out to have the capacity to exponentially self-improve, and that there is a theoretical piece of source code that would be friendly, and that at least one such code can provably be shown to be friendly, and that a team of genius mathematicians will actually find that proof, and that these mathematicians will prevail upon a group of engineers to build the FAI before anyone else builds a competing model. This is a conjunctive scenario.
It’s not at all clear to me how just generally having a team of researchers who are moderately familiar with the properties of the mathematical objects that determine the friendliness of AI could do anything to reduce existential risk if this conjunctive scenario doesn’t come to pass. In other words, if we get self-replicating autonomous moderately intelligent AIs, or if it turns out that there’s no such thing as a mathematical proof of friendliness, or if AI first comes about by way of whole brain emulation, then I don’t understand how the Singularity Institute proposes to make itself useful. It’s not a crazy thought that having a ready-made team of seasoned amateurs ready to tackle the problems of AI would yield better results than having to improvise a response team from scratch...but there are other charitable proposals (including proposals to reduce other kinds of x-risk) that I find considerably more compelling. If you want me to donate to the Singularity Institute, you’ll have to come up with a better plan than “This incredibly specific scenario might come to pass and we have a small chance of being able to mitigate the consequences if it does, and even if the scenario doesn’t come to pass, it would still probably be good to have people like us on hand to cope with unspecified similar problems in unspecified ways.”
By way of analogy, a group of forward-thinking humanitarians in 1910 could have plausibly argued that somebody ought to start getting ready to think about ways to help protect the world against the unknown risks of new discoveries in theoretical physics...but they probably would have been better off thinking up interesting ways of stopping World War I or a re-occurrence of the dreaded 1893 Russian Flu. The odds that even a genius team of humanitarian physicists would have anticipated the specific course that cutting-edge physics would take—involving radioactivity, chain reactions, uranium enrichment, and implosion bombs—just from baseline knowledge about Bohr’s model of the atom and Marie Curie’s discovery of radioactivity—are already incredibly low. The further odds that they would take useful steps, in the 1910s, to devise and execute an effective plan to stop the development of nuclear weapons or even to ensure that they were not used irresponsibly, seem astronomically low. The team might manage, in a general way, to help improve the security controls on known radioactive materials—but, as actually happened, new materials were found to be radioactive, and new ways were found of artificially enhancing the radioactivity of a substance, and in any event most governments had secret stockpiles of fissile material that would not have been reached by ordinary security controls.
Today, we know a little something about computer science, and it’s understandable to want to develop expertise in how to keep computers safe—but we can’t anticipate the specific course of discoveries in cutting-edge computer science, and even if we could, it’s unlikely that we’ll be able to take action now to help us cope with them, and if our guesses about the future prove to be close but not exactly accurate, then it’s even more unlikely that the plans we make now based on our guesses will wind up being useful.
That’s why I prefer to donate to charities that are attempting either to (a) alleviate suffering that is currently and verifiably happening, e.g., Deworm the World, or (b) obviously useful for preventing existential risks in a disjunctive way, e.g., the Millenium Seed Bank. I have nothing against the SI—I wish you well and hope you grow and succeed. I think you’re doing better than the vast majority of charities out there. I just also think there are even better uses for my money.
EDIT: Clarified that my views may be different from TheOtherDave’s, even though I agree with his views.
I should say, incidentally (since this was framed as agreement to my comment) that Mass_Driver’s point is rather different from mine.