That evolution “designed” something more intelligent than itself inefficiently does imply that we can efficiently design something less intelligent than ourselves that can in turn efficiently design something much more intelligent than its creators?
Either we can design a human-level AGI (without WBE) or we cannot. If we cannot, this entire discussion about safety protocols is irrelevant. Maybe we need some safety protocols for experiments with WBE but it’s a different story. If we can, then it seems likely that there exists a subhuman AGI which is able to design a superhuman AGI (because there’s no reason to believe human-level intelligence is a special point & because this weaker intelligence will be better optimized for designing AGI than humans). Such a self-improvement process creates a positive feedback loop which might lead to very rapid rise in intelligence.
People already suck at telling whether Vitamin D is good for you, yet some people seem to believe that they can have non-negligible confidence about the power and behavior of artificial general intelligence.
Low confidence means stronger safety requirements, not the other way around.
For important abilities, such as persuasion, there are good reasons to believe that there are no minds much better than humans.
One of the arguments I heard for humans being the bare minimum level of intelligence for a technological civilization is that there existed no further evolutionary pressure to select for even higher levels of general intelligence.
You just claim that there can be levels of intelligence below us that are better than us at designing levels of intelligence above us and that we can create such intelligences. In my opinion such a belief requires strong justification.
People already suck at telling whether Vitamin D is good for you, yet some people seem to believe that they can have non-negligible confidence about the power and behavior of artificial general intelligence.
Low confidence means stronger safety requirements, not the other way around.
Yes. Something is very wrong with this line of reasoning. I hope GiveWell succeeds at writing a post on this soon. My technical skills are not sufficient to formalize my doubts.
I’ll just say as much. I am not going to spend resources on the possibility of catching some exotic disease, even though it could kill me in a horrible way, when there are other more likely risks that could cripple me.
What are these reasons?
I list some caveats here. Even humans hit diminishing returns on many tasks and just stop exploring and start exploiting. For persuasion this should be pretty obvious. Improving a sentence you want to send to your gatekeeper for a million subjective years does not make it one hundred thousand times more persuasive than improving it for 10 subjective years.
When having a fist fight with someone, strategy only gives you little advantage if your combatant is much stronger. An AI trying to take over the world would have to account for its fragility when fighting humans, who are adapted to living outside the box.
To take over the world you either require excellent persuasion skills or raw power. That an AI could somehow become good at persuasion, given its huge inferential distance, lack of direct insight, and without a theory of mind, is in my opinion nearly impossible. And regarding the acquisition of raw power, you will have to show how it is likely going to do so without just conjecturing technological magic.
At the time of the first AI, the global infrastructure will still require humans to keep it running. You need to show that the AI is independent enough of this infrastructure that it can risk its demise in a confrontation with humans.
There are a huge number of questions looming in the background. How would the AI hide its motivations and make predictions about human countermeasures? Why would it be given unsupervised controlled of the equipment necessary to construct molecular factories?
I can of course imagine science fiction stories where an AI does anything. That proves nothing.
I am not going to spend resources on the possibility of catching some exotic disease, even though it could kill me in a horrible way, when there are other more likely risks that could cripple me.
Allow me to make a different analogy. Suppose that someone is planning to build a particle accelerator of unprecedented power. Some experts claim the accelerator is going to create a black hole which will destroy Earth. Other experts think differently. Everyone agrees (in stark contrast to what happened with LHC) that our understanding of processes at these energies is very poor. In these conditions, do you think it would be a good idea to build the accelerator?
In these conditions, do you think it would be a good idea to build the accelerator?
It would not be a good idea. Ideally, you should then try to raise your confidence that it won’t destroy the world so far that the expected benefits of building it outweigh the risks. But that’s probably not feasible, and I have no idea where to draw the line.
If you can already build something, and there are good reasons to be cautious, then that passed the threshold where I can afford to care, without risking to waste my limited amount of attention on risks approaching Pascal’s mugging type scenarios.
I like to make the comparison between an extinction type asteroid, spotted with telescopes, and calculated to have .001 probability of hitting Earth in 2040, vs. a 50% probability of extinction by unfriendly AI at the same time. The former calculation is based on hard facts, empirical evidence, while the latter is purely inference based and therefore very unstable.
In other words, one may assign 50% probability to “a coin will come up heads” and “there is intelligent life on other planets,” but one’s knowledge about the two scenarios is different in important ways.
ETA:
Suppose there are 4 risks. One mundane risk has a probability of 1⁄10 and and you assign 20 utils to its prevention. Another less likely risk has a probability of 1⁄100 but you assign 1000 utils to its prevention. Yet another risk is very unlikely, having a probability of 1/1000, but you assign 1 million utils to its prevention. The fourth risk is extremely unlikely, having a probability of 10^-10000, but you assign 10^10006 to its prevention. All else equal, which one would you choose to prevent and why?
If you wouldn’t choose risk 4 then why wouldn’t the same line of reasoning, or intuition, not be similarly valid in choosing risk number 1 over 2 or 3? And in case that you would choose risk 4 then do you also give money to a Pascalian mugger?
The important difference between an AI risks charity and a deworming charity can’t be its expected utility, because that results in Pascal’s mugging. The difference can neither be that deworming is more probable than AI risks. Because that argument also works against deworming, by choosing a cause that is even more probable than deworming.
And in case you are saying that AI risk is the most probable underfunded risk, then what is the greatest lower bound for “probable” here and how do you formally define it? In other words, in conjunction with doesn’t work either. Because any case of Pascal’s mugging is underfunded as well. You’d have to formally define and justify some well-grounded minimum for “probable”.
The probability of unfriendly AI is too low, and the evidence is too “brittle”.
Earlier you said: “People already suck at telling whether Vitamin D is good for you, yet some people seem to believe that they can have non-negligible confidence about the power and behavior of artificial general intelligence.” Now you’re making high confidence claims about AGI. Also, I remind you the discussion started from my criticism of the proposed AGI safety protocols. If there is no UFAI risk than the safety protocols are pointless.
In other words, one may assign 50% probability to “a coin will come up heads” and “there is intelligent life on other planets,” but one’s knowledge about the two scenarios is different in important ways.
Not in ways that have to do with expected utility calculation.
Suppose there are 4 risks. One mundane risk has a probability of 1⁄10 and and you assign 20 utils to its prevention. Another less likely risk has a probability of 1⁄100 but you assign 1000 utils to its prevention. Yet another risk is very unlikely, having a probability of 1/1000, but you assign 1 million utils to its prevention. The fourth risk is extremely unlikely, having a probability of 10^-10000, but you assign 10^10006 to its prevention. All else equal, which one would you choose to prevent and why?
Risk 4 since it corresponds to highest expected utility.
And in case that you would choose risk 4 then do you also give money to a Pascalian mugger?
My utility function is bounded (I think) so you can only Pascal-mug me that much.
And in case you are saying that AI risk is the most probable underfunded risk...
I have no idea whether it is underfunded. I can try to think about it, but it has little to do with the present discussion.
Either we can design a human-level AGI (without WBE) or we cannot. If we cannot, this entire discussion about safety protocols is irrelevant. Maybe we need some safety protocols for experiments with WBE but it’s a different story. If we can, then it seems likely that there exists a subhuman AGI which is able to design a superhuman AGI (because there’s no reason to believe human-level intelligence is a special point & because this weaker intelligence will be better optimized for designing AGI than humans). Such a self-improvement process creates a positive feedback loop which might lead to very rapid rise in intelligence.
Low confidence means stronger safety requirements, not the other way around.
What are these reasons?
One of the arguments I heard for humans being the bare minimum level of intelligence for a technological civilization is that there existed no further evolutionary pressure to select for even higher levels of general intelligence.
You just claim that there can be levels of intelligence below us that are better than us at designing levels of intelligence above us and that we can create such intelligences. In my opinion such a belief requires strong justification.
Yes. Something is very wrong with this line of reasoning. I hope GiveWell succeeds at writing a post on this soon. My technical skills are not sufficient to formalize my doubts.
I’ll just say as much. I am not going to spend resources on the possibility of catching some exotic disease, even though it could kill me in a horrible way, when there are other more likely risks that could cripple me.
I list some caveats here. Even humans hit diminishing returns on many tasks and just stop exploring and start exploiting. For persuasion this should be pretty obvious. Improving a sentence you want to send to your gatekeeper for a million subjective years does not make it one hundred thousand times more persuasive than improving it for 10 subjective years.
When having a fist fight with someone, strategy only gives you little advantage if your combatant is much stronger. An AI trying to take over the world would have to account for its fragility when fighting humans, who are adapted to living outside the box.
To take over the world you either require excellent persuasion skills or raw power. That an AI could somehow become good at persuasion, given its huge inferential distance, lack of direct insight, and without a theory of mind, is in my opinion nearly impossible. And regarding the acquisition of raw power, you will have to show how it is likely going to do so without just conjecturing technological magic.
At the time of the first AI, the global infrastructure will still require humans to keep it running. You need to show that the AI is independent enough of this infrastructure that it can risk its demise in a confrontation with humans.
There are a huge number of questions looming in the background. How would the AI hide its motivations and make predictions about human countermeasures? Why would it be given unsupervised controlled of the equipment necessary to construct molecular factories?
I can of course imagine science fiction stories where an AI does anything. That proves nothing.
Allow me to make a different analogy. Suppose that someone is planning to build a particle accelerator of unprecedented power. Some experts claim the accelerator is going to create a black hole which will destroy Earth. Other experts think differently. Everyone agrees (in stark contrast to what happened with LHC) that our understanding of processes at these energies is very poor. In these conditions, do you think it would be a good idea to build the accelerator?
It would not be a good idea. Ideally, you should then try to raise your confidence that it won’t destroy the world so far that the expected benefits of building it outweigh the risks. But that’s probably not feasible, and I have no idea where to draw the line.
If you can already build something, and there are good reasons to be cautious, then that passed the threshold where I can afford to care, without risking to waste my limited amount of attention on risks approaching Pascal’s mugging type scenarios.
Unfriendly AI does not pass this threshold. The probability of unfriendly AI is too low, and the evidence is too “brittle”.
I like to make the comparison between an extinction type asteroid, spotted with telescopes, and calculated to have .001 probability of hitting Earth in 2040, vs. a 50% probability of extinction by unfriendly AI at the same time. The former calculation is based on hard facts, empirical evidence, while the latter is purely inference based and therefore very unstable.
In other words, one may assign 50% probability to “a coin will come up heads” and “there is intelligent life on other planets,” but one’s knowledge about the two scenarios is different in important ways.
ETA:
Suppose there are 4 risks. One mundane risk has a probability of 1⁄10 and and you assign 20 utils to its prevention. Another less likely risk has a probability of 1⁄100 but you assign 1000 utils to its prevention. Yet another risk is very unlikely, having a probability of 1/1000, but you assign 1 million utils to its prevention. The fourth risk is extremely unlikely, having a probability of 10^-10000, but you assign 10^10006 to its prevention. All else equal, which one would you choose to prevent and why?
If you wouldn’t choose risk 4 then why wouldn’t the same line of reasoning, or intuition, not be similarly valid in choosing risk number 1 over 2 or 3? And in case that you would choose risk 4 then do you also give money to a Pascalian mugger?
The important difference between an AI risks charity and a deworming charity can’t be its expected utility, because that results in Pascal’s mugging. The difference can neither be that deworming is more probable than AI risks. Because that argument also works against deworming, by choosing a cause that is even more probable than deworming.
And in case you are saying that AI risk is the most probable underfunded risk, then what is the greatest lower bound for “probable” here and how do you formally define it? In other words, in conjunction with doesn’t work either. Because any case of Pascal’s mugging is underfunded as well. You’d have to formally define and justify some well-grounded minimum for “probable”.
Earlier you said: “People already suck at telling whether Vitamin D is good for you, yet some people seem to believe that they can have non-negligible confidence about the power and behavior of artificial general intelligence.” Now you’re making high confidence claims about AGI. Also, I remind you the discussion started from my criticism of the proposed AGI safety protocols. If there is no UFAI risk than the safety protocols are pointless.
Not in ways that have to do with expected utility calculation.
Risk 4 since it corresponds to highest expected utility.
My utility function is bounded (I think) so you can only Pascal-mug me that much.
I have no idea whether it is underfunded. I can try to think about it, but it has little to do with the present discussion.