(These are my personal views and do not reflect MIRI’s official position, I don’t even work there anymore.)
The concept of an intelligence explosion, which is itself a logical implication, should not be used to make further inferences and estimations without additional evidence.
Not sure how to interpret this. What does the “further inferences and estimations” refer to?
The likelihood of a gradual and controllable development versus the likelihood of an intelligence explosion.
See this comment for references to sources that discuss this.
But note that an intelligence explosion is sufficient but not necessary for AGI to be risky: just because development is gradual doesn’t mean that it will be safe. The Chernobyl power plant was the result of gradual development in nuclear engineering. Countless other disasters have likewise been caused by technologies that were developed gradually.
The likelihood of unfriendly AI versus friendly AI as the outcome of practical AI research.
Hard to say for sure, but note that few technologies are safe unless people work to make them safe, and the more complex the technology, the more effort is needed to ensure that no unexpected situations crop up where it turns out to be unsafe after all. See also section 5.1.1. of Responses to Catastrophic AGI Risk for a brief discussion about various incentives that may pressure people to deploy increasingly autonomous AI systems into domains where their enemies or competitors are doing the same, even if it isn’t necessarily safe.
The ability of superhuman intelligence and cognitive flexibility as characteristics alone to constitute a serious risk given the absence of enabling technologies like advanced nanotechnology.
We’re already giving computers considerable power in the economy, even without nanotechnology: see automated stock trading (and the resulting 2010 Flash Crash), various military drones, visions for replacing all cars (and ships) with self-driving ones, the amount of purchases that are carried out electronically via credit/debit cards or PayPal versus the ones that are done in old-fashioned cash, and so on and so on. See also section 2.1. of Responses to Catastrophic AGI Risk, as well as the previously mentioned section 5.1.1., for some discussion of why these trends are only likely to continue.
That some highly intelligent people who are aware of the Singularity Institute’s position do not accept it.
Expert disagreement is a viable reason to put reduced weight on the arguments, true, but this bullet point doesn’t indicate exactly what parts they disagree on. So it’s hard to comment further.
How is an AI going to become a master of dark arts and social engineering in order to persuade and deceive humans?
Some possibilities:
It’s built with a general skill-learning capability and all the collected psychology papers as well as people’s accounts of their lives that are available online are sufficient to build up the skill, especially if it gets to practice enough.
It’s an AI expressely designed and developed for that purpose, due to being developed for political, marketing, or military purposes.
It doesn’t and it doesn’t need to, because it does damage via some other (possibly unforeseen) method.
How is an AI going to coordinate a large scale conspiracy or deception, given its initial resources, without making any suspicious mistakes along the way?
This seems to presuppose that the AI is going to coordinate a large-scale conspiracy. Which might be happen or it might not. If it does, possibly the six first AIs that try it do commit various mistakes and are stopped, but the seventh one learns from their mistakes and does things differently. Or maybe an AI is created by a company like Google that already wields massive resources, so it doesn’t need to coordinate a huge conspiracy to obtain lots of resources. Or maybe the AI is just a really hard worker and sells its services to people and accumulates lots of money and power that way. Or...
This is what frustrates me about a lot of Kruel’s comments: often they seem to be presupposing some awfully narrow and specific scenario, when in reality are countless of different ways by which AIs might become dangerous.
Are those computational resources that can be hacked applicable to improve the general intelligence of an AI?
Nobody knows, but note that this also depends a lot on how you define “general intelligence”. For instance, suppose that if you control five computers rather than just one, you can’t become qualitatively more intelligent, but you can do five times as many things at the same time, and of course require your enemies to knock out five times as many computers if they want to incapacitate you. You can do a lot of stuff with general-purpose hardware, of which improving your own intelligence is but one (albeit very useful) possibility.
Does throwing more computational resources at important problems, like building new and better computational substrates, allow an AI to come up with better architectures so much faster as to outweigh the expenditure of obtaining those resources, without hitting diminishing returns?
This question is weird. “Diminishing returns” just means that if you initially get X units of benefit per unit invested, then at some point you’ll get Y units of benefit per unit invested, where X > Y. But this can still be a profitable investment regardless.
I guess this means something like “will there be a point where it won’t be useful for the AI to invest in self-improvement anymore”. If you frame it that way, the answer is obviously yes: you can’t improve forever. But that’s not an interesting question: the interesting question is whether the AI will hit that point before it has obtained any considerable advantage over humans.
As for that question, well, evolution is basically a brute-force search algorithm that can easily become stuck in local optimums, which cannot plan ahead, which has mainly optimized humans for living in a hunter-gatherer environment, and which has been forced to work within the constraints of biological cells and similar building material. Is there any reason to assume that such a process would have produced creatures with no major room for improvement?
Moravec’s Pigs in Cyberspace is also relevant, the four last paragraphs in particular.
How does an AI brute-force the discovery of unknown unknowns?
Not sure what’s meant by this.
How is an AI going to solve important problems without real-world experimentation and slow environmental feedback?
Your “maybe by simulating the real world environment” is indeed one possible answer. Also, who’s to say that the AI couldn’t do real-world experimentation?
How is an AI going to build new computational substrates and obtain control of those resources without making use of existing infrastructure?
How is an AI going to cloak its actions, i.e. its energy consumption etc.?
A theorem that there likely exists a information theoretically simple, physically and economically realizable, algorithm that can be improved to self-improve explosively. Prove that there likely are no strongly diminishing intelligence returns for additional compute power.
More unexplainedly narrow assumptions. Why isn’t the AI allowed to make use of existing infrastructure? Why does it necessarily need to hide its energy consumption? Why does the AI’s algorithm need to be information-theoretically simple?
The existence of a robot that could navigate autonomously in a real-world environment and survive real-world threats and attacks with approximately the skill of C. elegans. A machine that can quickly learn to play Go on its own, unassisted by humans, and beat the best human players.
Self-driving cars are getting there, as are Go AIs.
Show how something like expected utility maximization would actually work out in practice.
What does this mean? Expected utility maximization is a standard AI technique already.
Conclusive evidence that current research will actually lead to the creation of superhuman AI designs equipped with the relevant drives that are necessary to disregard any explicit or implicit spatio-temporal scope boundaries and resource limits.
What does the “further inferences and estimations” refer to?
Basically the hundreds of hours it would take MIRI to close the inferential distance between them and AI experts. See e.g. this comment by
Luke Muehlhauser:
I agree with Eliezer that the main difficulty is in getting top-quality, relatively rational people to spend hundreds of hours being educated, working through the arguments, etc.
If your arguments are this complex then you are probably wrong.
But note that an intelligence explosion is sufficient but not necessary for AGI to be risky: just because development is gradual doesn’t mean that it will be safe.
I do not disagree with that kind of AI risks. If MIRI is working on mitigating AI risks that do not require an intelligence explosion, a certain set of AI drives and a bunch of, from my perspective, very unlikely developments...then I was not aware of that.
Hard to say for sure, but note that few technologies are safe unless people work to make them safe, and the more complex the technology, the more effort is needed to ensure that no unexpected situations crop up where it turns out to be unsafe after all.
This seems very misleading. We are after all talking about a technology that works perfectly well at being actively unsafe. You have to get lots of things right, e.g. that the AI cares to take over the world, knows how to improve itself, and manages to hide its true intentions before it can do so etc. etc. etc.
Expert disagreement is a viable reason to put reduced weight on the arguments, true, but this bullet point doesn’t indicate exactly what parts they disagree on.
There is a reason why MIRI doesn’t know this. Look at the latest interviews with experts conducted by Luke Muehlhauser. He doesn’t even try to figure out if they disagree with Xenu, but only asks uncontroversial questions.
This is what frustrates me about a lot of Kruel’s comments: often they seem to be presupposing some awfully narrow and specific scenario...
Crazy...this is why I am criticizing MIRI. A focus on an awfully narrow and specific scenario rather than AI risks in general.
...suppose that if you control five computers rather than just one, you can’t become qualitatively more intelligent, but you can do five times as many things at the same time...
Consider that the U.S. had many more and smarter people than the Taliban. The bottom line being that the U.S. devoted a lot more output per man-hour to defeat a completely inferior enemy. Yet their advantage apparently did scale sublinearly.
I guess this means something like “will there be a point where it won’t be useful for the AI to invest in self-improvement anymore”. If you frame it that way, the answer is obviously yes: you can’t improve forever. But that’s not an interesting question: the interesting question is whether the AI will hit that point before it has obtained any considerable advantage over humans.
I do not disagree that there are minds better at social engineering than that of e.g. Hitler, but I strongly doubt that there are minds which are vastly better. Optimizing a political speech for 10 versus a million subjective years won’t make it one hundred thousand times more persuasive.
Is there any reason to assume that such a process would have produced creatures with no major room for improvement?
The question is if just because humans are much smarter and stronger they can actually wipe out mosquitoes. Well, they can...but it is either very difficult or will harm humans.
Also, who’s to say that the AI couldn’t do real-world experimentation?
You already need to build huge particle accelerators to gain new physical insights and need a whole technological civilization in order to build an iPhone. You can’t just get around this easily and overnight.
Everything else you wrote I already discuss in detail in various posts.
Brief replies to the bits that you quoted:
(These are my personal views and do not reflect MIRI’s official position, I don’t even work there anymore.)
Not sure how to interpret this. What does the “further inferences and estimations” refer to?
See this comment for references to sources that discuss this.
But note that an intelligence explosion is sufficient but not necessary for AGI to be risky: just because development is gradual doesn’t mean that it will be safe. The Chernobyl power plant was the result of gradual development in nuclear engineering. Countless other disasters have likewise been caused by technologies that were developed gradually.
Hard to say for sure, but note that few technologies are safe unless people work to make them safe, and the more complex the technology, the more effort is needed to ensure that no unexpected situations crop up where it turns out to be unsafe after all. See also section 5.1.1. of Responses to Catastrophic AGI Risk for a brief discussion about various incentives that may pressure people to deploy increasingly autonomous AI systems into domains where their enemies or competitors are doing the same, even if it isn’t necessarily safe.
We’re already giving computers considerable power in the economy, even without nanotechnology: see automated stock trading (and the resulting 2010 Flash Crash), various military drones, visions for replacing all cars (and ships) with self-driving ones, the amount of purchases that are carried out electronically via credit/debit cards or PayPal versus the ones that are done in old-fashioned cash, and so on and so on. See also section 2.1. of Responses to Catastrophic AGI Risk, as well as the previously mentioned section 5.1.1., for some discussion of why these trends are only likely to continue.
Expert disagreement is a viable reason to put reduced weight on the arguments, true, but this bullet point doesn’t indicate exactly what parts they disagree on. So it’s hard to comment further.
Some possibilities:
It’s built with a general skill-learning capability and all the collected psychology papers as well as people’s accounts of their lives that are available online are sufficient to build up the skill, especially if it gets to practice enough.
It’s an AI expressely designed and developed for that purpose, due to being developed for political, marketing, or military purposes.
It doesn’t and it doesn’t need to, because it does damage via some other (possibly unforeseen) method.
This seems to presuppose that the AI is going to coordinate a large-scale conspiracy. Which might be happen or it might not. If it does, possibly the six first AIs that try it do commit various mistakes and are stopped, but the seventh one learns from their mistakes and does things differently. Or maybe an AI is created by a company like Google that already wields massive resources, so it doesn’t need to coordinate a huge conspiracy to obtain lots of resources. Or maybe the AI is just a really hard worker and sells its services to people and accumulates lots of money and power that way. Or...
This is what frustrates me about a lot of Kruel’s comments: often they seem to be presupposing some awfully narrow and specific scenario, when in reality are countless of different ways by which AIs might become dangerous.
Nobody knows, but note that this also depends a lot on how you define “general intelligence”. For instance, suppose that if you control five computers rather than just one, you can’t become qualitatively more intelligent, but you can do five times as many things at the same time, and of course require your enemies to knock out five times as many computers if they want to incapacitate you. You can do a lot of stuff with general-purpose hardware, of which improving your own intelligence is but one (albeit very useful) possibility.
This question is weird. “Diminishing returns” just means that if you initially get X units of benefit per unit invested, then at some point you’ll get Y units of benefit per unit invested, where X > Y. But this can still be a profitable investment regardless.
I guess this means something like “will there be a point where it won’t be useful for the AI to invest in self-improvement anymore”. If you frame it that way, the answer is obviously yes: you can’t improve forever. But that’s not an interesting question: the interesting question is whether the AI will hit that point before it has obtained any considerable advantage over humans.
As for that question, well, evolution is basically a brute-force search algorithm that can easily become stuck in local optimums, which cannot plan ahead, which has mainly optimized humans for living in a hunter-gatherer environment, and which has been forced to work within the constraints of biological cells and similar building material. Is there any reason to assume that such a process would have produced creatures with no major room for improvement?
Moravec’s Pigs in Cyberspace is also relevant, the four last paragraphs in particular.
Not sure what’s meant by this.
Your “maybe by simulating the real world environment” is indeed one possible answer. Also, who’s to say that the AI couldn’t do real-world experimentation?
More unexplainedly narrow assumptions. Why isn’t the AI allowed to make use of existing infrastructure? Why does it necessarily need to hide its energy consumption? Why does the AI’s algorithm need to be information-theoretically simple?
Self-driving cars are getting there, as are Go AIs.
What does this mean? Expected utility maximization is a standard AI technique already.
It’s true that this would be nice to have.
Basically the hundreds of hours it would take MIRI to close the inferential distance between them and AI experts. See e.g. this comment by Luke Muehlhauser:
If your arguments are this complex then you are probably wrong.
I do not disagree with that kind of AI risks. If MIRI is working on mitigating AI risks that do not require an intelligence explosion, a certain set of AI drives and a bunch of, from my perspective, very unlikely developments...then I was not aware of that.
This seems very misleading. We are after all talking about a technology that works perfectly well at being actively unsafe. You have to get lots of things right, e.g. that the AI cares to take over the world, knows how to improve itself, and manages to hide its true intentions before it can do so etc. etc. etc.
There is a reason why MIRI doesn’t know this. Look at the latest interviews with experts conducted by Luke Muehlhauser. He doesn’t even try to figure out if they disagree with Xenu, but only asks uncontroversial questions.
Crazy...this is why I am criticizing MIRI. A focus on an awfully narrow and specific scenario rather than AI risks in general.
Consider that the U.S. had many more and smarter people than the Taliban. The bottom line being that the U.S. devoted a lot more output per man-hour to defeat a completely inferior enemy. Yet their advantage apparently did scale sublinearly.
I do not disagree that there are minds better at social engineering than that of e.g. Hitler, but I strongly doubt that there are minds which are vastly better. Optimizing a political speech for 10 versus a million subjective years won’t make it one hundred thousand times more persuasive.
The question is if just because humans are much smarter and stronger they can actually wipe out mosquitoes. Well, they can...but it is either very difficult or will harm humans.
You already need to build huge particle accelerators to gain new physical insights and need a whole technological civilization in order to build an iPhone. You can’t just get around this easily and overnight.
Everything else you wrote I already discuss in detail in various posts.