This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that’s not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.
This may be retreating to the motte’s bailey, so to speak, but I don’t think anyone seriously thinks that a superintelligence would be literally impossible to understand. The worry is that there will be such a huge gulf between how superintelligences reason versus how we reason that it would take prohibitively long to understand them.
I think a laptop is a good example. There probably isn’t any single human on earth that knows how to build a modern laptop from scratch. There’s are computer scientists that know how the operating system is put together—how the operating system is programmed, how memory is written and retrieved from the various buses; there are other computer scientists and electrical engineers who designed the chips themselves, who arrayed circuits efficiently to dissipate heat and optimize signal latency. Even further, there are material scientists and physicists who designed the transistors and chip fabrication processes, and so on.
So, as an individual human, I don’t know what it’s like to know everything about a laptop all at once in my head, at a glance. I can zoom in on an individual piece and learn about it, but I don’t know all the nuances for each piece—just a sort of executive summary. The fundamental objects with which I can reason have a sort of characteristic size in mindspace—I can imagine 5, maybe 6 balls moving around with distinct trajectories (even then, I tend to group them into smaller subgroups). But I can’t individually imagine a hundred (I could sit down and trace out the paths of a hundred balls individually, of course, but not all at once).
This is the sense in which a superintelligence could be “dangerously” unpredictable. If the fundamental structures it uses for reasoning greatly exceed a human’s characteristic size of mindspace, it would be difficult to tease out its chain of logic. And this only gets worse the more intelligent it gets.
Now, I’ll grant you that the lesswrong community likes to sweep under the rug the great competition of timescales and “size”scales that are going on here. It might be prohibitively difficult, for fundamental reasons, to move from working-mind-RAM of size 5 to size 10. It may be that artificial intelligence research progresses so slowly that we never even see an intelligence explosion—just a gently sloped intelligence rise over the next few millennia. But I do think it’s a maybe not a mistake but certainly naiive to just proclaim, “Of course we’ll be able to understand them, we are generalized reasoners!”.
Edit: I should add that this is already a problem for, ironically, computer-assisted theorem proving. If a computer produces a 10,000,000 page “proof” of a mathematical theorem (i.e., something far longer than any human could check by hand), you’re putting a huge amount of trust in the correctness of the theorem-proving-software itself.
Edit: I should add that this is already a problem for, ironically, computer-assisted theorem proving. If a computer produces a 10,000,000 page “proof” of a mathematical theorem (i.e., something far longer than any human could check by hand), you’re putting a huge amount of trust in the correctness of the theorem-proving-software itself.
No, you just need to trust a proof-checking program, which can be quite small and simple, in contrast with the theorem proving program, which can be arbitrarily complex and obscure.
Isn’t using a laptop as a metaphor exactly an example of
Most often reasoning by analogy?
I think one of the points trying to be made was that because we have this uncertainty about how a superintelligence would work, we can’t accurately predict anything without more data.
So maybe the next step in AI should be to create an “Aquarium,” a self-contained network with no actuators and no way to access the internet, but enough processing power to support a superintelligence. We then observe what that superintelligence does in the aquarium before deciding how to resolve further uncertainties.
There is a difference between argument by analogy and using an example. The relevant difference here is that examples illustrate arguments that are made separately, like how calef spent paragraphs 4 and 5 restating the arguments sans laptop.
If anything, the argument from analogy here is in the comparison between human working memory and computer RAM and a nebulous “size in mindspace,” because it is used as an important part of the argument but is not supported separately. But don’t fall for the fallacy fallacy—just because something isn’t modus ponens doesn’t mean it can’t be Bayesian evidence.
Isn’t using a laptop as a metaphor exactly an example
The sentence could have stopped there. If someone makes a claim like “∀ x, p(x)”, it is entirely valid to disprove it via “~p(y)”, and it is not valid to complain that the first proposition is general but the second is specific.
Moving from the general to the specific myself, that laptop example is perfect. It is utterly baffling to me that people can insist we will be able to safely reason about the safety of AGI when we have yet to do so much as produce a consumer operating system that is safe from remote exploits or crashes. Are Microsoft employees uniquely incapable of “fully general intelligent behavior”? Are the OpenSSL developers especially imperfectly “capable of understanding the logical implications of models”?
If you argue that it is “nonsense” to believe that humans won’t naturally understand the complex things they devise, then that argument fails to predict the present, much less the future. If you argue that it is “nonsense” to believe that humans can’t eventually understand the complex things they devise after sufficient time and effort, then that’s more defensible, but that argument is pro-FAI-research, not anti-.
Problems with computer operating systems do not do arbitrary things in the absence of someone consciously using the exploit to make it do arbitrary things. If Windows was a metaphor for unfriendly AI, then it would be possible for AIs to halt in situations where they were intended to work, but they would only turn hostile if someone intentionally programmed them to become hostile. Unfriendly AI as discussed here is not someone intentionally programming the AI to become hostile.
Isn’t using a laptop as a metaphor exactly an example of “Most often reasoning by analogy”?
Precisely correct, thank you for catching that.
I think one of the points trying to be made was that because we have this uncertainty about how a superintelligence would work, we can’t accurately predict anything without more data.
Also correct reading of my intent. The “aquarium” ides is basically what I have and would continue to advocate for: continue developing AGI technology within the confines of a safe experimental setup. By learning more about the types of programs which can perform limited general intelligence tasks in sandbox environments, we learn more about their various strengths and limitations in context, and from that experiance we can construct suitable safeguards for larger deployments.
The worry is that there will be such a huge gulf between how superintelligences reason versus how we reason that it would take prohibitively long to understand them.
That may be a valid concern, but it requires evidence as it is not the default conclusion. Note that quantum physics is sufficiently different that human intuitions do not apply, but it does not take a physicist a “prohibitively long” time to understand quantum mechanical problems and their solutions.
As to your laptop example, I’m not sure what you are attempting to prove. Even if one single engineer doesn’t understand how ever component of a laptop works, we are nevertheless very much able to reason about the systems-level operation of laptops, or the the development trajectory of the global laptop market. When there are issues, we are able to debug them and fix them in context. If anything the example shows how humanity as a whole is able to complete complex projects like the creation of a modern computational machine without being constrained to any one individual understanding the whole.
Edit: gaaaah. Thanks Sable. I fell for the very trap of reasoning by analogy I opined against. Habitual modes of thought are hard to break.
As far as I can tell, you’re responding to the claim, “A group of humans can’t figure out complicated ideas given enough time.” But this isn’t my claim at all. My claim is, “One or many superintelligences would be difficult to predict/model/understand because they have a fundamentally more powerful way to reason about reality.” This is trivially true once the number of machines which are “smarter” than humans exceeds the total number of humans. The extent to which it is difficult to predict/model the “smarter” machines is a matter of contention. The precise number of “smarter” machines and how much “smarter” they need be before we should be “worried” is also a matter of contention. (How “worried” we should be is a matter of contention!)
But all of these points of contention are exactly the sorts of things that people at MIRI like to think about.
One or many superintelligences would be difficult to predict/model/understand because they have a fundamentally more powerful way to reason about reality.
Whatever reasoning technique is available to a super-intelligence is available to humans as well. No one is mandating that humans who build an AGI check their work with pencil and paper.
I mean, sure, but this observation (i.e., “We have tools that allow us to study the AI”) is only helpful if your reasoning techniques allow you to keep the AI in the box.
Which is, like, the entire point of contention, here (i.e., whether or not this can be done safely a priori).
I think that you think MIRI’s claim is “This cannot be done safely.” And I think your claim is “This obviously can be done safely” or perhaps “The onus is on MIRI to prove that this cannot be done safely.”
But, again, MIRI’s whole mission is to figure out the extent to which this can be done safely.
This may be retreating to the motte’s bailey, so to speak, but I don’t think anyone seriously thinks that a superintelligence would be literally impossible to understand. The worry is that there will be such a huge gulf between how superintelligences reason versus how we reason that it would take prohibitively long to understand them.
I think a laptop is a good example. There probably isn’t any single human on earth that knows how to build a modern laptop from scratch. There’s are computer scientists that know how the operating system is put together—how the operating system is programmed, how memory is written and retrieved from the various buses; there are other computer scientists and electrical engineers who designed the chips themselves, who arrayed circuits efficiently to dissipate heat and optimize signal latency. Even further, there are material scientists and physicists who designed the transistors and chip fabrication processes, and so on.
So, as an individual human, I don’t know what it’s like to know everything about a laptop all at once in my head, at a glance. I can zoom in on an individual piece and learn about it, but I don’t know all the nuances for each piece—just a sort of executive summary. The fundamental objects with which I can reason have a sort of characteristic size in mindspace—I can imagine 5, maybe 6 balls moving around with distinct trajectories (even then, I tend to group them into smaller subgroups). But I can’t individually imagine a hundred (I could sit down and trace out the paths of a hundred balls individually, of course, but not all at once).
This is the sense in which a superintelligence could be “dangerously” unpredictable. If the fundamental structures it uses for reasoning greatly exceed a human’s characteristic size of mindspace, it would be difficult to tease out its chain of logic. And this only gets worse the more intelligent it gets.
Now, I’ll grant you that the lesswrong community likes to sweep under the rug the great competition of timescales and “size”scales that are going on here. It might be prohibitively difficult, for fundamental reasons, to move from working-mind-RAM of size 5 to size 10. It may be that artificial intelligence research progresses so slowly that we never even see an intelligence explosion—just a gently sloped intelligence rise over the next few millennia. But I do think it’s a maybe not a mistake but certainly naiive to just proclaim, “Of course we’ll be able to understand them, we are generalized reasoners!”.
Edit: I should add that this is already a problem for, ironically, computer-assisted theorem proving. If a computer produces a 10,000,000 page “proof” of a mathematical theorem (i.e., something far longer than any human could check by hand), you’re putting a huge amount of trust in the correctness of the theorem-proving-software itself.
No, you just need to trust a proof-checking program, which can be quite small and simple, in contrast with the theorem proving program, which can be arbitrarily complex and obscure.
Isn’t using a laptop as a metaphor exactly an example of
I think one of the points trying to be made was that because we have this uncertainty about how a superintelligence would work, we can’t accurately predict anything without more data.
So maybe the next step in AI should be to create an “Aquarium,” a self-contained network with no actuators and no way to access the internet, but enough processing power to support a superintelligence. We then observe what that superintelligence does in the aquarium before deciding how to resolve further uncertainties.
There is a difference between argument by analogy and using an example. The relevant difference here is that examples illustrate arguments that are made separately, like how calef spent paragraphs 4 and 5 restating the arguments sans laptop.
If anything, the argument from analogy here is in the comparison between human working memory and computer RAM and a nebulous “size in mindspace,” because it is used as an important part of the argument but is not supported separately. But don’t fall for the fallacy fallacy—just because something isn’t modus ponens doesn’t mean it can’t be Bayesian evidence.
The sentence could have stopped there. If someone makes a claim like “∀ x, p(x)”, it is entirely valid to disprove it via “~p(y)”, and it is not valid to complain that the first proposition is general but the second is specific.
Moving from the general to the specific myself, that laptop example is perfect. It is utterly baffling to me that people can insist we will be able to safely reason about the safety of AGI when we have yet to do so much as produce a consumer operating system that is safe from remote exploits or crashes. Are Microsoft employees uniquely incapable of “fully general intelligent behavior”? Are the OpenSSL developers especially imperfectly “capable of understanding the logical implications of models”?
If you argue that it is “nonsense” to believe that humans won’t naturally understand the complex things they devise, then that argument fails to predict the present, much less the future. If you argue that it is “nonsense” to believe that humans can’t eventually understand the complex things they devise after sufficient time and effort, then that’s more defensible, but that argument is pro-FAI-research, not anti-.
Problems with computer operating systems do not do arbitrary things in the absence of someone consciously using the exploit to make it do arbitrary things. If Windows was a metaphor for unfriendly AI, then it would be possible for AIs to halt in situations where they were intended to work, but they would only turn hostile if someone intentionally programmed them to become hostile. Unfriendly AI as discussed here is not someone intentionally programming the AI to become hostile.
Precisely correct, thank you for catching that.
Also correct reading of my intent. The “aquarium” ides is basically what I have and would continue to advocate for: continue developing AGI technology within the confines of a safe experimental setup. By learning more about the types of programs which can perform limited general intelligence tasks in sandbox environments, we learn more about their various strengths and limitations in context, and from that experiance we can construct suitable safeguards for larger deployments.
That may be a valid concern, but it requires evidence as it is not the default conclusion. Note that quantum physics is sufficiently different that human intuitions do not apply, but it does not take a physicist a “prohibitively long” time to understand quantum mechanical problems and their solutions.
As to your laptop example, I’m not sure what you are attempting to prove. Even if one single engineer doesn’t understand how ever component of a laptop works, we are nevertheless very much able to reason about the systems-level operation of laptops, or the the development trajectory of the global laptop market. When there are issues, we are able to debug them and fix them in context. If anything the example shows how humanity as a whole is able to complete complex projects like the creation of a modern computational machine without being constrained to any one individual understanding the whole.
Edit: gaaaah. Thanks Sable. I fell for the very trap of reasoning by analogy I opined against. Habitual modes of thought are hard to break.
As far as I can tell, you’re responding to the claim, “A group of humans can’t figure out complicated ideas given enough time.” But this isn’t my claim at all. My claim is, “One or many superintelligences would be difficult to predict/model/understand because they have a fundamentally more powerful way to reason about reality.” This is trivially true once the number of machines which are “smarter” than humans exceeds the total number of humans. The extent to which it is difficult to predict/model the “smarter” machines is a matter of contention. The precise number of “smarter” machines and how much “smarter” they need be before we should be “worried” is also a matter of contention. (How “worried” we should be is a matter of contention!)
But all of these points of contention are exactly the sorts of things that people at MIRI like to think about.
Whatever reasoning technique is available to a super-intelligence is available to humans as well. No one is mandating that humans who build an AGI check their work with pencil and paper.
I mean, sure, but this observation (i.e., “We have tools that allow us to study the AI”) is only helpful if your reasoning techniques allow you to keep the AI in the box.
Which is, like, the entire point of contention, here (i.e., whether or not this can be done safely a priori).
I think that you think MIRI’s claim is “This cannot be done safely.” And I think your claim is “This obviously can be done safely” or perhaps “The onus is on MIRI to prove that this cannot be done safely.”
But, again, MIRI’s whole mission is to figure out the extent to which this can be done safely.