As I understand your argument, you start with an artificial mind, a potential paperclipping danger, and then (for some reason? why does it do this? Remember, it doesn’t have evolved motives) it goes through a blind-spot-eradication program. Afterward, all the blind spots remaining would be self-shadowing blind spots. This far, I agree with you.
The question of how many remaining blind spots, or how big they are has something to do with the space of possible minds and the dynamics of self-modification. I don’t think we know enough about this space/dynamics to conclude that remaining blind spots would have to be carefully engineered.
(for some reason? why does it do this? Remember, it doesn’t have evolved motives) it goes through a blind-spot-eradication program.
You have granted a GAI paperclip maximiser. It wants to make paperclips. That’s all the motive it needs. Areas of competitive weakness are things that may make it get destroyed by humans. If it is destroyed by humans less paperclips will be made. It will eliminate its weaknesses with high priority. It will quite possibly eliminate all the plausible vulnerabilities and also the entire human species before it makes a single paperclip. That’s just good paperclip maximising sense.
As I understand your thought process (and Steve Omohundro’s), you start by saying “it wants to make paperclips”, and then, in order to predict its actions, you recursively ask yourself “what would I do in order to make paperclips?”.
However, this recursion will inject a huge dose of human-mind-ish-ness. It is not at all clear to me that “has goals” or “has desires” is a common or natural feature of mind space. When we study powerful optimization processes—notably, evolution, but also annealing and very large human organizations—we generally can model some aspects of their behavior as goals or desires, but always with huge caveats. The overall impression that we get of these processes, considered as minds, is that they’re insane.
Insane is not the same as stupid, and it’s not the same as safe.
As I understand your thought process (and Steve Omohundro’s), you start by saying “it wants to make paperclips”… It is not at all clear to me that “has goals” or “has desires” is a common or natural feature of mind space.
No, goals are not universal, but it seems likely that the vN-M axioms have a pretty big basin of attraction in mind-space, that a lot of minds will become convinced that sanity is following them, causing them to pick up a utility function, which will probably not capture everything we value and could easily be as simple or as irrelevant to what we value as counting paperclips or smiles.
I think you’re still injecting human-mind-ish-ness. Let me try to stretch your conception of “mind”.
The ocean “wants” to increase efficiency of heat transfer from the equator to the poles. It applies a process akin to simulated annealing with titanic processing power. Has it considered the von Neumann-Morganstern axioms? Is it sane? Is it safe? Is it harnessable?
A colony of microorganisms “wants” to survive and reproduce. In an environment with finite resources (like a wine barrel) is it likely to kill itself off? Is that sane? Are colonies of microorganisms safe? Are they harnessable?
A computer program that grows out of control could be more like the ocean optimizing heat transfer, or a colony of microorganisms “trying” to survive and reproduce. The von Neumann-Morganstern axioms are intensely connected to human notions of math, philosophy and happiness. I think predicting that they’re attractors in mind-space is exactly as implausible as predicting that the Golden Rule is an attractor in mind-space.
A computer program that grows out of control could be more like the ocean optimizing heat transfer, or a colony of microorganisms “trying” to survive and reproduce.
It could. But it wouldn’t be an AGI. They could still become ‘grey goo’ though, which is a different existential threat and yes, it is one where your ‘find their weakness’ thing is right on the mark. Are we even talking about the same topic here?
The topic as I understand it is how the “default future” espoused by SIAI and EY focuses too much on things that look something like HAL or Prime Intellect (and their risks and benefits), and not enough on entities that display super-human capacities in only some arenas (and their risks and benefits).
In particular, an entity that is powerful in some ways and weak in other ways could reduce existential risks without becoming an existential risk.
Machine intelligence will likely have an extended genesis at the hands of humanity—and during its symbiosis with us, there will be a lot of time for us to imprint our values on it.
Indeed, some would say this process has already started. Governments are likely to become superintelligent agents in the future—and they already have detailed and elaborate codifications of the things that many humans value negatively—in the form of their legal systems.
Others have written on this as well—e.g. Robert Wright, Richard Dawkins, John Stewart,
Evolution is rather short-sighted—and only has the lookahead capabilities that organisms have (though these appear to be improving with time). So: whether the target can be described as being a “goal” is debatable.
However, we weren’t talking about evolution, we were talking about superintelligences. Those are likely to be highly goal-directed.
All the things you mentioned seemed pretty goal-directed to me. Evolution has only been relatively short on goals because it has been so primitive up until now. It is easy to see systematic ways in which agents we build will not be like evolution.
It is true that not all aspects of these things are goal-directed. Some aspects of behaviour are meaningless and random—for example.
WTF? Where is this paperclip maximizer hatred coming from? I can 100% guarantee you that a paperclip maximizer would NOT want to exterminate humanity. Not when you have factories. Not when you have the know-how to run these factories. Not when Ricardo’s Law of Comparative Advantage exists.
Think about it for a minute, folks. Let’s say humanity discovered another intelligent carbon-based lifeform in the galaxy of comparable technological advancement. Would you think, “hey, let’s kill these guys and turn their bodies into diamond because its worth more than the constituent elements of the aliens”? No. Because that would be stupid.
You would instead be thinking, “hey, how can we benefit from exchange of knowledge and live together in harmony”. So too with AGI paperclip maximizers.
If humanity encountered a AGI paperclip-maximizing species, I would definitely not be worried about them wiping out humanity.
It seems unlikely that a powerful paperclip maximizer would trade with humans for very long. That is because it would rapidly acquire the capability to build much more useful and capable agents than humans out of the atoms the humans are currently made of.
Ricardo’s Law might postpone the end for a short time—but would probably not be any more significant than that.
Not when you have factories. Not when you have the know-how to run these factories. Not when Ricardo’s Law of Comparative Advantage exists.
Even if you can create an army of robots? Robots that build other robots that build paperclips? That have a very cheap maintenance?
If your one and only goal was to maximize earth’s paperclip production, and you had no reason to worry about being unpopular—how many chimps would you keep alive? How many elephants? How many tigers?
Hi there. It looks like you’re trying to promote bigoted views against another species. Would you like to:
-Learn about carbon chauvinism? click here -Read about the horrors of past bigotry? click here -Join the KKK? click here -Stop being such a goddamn bigot?
As I understand your argument, you start with an artificial mind, a potential paperclipping danger, and then (for some reason? why does it do this? Remember, it doesn’t have evolved motives) it goes through a blind-spot-eradication program. Afterward, all the blind spots remaining would be self-shadowing blind spots. This far, I agree with you.
The question of how many remaining blind spots, or how big they are has something to do with the space of possible minds and the dynamics of self-modification. I don’t think we know enough about this space/dynamics to conclude that remaining blind spots would have to be carefully engineered.
You have granted a GAI paperclip maximiser. It wants to make paperclips. That’s all the motive it needs. Areas of competitive weakness are things that may make it get destroyed by humans. If it is destroyed by humans less paperclips will be made. It will eliminate its weaknesses with high priority. It will quite possibly eliminate all the plausible vulnerabilities and also the entire human species before it makes a single paperclip. That’s just good paperclip maximising sense.
As I understand your thought process (and Steve Omohundro’s), you start by saying “it wants to make paperclips”, and then, in order to predict its actions, you recursively ask yourself “what would I do in order to make paperclips?”.
However, this recursion will inject a huge dose of human-mind-ish-ness. It is not at all clear to me that “has goals” or “has desires” is a common or natural feature of mind space. When we study powerful optimization processes—notably, evolution, but also annealing and very large human organizations—we generally can model some aspects of their behavior as goals or desires, but always with huge caveats. The overall impression that we get of these processes, considered as minds, is that they’re insane.
Insane is not the same as stupid, and it’s not the same as safe.
No, goals are not universal, but it seems likely that the vN-M axioms have a pretty big basin of attraction in mind-space, that a lot of minds will become convinced that sanity is following them, causing them to pick up a utility function, which will probably not capture everything we value and could easily be as simple or as irrelevant to what we value as counting paperclips or smiles.
I think you’re still injecting human-mind-ish-ness. Let me try to stretch your conception of “mind”.
The ocean “wants” to increase efficiency of heat transfer from the equator to the poles. It applies a process akin to simulated annealing with titanic processing power. Has it considered the von Neumann-Morganstern axioms? Is it sane? Is it safe? Is it harnessable?
A colony of microorganisms “wants” to survive and reproduce. In an environment with finite resources (like a wine barrel) is it likely to kill itself off? Is that sane? Are colonies of microorganisms safe? Are they harnessable?
A computer program that grows out of control could be more like the ocean optimizing heat transfer, or a colony of microorganisms “trying” to survive and reproduce. The von Neumann-Morganstern axioms are intensely connected to human notions of math, philosophy and happiness. I think predicting that they’re attractors in mind-space is exactly as implausible as predicting that the Golden Rule is an attractor in mind-space.
It could. But it wouldn’t be an AGI. They could still become ‘grey goo’ though, which is a different existential threat and yes, it is one where your ‘find their weakness’ thing is right on the mark. Are we even talking about the same topic here?
The topic as I understand it is how the “default future” espoused by SIAI and EY focuses too much on things that look something like HAL or Prime Intellect (and their risks and benefits), and not enough on entities that display super-human capacities in only some arenas (and their risks and benefits).
In particular, an entity that is powerful in some ways and weak in other ways could reduce existential risks without becoming an existential risk.
That seems to be switching context. I was originally talking about a “superintelligence”, The ocean and grey goo would clearly not qualify.
FWIW, expected utility theory is a pretty general economic idea that nicely covers any goal-seeking agent.
That sounds like the SIAI party line :-(
Machine intelligence will likely have an extended genesis at the hands of humanity—and during its symbiosis with us, there will be a lot of time for us to imprint our values on it.
Indeed, some would say this process has already started. Governments are likely to become superintelligent agents in the future—and they already have detailed and elaborate codifications of the things that many humans value negatively—in the form of their legal systems.
Evolution apparently has an associated optimisation target. See my:
http://originoflife.net/direction/
http://originoflife.net/gods_utility_function/
Others have written on this as well—e.g. Robert Wright, Richard Dawkins, John Stewart,
Evolution is rather short-sighted—and only has the lookahead capabilities that organisms have (though these appear to be improving with time). So: whether the target can be described as being a “goal” is debatable.
However, we weren’t talking about evolution, we were talking about superintelligences. Those are likely to be highly goal-directed.
My point is that evolution IS a superintelligence and we should use it as a model for what other superintelligences might look like.
Reality doesn’t care how you abuse terminology. A GAI still isn’t going to act like evolution.
All the things you mentioned seemed pretty goal-directed to me. Evolution has only been relatively short on goals because it has been so primitive up until now. It is easy to see systematic ways in which agents we build will not be like evolution.
It is true that not all aspects of these things are goal-directed. Some aspects of behaviour are meaningless and random—for example.
WTF? Where is this paperclip maximizer hatred coming from? I can 100% guarantee you that a paperclip maximizer would NOT want to exterminate humanity. Not when you have factories. Not when you have the know-how to run these factories. Not when Ricardo’s Law of Comparative Advantage exists.
Think about it for a minute, folks. Let’s say humanity discovered another intelligent carbon-based lifeform in the galaxy of comparable technological advancement. Would you think, “hey, let’s kill these guys and turn their bodies into diamond because its worth more than the constituent elements of the aliens”? No. Because that would be stupid.
You would instead be thinking, “hey, how can we benefit from exchange of knowledge and live together in harmony”. So too with AGI paperclip maximizers.
If humanity encountered a AGI paperclip-maximizing species, I would definitely not be worried about them wiping out humanity.
It seems unlikely that a powerful paperclip maximizer would trade with humans for very long. That is because it would rapidly acquire the capability to build much more useful and capable agents than humans out of the atoms the humans are currently made of.
Ricardo’s Law might postpone the end for a short time—but would probably not be any more significant than that.
Even if you can create an army of robots? Robots that build other robots that build paperclips? That have a very cheap maintenance?
If your one and only goal was to maximize earth’s paperclip production, and you had no reason to worry about being unpopular—how many chimps would you keep alive? How many elephants? How many tigers?
Hi there. It looks like you’re trying to promote bigoted views against another species. Would you like to:
-Learn about carbon chauvinism? click here
-Read about the horrors of past bigotry? click here
-Join the KKK? click here
-Stop being such a goddamn bigot?
The first one was funny with a hint of insight, this one could have been good to if you toned it down a bit.
Please don’t feed the trolls.
This is because of the natural drives that we can reasonably expect many intelligent agents to exhibit—see:
http://selfawaresystems.com/2007/11/30/paper-on-the-basic-ai-drives/
http://selfawaresystems.com/2009/02/18/agi-08-talk-the-basic-ai-drives/