As I understand your thought process (and Steve Omohundro’s), you start by saying “it wants to make paperclips”, and then, in order to predict its actions, you recursively ask yourself “what would I do in order to make paperclips?”.
However, this recursion will inject a huge dose of human-mind-ish-ness. It is not at all clear to me that “has goals” or “has desires” is a common or natural feature of mind space. When we study powerful optimization processes—notably, evolution, but also annealing and very large human organizations—we generally can model some aspects of their behavior as goals or desires, but always with huge caveats. The overall impression that we get of these processes, considered as minds, is that they’re insane.
Insane is not the same as stupid, and it’s not the same as safe.
As I understand your thought process (and Steve Omohundro’s), you start by saying “it wants to make paperclips”… It is not at all clear to me that “has goals” or “has desires” is a common or natural feature of mind space.
No, goals are not universal, but it seems likely that the vN-M axioms have a pretty big basin of attraction in mind-space, that a lot of minds will become convinced that sanity is following them, causing them to pick up a utility function, which will probably not capture everything we value and could easily be as simple or as irrelevant to what we value as counting paperclips or smiles.
I think you’re still injecting human-mind-ish-ness. Let me try to stretch your conception of “mind”.
The ocean “wants” to increase efficiency of heat transfer from the equator to the poles. It applies a process akin to simulated annealing with titanic processing power. Has it considered the von Neumann-Morganstern axioms? Is it sane? Is it safe? Is it harnessable?
A colony of microorganisms “wants” to survive and reproduce. In an environment with finite resources (like a wine barrel) is it likely to kill itself off? Is that sane? Are colonies of microorganisms safe? Are they harnessable?
A computer program that grows out of control could be more like the ocean optimizing heat transfer, or a colony of microorganisms “trying” to survive and reproduce. The von Neumann-Morganstern axioms are intensely connected to human notions of math, philosophy and happiness. I think predicting that they’re attractors in mind-space is exactly as implausible as predicting that the Golden Rule is an attractor in mind-space.
A computer program that grows out of control could be more like the ocean optimizing heat transfer, or a colony of microorganisms “trying” to survive and reproduce.
It could. But it wouldn’t be an AGI. They could still become ‘grey goo’ though, which is a different existential threat and yes, it is one where your ‘find their weakness’ thing is right on the mark. Are we even talking about the same topic here?
The topic as I understand it is how the “default future” espoused by SIAI and EY focuses too much on things that look something like HAL or Prime Intellect (and their risks and benefits), and not enough on entities that display super-human capacities in only some arenas (and their risks and benefits).
In particular, an entity that is powerful in some ways and weak in other ways could reduce existential risks without becoming an existential risk.
Machine intelligence will likely have an extended genesis at the hands of humanity—and during its symbiosis with us, there will be a lot of time for us to imprint our values on it.
Indeed, some would say this process has already started. Governments are likely to become superintelligent agents in the future—and they already have detailed and elaborate codifications of the things that many humans value negatively—in the form of their legal systems.
Others have written on this as well—e.g. Robert Wright, Richard Dawkins, John Stewart,
Evolution is rather short-sighted—and only has the lookahead capabilities that organisms have (though these appear to be improving with time). So: whether the target can be described as being a “goal” is debatable.
However, we weren’t talking about evolution, we were talking about superintelligences. Those are likely to be highly goal-directed.
All the things you mentioned seemed pretty goal-directed to me. Evolution has only been relatively short on goals because it has been so primitive up until now. It is easy to see systematic ways in which agents we build will not be like evolution.
It is true that not all aspects of these things are goal-directed. Some aspects of behaviour are meaningless and random—for example.
As I understand your thought process (and Steve Omohundro’s), you start by saying “it wants to make paperclips”, and then, in order to predict its actions, you recursively ask yourself “what would I do in order to make paperclips?”.
However, this recursion will inject a huge dose of human-mind-ish-ness. It is not at all clear to me that “has goals” or “has desires” is a common or natural feature of mind space. When we study powerful optimization processes—notably, evolution, but also annealing and very large human organizations—we generally can model some aspects of their behavior as goals or desires, but always with huge caveats. The overall impression that we get of these processes, considered as minds, is that they’re insane.
Insane is not the same as stupid, and it’s not the same as safe.
No, goals are not universal, but it seems likely that the vN-M axioms have a pretty big basin of attraction in mind-space, that a lot of minds will become convinced that sanity is following them, causing them to pick up a utility function, which will probably not capture everything we value and could easily be as simple or as irrelevant to what we value as counting paperclips or smiles.
I think you’re still injecting human-mind-ish-ness. Let me try to stretch your conception of “mind”.
The ocean “wants” to increase efficiency of heat transfer from the equator to the poles. It applies a process akin to simulated annealing with titanic processing power. Has it considered the von Neumann-Morganstern axioms? Is it sane? Is it safe? Is it harnessable?
A colony of microorganisms “wants” to survive and reproduce. In an environment with finite resources (like a wine barrel) is it likely to kill itself off? Is that sane? Are colonies of microorganisms safe? Are they harnessable?
A computer program that grows out of control could be more like the ocean optimizing heat transfer, or a colony of microorganisms “trying” to survive and reproduce. The von Neumann-Morganstern axioms are intensely connected to human notions of math, philosophy and happiness. I think predicting that they’re attractors in mind-space is exactly as implausible as predicting that the Golden Rule is an attractor in mind-space.
It could. But it wouldn’t be an AGI. They could still become ‘grey goo’ though, which is a different existential threat and yes, it is one where your ‘find their weakness’ thing is right on the mark. Are we even talking about the same topic here?
The topic as I understand it is how the “default future” espoused by SIAI and EY focuses too much on things that look something like HAL or Prime Intellect (and their risks and benefits), and not enough on entities that display super-human capacities in only some arenas (and their risks and benefits).
In particular, an entity that is powerful in some ways and weak in other ways could reduce existential risks without becoming an existential risk.
That seems to be switching context. I was originally talking about a “superintelligence”, The ocean and grey goo would clearly not qualify.
FWIW, expected utility theory is a pretty general economic idea that nicely covers any goal-seeking agent.
That sounds like the SIAI party line :-(
Machine intelligence will likely have an extended genesis at the hands of humanity—and during its symbiosis with us, there will be a lot of time for us to imprint our values on it.
Indeed, some would say this process has already started. Governments are likely to become superintelligent agents in the future—and they already have detailed and elaborate codifications of the things that many humans value negatively—in the form of their legal systems.
Evolution apparently has an associated optimisation target. See my:
http://originoflife.net/direction/
http://originoflife.net/gods_utility_function/
Others have written on this as well—e.g. Robert Wright, Richard Dawkins, John Stewart,
Evolution is rather short-sighted—and only has the lookahead capabilities that organisms have (though these appear to be improving with time). So: whether the target can be described as being a “goal” is debatable.
However, we weren’t talking about evolution, we were talking about superintelligences. Those are likely to be highly goal-directed.
My point is that evolution IS a superintelligence and we should use it as a model for what other superintelligences might look like.
Reality doesn’t care how you abuse terminology. A GAI still isn’t going to act like evolution.
All the things you mentioned seemed pretty goal-directed to me. Evolution has only been relatively short on goals because it has been so primitive up until now. It is easy to see systematic ways in which agents we build will not be like evolution.
It is true that not all aspects of these things are goal-directed. Some aspects of behaviour are meaningless and random—for example.