I certainly agree that creating an optimization process which provably advances a set of values under a wide variety of taxing circumstances is hard.
We know that perfect solutions to even quite simple optimization problems are a different kind of hard. We have quite good reason to suspect that this is an essential property of reality and that we will never be able to solve such problems simply. The kinds of problems we are talking about seem likely to be more complex to solve. In other words if (and it is a big if) it is possible to create an optimization process that provably advances a set of values (let’s call it ‘friendly’) it is unlikely to be a perfect optimization process. It seems likely to me that such ‘friendly’ optimization processes will represent a subset of all possible optimization processes and that it is quite likely that some ‘non-friendly’ optimization processes will be better optimizers. I see no reason to suppose that the most effective optimizers will happily fall into the ‘friendly’ subset.
The intelligence explosion means that a properly designed AI will be able to quickly take control of its immediate environment.
I don’t consider this hypothesis proved or self-evident. It is at least plausible but I can think of lots of reasons why it might not be true. Taking an outside view, we do not see much evidence from evolution or human societies of ‘winner takes all’ being a common outcome (we see much diversity in nature and human society), nor of first mover advantage always leading to an insurmountable lead. And yes, I know there are lots of reasons why ‘self improving AI is different’ but I don’t consider the matter settled. It is a realistic enough concern for me to broadly support SIAI’s efforts but it is by no means the only possible outcome.
Why would an AI with a stable goal allow another mind to be created, gain resources and threaten it? It wouldn’t. It would crush all possible rivals, because to do otherwise is to invite disaster.
Why does any goal directed agent ‘allow’ other agents to conflict with its goals? Because it isn’t strong enough to prevent them. We know of no counter examples in all of history to the hypothesis that all goal directed agents have limits. This does not rule out the possibility that a self improving AI would be the first counter-example but neither does it make me as sure of that claim as many here seem to be.
But true AIs, of the type that Eliezer wants to build, will not run into the problems you would expect from the majority of minds.
I understand the claim. I am not yet convinced it is possible or likely.
It seems likely to me that such ‘friendly’ optimization processes will represent a subset of all possible optimization processes and that it is quite likely that some ‘non-friendly’ optimization processes will be better optimizers.
I agree that human values are unlikely to be the easiest to maximize. However, for another mind to optimize our universe, it needs to be created. This is why SIAI advocates creating an AI friendly to humans before other optimization processes are created.
It seems to me that your true objection to what I am saying is contained within the statement that “it is at the very least possible for an intelligence to not take over its immediate environment before another, with possibly inimical goals, is created.” Does this agree with your assessment? Would convincing argument for the intelligence explosion cause you to change your mind?
It seems to me that your true objection to what I am saying is contained within the statement that “it is at the very least possible for an intelligence to not take over its immediate environment before another, with possibly inimical goals, is created.” Does this agree with your assessment?
More or less, though I actually lean towards it being likely rather than merely possible. I am also making the related claim that a widely spatially dispersed entity with a single coherent goal system may be a highly unstable configuration.
Would convincing argument for the intelligence explosion cause you to change your mind?
On the first point, yes. I don’t believe I’ve seen my points addressed in detail, though it sounds like Eliezer’s debate with Robin Hanson that was linked earlier might cover the same ground. I will take some time to follow up on that later.
it sounds like Eliezer’s debate with Robin Hanson that was linked earlier might cover the same ground.
I’m working my way through it and indeed it does. Robin Hanson’s post Dreams of Autarky is close to my position. I think there are other computational, economic and physical arguments in this direction as well.
We know that perfect solutions to even quite simple optimization problems are a different kind of hard. We have quite good reason to suspect that this is an essential property of reality and that we will never be able to solve such problems simply. The kinds of problems we are talking about seem likely to be more complex to solve. In other words if (and it is a big if) it is possible to create an optimization process that provably advances a set of values (let’s call it ‘friendly’) it is unlikely to be a perfect optimization process. It seems likely to me that such ‘friendly’ optimization processes will represent a subset of all possible optimization processes and that it is quite likely that some ‘non-friendly’ optimization processes will be better optimizers. I see no reason to suppose that the most effective optimizers will happily fall into the ‘friendly’ subset.
I don’t consider this hypothesis proved or self-evident. It is at least plausible but I can think of lots of reasons why it might not be true. Taking an outside view, we do not see much evidence from evolution or human societies of ‘winner takes all’ being a common outcome (we see much diversity in nature and human society), nor of first mover advantage always leading to an insurmountable lead. And yes, I know there are lots of reasons why ‘self improving AI is different’ but I don’t consider the matter settled. It is a realistic enough concern for me to broadly support SIAI’s efforts but it is by no means the only possible outcome.
Why does any goal directed agent ‘allow’ other agents to conflict with its goals? Because it isn’t strong enough to prevent them. We know of no counter examples in all of history to the hypothesis that all goal directed agents have limits. This does not rule out the possibility that a self improving AI would be the first counter-example but neither does it make me as sure of that claim as many here seem to be.
I understand the claim. I am not yet convinced it is possible or likely.
I agree that human values are unlikely to be the easiest to maximize. However, for another mind to optimize our universe, it needs to be created. This is why SIAI advocates creating an AI friendly to humans before other optimization processes are created.
It seems to me that your true objection to what I am saying is contained within the statement that “it is at the very least possible for an intelligence to not take over its immediate environment before another, with possibly inimical goals, is created.” Does this agree with your assessment? Would convincing argument for the intelligence explosion cause you to change your mind?
More or less, though I actually lean towards it being likely rather than merely possible. I am also making the related claim that a widely spatially dispersed entity with a single coherent goal system may be a highly unstable configuration.
On the first point, yes. I don’t believe I’ve seen my points addressed in detail, though it sounds like Eliezer’s debate with Robin Hanson that was linked earlier might cover the same ground. I will take some time to follow up on that later.
I’m working my way through it and indeed it does. Robin Hanson’s post Dreams of Autarky is close to my position. I think there are other computational, economic and physical arguments in this direction as well.