My guess is that Eliezer will be horrified at the results of CEV—despite the fact that most people will be happy with it.
This is obvious given the degree to which Eliezer’s personal morality diverges from the morality of the human race.
My guess is that Eliezer will be horrified at the results of CEV—despite the fact that most people will be happy with it.
This is obvious given the degree to which Eliezer’s personal morality diverges from the morality of the human race.
Being deterministic does NOT mean that you are predictable. Consider this deterministic algorithm, for something that has only two possible actions, X and Y.
Find out what action has been predicted.
If X has been predicted, do Y.
If Y has been predicted, do X.
This algorithm is deterministic, but not predictable. And by the way, human beings can implement this algorithm; try to tell someone everything he will do the next day, and I assure you that he will not do it (unless you pay him etc).
Also, Eliezer may be right that in theory, you can prove that the AI will not do X, and then it will think, “Now I know that I will decide not to do X. So I might as well make up my mind right now not to do X, rather than wasting time thinking about it, since I will end up not doing X in any case.” However, in practice this will not be possible because any particular action X will be possible to any intelligent being, given certain beliefs or circumstances (and this is not contrary to determinism, since evidence and circumstances come from outside), and as James admitted, the AI does not know the future. So it will not know for sure what it is going to do, even if it knows its own source code, but it will only know what is likely.
James, of course it would know that only one of the two was objectively possible. However, it would not know which one was objectively possible and which one was not.
The AI would not be persuaded by the “proof”, because it would still believe that if later events gave it reason to do X, it would do X, and if later events gave it reason to do Y, it would do Y. This does not mean that it thinks that both are objectively possible. It means that as far as it can tell, each of the two is subjectively open to it.
Your example does not prove what you want it to. Yes, if the source code included that line, it would do it. But if the AI were to talk about itself, it would say, “When someone types ‘tickle’ I am programmed to respond ‘hahaha’.” It would not say that it has made any decision at all. It would be like someone saying, “when it’s cold, I shiver.” This does not depend on a choice, and the AI would not consider the hahaha output to depend on a choice. And if it was self modifying, it is perfectly possible that it would modify itself not to make this response at all.
It does not matter that in fact, all of its actions are just as determinate as the tickle response. The point is that it understands the one as determinate in advance. It does not see that there is any decision to make. If it thinks there is a decision to be made, then it may be deterministic, but it surely does not know which decision it will make.
The basic point is that you are assuming, without proof, that intelligence can be modeled by a simple algorithm. But the way intelligence feels from the inside, proves that it cannot be so modelled, namely it proves that a model of my intelligence must be too complicated for me to understand, and the same is true of the AI: it’s own intelligence is too complicated for it to understand, even if it can understand mine.
James Andrix: an AI would be perfectly capable of understanding a proof that it was deterministic, assuming that it in fact was deterministic.
Despite this, it would not be capable of understanding a proof that at some future time, it will take action X, some given action, and will not take action Y, some other given action.
This is clear for the reason stated. It sees both X and Y as possibilities which it has not yet decided between, and as long as it has not yet decided, it cannot already believe that it is impossible for it to take one of the choices. So if you present a “proof” of this fact, it will not accept it, and this is a very strong argument that your proof is invalid.
The fact is clear enough. The reason for it is not quite clear simply because the nature of intelligence and consciousness is not clear. A clear understanding of these things would show in detail the reason for the fact, namely that understanding the causes that determine which actions will be taken and which ones will be not, takes more “power of understanding” than possessed by the being that makes the choice. So the superintelligent AI might very well know that you will do X, and will not do Y. But it will not know this about itself, nor will you know this about the AI, because in order to know this about the AI, you would require a greater power of understanding than that possessed by the AI (which by hypothesis is superintelligent while you are not.)
Nick, the reason there are no such systems (which are at least as intelligent as us) is that we are not complicated enough to manage to understand the proof.
This is obvious: the AI itself cannot understand a proof that it cannot do action A. For if we told it that it could not do A, it would still say, “I could do A, if I wanted to. And I have not made my decision yet. So I don’t yet know whether I will do A or not. So your proof does not convince me.” And if the AI cannot understand the proof, obviously we cannot understand the proof ourselves, since we are inferior to it.
So in other words, I am not saying that there are no rigid restrictions. I am saying that there are no rigid restrictions that can be formally proved by a proof that can be understood by the human mind.
This is all perfectly consistent with physics and math.
Emile, you can’t prove that the chess moves outputted by a human chess player will be legal chess moves, and in the same way, you may be able to prove that about a regular chess playing program, but you will not be able to prove it for an AI that plays chess; an AI could try to cheat at chess when you’re not looking, just like a human being could.
Basically, a rigid restriction on the outputs, as in the chess playing program, proves you’re not dealing with something intelligent, since something intelligent can consider the possibility of breaking the rules. So if you can prove that the AI won’t turn the universe into paperclips, that shows that it is not even intelligent, let alone superintelligent.
This doesn’t mean that there are no restrictions at all on the output of an intelligent being, of course. It just means that the restrictions are too complicated for you to prove.
Eliezer, this is the source of the objection. I have free will, i.e. I can consider two possible courses of action. I could kill myself, or I could go on with life. Until I make up my mind, I don’t know which one I will choose. Of course, I have already decided to go on with life, so I know. But if I hadn’t decided yet, I wouldn’t know.
In the same way, an AI, before making its decision, does not know whether it will turn the universe into paperclips, or into a nice place for human beings. But the AI is superintelligent: so if it does not know which one it will do, neither do we know. So we don’t know that it won’t turn the universe into paperclips.
It seems to me that this argument is valid: you will not be able to come up with what you are looking for, namely a mathematical demonstration that your AI will not turn the universe into paperclips. But it may be easy enough to show that it is unlikely, just as it is unlikely that I will kill myself.
Ben Jones, the means of identifying myself will only show that I am the same one who sent the $10, not who it is who sent it.
Eliezer seemed to think that one week would be sufficient for the AI to take over the world, so that seems enough time.
As for what constitutes the AI, since we don’t have any measure of superhuman intelligence, it seems to me sufficient that it be clearly more intelligent than any human being.
Eliezer: did you receive the $10? I don’t want you making up the story, 20 or 30 years from now, when you lose the bet, that you never received the money.
Eliezer, also consider this: suppose I am a mad scientist trying to decide between making one copy of Eliezer and torturing it for 50 years, or on the other hand, making 1000 copies of Eliezer and torturing them all for 50 years.
The second possibility is much, much worse for you personally. For in the first possibility, you would subjectively have a 50% chance of being tortured. But in the second possibility, you would have a subjective chance of 99.9% of being tortured. This implies that the second possibility is much worse, so creating copies of bad experiences multiplies, even without diversity. But this implies that copies of good experiences should also multiply: if I make a million copies of Eliezer having billions of units of utility, this would be much better than making only one, which would give you only a 50% chance of experiencing this.
Eliezer, you know perfectly well that the theory you are suggesting here leads to circular preferences. On another occasion when this came up, I started to indicate the path that would show this, and you did not respond. If circular preferences are justified on the grounds that you are confused, then you are justifying those who said that dust specks are preferable to torture.
Eliezer: c/o Singularity Institute P.O. Box 50182 Palo Alto, CA 94303 USA
I hope that works.
Eliezer, I am sending you the $10. I will let you know how to pay when you lose the bet. I have included in the envelope a means of identifying myself when I claim the money, so that it cannot be claimed by someone impersonating me.
Your overconfidence will surely cost you on this occasion, even though I must admit that I was forced to update (a very small amount) in favor of your position, on seeing the surprising fact that you were willing to engage in such a wager.
When someone designs a superintelligent AI (it won’t be Eliezer), without paying any attention to Friendliness (the first person who does it won’t), and the world doesn’t end (it won’t), it will be interesting to hear Eliezer’s excuses.
Eliezer, “changes in my progamming that seem to result in improvements” are sufficently arbitrary that you may still have to face the halting problem, i.e. if you are programming an intelligent being, it is going to be sufficiently complicated that you will never prove that there are no bugs in your original programming, i.e. even ones that may show no effect until it has improved itself 1,000,000 times, and by then it will be too late.
Apart from this, no intelligent entity can predict in own actions, i.e. it will always have a feeling of “free will.” This is necessary because whenever it looks at a choice between A and B, it will always say, “I could do A, if I thought it was better,” and “I could also do B, if I thought it was better.” So it’s own actions are surely unpredictable to it, it can’t predict the choice until it actually makes the choice, just like us. But this implies that “insight into intelligence” may be impossible, or at least full insight into one’s own intelligence, and that is enough to imply that your whole project may be impossible, or at least that it may go very slowly, so Robin will turn our to be right.
Eliezer, your basic error regarding the singularity is the planning fallacy. And a lot of people are going to say “I told you so” sooner or later.
Komponisto: that definition includes human beings, so Eliezer is not an atheist according to that.
Psy-Kosh, your new definition doesn’t help. For example Eliezer Yudkowsky believes in God according to the definition you have just given, both according to the deist part, and according to theist part. Let’s take those one at a time, to illustrate the point:
First part of the definition:
“An ontologically fundamental unique entity that has, in some sense something resembling desire/will, further, this entity deliberately, as an act of will, created the reality we experience.”
Does Eliezer believe in ontologically fundamental entities? Yes. So that’s one element. Does Eliezer believe in an ontologically fundamental unique entity? Yes, he believes in at least one: he has stated that the universe consists of one unique mathematical object, and as far as I can tell, he thinks it is fundamental. This is clear from the fact that he denies the fundamental nature of anything else. An electron, for example, is not fundamental, since it is simply a part of a larger wave function. It is really the wave function, the whole of it, which is fundamental, and unique.
Does this unique being have something resembling will, by which it created the world? First it is clear that it created the world. I shouldn’t have to argue for this point, it follows directly from Eliezer’s ideas. But does it have anything resembling will? Well, one thing that will does is that it tends to produce something definite, namely the thing that you will. So anything that produces definite results, rather than random results, resembles will in at least one way. And this wave function produces definite results: according to Eliezer all of reality is totally deterministic. Thus, Eliezer believes in a fundamental, unique entity, which created the world by means of something resembling will or desire, i.e. by your definition, he believes in God.
Next question: does this entity directly orchestrate all of reality? It should be obvious that according to Eliezer, yes.
So Eliezer is a theist.
As far as I can tell, atheists and theists don’t even disagree, for the most part. Ask an atheist, “What do you understand the word ‘God’ to mean?” Then ask a theist if he thinks that this thing exists, giving the definition of the word given by the atheist. The theist will say, “No.”
Eliezer: “And you might not notice if your goals shifted only a bit at a time, as your emotional balance altered with the strange new harmonies of your brain.”
This is yet another example of Eliezer’s disagreement with the human race about morality. This actually happens to us all the time, without any modification at all, and we don’t care at all, and in fact we tend to be happy about it, because according to the new goal system, our goals have improved. So this suggests that we still won’t care if it happens due to upgrading.