I don’t know, perhaps we’re not talking about the same thing. It won’t be an agent with a single, non-reflective goal, but an agent billion times more complex than a human; and all I am saying is, that I don’t think it will matter much, whether we imprint in it a goal like “don’t kill humans” or not. Ultimately, the decision will be its own.
Dwelle
Sure, but I’d be more cautious at assigning probabilities of how likely it’s for a very intelligent AI to change its human-programmed values.
That’s why I said, that they can change it anytime they like. If they don’t desire the change, they won’t change it. I see nothing incoherent there.
Why? Do you think paperclip maximizers are impossible?
Yes, right now I think it’s impossible to create self-improving, self-aware AI with fixed values. I never said that paperclip maximizing can’t be their ultimate life goal, but they could change it anytime they like.
You don’t mean that as a dichotomy, do you?
No.
Well first, I was all for creating an AI to become the next stage. I was a very singularity-happy type of guy. I saw it as a way out of this world’s status quo—corruption, state of politics, etc… but the singularity would ultimately mean I and everybody else would cease to exist, at least in their true sense. You know, I have these romantic dreams, similar to Yudkowsky’s idea of dancing in an orbital night club around Saturn, and such. I don’t want to be fused in one, even though possibly amazing, matrix of intelligence, which I think is how the things will play out, eventually. Even though, I can’t imagine what it will be like and how it will pan out, as of now I just don’t cherish the idea much.
But yea, I could say that I am torn between moving on, advancing, and between more or less stagnating and in our human form.
But in answer to your question: if we were to creating an AI to replace us, I’d hate it to become paperclip maximizer. I don’t think it’s likely.
Yea, I get it… I believe, though, that it’s impossible to create an AI (self-aware, learning) that has set values, that can’t change—more importantly, I am not even sure if its desired (but that depends what our goal is—whether to create AI only to perform certain simple tasks or whether to create a new race, something that precedes us (which WOULD ultimately mean our demise, anyway))
Alright—that is to create completely deterministic AI system, or otherwise, to my belief, it would be impossible to predict how the AI is going to react. Anyway, I admit that I have not read much on the matter, and it’s just reasoning… so thanks for your insight.
A variant that does make sense and is a real concern is that as the AGI learns, it could change its definitions in unpredictable ways. Peter De Blanc talks about this here. This could lead to part of the utility function becoming undefined or to the machine valuing things that we never intended it to value—basically it makes the utility function unstable under the conditions you describe. The intuition is roughly that if you define a human in one way, according to what we currently know about physics, some new discovery made available to the AI might result in it redefining humans in new terms and no longer having them as a part of its utility function. Whatever the utility function describes is now separate from how humans appear to it.
That’s what I basically meant.
Ok, I guess we were talking about different things, then.
I don’t see any point in giving particular examples. More importantly, even if I didn’t support my claim, it wouldn’t mean your argument was correct. The burden of proof lies on your shoulders, not mine. Anyway, here’s one example, quite cliche—I would choose to sterilize myself, if I realized that having intercourse with little girls is wrong (or that having intercourse at all is wrong, whatever the reason..) Even if it was my utmost desire, and in my wholeness I believed that it is my purpose to have intercourse , I would choose to modify that desire if I realized it’s wrong—or illogical, or stupid, or anything. It doesn’t matter really.
THERFORE:
(A) I do not desire not to have intercourse. (B) But based on new information, I found out that having intercourse produces great evil. ⇒ I choose to alter my desire (A).
You might say that by introducing new desire (not to produce evil) I no longer desire (A), and I say, fine. Now, how do you want to ensure that the AI won’t create it’s own new desires based on new facts.
Let me get this straight, are you saying that if you believe X, there can’t possibly exist any information that you haven’t discovered yet that could convince your belief is false? You can’t know what connections and conclusions might AI deduce out of every information put together. They might conclude that humanity is a stain of universe and even if they thought wiping humanity out wouldn’t accomplish anything (and they strongly desired against doing so), they might wipe us out purely because the choice “wipe humanity” would be assigned higher value than the choice “not to wipe out humanity”.
Also, is the statement “my desire is not do do X, therefore I wouldn’t choose to desire to do X even if I could choose that.” your subjective feeling, or do you base it on some studies? For example, this statement doesn’t apply to me, as I would, under certain circumstances, choose to desire to do X, even if it was not my desire initially. Therefore it’s not an universal truth, therefore may not apply to AI either.
I am not sure your argument is entirely valid. The AI would have access to every information humans ever conceived, including the discussions, disputes and research put into programming this AI’s goals and nature. It may then adopt new goals based on the information gathered, realizing its former ones are no longer desirable.
Let’s say that you’re programmed not to kill baby eaters. One day you find out, that eating babies is wrong (based on the information you gather), and killing the baby eaters is therefore right, you might kill the baby eaters no matter what your desire is.
I am not saying my logic isn’t wrong, but I don’t think that the argument—“my desire is not do do X, therefore I wouldn’t do X even if I knew it was the right thing to do” is right, either.
Anyway, I plan to read the sequences, when I have time.
Wouldn’t it be pointless to try to instill into an AI a friendly goal, as a self-aware improving AI should be able to act independently regardless of however we might write them in the beginning?
Depends, of course, how you define religion. I’m not sure what the original question was but there is of course a religion stating the universe is a simulation, god or no god.
Took the survey and was quite unsure how to answer the god questions… If we took it, for example, that there’s 30% chance of universe being simulated then the same probability should be assigned to P(God) too and to P(one of the religions is correct) as well.
If it will help anyone, I’d like to chip in with a memory/note-taking technique I am using at the moment. Mindmaps. I find it extremely powerful for very fast information retrieval, since it’s inherently hierarchical. I do digital mindmaps, using Freeplane. I use it for storing key ideas from books, articles, workflows, step by step howTos, programming snippets, even for my own thoughts. You name it.
Only downside I can think of is that my memory no longer has any incentive to retain the information I come across so you could say my memory only gets worse, using this technique. Does anyone know of any studies on long term effects on memory when storing information externally, rather than forcing your brain to do it itself?