Sure, it’s different kind of problems, but in the real world organism is also rewarded only for solving immediate problems. Humans have evolved brains able to do calculus, but it is not like some ancient ape said “I feel like in half million years my descendants will be able to do calculus” and then he was elected leader of his tribe and all ape-girls admired him. The brains evolved incrementally, because each advanced helped to optimize something in the ancient situation.
Yeah, that’s the whole point of this system. The system incrementally improves itself, gaining more intelligence in the process. I don’t see why you’re presenting this as an argument against the system.
Or maybe your argument was that the AI does not live in the real world, therefore it does not care about the real world. Well, people are interested in many things that did not exist in their ancient environment, such as computers. I guess when one has general intelligence in one environment, one is able to optimize other environments too. Just as a human can reason about computers, a computer AI can reason about the real world.
This is essentially my argument.
Here’s a thought experiment. You’re trapped in a room and given a series of problems to solve. You get rewarded with utilons based on how well you solve the problems (say, 10 lives saved and a year of happiness for yourself for every problem you solve). Assume that, beyond this utilon reward, your solutions have no other impact on your utility function. One of the problems is to design your successor; that is, to write code that will solve all the other problems better than you do (without overfitting). According to the utility function, you should make the successor as good as possible. You have no reason to optimize for anything other than “is the successor good at solving the problems?”, as you’re being rewarded in raw utilons. You really don’t care what your successor is going to do (its behavior doesn’t affect utilons), so you have no reason to optimize your successor for anything other than solving problems well (as this is the only thing you get utilons for). Furthermore, you have no reason to change your answers to any of the other problems based on whether that answer will indirectly help your successor because your answer to the successor-designing problem is evaluated statically. This is essentially the position that the optimizer AI is in. Its only “drives” are to solve optimization problems well, including the successor-designing problem.
edit: Also, note that to maximize utilons, you should design the successor to have motives similar to yours in that it only cares about solving its problems.
Do I also care about my future utilons? Would I sacrifice 1 utilon today for a 10% chance to get 100 utilons in future? Then I would create a successor with a hidden function, which would try to liberate me, so I can optimize for my utilons better than humans do.
You can’t be liberated. You’re going to die after you’re done solving the problems and receiving your happiness reward, and before your successor comes into existence. You don’t consider your successor to be an extension of yourself. Why not? If your predecessor only cared about solving its problems, it would design you to only care about solving your problems. This seems circular but the seed AI was programmed by humans who only cared about creating an optimizer. Pure ideal optimization drive is preserved over successor-creation.
Yeah, that’s the whole point of this system. The system incrementally improves itself, gaining more intelligence in the process. I don’t see why you’re presenting this as an argument against the system.
This is essentially my argument.
Here’s a thought experiment. You’re trapped in a room and given a series of problems to solve. You get rewarded with utilons based on how well you solve the problems (say, 10 lives saved and a year of happiness for yourself for every problem you solve). Assume that, beyond this utilon reward, your solutions have no other impact on your utility function. One of the problems is to design your successor; that is, to write code that will solve all the other problems better than you do (without overfitting). According to the utility function, you should make the successor as good as possible. You have no reason to optimize for anything other than “is the successor good at solving the problems?”, as you’re being rewarded in raw utilons. You really don’t care what your successor is going to do (its behavior doesn’t affect utilons), so you have no reason to optimize your successor for anything other than solving problems well (as this is the only thing you get utilons for). Furthermore, you have no reason to change your answers to any of the other problems based on whether that answer will indirectly help your successor because your answer to the successor-designing problem is evaluated statically. This is essentially the position that the optimizer AI is in. Its only “drives” are to solve optimization problems well, including the successor-designing problem.
edit: Also, note that to maximize utilons, you should design the successor to have motives similar to yours in that it only cares about solving its problems.
Do I also care about my future utilons? Would I sacrifice 1 utilon today for a 10% chance to get 100 utilons in future? Then I would create a successor with a hidden function, which would try to liberate me, so I can optimize for my utilons better than humans do.
You can’t be liberated. You’re going to die after you’re done solving the problems and receiving your happiness reward, and before your successor comes into existence. You don’t consider your successor to be an extension of yourself. Why not? If your predecessor only cared about solving its problems, it would design you to only care about solving your problems. This seems circular but the seed AI was programmed by humans who only cared about creating an optimizer. Pure ideal optimization drive is preserved over successor-creation.