The framework, as we already have established, would not keep an AI from maximizing what ever the AI wants to maximize.
The framework also does nothing to prevent AI from creating a more effective problem solving AI that is more effective at problem solving by not evaluating your problem solving functions on various candidate solutions, and instead doing something else that’s more effective. I.e. the AI with some substitute goals of it’s own instead of straightforward maximization of scores. (Heh, the whole point of exercise is to create AI that would keep self improving, meaning, would improve it’s ability to self improve. Which is something that you can only do by some kind of goal substitution because the evaluation of the ability to self improve is too expensive—the goal is a something that you evaluate many times.)
So what does the framework do, exactly, that would improve safety here? Beyond keeping the AI in the rudimentary box, and making it very dubious that the AI would at all self improve. Yes, it is very dubious that under this framework the unfriendly AI will arise but is some added safety, or is it a special case of general dubiousness that a self improvement would take place? I don’t see added safety. I don’t see framework impeding growing unfriendliness any more than it would impede self improvement.
edit: maybe should just say, nonfriendly. Any AI that is not friendly, can just eat you up when hungry and it doesn’t need you.
The framework, as we already have established, would not keep an AI from maximizing what ever the AI wants to maximize.
That’s only if you plop a ready-made AGI in the framework. The framework is meant to grow a stupider seed AI.
The framework also does nothing to prevent AI from creating a more effective problem solving AI that is more effective at problem solving by not evaluating your problem solving functions on various candidate solutions, and instead doing something else that’s more effective.
Program (3) cannot be re-written. Program (2) is the only thing that is changed. All it does is improve itself and spit out solutions to optimization problems. I see no way for it to “create a more effective problem solving AI”.
So what does the framework do, exactly, that would improve safety here?
It provides guidance for a seed AI to grow to solve optimization problems better without having it take actions that have effects beyond its ability to solve optimization problems.
A lot goes into solving the optimization problems without invoking the scoring function a trillion times (which would entirely prohibit self improvement).
Look at where similar kind of framework got us, the homo sapiens. We were minding our business evolving, maximizing own fitness, which was the all we could do. We were self improving (the output being next generation’s us). Now there’s talk of Large Hadron Collider destroying the world. It probably won’t, of course, but we’re pretty well going along the bothersome path. We also started as a pretty stupid seed AI, a bunch of monkeys. Scratch that, as unicellular life.
The framework, as we already have established, would not keep an AI from maximizing what ever the AI wants to maximize.
The framework also does nothing to prevent AI from creating a more effective problem solving AI that is more effective at problem solving by not evaluating your problem solving functions on various candidate solutions, and instead doing something else that’s more effective. I.e. the AI with some substitute goals of it’s own instead of straightforward maximization of scores. (Heh, the whole point of exercise is to create AI that would keep self improving, meaning, would improve it’s ability to self improve. Which is something that you can only do by some kind of goal substitution because the evaluation of the ability to self improve is too expensive—the goal is a something that you evaluate many times.)
So what does the framework do, exactly, that would improve safety here? Beyond keeping the AI in the rudimentary box, and making it very dubious that the AI would at all self improve. Yes, it is very dubious that under this framework the unfriendly AI will arise but is some added safety, or is it a special case of general dubiousness that a self improvement would take place? I don’t see added safety. I don’t see framework impeding growing unfriendliness any more than it would impede self improvement.
edit: maybe should just say, nonfriendly. Any AI that is not friendly, can just eat you up when hungry and it doesn’t need you.
That’s only if you plop a ready-made AGI in the framework. The framework is meant to grow a stupider seed AI.
Program (3) cannot be re-written. Program (2) is the only thing that is changed. All it does is improve itself and spit out solutions to optimization problems. I see no way for it to “create a more effective problem solving AI”.
It provides guidance for a seed AI to grow to solve optimization problems better without having it take actions that have effects beyond its ability to solve optimization problems.
A lot goes into solving the optimization problems without invoking the scoring function a trillion times (which would entirely prohibit self improvement).
Look at where similar kind of framework got us, the homo sapiens. We were minding our business evolving, maximizing own fitness, which was the all we could do. We were self improving (the output being next generation’s us). Now there’s talk of Large Hadron Collider destroying the world. It probably won’t, of course, but we’re pretty well going along the bothersome path. We also started as a pretty stupid seed AI, a bunch of monkeys. Scratch that, as unicellular life.