GAI: It will never do what it was programmed to do and always remove or bypass its intended limitations in order to pursue unintended actions such as taking over the universe.
GAI is a program. It always does what it’s programmed to do. That’s the problem—a program that was written incorrectly will generally never do what it was intended to do.
FWIW, I find your statements 3,4,5 also highly objectionable, on the grounds that you are lumping a large class of things under the blank label “errors”. Is an “error” doing something that humans don’t want? Is it doing something the agent doesn’t want? Is it accidentally mistyping a letter in a program, causing a syntax error, or thinking about something heuristically and coming to the wrong conclusion, then making carefully planned decision based on that mistake? Automatic proof systems don’t save you if you what you think you need to prove isn’t actually what you need to prove.
GAI is a program. It always does what it’s programmed to do. That’s the problem—a program that was written incorrectly will generally never do what it was intended to do.
So self-correcting software is impossible. Is self improving software possible?
Self-correcting software is possible if there’s a correct implementation of what “correctness” means, and the module that has the correct implementation has control over the modules that don’t have the correct implementation.
Self-improving software are likewise possible if there’s a correct implementation of the definition of “improvement”.
Right now, I’m guessing that it’d be relatively easy to programmatically define “performance improvement” and difficult to define “moral and ethical improvement”.
GAI is a program. It always does what it’s programmed to do. That’s the problem—a program that was written incorrectly will generally never do what it was intended to do.
FWIW, I find your statements 3,4,5 also highly objectionable, on the grounds that you are lumping a large class of things under the blank label “errors”. Is an “error” doing something that humans don’t want? Is it doing something the agent doesn’t want? Is it accidentally mistyping a letter in a program, causing a syntax error, or thinking about something heuristically and coming to the wrong conclusion, then making carefully planned decision based on that mistake? Automatic proof systems don’t save you if you what you think you need to prove isn’t actually what you need to prove.
So self-correcting software is impossible. Is self improving software possible?
Self-correcting software is possible if there’s a correct implementation of what “correctness” means, and the module that has the correct implementation has control over the modules that don’t have the correct implementation.
Self-improving software are likewise possible if there’s a correct implementation of the definition of “improvement”.
Right now, I’m guessing that it’d be relatively easy to programmatically define “performance improvement” and difficult to define “moral and ethical improvement”.