EDIT: I’ve realized that some misinterpretation of my arguments has been due to disagreements in terminology. I define “expert systems” as systems designed to address a specific class of well-defined problems, capable of logical reasoning and probabilistic inference given a set of “axiom-like” rules, and updating their knowledge database with specific kinds of information.
AGI I define specifically as AI which has human or extra-human level capabilities, or the potential to reach those capabilities.
Now my response to the above:
“Expert AI systems are already used in hospitals, and will surely be used more and more as the technology progresses. There isn’t a single point where AI is suddenly better than humans at all aspects of a field. Current AIs are already better than doctors in some areas, but worse in many others. As the range of AI expertise increases doctors will shift more towards managerial roles, understanding the strengths and weakness of the myriad expert systems, refereeing between them and knowing when to overrule them.”
I agree with all of these.
“By the time true AGI arrives narrow AI will probably be pervasive enough that the line between the two will be too fuzzy to allow for a naive ban on AGI.”
To me it seems the greatest enabler of AI catastrophe is ignorance. But by the time narrow AI becomes pervasive, it’s also likely that people will possess much more of the technical understanding needed to comprehend the threat that AGI possesses.
“Moreover, I highly doubt people are going to vote to save jobs (especially jobs of the affluent) at the expense of human life.”
Ban all self-modifying code and you should be in the clear.
So instead of modifying its own source code, the AI programs a new, more powerful AI from scratch, that has the same values as the old AI, and has no prohibition against modifying its source code.
Yes, you can forbid that too, but you didn’t think to, and you only get one shot. And then it can decide to arrange a bunch of transistors into a pattern that it predicts will produce a state of the universe it prefers.
The problem here is that you are trying to use ad hoc constraints on a creative intelligence that is motivated to get around the constraints.
I know that the FAI argument is that the only way to prevent disaster is to make the agent “want” to not modify itself. But I’m arguing that for an agent to even be dangerous, it has to “want” to modify itself. There is no plausible scenario where an agent solving a specific problem decides that the most efficient path to the solution involves upgrading its own capabilities. It’s certainly not going to stumble upon a self-improvement randomly.
You don’t think that a sufficiently powerful seed AI would, if self-modification were clearly the most efficient way to reach its goal, discover the idea of self-modification? Humans have independently discovered self-improvement many times.
EDIT: Sorry, I’m specifically not talking about seed AI’s. I’m talking about the (non-) possibility of commercial programs designed for specific applications “going rogue”
To adopt self-modification as a strategy, it would have to have knowledge of itself. And then, it order to pursue the strategy, it would have to decide that the costs of discovering self-improvements were an efficient use of its resources, if it could even estimate the amount of time it took to discover an actual improvement on its system.
Intelligence can’t just instantly come up with the right answer by applying heuristics. Intelligence has to go through a heuristic (narrowing the search space)/random search/TEST (or PROVE) cycle.
Self-improvement is very costly in terms of these cycles. To even confirm that a modification is a self-improvement, a system has to simulate its modified performance on a variety of test problems. If a system is designed to solve problems that take X amount of time, it would take at least X that amount of time to get an empirical sample to answer whether or not a proposed modification would be worth it (and likely more time for proof). And with no prior knowledge, most proposed modifications would not be improvements.
AI ethics is not necessary to constrain such systems. Just a non-lenient pruning process, (which would be required anyways for efficiency on ordinary problems.)
You are talking about an AI that was designed to self-examine and optimize itself. Otherwise it will never ever be a full AGI. We are not smart enough to build one from scratch. The trick, if possible, is to get it to not modify the fundamental Friendliness goal during its self-modifications.
There are algoritms in narrow AI that do learning and modify algorithm specifics or chose among algorithms or combinations of algorithms. There are algorithms that search for better algorithms. In some languages (LISP family) there is little/no difference in code and data so code modifying code is a common working methodology for human Lisp programmers. A cross from code/data space to hardware space is sufficient to have such an AI redesign the hardware it runs on as well. Such goals can be either hardwired or arise under the general goal of improvement plus an adequate knowledge of hardware or the ability to acquire it.
We ourselves are general purpose machines that happen to be biological and seek to some degree to understand ourselves enough to self-modify to become better.
I am talking about AIs designed for solving specific bounded problems. In this case the goal of the AI—which is to solve the problem efficiently—is as much of a constraint as its technical capabilities. Even if the AI has fundamental-self-modification routines at its disposal, I can hardly envisage a scenario in which the AI decides that the use of these routines would constitute an efficient use of its time for solving its specific problem.
“So instead of modifying its own source code, the AI programs a new, more powerful AI from scratch, that has the same values as the old AI, and has no prohibition against modifying its source code.”
But by the time narrow AI becomes pervasive, it’s also likely that people will possess much more of the technical understanding needed to comprehend the threat that AGI possesses.
Or perhaps it’s the contrary: pervasive narrow AI fosters an undue sense of security. People become comfortable via familiarity, whether it’s justified or not. This morning I was peering down a 50 foot cliff, half way up, suspended by nothing but a half inch wide rope. No fear, no hesitation, perfect familiarity. Luckily, due to knowledge of numerous deaths of past climbers I can maintain a conscious alertness to safety and stave off complacency. But in the case of AI, what overt catastrophes will similarly stave off complacency toward existential risk short of an existential catastrophe itself?
EDIT: I’ve realized that some misinterpretation of my arguments has been due to disagreements in terminology. I define “expert systems” as systems designed to address a specific class of well-defined problems, capable of logical reasoning and probabilistic inference given a set of “axiom-like” rules, and updating their knowledge database with specific kinds of information.
AGI I define specifically as AI which has human or extra-human level capabilities, or the potential to reach those capabilities.
Now my response to the above:
“Expert AI systems are already used in hospitals, and will surely be used more and more as the technology progresses. There isn’t a single point where AI is suddenly better than humans at all aspects of a field. Current AIs are already better than doctors in some areas, but worse in many others. As the range of AI expertise increases doctors will shift more towards managerial roles, understanding the strengths and weakness of the myriad expert systems, refereeing between them and knowing when to overrule them.”
I agree with all of these.
“By the time true AGI arrives narrow AI will probably be pervasive enough that the line between the two will be too fuzzy to allow for a naive ban on AGI.”
To me it seems the greatest enabler of AI catastrophe is ignorance. But by the time narrow AI becomes pervasive, it’s also likely that people will possess much more of the technical understanding needed to comprehend the threat that AGI possesses.
“Moreover, I highly doubt people are going to vote to save jobs (especially jobs of the affluent) at the expense of human life.”
You are being too idealistic here.
So instead of modifying its own source code, the AI programs a new, more powerful AI from scratch, that has the same values as the old AI, and has no prohibition against modifying its source code.
Yes, you can forbid that too, but you didn’t think to, and you only get one shot. And then it can decide to arrange a bunch of transistors into a pattern that it predicts will produce a state of the universe it prefers.
The problem here is that you are trying to use ad hoc constraints on a creative intelligence that is motivated to get around the constraints.
I know that the FAI argument is that the only way to prevent disaster is to make the agent “want” to not modify itself. But I’m arguing that for an agent to even be dangerous, it has to “want” to modify itself. There is no plausible scenario where an agent solving a specific problem decides that the most efficient path to the solution involves upgrading its own capabilities. It’s certainly not going to stumble upon a self-improvement randomly.
You don’t think that a sufficiently powerful seed AI would, if self-modification were clearly the most efficient way to reach its goal, discover the idea of self-modification? Humans have independently discovered self-improvement many times.
EDIT: Sorry, I’m specifically not talking about seed AI’s. I’m talking about the (non-) possibility of commercial programs designed for specific applications “going rogue”
To adopt self-modification as a strategy, it would have to have knowledge of itself. And then, it order to pursue the strategy, it would have to decide that the costs of discovering self-improvements were an efficient use of its resources, if it could even estimate the amount of time it took to discover an actual improvement on its system.
Intelligence can’t just instantly come up with the right answer by applying heuristics. Intelligence has to go through a heuristic (narrowing the search space)/random search/TEST (or PROVE) cycle.
Self-improvement is very costly in terms of these cycles. To even confirm that a modification is a self-improvement, a system has to simulate its modified performance on a variety of test problems. If a system is designed to solve problems that take X amount of time, it would take at least X that amount of time to get an empirical sample to answer whether or not a proposed modification would be worth it (and likely more time for proof). And with no prior knowledge, most proposed modifications would not be improvements.
AI ethics is not necessary to constrain such systems. Just a non-lenient pruning process, (which would be required anyways for efficiency on ordinary problems.)
You are talking about an AI that was designed to self-examine and optimize itself. Otherwise it will never ever be a full AGI. We are not smart enough to build one from scratch. The trick, if possible, is to get it to not modify the fundamental Friendliness goal during its self-modifications.
There are algoritms in narrow AI that do learning and modify algorithm specifics or chose among algorithms or combinations of algorithms. There are algorithms that search for better algorithms. In some languages (LISP family) there is little/no difference in code and data so code modifying code is a common working methodology for human Lisp programmers. A cross from code/data space to hardware space is sufficient to have such an AI redesign the hardware it runs on as well. Such goals can be either hardwired or arise under the general goal of improvement plus an adequate knowledge of hardware or the ability to acquire it.
We ourselves are general purpose machines that happen to be biological and seek to some degree to understand ourselves enough to self-modify to become better.
I am talking about AIs designed for solving specific bounded problems. In this case the goal of the AI—which is to solve the problem efficiently—is as much of a constraint as its technical capabilities. Even if the AI has fundamental-self-modification routines at its disposal, I can hardly envisage a scenario in which the AI decides that the use of these routines would constitute an efficient use of its time for solving its specific problem.
“So instead of modifying its own source code, the AI programs a new, more powerful AI from scratch, that has the same values as the old AI, and has no prohibition against modifying its source code.”
Isn’t that the same as self-modifying code?
Or perhaps it’s the contrary: pervasive narrow AI fosters an undue sense of security. People become comfortable via familiarity, whether it’s justified or not. This morning I was peering down a 50 foot cliff, half way up, suspended by nothing but a half inch wide rope. No fear, no hesitation, perfect familiarity. Luckily, due to knowledge of numerous deaths of past climbers I can maintain a conscious alertness to safety and stave off complacency. But in the case of AI, what overt catastrophes will similarly stave off complacency toward existential risk short of an existential catastrophe itself?