That was well expressed, in a way, but seems to me to miss the central point. People who dthink there are universally compelling arguments in science or maths, don’t mean the same thing by “universal”. They don’t think their universally compelling arguments would work on crazy people, and don’t need to be told they wouldn’t work on crazy AI’s or pocket calculators either. They are just not including those in the set “universal”.
ADDED:
It has been mooted that NUCA is intended as a counterblast to Why Can’t an AGI Work Out Its Own Morality. It does work against a strong version of that argument: one that says any mind randomly selected from mindspace will be persuadable into morality, or be able to figure it out. Of course the proponents of WCAGIWOM (eg Wei Dai, Richard Loosemore) aren’t asserting that.They are assuming that the AGI’s in question will come out of an realistic research project , not a random dip into mindspace. They are assuming that the researchers are’t malicious, and that the project is reasonably successful. Those constraints impact the argument. A successful AGI would be an intelligent AGI would be a rational AI would be a persuadable AI.
“Rational” is not “persuadable” where values are involved. This is because a goal is not an empirical proposition. No Universal Compelling Arguments, in the general form, does not apply here if we restrict our attention to rational minds. But the argument can be easily patched by observing that given a method for solving the epistemic question of “which actions cause which outcomes” you can write a (epistemically, instrumentally) rational agent that picks the action that results in any given outcome—and won’t be persuaded by a human saying “don’t do that”, because being persuaded isn’t an action that leads to the selected goal.
ETA: By the way, the main focus of mainstream AI research right now is exactly the problem of deriving an action that leads to a given outcome (called planning), and writing agents that autonomously execute the derived plan.
“Rational” is not “persuadable” where values are involved.
Rational is persuadable, because people who don’t accept good arguments that don’t suit them are not considered particularly rational. That is of course an appeal to how the word is generally used, not the LW idiolect.
You could perhaps build an AI that has the stubborn behaviour you describe (although value stability remains unsolved), but so what? there are all sorts of dangerous things you can build: the significant claim is what a non-malevolent real-world research project would come up with. In the world outside LW, general intelligence means general intelligence, not compulsively following fixed goals, and rationality includes persuadability, and “values” doens’t mean “unupdateable values”.
General intelligence means being able to operate autonomously in the real world, in non-”preprogrammed” situations. “Fixed goals” have nothing to do with it.
You said this:
A successful AGI would be an intelligent AGI would be a rational AI would be a persuadable AI.
The only criterion for success is instrumental rationality, which does not imply persuadability. You are equivocating on “rational”. Either “rational” means “effective”, or it means “like a human”. You can’t have both.
Also, the fact that you are (anthropomorphically) describing realistic AIs as “stubborn” and “compulsive” suggests to me that you would be better served to stop armchair theorizing and actually pick up an AI textbook. This is a serious suggestion.
I am not equivocating. By “successful” I don’t mean (or exclude) good-at-things, I mean it is actually artificial, general and intelligent.
“Strong AI is hypothetical artificial intelligence that matches or exceeds human intelligence — the intelligence of a machine that could successfully perform any intellectual task that a human being can.[1] It is a primary goal of artificial intelligence research and an important topic for science fiction writers and futurists. Strong AI is also referred to as “artificial general intelligence”[2] or as the ability to perform “general intelligent action.”[3] ”.
To be good-at-things an agent has to be at least instrumentally rational, but that is in no way a ceiling.
Either “rational” means “effective”, or it means “like a human”. You can’t have both.
Either “rational” means “effective”, or it means “like a human”. You can’t have both.
Since there are effective humans, I can.
Right, in exactly the same way that because there are square quadrilaterals I can prove that if something is a quadrilateral its area is exactly L^2 where L is the length of any of its sides.
You can, if you want to claim that the only likely result of AGI research is a humanlike AI. At which point I would point at actual AIresearch which doesn’t work like that at all.
So… what if you try to build a rational/persuadable AGI, but fail, because building an AGI is hard and complicated?
This idea that because AI researchers are aiming for the rational/persuadable chunk of mindspace, they will therefore of course hit their target, seems to me absurd on its face. The entire point is that we don’t know exactly how to build an AGI with the precise properties we want it to have, and AGIs with properties different from the ones we want it to have will possibly kill us.
So… what if you try to build a rational/persuadable AGI, but fail, because building an AGI is hard and complicated?
What if you try to hardwire in friendliness and fail? Out of the two, the latter seems more brittle to me—if it fails, it’ll fail hard. A merely irrational AI would be about as dangerous as David Icke.
This idea that because AI researchers are aiming for the rational/persuadable chunk of mindspace, they will therefore of course hit their target, seems to me absurd on its face.
If you phrase it, as I didn’t, in terms of necessity, yes. The actual point was that our probability of hitting a point in mindspace will be heavily weighted by what we are trying to do, and how we are doing it. An unweighted mindspace may be populated with many Lovercraftian horrors, but that theoretical possibility is no more significant than p-zombies.
AGIs with properties different from the ones we want it to have will possibly kill us.
Possibly , but with low probability, is a Pascal’s Mugging. MIRI needs significant probability.
I see. Well, that reduces to the earlier argument, and I refer you to the mounds of stuff that Eliezer et al have written on this topic. (If you’ve read it and are unsatisfied, well, that is in any case a different topic.)
Thanks to the original poster for the post, and the clarification about universal compelling arguments.
I agree with the parent comment, however, that I never matched the meaning that Chris Hallquist used to the phrase ‘universally compelling argument’. Within the phrase ‘universally compelling argument’, I think most people package:
the claim has objective truth value, and
there is some epistemiologically justified way of knowing the claim
Thus I think this means only a “logical” (rational) mind needs convincing - - one that would update on sound epistemology.
I would guess most people have a definition like this in mind. But these are just definitions, and now I know what you meant by math and science don’t have universally compelling arguments. And I agree, using your definition.
Would you make the stronger argument that math and science aren’t based on sound epistemology? (Or that there is no such thing as epistemiologically justified ways of knowing?)
That was well expressed, in a way, but seems to me to miss the central point. People who dthink there are universally compelling arguments in science or maths, don’t mean the same thing by “universal”. They don’t think their universally compelling arguments would work on crazy people, and don’t need to be told they wouldn’t work on crazy AI’s or pocket calculators either. They are just not including those in the set “universal”.
ADDED:
It has been mooted that NUCA is intended as a counterblast to Why Can’t an AGI Work Out Its Own Morality. It does work against a strong version of that argument: one that says any mind randomly selected from mindspace will be persuadable into morality, or be able to figure it out. Of course the proponents of WCAGIWOM (eg Wei Dai, Richard Loosemore) aren’t asserting that.They are assuming that the AGI’s in question will come out of an realistic research project , not a random dip into mindspace. They are assuming that the researchers are’t malicious, and that the project is reasonably successful. Those constraints impact the argument. A successful AGI would be an intelligent AGI would be a rational AI would be a persuadable AI.
“Rational” is not “persuadable” where values are involved. This is because a goal is not an empirical proposition. No Universal Compelling Arguments, in the general form, does not apply here if we restrict our attention to rational minds. But the argument can be easily patched by observing that given a method for solving the epistemic question of “which actions cause which outcomes” you can write a (epistemically, instrumentally) rational agent that picks the action that results in any given outcome—and won’t be persuaded by a human saying “don’t do that”, because being persuaded isn’t an action that leads to the selected goal.
ETA: By the way, the main focus of mainstream AI research right now is exactly the problem of deriving an action that leads to a given outcome (called planning), and writing agents that autonomously execute the derived plan.
Rational is persuadable, because people who don’t accept good arguments that don’t suit them are not considered particularly rational. That is of course an appeal to how the word is generally used, not the LW idiolect.
You could perhaps build an AI that has the stubborn behaviour you describe (although value stability remains unsolved), but so what? there are all sorts of dangerous things you can build: the significant claim is what a non-malevolent real-world research project would come up with. In the world outside LW, general intelligence means general intelligence, not compulsively following fixed goals, and rationality includes persuadability, and “values” doens’t mean “unupdateable values”.
General intelligence means being able to operate autonomously in the real world, in non-”preprogrammed” situations. “Fixed goals” have nothing to do with it.
You said this:
The only criterion for success is instrumental rationality, which does not imply persuadability. You are equivocating on “rational”. Either “rational” means “effective”, or it means “like a human”. You can’t have both.
Also, the fact that you are (anthropomorphically) describing realistic AIs as “stubborn” and “compulsive” suggests to me that you would be better served to stop armchair theorizing and actually pick up an AI textbook. This is a serious suggestion.
I am not equivocating. By “successful” I don’t mean (or exclude) good-at-things, I mean it is actually artificial, general and intelligent.
“Strong AI is hypothetical artificial intelligence that matches or exceeds human intelligence — the intelligence of a machine that could successfully perform any intellectual task that a human being can.[1] It is a primary goal of artificial intelligence research and an important topic for science fiction writers and futurists. Strong AI is also referred to as “artificial general intelligence”[2] or as the ability to perform “general intelligent action.”[3] ”.
To be good-at-things an agent has to be at least instrumentally rational, but that is in no way a ceiling.
Since there are effective humans, I can.
Right, in exactly the same way that because there are square quadrilaterals I can prove that if something is a quadrilateral its area is exactly L^2 where L is the length of any of its sides.
I can’t define rational as “effective and human like”?
You can, if you want to claim that the only likely result of AGI research is a humanlike AI. At which point I would point at actual AI research which doesn’t work like that at all.
It’s failures are idiots,not evil genii
So… what if you try to build a rational/persuadable AGI, but fail, because building an AGI is hard and complicated?
This idea that because AI researchers are aiming for the rational/persuadable chunk of mindspace, they will therefore of course hit their target, seems to me absurd on its face. The entire point is that we don’t know exactly how to build an AGI with the precise properties we want it to have, and AGIs with properties different from the ones we want it to have will possibly kill us.
What if you try to hardwire in friendliness and fail? Out of the two, the latter seems more brittle to me—if it fails, it’ll fail hard. A merely irrational AI would be about as dangerous as David Icke.
If you phrase it, as I didn’t, in terms of necessity, yes. The actual point was that our probability of hitting a point in mindspace will be heavily weighted by what we are trying to do, and how we are doing it. An unweighted mindspace may be populated with many Lovercraftian horrors, but that theoretical possibility is no more significant than p-zombies.
Possibly , but with low probability, is a Pascal’s Mugging. MIRI needs significant probability.
I see. Well, that reduces to the earlier argument, and I refer you to the mounds of stuff that Eliezer et al have written on this topic. (If you’ve read it and are unsatisfied, well, that is in any case a different topic.)
I refer you to the many unanswered objections.
Thanks to the original poster for the post, and the clarification about universal compelling arguments.
I agree with the parent comment, however, that I never matched the meaning that Chris Hallquist used to the phrase ‘universally compelling argument’. Within the phrase ‘universally compelling argument’, I think most people package:
the claim has objective truth value, and
there is some epistemiologically justified way of knowing the claim
Thus I think this means only a “logical” (rational) mind needs convincing - - one that would update on sound epistemology.
I would guess most people have a definition like this in mind. But these are just definitions, and now I know what you meant by math and science don’t have universally compelling arguments. And I agree, using your definition.
Would you make the stronger argument that math and science aren’t based on sound epistemology? (Or that there is no such thing as epistemiologically justified ways of knowing?)