Surely I can make the same claim about AIs. They wouldn’t be particularly useful otherwise.
Well, a general AI with intelligence equal to or greater than that of a human without proven friendliness probably wouldn’t be very useful because it would be so unsafe. See Eliezer’s The Hidden Complexity of Wishes.
This is speculation, but far from blind speculation, considering we do have very strong evidence regarding our own adaptations to intuitively predict other humans, and an observably poor track record in intuitively predicting non-humalike optimization processes (example.)
...wouldn’t be very useful because it would be so unsafe.
First, the existence of such an AI would imply that at least somebody thought it was useful enough to build.
Second, the safety is not a function of intelligence but a function of capabilities. Eliezer’s genies are omnipotent and I don’t see why a (pre-singularity) AI would be.
I am also doubtful about that “observably poor track record”—which data are you relying on?
Yes? I don’t understand what you are arguing. The point of worrying about unFriendly AI is precisely that the unintended consequences can be utterly disastrous. Suggest you restate your thesis and what you think you are arguing against; at least one of us has lost track of the thread of the argument.
As the discussion in the thread evolved, my main thesis seems to be that it is possible for an AI to change its original goals (=terminal values). A few people are denying that this can happen.
I agree that AIs are unpredictable, however humans are as well. Statements about AIs being more unpredictable than humans are unfalsifiable as there is no empirical data and all we can do is handwave.
Ok. As I pointed out elsewhere, “AI” around here usually refers to the class of well-designed programs. A badly-programmed AI can obviously change its goals; if it does so, however, then by construction it is not good at achieving whatever the original goals were. Moreover,no matter what its starting goals are, it is really extremely unlikely to arrive at ones we would like by moving around in goal space, unless it is specifically designed, and well designed, to do so. “Human terminal values” is not an attractor in goal space. The paperclip maximiser is really much more likely than the human-happiness maximiser, on the obvious grounds that paperclips are much simpler than human happiness; but an iron-atoms maximiser is more likely still. The point is that you cannot rely on the supposed “obviousness” of morality to get your AI to self-modify into a desirable state; it’s only obvious to humans.
“AI” around here usually refers to the class of well-designed programs.
Define “well-designed”.
...you cannot rely on the supposed “obviousness” of morality to get your AI to self-modify into a desirable state
Huh? I never claimed (nor do I believe in anything like) obviousness of morality. Of course human terminal values are not an attractor in goal space. Absent other considerations there is no reason to think that an evolving AI would arrive at maximum-human-happiness values. Yes, unFriendly AI can be very dangerous. I never said otherwise.
First, the existence of such an AI would imply that at least somebody thought it was useful enough to build.
I’ve met people with very stupid ideas about how to control an AI, who were convinced that they knew how to build such an AI. I argued them out of those initial stupid ideas. Had I not, they would have tried to build the AI with their initial ideas, which they now admit were dangerous.
So people trying to build dangerous AIs without realising the danger is already a fact!
Don’t know why you keep on getting downvoted… Anyway, I agree with you, in that particular case (not naming names!).
But I’ve seen no evidence that competence in designing a powerful AI is related to competence in controlling a powerful AI. If anything, these seem much less related than you’d expect.
I suspect Lumifer’s getting downvoted for four reasons:
(1) A lot of his/her responses attack the weakest (or least clear) point in the original argument, even if it’s peripheral to the central argument, without acknowledging any updating on his/her part in response to the main argument. This results in the conversation spinning off in a lot of unrelated directions simultaneously. Steel-manning is a better strategy, because it also makes it clearer whether there’s a misunderstanding about what’s at issue.
(2) Lumifer is expressing consistently high confidence that appears disproportionate to his/her level of expertise and familiarity with the issues being discussed. In particular, s/he ’s unfamiliar with even the cursory summaries of Sequence points that could be found on the wiki. (This is more surprising, and less easy to justify, given how much karma s/he’s accumulated.)
(3) Lumifer’s tone comes off as cute and smirky and dismissive, even when the issues being debated are of enormous human importance and the claims being raised are at best not obviously correct, at worst obviously not correct.
(4) Lumifer is expressing unpopular views on LW without arguing for them. (In my experience, unpopular views receive polarizing numbers of votes on LW: They get disproportionately many up-votes if well-argued, disproportionately many down-votes if merely asserted. The most up-voted post in the history of LW is an extensive critique of MIRI.)
I didn’t downvote Lumifer’s “My prior that they were capable of building an actually dangerous AI cannot be distinguished from zero :-D”, but I think all four of those characteristics hold even for this relatively innocuous (and almost certainly correct) post. The response is glib and dismissive of the legitimate worry you raised, it reflects a lack of understanding of why this concern is serious (hence also lacks any relevant counter-argument; you already recognized that the people you were talking about weren’t going to succeed in building AI), and it changes the topic without demonstrating any updating in response to the previous argument.
First, the existence of such an AI would imply that at least somebody thought it was useful enough to build.
Which doesn’t mean that it would be a good idea. Have you read the Sequences? It seems like we’re missing some pretty important shared background here.
Well, a general AI with intelligence equal to or greater than that of a human without proven friendliness probably wouldn’t be very useful because it would be so unsafe. See Eliezer’s The Hidden Complexity of Wishes.
This is speculation, but far from blind speculation, considering we do have very strong evidence regarding our own adaptations to intuitively predict other humans, and an observably poor track record in intuitively predicting non-humalike optimization processes (example.)
First, the existence of such an AI would imply that at least somebody thought it was useful enough to build.
Second, the safety is not a function of intelligence but a function of capabilities. Eliezer’s genies are omnipotent and I don’t see why a (pre-singularity) AI would be.
I am also doubtful about that “observably poor track record”—which data are you relying on?
This is also true of leaded gasoline, the reactor at Chernobyl, and thalidomide.
Notice that all your examples exist.
Oh, and the Law of Unintended Consequences is still fully operational.
Yes? I don’t understand what you are arguing. The point of worrying about unFriendly AI is precisely that the unintended consequences can be utterly disastrous. Suggest you restate your thesis and what you think you are arguing against; at least one of us has lost track of the thread of the argument.
As the discussion in the thread evolved, my main thesis seems to be that it is possible for an AI to change its original goals (=terminal values). A few people are denying that this can happen.
I agree that AIs are unpredictable, however humans are as well. Statements about AIs being more unpredictable than humans are unfalsifiable as there is no empirical data and all we can do is handwave.
Ok. As I pointed out elsewhere, “AI” around here usually refers to the class of well-designed programs. A badly-programmed AI can obviously change its goals; if it does so, however, then by construction it is not good at achieving whatever the original goals were. Moreover,no matter what its starting goals are, it is really extremely unlikely to arrive at ones we would like by moving around in goal space, unless it is specifically designed, and well designed, to do so. “Human terminal values” is not an attractor in goal space. The paperclip maximiser is really much more likely than the human-happiness maximiser, on the obvious grounds that paperclips are much simpler than human happiness; but an iron-atoms maximiser is more likely still. The point is that you cannot rely on the supposed “obviousness” of morality to get your AI to self-modify into a desirable state; it’s only obvious to humans.
Define “well-designed”.
Huh? I never claimed (nor do I believe in anything like) obviousness of morality. Of course human terminal values are not an attractor in goal space. Absent other considerations there is no reason to think that an evolving AI would arrive at maximum-human-happiness values. Yes, unFriendly AI can be very dangerous. I never said otherwise.
I’ve met people with very stupid ideas about how to control an AI, who were convinced that they knew how to build such an AI. I argued them out of those initial stupid ideas. Had I not, they would have tried to build the AI with their initial ideas, which they now admit were dangerous.
So people trying to build dangerous AIs without realising the danger is already a fact!
My prior that they were capable of building an actually dangerous AI cannot be distinguished from zero :-D
Don’t know why you keep on getting downvoted… Anyway, I agree with you, in that particular case (not naming names!).
But I’ve seen no evidence that competence in designing a powerful AI is related to competence in controlling a powerful AI. If anything, these seem much less related than you’d expect.
I suspect Lumifer’s getting downvoted for four reasons:
(1) A lot of his/her responses attack the weakest (or least clear) point in the original argument, even if it’s peripheral to the central argument, without acknowledging any updating on his/her part in response to the main argument. This results in the conversation spinning off in a lot of unrelated directions simultaneously. Steel-manning is a better strategy, because it also makes it clearer whether there’s a misunderstanding about what’s at issue.
(2) Lumifer is expressing consistently high confidence that appears disproportionate to his/her level of expertise and familiarity with the issues being discussed. In particular, s/he ’s unfamiliar with even the cursory summaries of Sequence points that could be found on the wiki. (This is more surprising, and less easy to justify, given how much karma s/he’s accumulated.)
(3) Lumifer’s tone comes off as cute and smirky and dismissive, even when the issues being debated are of enormous human importance and the claims being raised are at best not obviously correct, at worst obviously not correct.
(4) Lumifer is expressing unpopular views on LW without arguing for them. (In my experience, unpopular views receive polarizing numbers of votes on LW: They get disproportionately many up-votes if well-argued, disproportionately many down-votes if merely asserted. The most up-voted post in the history of LW is an extensive critique of MIRI.)
I didn’t downvote Lumifer’s “My prior that they were capable of building an actually dangerous AI cannot be distinguished from zero :-D”, but I think all four of those characteristics hold even for this relatively innocuous (and almost certainly correct) post. The response is glib and dismissive of the legitimate worry you raised, it reflects a lack of understanding of why this concern is serious (hence also lacks any relevant counter-argument; you already recognized that the people you were talking about weren’t going to succeed in building AI), and it changes the topic without demonstrating any updating in response to the previous argument.
Heh. People are people, even on LW...
Which doesn’t mean that it would be a good idea. Have you read the Sequences? It seems like we’re missing some pretty important shared background here.