However, this is exactly the issue I’m trying to discuss. It looks as though, if we take the threat of uncaring AI seriously, this is a real problem and it demands a real solution. The only solution that I can see is morally abhorrent, and I’m trying to open a discussion looking for a better one. Any suggestions on how to do this would be appreciated.
As I understand it from reading the sequences, Eliezer’s position roughly boils down to “most AI researchers are dilettantes and no danger to anyone at the moment. Anyone capable of solving the problems in AI at the moment will have to be bright enough, and gain enough insights from their work, that they’ll probably have to solve Friendliness as part of it—or at least be competent enough that if SIAI shout loud enough about Friendliness they’ll listen. The problem comes if Friendliness isn’t solved before the point where it becomes possible to build an AI without any special insight, just by throwing computing power at it along with a load of out-of-the-box software and getting ‘lucky’.”
In other words, if you’re convinced by the argument that Friendly AI is the most important problem facing us, the thing to do is work on Friendly AI rather than prevent other people working on unFriendly AI. Find an area of the problem no-one else is working on, and do that. That might sound hard, but it’s infinitely more productive than finding the baddies and shooting at them.
Anyone smart enough to be dangerous is smart enough to be safe? I’m skeptical- folksy wisdom tells me that being smart doesn’t protect you from being stupid.
But in general, yes- the threat becomes more and more tangible as the barrier to AI gets lower and the number of players increases. At the moment, it seems pretty intangible, but I haven’t actually gone out and counted dangerously smart AI researchers- I might be surprised by how many there are.
To be clear, I was NOT trying to imply that we should actually right now form the Turing Police.
As I understand it, the argument (roughly) is that if you build an AI from scratch, using just tools available now, you will have to specify its utility function, in a way that the program can understand, as part of that process. Anyone actually trying to work out a utility function that can be programmed would have to have a fairly deep understanding—you can’t just type “make nice things happen and no bad things”, but have to think in terms that can be converted into C or Perl or whatever. In doing so, you would have to have some kind of understanding in your own head of what you’re telling the computer to do, and would be likely to avoid at least the most obvious failure modes.
However, in (say) twenty years that might not be the case—it might be (as an example) that we have natural language processing programs that can take a sentence like ‘make people happy’ and have some form of ‘understanding’ of it, while still not being Turing-test-passing, self-modification-capable fully general AIs. It could then get to the stage that some half-clever person could think “Hmm… If I put this and this and this together, I’ll have a self-modifying AI. And then I’ll just tell it to make everyone smile. What could go wrong?”
In any case, non-abhorrent solutions include “work on FAI” and “talk to AGI researchers, some of whom will listen (especially if you don’t start off with how we’re all going to die unless they repent, even though that’s the natural first thought)”.
Edited, in the interest of caution.
However, this is exactly the issue I’m trying to discuss. It looks as though, if we take the threat of uncaring AI seriously, this is a real problem and it demands a real solution. The only solution that I can see is morally abhorrent, and I’m trying to open a discussion looking for a better one. Any suggestions on how to do this would be appreciated.
As I understand it from reading the sequences, Eliezer’s position roughly boils down to “most AI researchers are dilettantes and no danger to anyone at the moment. Anyone capable of solving the problems in AI at the moment will have to be bright enough, and gain enough insights from their work, that they’ll probably have to solve Friendliness as part of it—or at least be competent enough that if SIAI shout loud enough about Friendliness they’ll listen. The problem comes if Friendliness isn’t solved before the point where it becomes possible to build an AI without any special insight, just by throwing computing power at it along with a load of out-of-the-box software and getting ‘lucky’.”
In other words, if you’re convinced by the argument that Friendly AI is the most important problem facing us, the thing to do is work on Friendly AI rather than prevent other people working on unFriendly AI. Find an area of the problem no-one else is working on, and do that. That might sound hard, but it’s infinitely more productive than finding the baddies and shooting at them.
Anyone smart enough to be dangerous is smart enough to be safe? I’m skeptical- folksy wisdom tells me that being smart doesn’t protect you from being stupid.
But in general, yes- the threat becomes more and more tangible as the barrier to AI gets lower and the number of players increases. At the moment, it seems pretty intangible, but I haven’t actually gone out and counted dangerously smart AI researchers- I might be surprised by how many there are.
To be clear, I was NOT trying to imply that we should actually right now form the Turing Police.
As I understand it, the argument (roughly) is that if you build an AI from scratch, using just tools available now, you will have to specify its utility function, in a way that the program can understand, as part of that process. Anyone actually trying to work out a utility function that can be programmed would have to have a fairly deep understanding—you can’t just type “make nice things happen and no bad things”, but have to think in terms that can be converted into C or Perl or whatever. In doing so, you would have to have some kind of understanding in your own head of what you’re telling the computer to do, and would be likely to avoid at least the most obvious failure modes.
However, in (say) twenty years that might not be the case—it might be (as an example) that we have natural language processing programs that can take a sentence like ‘make people happy’ and have some form of ‘understanding’ of it, while still not being Turing-test-passing, self-modification-capable fully general AIs. It could then get to the stage that some half-clever person could think “Hmm… If I put this and this and this together, I’ll have a self-modifying AI. And then I’ll just tell it to make everyone smile. What could go wrong?”
It’s already been linked to a couple times under this post, but: have you read http://lesswrong.com/lw/v1/ethical_injunctions/ and the posts it links to?
In any case, non-abhorrent solutions include “work on FAI” and “talk to AGI researchers, some of whom will listen (especially if you don’t start off with how we’re all going to die unless they repent, even though that’s the natural first thought)”.