Yes, you can create risk by rushing things. But you still have to be fast enough to outrun the creation of UFAI by someone else. So you have to be fast, but not too fast. It’s a balancing act.
If intelligence is the ability to understand concepts, and a super-intelligent AI has a super ability to understand concepts, what would prevent it (as a tool) from answering questions in a way so as to influence the user and affect outcomes as though it were an agent?
Google Maps, when asked for directions to Eddie the Snitch’s hideout, will not reply with “Maybe I know and maybe I don’t. You gonna make it worth my time?” because providing directions is, to it, a reflex action rather than a move in a larger game.
There are possible questions where the super-intelligent AI has to make a choice of some sort, because multiple answers can be correct (depending on which answer is given).
For example: Sam, a basketball player, approaches Predictor, a super-intelligent tool AI, before his game and asks the question “Will my team win today’s game?” Predictor knows that if it says ‘yes’, Sam will be confident, play aggressively, and this will lead to a win. If, on the other hand, it answers ‘no’, Sam’s confidence will be shattered and his team will lose comprehensively. Refusing to answer will confuse Sam, distracting him from the task at hand, and causing his team to be narrowly defeated. Any answer makes Predictor an agent, and not merely a tool—Predictor doesn’t even need to care about the basketball game.
Absolutely agreed that this sort of situation arises, and that the more I know about the world, the more situations have this character for me. That said, if I’m indifferent to the world-affecting effects of my answers, it seems that the result is very similar to if I’m ignorant of them.
That is, it seems that Predictor looks at that situation, concludes that in order to predict “yes” or “no” it has to first predict whether it will answer “yes” or “no”, and either does so (on what basis, I have no idea) or fails to do so and refuses to answer. Yes, those actions influence the world (as does the very existence of Predictor, and Sam’s knowledge of Predictor’s existence), but I’m not sure I would characterize the resulting behavior as agentlike.
Then consider; Sam asks a question. Predictor knows that an answer of “yes” will result in the development of Clippy, and subsequently in turning Earth into paperclips, causing the destruction of humanity, within the next ten thousand years; while an answer of “no” will result in a wonderful future where everyone is happy and disease is eradicated and all Good Things happen. In both cases, the prediction will be correct.
If Predictor doesn’t care about that answer, then I would not define Predictor as a Friendly AI.
What if you ask Google Interrogation Aid for the best way to get a confession out of Eddie the Snitch, given the constraints of the law and Eddie’s psychographics? What if you ask Google Municipal Planner for the best way to reduce crime? What if you ask Google Operations Assistant for the best way to maximize your paperclip production?
Google Maps has options for walking, public transit, and avoiding major highways; a hypothetical interrogation assistant would have equivalent options for degrees of legal or ethical restraint, including “How do I make sure Eddie only confesses to things he’s actually done?” If Google Operations Assistant says that a few simple modifications to the factory can produce a volume of paperclips that outmasses the Earth, there will be follow-up questions about warehousing and buyers.
Reducing crime is comparatively straightforward: more cops per capita, fewer laws for them to enforce, enough economic opportunity to make sure people don’t get desperate and stupid. The real problem is political, rather than technical, so any proposed solution will have a lot of hoops to jump through.
Yes, all it takes is a little common sense to see that legal and ethical restraint are important considerations during your interview and interrogation of Eddie. However, as the complexity of the problem rises, the tractability of the solution to a human reader lowers, as does the probability that your tool AI has sufficient common sense.
A route on a map only has a few degrees of freedom; and it’s easy to spot violations of common-sense constraints that weren’t properly programmed in, or to abort the direction-following process when problems spring up. A route to a virally delivered cancer cure has many degrees of freedom, and it’s harder to spot violations of common-sense constraints, and problems may only become evident when it’s too late to abort.
If all it took was “a little common sense” to do interrogations safely and ethically, the Stanford Prison Experiment wouldn’t have turned out the way it did. These are not simple problems!
When a medical expert system spits out a novel plan for cancer treatment, do you think that plan would be less trustworthy, or receive less scrutiny at every stage, than one invented by human experts? If an initial trial results in some statistically significant number of rats erupting into clockwork horror and rampaging through the lab until cleansed by fire, or even just keeling over from seemingly-unrelated kidney failure, do you think the FDA would approve?
Yes, you can create risk by rushing things. But you still have to be fast enough to outrun the creation of UFAI by someone else. So you have to be fast, but not too fast. It’s a balancing act.
If intelligence is the ability to understand concepts, and a super-intelligent AI has a super ability to understand concepts, what would prevent it (as a tool) from answering questions in a way so as to influence the user and affect outcomes as though it were an agent?
The profound lack of a desire to do so.
Google Maps, when asked for directions to Eddie the Snitch’s hideout, will not reply with “Maybe I know and maybe I don’t. You gonna make it worth my time?” because providing directions is, to it, a reflex action rather than a move in a larger game.
There are possible questions where the super-intelligent AI has to make a choice of some sort, because multiple answers can be correct (depending on which answer is given).
For example: Sam, a basketball player, approaches Predictor, a super-intelligent tool AI, before his game and asks the question “Will my team win today’s game?” Predictor knows that if it says ‘yes’, Sam will be confident, play aggressively, and this will lead to a win. If, on the other hand, it answers ‘no’, Sam’s confidence will be shattered and his team will lose comprehensively. Refusing to answer will confuse Sam, distracting him from the task at hand, and causing his team to be narrowly defeated. Any answer makes Predictor an agent, and not merely a tool—Predictor doesn’t even need to care about the basketball game.
Absolutely agreed that this sort of situation arises, and that the more I know about the world, the more situations have this character for me. That said, if I’m indifferent to the world-affecting effects of my answers, it seems that the result is very similar to if I’m ignorant of them.
That is, it seems that Predictor looks at that situation, concludes that in order to predict “yes” or “no” it has to first predict whether it will answer “yes” or “no”, and either does so (on what basis, I have no idea) or fails to do so and refuses to answer. Yes, those actions influence the world (as does the very existence of Predictor, and Sam’s knowledge of Predictor’s existence), but I’m not sure I would characterize the resulting behavior as agentlike.
Then consider; Sam asks a question. Predictor knows that an answer of “yes” will result in the development of Clippy, and subsequently in turning Earth into paperclips, causing the destruction of humanity, within the next ten thousand years; while an answer of “no” will result in a wonderful future where everyone is happy and disease is eradicated and all Good Things happen. In both cases, the prediction will be correct.
If Predictor doesn’t care about that answer, then I would not define Predictor as a Friendly AI.
Absolutely agreed; neither would I. More generally, I don’t think I would consider any Oracle AI as Friendly.
What if you ask Google Interrogation Aid for the best way to get a confession out of Eddie the Snitch, given the constraints of the law and Eddie’s psychographics? What if you ask Google Municipal Planner for the best way to reduce crime? What if you ask Google Operations Assistant for the best way to maximize your paperclip production?
Google Maps has options for walking, public transit, and avoiding major highways; a hypothetical interrogation assistant would have equivalent options for degrees of legal or ethical restraint, including “How do I make sure Eddie only confesses to things he’s actually done?” If Google Operations Assistant says that a few simple modifications to the factory can produce a volume of paperclips that outmasses the Earth, there will be follow-up questions about warehousing and buyers.
Reducing crime is comparatively straightforward: more cops per capita, fewer laws for them to enforce, enough economic opportunity to make sure people don’t get desperate and stupid. The real problem is political, rather than technical, so any proposed solution will have a lot of hoops to jump through.
Yes, all it takes is a little common sense to see that legal and ethical restraint are important considerations during your interview and interrogation of Eddie. However, as the complexity of the problem rises, the tractability of the solution to a human reader lowers, as does the probability that your tool AI has sufficient common sense.
A route on a map only has a few degrees of freedom; and it’s easy to spot violations of common-sense constraints that weren’t properly programmed in, or to abort the direction-following process when problems spring up. A route to a virally delivered cancer cure has many degrees of freedom, and it’s harder to spot violations of common-sense constraints, and problems may only become evident when it’s too late to abort.
If all it took was “a little common sense” to do interrogations safely and ethically, the Stanford Prison Experiment wouldn’t have turned out the way it did. These are not simple problems!
When a medical expert system spits out a novel plan for cancer treatment, do you think that plan would be less trustworthy, or receive less scrutiny at every stage, than one invented by human experts? If an initial trial results in some statistically significant number of rats erupting into clockwork horror and rampaging through the lab until cleansed by fire, or even just keeling over from seemingly-unrelated kidney failure, do you think the FDA would approve?