Q&A with Stan Franklin on risks from AI
[Click here to see a list of all interviews]
I am emailing experts in order to raise and estimate the academic awareness and perception of risks from AI.
Stan Franklin, Professor, Computer Science
W. Harry Feinstone Interdisciplinary Research Professor
Institute for Intelligent Systems
FedEx Institute of Technology
The University of Memphis
The Interview:
Q: What probability do you assign to the possibility of us being wiped out by badly done AI?
Stan Franklin: On the basis of current evidence, I estimate that probability as being tiny. However, the cost would be so high, that the expectation is really difficult to estimate.
Q: What probability do you assign to the possibility of a human level AI, respectively sub-human level AI, to self-modify its way up to massive superhuman intelligence within a matter of hours or days?
Stan Franklin: Essentially zero in such a time frame. A lengthy developmental period would be required. You might want to investigate the work of the IEEE Technical Committee on Autonomous Mental Development.
Q: Is it important to figure out how to make AI provably friendly to us and our values (non-dangerous), before attempting to solve artificial general intelligence?
Stan Franklin: Proofs occur only in mathematics. Concern about the “friendliness” of AGI agents, or the lack thereof, has been present since the very inception of AGI. The 2006 workshop <http://www.agiri.org/forum/index.php?act=ST&f=21&t=23>, perhaps the first organized event devoted to AGI, included a panel session entitled How do we more greatly ensure responsible AGI? Video available at <http://video.google.com/videoplay?docid=5060147993569028388> (There’s also a video of my keynote address.) I suspect we’re not close enough to achieving AGI to be overly concerned yet. But that doesn’t mean we shouldn’t think about it. The day may well come.
Q: What is the current level of awareness of possible risks from AI within the artificial intelligence community, relative to the ideal level?
Stan Franklin: I’m not sure about the ideal level. Most AI researchers and practitioners seem to devote little or no thought at all to AGI. Though quite healthy and growing, the AGI movement is still marginal within the AI community. AGI has been supported by AAAI, the central organization of the AI community, and continues to receive such support.
Q: How do risks from AI compare to other existential risks, e.g. advanced nanotechnology?
Stan Franklin: I have no thoughts on this subject. I’ve copied this message to Sonia Miller, who might be able to provide an answer or point you to someone who can.
Q: Furthermore I would also like to ask your permission to publish and discuss your possible answers, in order to estimate the academic awareness and perception of risks from AI.
Stan Franklin: Feel free, but do warn readers that my responses are strictly half-baked and off-the-top-of-my-head, rather than being well thought out. Given time and inclination to think further about these issues, my responses might change radically. I’m ok with their being used to stimulate discussion, but not as pronouncements.
This is a great idea! These responses are very interesting, and I look forward to reading others. As for the interpretation of the results, this paragraph seems to say it all:
So this is pretty strong evidence that safety issues aren’t being thought about much, at least at the University of Memphis. Any information value from the answers to the rest of the questions are screened off by this disclaimer.
This seems like a good point, and something that’s been kind of bugging me for a while. It seems like “proving” an AI design will be friendly is like proving a system of government won’t lead to the economy going bad. I don’t understand how it’s supposed to be possible.
I can understand how you can prove a hello world program will print “hello world”, but friendly AI designs are based around heavy interaction WITH the messy outside world, not just saying hello to it, but learning all but its most primitive values from it.
How can we be developing 99% of our utility function by stealing it from the outside world, where we can’t even “prove” that the shop won’t be out of shampoo, and yet simultaneously have a “proof” that this will all work out? Even if we’re not proving “friendliness” per se, but just that the AI has “consistent goals under self-modification”, consistent with WHAT? If you’re not programming in an opinion about abortion and gun control to start with, how can any value it comes to regarding that be “consistent” OR “inconsistent”?
Proving “friendliness” may well be impossible to define, but there are narrower desirable properties that can be proven. You could prove that it optimizes correctly on special cases that are simpler than the world as a whole; you can prove that it doesn’t have certain classes of security holes; you can prove that it’s resilient against single-bit errors. With a more detailed understanding of metaethics, we might prove that it aggregates values in a way that’s stable in spite of outliers, and that its debug output about the values it’s discovered is accurate. Basically, we should prove as much as we can, even if there are some parts that aren’t amenable to formal proof.
I’ve been under the impression that “Friendliness proofs” aren’t about proving Friendliness as such. Rather, they’re proofs that whatever is set as the AI’s goal function will always be preserved by the AI as its goal function, no matter how much self-improvement it goes through.
That doesn’t sound to be impossible. Consider that in the case of a seed AI, the “government” only has to deal with one perfectly rational game theoretic textbook agent. The only reason that economists fail to predict how certain policies will affect the economy is that their models often have to deal with a lot of unknown, or unpredictable factors. In the case of an AI, the policy is applied to the model itself, which is a well-defined mathematical entity.
Note that the video he linked to does feature Eliezer Yudkowsky, of which a transcript can be found here.
Asking him is like asking Mach about planes in 1900. Or Rutherford about atomic bomb in 1930.
They should new it, but they were both wrong.
The point about “asking” is to make those people think about risks from AI, without implying that they are wrong or should know better. Who else but me can do this without appearing to be sneaky? I am honestly interested in their answers. And if, for some reason, I give a negative impression, you can always say that I am not associated with the SIAI, critical of LW and don’t understand the important arguments. A mission with plausible deniability (well, it’s more than plausible, it’s true ;-).
It still looks a bit sneaky—if you publicly ’fess up in the comments here!
I hope that’s meant to be an empirical claim and not an argument.