[Template] Questions regarding possible risks from artificial intelligence
I am emailing experts in order to raise and estimate the academic awareness and perception of risks from AI. Below are some questions I am going to ask. Please help to refine the questions or suggest new and better questions.
(Thanks goes to paulfchristiano, Steve Rayhawk and Mafred.)
Q1: Assuming beneficially political and economic development and that no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of artificial intelligence that is roughly as good as humans at science, mathematics, engineering and programming?
Q2: Once we build AI that is roughly as good as humans at science, mathematics, engineering and programming, how much more difficult will it be for humans and/or AIs to build an AI which is substantially better at those activities than humans?
Q3: Do you ever expect artificial intelligence to overwhelmingly outperform humans at typical academic research, in the way that they may soon overwhelmingly outperform humans at trivia contests, or do you expect that humans will always play an important role in scientific progress?
Q4: What probability do you assign to the possibility of an AI with initially (professional) human-level competence at general reasoning (including science, mathematics, engineering and programming) to self-modify its way up to vastly superhuman capabilities within a matter of hours/days/< 5 years?
Q5: How important is it to figure out how to make superhuman AI provably friendly to us and our values (non-dangerous), before attempting to build AI that is good enough at general reasoning (including science, mathematics, engineering and programming) to undergo radical self-modification?
Q6: What probability do you assign to the possibility of human extinction as a result of AI capable of self-modification (that is not provably non-dangerous, if that is even possible)?
- 23 Mar 2012 15:23 UTC; -1 points) 's comment on Reply to Yvain on ‘The Futility of Intelligence’ by (
My preferred rewrite, without spending too much time on it:
Q1a: Assuming no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of human-level machine intelligence? Feel free to answer ‘never’ if you believe such a milestone will never be reached. [reason: this matches question #1 of FHI’s machine intelligence survey.]
Q1b: Once we build AIs that are as skilled at technology design and general reasoning as humans are, how much more difficult will it be for humans and/or AIs to build an AI which is substantially better than humans at technology design and general engineering?
Q2a: Do you ever expect AIs to overwhelmingly outperform humans at typical academic research, in the way that they may soon overwhelmingly outperform humans at trivia contests, or do you expect that humans will always play an important role in scientific progress?
Q2b: [delete to make questions list less dauntingly long, and increase response rate]
Q2c: What probability do you assign to the possibility of an AI with initially (professional) human-level competence at technology design and general reasoning to use its capacities to self-modify its way up to vastly superhuman general capabilities within a matter of hours/days/< 5 years? (‘Self modification’ may include the first AI creating improved child AIs, which create further-improved child AIs, etc.)
Q3a: How important is it to figure out how to make superhuman AI provably friendly to us and our values (non-dangerous), before attempting to build AI that is good enough at technology design and general reasoning to undergo radical self-modification?
Q3b: What probability do you assign to the possibility of human extinction as a result of AI capable of self-modification (that is not provably non-dangerous, if that is even possible)?
Q3c: [delete to reduce length of questions list]
Q4: [delete to reduce length of questions list]
Q5: [delete to reduce length of questions list; AI experts are unlikely to be experts on other x-risks]
Q6: [delete to reduce length of questions list; I haven’t seen, and don’t anticipate, useful answers here]
Q7: [delete to reduce length of questions list]
I endorse the “question deletion” idea.
Are these two expressions supposed (or assumed) to be equivalent?
I updated the original post. Maybe we could agree on those questions. Be back tomorrow.
I stand by my preferred rewrites above, but of course it’s up to you.
I agree with deleting Q5 and Q6 because not only would I not expect useful responses but also because it may come off as “extremist” if any respondents are not already familiar with UFAI concepts (or if they are familiar and overtly dismissive of them).
You get a lot of “human level—WTF” comments.
To avoid those, perhaps you could say what you actually mean:
More than “100” on IQ tests, pass the Turing test—or whatever.
IQ tests seem to be tests of how well humans can do things that you would already expect a computer to be better at! The difficult part seems to be parsing the question and translating it from natural language into a format the computer can tackle. No mean feat but not one requiring general intelligence! I’m not entirely sure it would be a more difficult task than having an everyday conversation at the level of a 70 IQ human. (This isn’t to say that ‘pass for human’ is at all equivalent to ‘human level’ either.)
“About as good as an average intelligence but well trained human is at doing scientific research” seems to be approximately what ‘human level’ is getting at.
Maybe. Machines can outperform humans in some parts of IQ tests.
...but they don’t get good scores overall yet.
An IQ 100 machine would be quite something. An IQ 150 machine would be even more interesting.
What I would put as question 1 (with three parts):
(a) What does the (concept/phrase) of “human-level machine intelligence” mean to you? (b) What forms of machine intelligence are you most optimistic about? (c) What forms do you think could be the most dangerous?
Rationale: it seems to me that the most useful part of Nilsson’s response was his alternate definition of human-level intelligence. Moving AI experts from the ridiculous mode of “what probability do you place on Terminator occurring?” to the serious mode of “what could go wrong with a potential design?” both signals your seriousness as a thinker and primes them to take AI risks seriously, since they came up with the doomsday scenario. It also seems like getting a sense of what direction AI experts think AI will take is useful: if experts are optimistic about machine intelligence hardware design, then FOOMing is more likely. (It might even be useful to ask about areas they’re pessimistic about, since that’s a different question than danger, but four questions for the first question seems like a bit much.)
Drawback: what you’re interested in is cross-domain optimization competence. If people give you numbers based on when machine intelligence will be able to pass a Turing test, those numbers will be mostly useless. Even the numbers Nilsson gave for his ‘employable AI’ are difficult to compare to numbers other people are giving. But it seems to me that knowing better what they mean is more important than easy comparisons.
Overall, I feel a bit better about lukeprog’s rewrite than I do about the original. I do think at least one question about AI risk countermeasures should be preserved- probably something like this:
Q4. What is the ideal level of awareness and funding for AI risk reduction?
Possibly with a clarification that they can either give a dollar number or just compare it to some other cause (like a particular variety of cancer, other existential risks, etc.).
The whole of question 3 seems problematic to me.
Concerning parts (a) and (b), I doubt that researchers will know what you have in mind by “provably friendly.” For that matter I myself don’t know what you have in mind by “probably friendly” despite having read a number of relevant posts on Less Wrong.
Concerning part (c); I doubt that experts are thinking in terms of money needed to possible mitigate AI risks at all; presumably in most cases if they saw this as a high priority issue and tractable issue they would have written about it already.
Not only that, 3b seems to presuppose that the only dangerous AI is a recursively self-improving one.
Q6 is confusing. Are you asking for P(human extinction by AI that is capable of self-modification and not provably non-dangerous), P(human extinction by AI | AI capable of self-modification and not provably non-dangerous is created), or P(human extinction by first such AI | AI capable of self-modification and not provably non-dangerous is created)?
I think you should replace “superhuman AI” with something like “a singular AI capable of having a catastrophic global impact.” Anything that isn’t sourced from nerd culture, basically. I also preferred “provably non-dangerous” to “provably friendly.”
Q2 is too nebulous. What do you mean by “how much more difficult” ? How do you measure “difficulty” ?
Q5 glosses over the main problem: we don’t know what “our values” even are. There’s wide disagreement on this topic among practically all communities.
Q6 is not entirely clear on whether you’re asking for cumulative probability, or a single random variable. You also do not define what “extinction” is.
I am not sure whether anyone thinks that is true. If you look at the claims by E. Yudkowsky they typically say something like:
Yudkowsky appears to be hedging his bets on when this is going to happen—by saying: “at some point”. There’s not much sign of anything like: “initially (professional) human-level competence”.
Does anyone believe such a thing will happen then? At first glance the claim makes little sense: we already know how fast progress goes with agents of “professional human-level competence” knocking around—and it just isn’t that fast.
Q1: I think ‘beneficially’ should be ‘beneficial’.
You might want to include some context, especially about why you think AIs might pose a threat. Yes, there are reasons to not do this but some people seem to not have considered the issue at all, or immediately jump to sci-fi tropes involving robots with human-like desires for power/revenge/...
The main concern I would have here is emailing a busy expert to say “here’s a bunch of background material you may or may not be familiar with, please read it then answer some questions” seems like a poor way to get responses.
2018/2025/2045
Not difficult at all. It follows nearly automatically.
Science is to important to be left to humans. Those systems will outperforms humans, of course. By a LARGE margin.
Not to a great extent. Can be done from scratch if need be.
10%/40%/99%
It will be no time to prove it mathematically in advance.
99% that humans will be wiped out. We may survive as non humans − 50%.
Billion, maybe.
I am glad, that there is no mass hysteria about it.
Yes.
Several.
Quite a bit.
As a FOOM skeptic, can I ask you to show your reasoning a little more? Thinking faster is great, but there’s a lower bound on the time it takes to solve certain types of hard problems.
Wait, what? At the very least, consider the implications of the chronophone
Off-topic
This intuition may be wrong, but if I thought there was a 50% chance of GAI (human level) by 2025, I’d estimate a 10% chance essentially now (2012-2014). I guess this shows that our estimated shape of the probability distribution (what we think sigma is) is very different. Interesting.
The next FOOM will be only the faster phase of the already functioning one. The one from the primordial Earth to now. Or the one from the Big Bang to now.
Nothing new, except the speed.
Respectfully, “like the Big Bang, only faster” does nothing to answer my question. I’m hardly committed to believing AI will go FOOM based on my belief in the Big Bang. Likewise with my belief in the evolution of life on Earth.
Not “like Big Bang, only faster”, but “like from Big Bang to today, only faster” or “like from the Roman Empire to today, only faster”. Or “like from the first cell to an ape, only faster”.
How fast a transformation goes is a matter of a degree inside the possible physics. But if something “evolves” very fast, you can call it FOOM more easily. Only that.
Now, what make me think, that some “intelligent” program could change its hardware as well? And fast!?
Be cause there is no real dichotomy here. Every bit has its physical imprint and every calculation is also a physical process. Nothing forbids quite a large influence onto surrounding matter and a positive feedback.
Does it?
Yes, there are things that forbid this. Typically when we design a CPU, one of the design requirements is that no sequence of instructions can alter the hardware in irreversible ways. A reset should really put it back to a consistent state. Yes, it’s possible that the hardware has the potential for unexpected alteration from software, but I wouldn’t bet on that as a magic capability without real evidence. It takes a lot of energy to alter silicon and digital logic circuits just don’t have that kind of power.
So, given a correctly-designed CPU, any positive-feedback loop here has to go off-chip, which usually means “through humans”. And humans are slow and error-prone, so that imposes a lot of lag in the feedback loop.
I believe that a human-machine system will steadily improve over time. But it doesn’t seem, based on past experience, that there’s unlimited positive feedback. We’ve hit limits in hardware performance, despite using sophisticated machines and algorithms for design. We’ve hit limits in software performance—some problems really are intractable and others are undecidable.
So where’s the evidence that a single software program can improve its capabilities in an uncontrolled fashion, much more quickly than the surrounding society?
Just to make sure I understand you: if A is a program that has full access to its source code and the specifications of the hardware it’s running on, and A designs a new machine infrastructure and applies pressure to the world (e.g., through money or blackmail or whatever works) to induce humans to build an instance of that machine, B, such that B allows software-mediated hardware modification (for example, by having an automated chip-manufacturing plant attached to it), you would say that B is an “incorrectly-designed” CPU that might allow for a positive feedback loop.
Is that right?
Put a different way: this argument assumes that the existence of intelligent software doesn’t alter our predictions that CPUs will all be “correctly designed.” That might be true, or might not be.
No, this is not a case of an incorrectly designed CPU. This is a case where there’s a human in the loop and where the process of evolution will therefore be slow. It’s not a FOOM if it takes years between improvements, during which time the rest of the world is also improving.
We are very far from having a wholly-automated CPU-builder-plus-machine-assembly-and-install system. This is not a process that I expect a mildly-superhuman intelligence to be able to short-circuit.
Ah, OK.
Agreed that IF it turns out that existing hardware is incapable of supporting software capable of designing a wholly automated chip factory, THEN humans are a necessary part of the self-improvement cycle for as many iterations as it takes to come up with hardware that is capable of that (plus one final iteration).
I’m not as confident of that premise as you sound, but it’s certainly possible.
Existing hardware might be capable of supporting software capable of designing an automated chip factory. But the assumption required for the FOOM scenario is much stronger than that.
To get an automated self-improving system, it’s not enough to design—you have to actually build. And the necessary factory has to build a lot more than chips. I’m certain that existing hardware attached to general purpose computers is insufficient to build much of anything. And the sort of robotic actuators required to build a wholly automated factory are pretty far from what’s available today. There’s really a lot of manufacturing required to get from clever software to a flexible robotic factory.
I am skeptical that these steps can be done all that quickly or that a merely superhuman AI won’t make costly mistakes along the way. There are lots and lots of details to get right and the AI won’t typically have access to all the relevant facts.
At least you need to build eventually. That’s after you’ve harvested the resources you can from the internet. Which is a lot. ie. All the early iterations would probably just be software improvements. Hardware improvements can wait until the self improving system is already smart enough to make such tasks simple.
How do you know how much scope there is for software-only optimization? If I understand right, you are assuming that an AGI is able to reliably write the code for a much more capable AGI.
I’m sure this isn’t true in general. At some point you max out the hardware. Before you get to that point, I’d expect the amount of cleverness needed to find more improvements exceeds the ability of the machine. Intractable problems stay intractable no matter how smart you are.
Just how much room do you think there is for iterative software-only reengineering of an AGI, and why?
Not every software, of course not. But a complex enough, that can search through the space of all possibilities fast enough to find a hole, if there is one.
Nobody thought, that in a chess a king with two knights is doomed against a king with two bishops. The most brilliant human minds have never suspected that. Then a simple software program found this hole in the FIDE’s rules of “50 moves without check”. The million or so best human minds haven’t. People are able to explore only a small part of the solutions space.
I’m trying to find a reference for that but I can’t find any mention of that endgame. Do you have a reference or maybe another detail which could narrow the google search down?
Isn’t the 50 move rule “50 moves without a pawn moved or a piece captured”? Just requiring a check wouldn’t (always) prevent the problem the rule is trying to prevent.
here
Quote:
It is almost all I find online. But I will keep try.
The “50 rule” changed several times.
This doesn’t seem to mention two knights vs two bishops. Is that specifically something you recall seeing elsewhere?
I’ve read this about 25 years ago in a magazine.
But do try this:
and this
Google? Yes, I tried that. I found no confirmation. I still haven’t found said confirmation. I now doubt the claim.
Who do you think proved this? A human? Do you have a supporting link?
Do you think it isn’t proven?
If there was such a proof it would have been found by a computer.
I initially just believed you and wanted to find out more. But it turns out there isn’t any mention of it in the places where I expected it to be mentioned. A winning endgame between a combination so similar in material would almost certainly be mentioned if it existed. Absence of evidence (that should exist) is evidence of absence! Perhaps there was another similar result in the magazine?
The most interesting endgame I found in my searching was two knights vs king and pawn, which is (depending on the pawn) a win. This is in contrast to the knights vs the lone king which is an easy draw. On a related (better to be worse) note there was a high ranked game in which a player underpromoted (pawns to knights) twice in one game and in each case the underpromotion was the unambiguous correct play.
Here
Somebody recalls a slightly different version than I.
I second wedrifid’s request, please provide a link to the two knights against two bishops problem. It sounds interesting. Also, it’s indeed not “50 moves without check” but rather “50 moves without a capture or a pawn move”.
Sure. Machines are good at systematically checking cases and at combinatorial optimization, once the state space is set up properly. But this isn’t a good model for general-purpose intelligence. In fact, this sort of systematic checking is precisely why I think we can build correct hardware.
The way systematic verification works is that designers write a specification and then run moderately-complex programs to check that the design meets the spec. Model-checking software or hardware doesn’t require a general-purpose intelligence. It requires good algorithms and plenty of horsepower, but nothing remotely self-modifying or even particularly adaptive.
Yes, that’s basically what going FOOM means. Why do you think it will happen?
Well, that’s not true. Many computational problems have well understood upper limits on how fast they can be solved. If you make those problem sufficiently large, they are just as intractable to a fast computer as to a smart human. You seem to think that “sufficiently large” is not a likely size of problems we will want want to solve in the future. Why do you think that?
It means, that maybe a self optimizing program will first only recompile itself more optimally. Then it will make itself parallel. Then it will find a way to level the voltage. Then it will find undocumented OPs. Then it will harness some quantum effects in the processor or in RAM or elsewhere to get a boost. Then it will outsource itself to the neighboring devices. Then it will do some small changes on the “quantum level”.
Soon we will call it—a FOOMer.
On a given hardware. Another reason it may want to FOOM a little.
Again, this is not what I mean..
Please note that I’m asking WHY you think your assertions are true.
I thought it was clear. A program, which goal is only to improve itself, as much as possible, when advanced enough, CAN influence its hardware. I don’t know exactly what would be the best way to do it, but I imagine that some tinkering with the electrical currents inside the CPU might alter it on a nondestructive way as well.
The “well understood upper limit” of the PI calculating will wait for an improved hardware. Improved with the whole Earth, for example.
Search lesswrong.com and Yudkowsky about this, it is one of a few things I agree with them.
XiXiDu posted these questions for the purpose of getting feedback on how to revise them.
But since you answered the questions: Are you an AI expert? What is your full name? Is your CV available online?
I don’t want to be regarded as an “AI expert”. Especially not one of those, who are interviewed by XiXiDu.
I just read and post here, from time to time.
Still, you can follow the link in my profile and judge for yourself, what do we do.