Reframing the Problem of AI Progress
“Fascinating! You should definitely look into this. Fortunately, my own research has no chance of producing a super intelligent AGI, so I’ll continue. Good luck son! The government should give you more money.”
Stuart Armstrong paraphrasing a typical AI researcher
I forgot to mention in my last post why “AI risk” might be a bad phrase even to denote the problem of UFAI. It brings to mind analogies like physics catastrophes or astronomical disasters, and lets AI researchers think that their work is ok as long as they have little chance of immediately destroying Earth. But the real problem we face is how to build or become a superintelligence that shares our values, and given that this seems very difficult, any progress that doesn’t contribute to the solution but brings forward the date by which we must solve it (or be stuck with something very suboptimal even if it doesn’t kill us), is bad. The word “risk” connotes a small chance of something bad suddenly happening, but slow steady progress towards losing the future is just as worrisome.
The usual way of stating the problem also invites lots of debate that are largely beside the point (as far as determining how serious the problem is), like whether intelligence explosion is possible, or whether a superintelligence can have arbitrary goals, or how sure we are that a non-Friendly superintelligence will destroy human civilization. If someone wants to question the importance of facing this problem, they really instead need to argue that a superintelligence isn’t possible (not even a modest one), or that the future will turn out to be close to the best possible just by everyone pushing forward their own research without any concern for the big picture, or perhaps that we really don’t care very much about the far future and distant strangers and should pursue AI progress just for the immediate benefits.
(This is an expanded version of a previous comment.)
- 12 May 2012 19:35 UTC; 17 points) 's comment on Thoughts on the Singularity Institute (SI) by (
- Muehlhauser-Hibbard Dialogue on AGI by 9 Jul 2012 23:11 UTC; 14 points) (
- 12 Apr 2012 9:18 UTC; 11 points) 's comment on against “AI risk” by (
- 6 Jun 2013 22:00 UTC; 9 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
- 19 Apr 2012 10:39 UTC; 8 points) 's comment on AI Risk and Opportunity: A Strategic Analysis by (
- 19 Apr 2012 18:35 UTC; 5 points) 's comment on A question about Eliezer by (
- 2 Sep 2013 21:17 UTC; 5 points) 's comment on How can we ensure that a Friendly AI team will be sane enough? by (
- 23 May 2013 6:05 UTC; 1 point) 's comment on [LINK]s: Who says Watson is only a narrow AI? by (
Assume you manage to communicate this idea to the typical AI researcher. What do you expect him to do next? It’s absurd to think that the typical researcher will quit his field and work on strategies for mitigating intelligence explosion or on foundations of value. You might be able to convince him to work on some topic within AI instead of another. However, while some topics seem more likely to advance AI capabilities than others, this is difficult to tell in advance. More perniciously, what the field rewards are demonstrations of impressive capabilities. Researchers who avoid directions that lead to such demos will end up with less prestigious jobs, i.e., jobs where they are less able to influence the top students of the next generation of researchers. This isn’t what the typical AI researcher wants either. So, what’s he to do?
This probably deserves a discussion post of its own, but here are some ideas that I came up with. We can:
persuade more AI researchers to lend credibility to the argument against AI progress, and to support whatever projects we decide upon to try to achieve a positive Singularity
convince the most promising AI researchers (especially promising young researchers) to seek different careers
hire the most promising AI researchers to do research in secret
use the argument on funding agencies and policy makers
publicize the argument enough so that the most promising researchers don’t go into AI in the first place
So … the name is misleading—it’s actually the Singularity Institute against Artificial Intelligence.
See this thread.
The Singularity Institute For or Against Artificial Intelligence Depending on Which Seems to Be a Better Idea Upon Due Consideration.
or for exclusively friendly AI.
You (as a group) need “street cred” to be persuasive. To a typical person you look like a modern day version of a doomsday cult. Publishing recognized AI work would be a good place to start.
Publishing AI work would help increase credibility, but it’s a costly way of doing so since it directly promotes AI progress. At least some mainstream AI researchers already take SIAI seriously. (Evidence: 1 2) So I suggest bringing better arguments to them and convincing them to lend further credibility.
By the way, what counts as “AI progress?” Do you consider statistics and machine learning a part of “AI progress”? Is theoretical work okay? What about building self-driving cars or speech recognition software? Where is, as someone here would call it, the Shelling point?
Do you consider stopping “AI progress” important enough to put something on the line besides talking about it?
You raise a very good question. There doesn’t seem to be a natural Schelling point, and actually the argument can be generalized to cover other areas of technological development that wouldn’t ordinarily be considered to fall under AI at all, for example computer hardware. So somebody can always say “Hey, all those other areas are just as dangerous. Why are you picking on me?” I’m not sure what to do about this.
I’m not totally sure what you mean by “put something on the line” but for example I’ve turned down offers to co-author academic papers on UDT and argued against such papers being written/published, even though I’d like to see my name and ideas in print as much as anybody. BTW, realistically I don’t expect to stop AI progress, but just hope to slow it down some.
My understanding of Shelling points is there are, by definition, no natural Shelling points. You pick an arbitrary point to defend as a strategy vs slippery slopes. In Yvain’s post he picked an arbitrary %, I think 95.
There is a slippery slope here. Where will you defend?
The issue is that it is a doomsday cult if one is to expect extreme outlier (on doom belief) who had never done anything notable beyond being a popular blogger, to be the best person to listen to. That is incredibly unlikely situation for a genuine risk. Bonus cultism points for knowing Bayesian inference but not applying it here. Regardless of how real is the AI risk. Regardless of how truly qualified that one outlier may be. It is an incredibly unlikely world-state where the AI risk would be best coming from someone like that. No matter how fucked up is the scientific review process, it is incredibly unlikely that world’s best AI talk is someone’s first notable contribution.
These are interesting suggestions, but they don’t exactly address the problem I was getting at: leaving a line of retreat for the typical AI researcher who comes to believe that his work likely contributes to harm.
My anecdotal impression is that the number of younger researchers who take arguments for AI risk seriously has grown substantially in the last years, but—apart from spreading the arguments and the option of career change—it is not clear how this knowledge should affect their actions.
If the risk of indifferent AI is to be averted, I expect that a gradual shift in what is considered important work is necessary in the minds of the AI community. The most viable path I see towards such a shift involves giving individual researchers an option to express their change in beliefs in their work—in a way that makes use of their existing skillset and doesn’t kill their careers.
Ok, I had completely missed what you were getting at, and instead interpreted your comment as saying that there’s not much point in coming up with better arguments, since we can’t expect AI researchers to change their behaviors anyway.
This seems like a hard problem, but certainly worth thinking about.
Relinquishment? My estiamte of the effectiveness of that hovers around zero. I don’t see any reason for thinking that it has any hope of being effective.
Especially not if the pitch is: YOU guys all relinquish the technology—AND LET US DEVELOP IT!!!
That will just smack of complete hypocracy.
Cosmetically splitting the organisation into the neo-luddute activists and the actual development team might help to mitigate this potential PR problem.
Surely secret progress is the worst kind—most likely to lead to a disruptive and unpleasant outcome for the majority—and to uncaught mistakes.
How do I tell whether a small group doing secret research will be better or worse at saving the world than the global science/military complex? Does anyone have strong arguments either way?
I haven’t heard of any justification for why it might only take “nine people and a brain in a box in a basement”. I think some people are too convinced of the AIXI approximation route and therefore believe that it is just a math problem that only takes some thinking and one or two deep insights.
Every success in AI so far relied on a huge team. IBM Watson, Siri, Big Dog or the various self-driving cars:
1)
2)
It takes a company like IBM to design such a narrow AI. More than 100 algorithms. Could it have been done without a lot of computational and intellectual resources?
The basement approach seems ridiculous given the above.
IBM Watson started in a rather small team (2-3 people); IBM started dumping resources on them once they saw serious potential.
I didn’t mean to endorse that. What I was thinking when I wrote “hire the most promising AI researchers to do research in secret” was that if there are any extremely promising AI researchers who are convinced by the argument but don’t want to give up their life’s work, we could hire them to continue in secret just to keep the results away from the public domain. And also to activate suitable contingency plans as needed.
My thoughts on what the main effort should be is still described in Some Thoughts on Singularity Strategies.
Inductive inference is “just a math problem”. That’s the part that models the world—which is what our brain spends most of its time doing. However, it’s probably not “one or two deep insights”. Inductive inference systems seem to be complex and challenging to build.
Everything is a math problem. But that doesn’t mean that you can build a brain by sitting in your basement and literally think it up.
A well-specified math problem, then. By contrast with fusion or space travel.
how is intelligence well specified compared to space travel? We know physics well enough. We know we want to get from point A to point B. The intelligence: we don’t even quite know what do exactly we want from it. We know of some ridiculous towers of exponents slow method, that means precisely nothing.
The claim was: inductive inference is just a math problem. If we know how to build a good quality, general-purpose stream compressor, the problem would be solved.
A small group doing secret research sounds pretty screwed to me—with its main hope being an acquisition or a merger.
Now that SI has been rebranded into MIRI, I’ve had “figure out new framing for AI risk talk, and concise answers to common questions” on my to-do list for several months, but haven’t gotten to it yet. I would certainly appreciate your help with that, if you’re willing.
Partly, I’m using “Effective Altruism and the End of the World” as a tool for testing out different framings of some of the key issues. I’ll be giving the talk many times, and I’m iterating the talk between each presentation, and taking notes on which questions people ask most frequently, and which framings and explanations seem to get the best response.
Christiano has been testing different framings of things, too, mostly with the upper crust of cognitive ability. Maybe we should have a side-meeting about framing issues when you’re in town for MIRI’s September workshop?
Taped for non-CA folks?
Eventually, once it’s good enough.
This is similar in spirit to my complaint about the focus on intelligence explosion. Your framing though requires acceptance of consequentialist optimization view on decision making (so that “good enough” is not considered good enough if it’s possible to do better), and of there being a significant difference between the better outcomes and the “default” outcomes.
For this, high risk of indifferent UFAI is a strong argument (when accepted), which in turn depends on orthogonality of values and optimization power. So while it’s true that this particular argument doesn’t have to hold for the problem to remain serious, it looks like one of the best available arguments for the seriousness of the problem. (It also has the virtue of describing relatively concrete features of the possible future.)
That said, I agree that there should exist a more abstract argument for ensuring the control of human value over the future, that doesn’t depend on any particular scenario. It’s harder to make this argument convincing, as it seems to depend on acceptance of some decision-theoretic/epistemic/metaethical background. Since it’s not necessary to accept this argument (if you accept other arguments, such as intelligence explosion), for the same reasons as it’s not necessary to accept plausibility of intelligence explosion, it seems to me that the current focus should be on making a good case for the strongest arguments, but once the low-hanging fruit on the other arguments is collected, it might be a good idea to develop this more general case as well.
Yes, agreed. On the other hand it may especially appeal to some AI researchers who seem really taken with the notion of optimality. :)
But it seems too early to conclude “orthogonality” at this point. So if you say “AI is dangerous because values are orthogonal to optimization power”, that may just invite people to dismiss you.
If it’s not immediately obvious to someone that the default outcome is not likely to be close to optimal, I think maybe we should emphasize the Malthusian scenario first. Or maybe use a weaker version of the orthogonality thesis (and not call it “orthogonality” which sounds like claiming full independence). And also emphasize that there are multiple lines of argument and not get stuck debating a particular one.
I’d be curious to know if that’s actually the case.
Right, at least mentioning that there is a more abstract argument that doesn’t depend on particular scenarios could be useful (for example, in Luke’s Facing the Singularity).
The robust part of “orthogonality” seems to be the idea that with most approaches to AGI (including neuromorphic or evolved, with very few exceptions such as WBE, which I wouldn’t call AGI, just faster humans with more dangerous tools for creating an AGI), it’s improbable that we end up with something close to human values, even if we try, and that greater optimization power of a design doesn’t address this issue (while aggravating the consequences, potentially all the way to a fatal intelligence explosion). I don’t think it’s too early to draw this weaker conclusion (and stronger statements seem mostly irrelevant for the argument).
This version is essentially Eliezer’s “complexity and fragility of values”, right? I suggest we keep calling it that, instead of “orthogonality” which again sounds like a too strong claim which makes it less likely for people to consider it seriously.
Basically, but there is a separate point here that greater optimization power doesn’t help with the problem and instead makes it worse. I agree that the word “orthogonality” is somewhat misleading.
David Dalrymple was nice enough to illustrate my concern with “orthogonality” just as we’re talking about it. :)
...which also presented an opportunity to make a consequentialist argument for FAI under the assumption that all AGIs are good.
I still don’t see how you can solve something that you virtually know nothing about. I think it will take real progress towards dangerous AGI before one can make it safe.
I just don’t see how someone could have designed a provably secure online banking software before the advent of the Internet.
This post should answer your questions. Let me know if it doesn’t.
False dilemma. For example, someone may think that superintelligences cannot arise quickly. Or they may think that human improvement to our own intelligent will make us as effective superintelligences well before we solve the AI problem (because it is just that tricky).
The point is eventual possibility of an intelligence significantly stronger than that of current humans, with “humans growing up” a special case of that. The latter doesn’t resolve the problem, because “growing out of humans” doesn’t automatically preserve values, this is a problem that must be solved in any case where vanilla humans are left behind, no matter in what manner or how slowly that happens.
Do you mean that the set of possible objections I gave isn’t complete? If so, I didn’t mean to imply that it was.
And therefore we’re powerless to do anything to prevent the default outcome? What about the Modest Superintelligences post that I linked to?
If someone has a strong intuition to that effect, then I’d ask them to consider how to safely improve our own intelligence.
You could ask Robin Hanson whether people really care about far future or it’s just signalling.
“Both” is the answer I would expect. The ‘just’ is misleading.
Upvoted for making me shudder.
This has me thinking: which organisations have the biggest vested interests in slowing progress towards machine intelligence in the English-speaking world?
I figure it has got to be the American and Chinese governments.
The NSA did manage to delay the commercial adoption of cryptography a little—by identifying it as a “weapon”. We may see the same kind of strategy used with machine intelligence. We are certainly seeing intelligent machines being identified as weapons—though the source doesn’t appear to be the government—though probably nobody would believe such propaganda if they thought it came from the government.
Good luck if you want to put the burden of proof on your opponents.
Does it really? I already explained that if someone makes an automated engineering tool, all users of that tool are at least as powerful as some (U)FAI based upon this engineering tool. Addition of independent will onto tank doesn’t make it suddenly win the war against much larger force of tanks with no independent will.
You are rationalizing the position here. If you actually reason forwards, it is clear that creation of such tools may, instead, be the life-saver when someone who thought he solved morality unleashes some horror upon the world. (Or sometime, hardware gets so good that very simple evolution simulator like systems could self improve to point of super-intelligence by evolving, albeit that is very far off into the future)
Suppose I were to convince you of butterfly effect, and explain that you sneezing could kill people, months later. And suppose you couldn’t think that non sneezing has same probability. You’d be trying real hard not to sneeze, for nothing, avoid the sudden bright lights (if you have sneeze reflex on bright lights), and so on.
The engineering super-intelligences don’t share our values to such profound extent, as to not even share the desire to ‘do something’ in the real world. Even the engineering intelligence inside my own skull, as far as I can feel. I build designs in real life, because I have rent to pay, or because I am not sure enough it will work and don’t trust the internal simulator that I use for design (i.e. imagining) [and that’s because my hardware is very flawed]. This is also the case with all my friends whom are good engineers.
The issue here is that you conflate things into ‘human level AI’. There’s at least three distinct aspects to AI:
1: Engineering, and other problem solving. This is a creation of designs in abstract design space.
2: Will to do something in real world in real time.
3: Morality.
People here see first two as inseparable, while seeing third as unrelated.
Think of the tool and its human user as a single system. As long as the system is limited by the human’s intelligence then it will not be as powerful as a system consisting of the same tool driven by a superhuman intelligence. And if the system isn’t limited by the human’s intelligence then the tool is making decisions, it is an AI, and we’re facing the problem of making it follow the operator’s will. (And didn’t you mean to say “as powerful as any (U)FAI”?)
In general, it doesn’t make much sense to draw a sharp distinction between tools and wills that use them. How do you draw the line in the case of a self-modifying AI?
Reasoning by cooked anecdote? Why speak of tanks and not, for example, automated biochemistry labs? I can imagine such existing in the future. And one of those could win the war against all the other biochemistry labs in the world and the rest of the biosphere too, if it were driven by a superior intelligence.