[Link] If we knew about all the ways an Intelligence Explosion could go wrong, would we be able to avoid them?
I submitted this a while back to the lesswrong subreddit, but it occurs to me now that most LWers probably don’t actually check the sub. So here it is again in case anyone that’s interested didn’t see it.
I would have preferred that you just copy and paste into LW. Why did you put a link?
As a sidenote, why is there a lesswrong subreddit?
Because some guy on reddit (now inactive) made the subreddit 4 years ago.
Edit: I’m serious, that’s actually why it exists. u/therationalfuturist made the subreddit, hasn’t posted in 4 years, was apparently a student at Yale. Most of the posts made from the account were quoting stuff people said on LW (4 years ago).
Because I don’t yet have enough karma to post here.
Answer: clearly, no. If you know all the ways things can go wrong, but don’t know how to make them go right, then your knowledge is useless for anything except worrying.
Thanks for comment. I will reply as follows:
Knowing how things could go wrong gives useful knowledge about scenarios/pathways to avoid
Our knowledge of how to make things go right is not zero
My intention with the article is to draw attention to some broader non-technical difficulties in implementing FAI. One worrying theme in the reponses I’ve gotten is a conflation between knowledge of AGI risk and building a FAI. I think they are separate projects, and that success of the second relies on comprehensive prior knowledge of the first. Apparently MIRI’s approach doesn’t really acknowledge the two as separate.
May I recommend the concept of risk management to you? It’s very useful.
It’s generally easier to gain the knowledge of how to make things go right when your research is anchored by potential problems.
Yes.
Eliezer has expressed that ultimately, the goal of MIRI is not just research how to make FAI, but to be the one’s to make it.
In many ways it’s a race. While the pubic is squabbling, someone is going to build the first recursively self-improving system. We’re trying to maneuver the situation so that the people that do it first are the people who know what they’re doing.
Hmm..I wasn’t aware of that. Is there any source for that statement? Is MIRI actually doing any general AI research? I don’t think that you can easily jump from one specific field of AI research (ethics) to general AI research&design.
From here.
Also,
I get the sense that Eliezer wants to be one of the nine people in that basement, if he can be, but I might be streching the evidence little to say “Eliezer has expressed that ultimately, the goal of MIRI is not just research how to make FAI, but to be the one’s to make it.”
Thanks! Haven’t seen that before. I still think it would be better to specialize on ethics issue and than apply its result on AGI sytsem developed by other (hopefully friendly) party. But It would be awesome if someone who is genuinely ethical develops AGI first. I’m really hoping that some big org which went furthest in AI research like google decides to cooperate with MIRI on that issue when they reach the critical point in AGI buildup.
This is something that I think is neglected (in part because it’s not the relevant problem yet) in thinking about friendly AI. Even if we had solved all of the problems of stable goal systems, there could still be trouble, depending on who’s goals are implemented. If it’s a fast take-off, whoever cracks recursive self-improvement first, basically gets Godlike powers (in the form a genii that reshapes the world according to your wish). They define the whole future of the expanding visible universe. There are a lot of institutions who I do not trust to have the foresight to think “We can create utopia beyond anyone’s wildest dreams” and instead to default to “We’ll skewer the competition in the next quarter.”
However, there are unsubstantiated rumors that Google has taken some ex-MIRI people for work on a project of some kind.
I’m pretty new to this although I’ve read Kurzweil’s book and Bostrom’s Superintelligence, and a couple of years worth of mostly lurking on LW, so if there’s if there’s a shitload of thinking about this I hope to be corrected civilly
If friendly AI is to be not just a substitute for but our guardian against unfriendly AI, won’t we end up thinking of all sorts of unfriendly AI tactics, and putting them into the friendly AI so it can anticipate and thwart them? If so, is there any chance of self-modification in the friendly AI turning all that against us? Ultimately, we’d count on the friendly AI itself trying to imagine and develop countermeasures against unfriendly AI tactics that are beyond our imagination, but then same problem maybe.
I’ve been pondering for some time, especially prompted by the book Boyd: The Fighter Pilot Who Changed the Art of War by Robert Coram, how one might distinguish qualities of possible knowledge that make them more or less likely to be of general benefit to humankind. Conflict knowledge seems to have a general problem. It is often developed under the optimistic assumption that it will give “us” who are well-intentioned the ability to make everybody else behave—or what is close to the same thing, it is developed under existential threat such that it is difficult to think a few years out—we need it or the evil ones will annihilate us. Hence the US and the atom bomb from 1945-49. Note that this kind of situation also motivates some (who have anticipated where I’m going) to insist “We have this advantage today—we probably won’t have it a few years from now—lets maximize our advantage while we have it (i.e. bomb the hell out of the USSR in 1946).
Another kind of knowledge might be called value-added knowledge—knowledge that disproves assumptions about economics being a zero-sum game. Better agriculture, house construction, health measures … One can always come up with counterexamples and some are quite non-trivial—the Internet facilitates formation of terrorist groups and other “echo chambers” of people with destructive or somehow non-benign belief systems. Maybe indeed media development falls in some middle-ground between value-added and conflict-oriented knowledge. Almost anything that can be considered “beneficial to humankind” might just advantage one supremely evil person, but I still think we can meaningfully speak of its general tendency to be beneficial, while the tendency of conflict-knowledge seems mostly in the long run to be neutral at best
Boyd, while developing a radically new philosophy of war-fighting got few rewards in the way of promotion and he was always embattled in the military establishment, but he collected around him a few strong acolytes, and did really if inadequately affect the design of fighter planes and their tactics, and his thought grew more and more ambitious until they embraced the art of war generally, and Coram strongly suggests he was as the side of the planners of the first Gulf War, and had a huge impact on how that was waged.
Unfortunately, as the book was being written several years later, there was speculation that people like Al Qaeda had incorporated some of the lessons of new warfare doctrines developed by the US.
It is generally problematic to predict where knowledge construction is going—because by definition we are making predictions about stuff the nature of which we don’t understand because it hasn’t been thought up yet—yet it seems we had better try, and Moore’s law gives one bit of encouragement. MIRI seems to be in part a huge exercise in this problematic sort of thinking.
If I have anything more than maybe “food for thought”, it may be to look for general tendencies (perhaps unprovable tendencies like Moore’s law) in the way kinds of knowledge affect conflict.