Will the world’s elites navigate the creation of AI just fine?
One open question in AI risk strategy is: Can we trust the world’s elite decision-makers (hereafter “elites”) to navigate the creation of human-level AI (and beyond) just fine, without the kinds of special efforts that e.g. Bostrom and Yudkowsky think are needed?
Some reasons for concern include:
Otherwise smart people say unreasonable things about AI safety.
Many people who believed AI was around the corner didn’t take safety very seriously.
Elites have failed to navigate many important issues wisely (2008 financial crisis, climate change, Iraq War, etc.), for a variety of reasons.
AI may arrive rather suddenly, leaving little time for preparation.
But if you were trying to argue for hope, you might argue along these lines (presented for the sake of argument; I don’t actually endorse this argument):
If AI is preceded by visible signals, elites are likely to take safety measures. Effective measures were taken to address asteroid risk. Large resources are devoted to mitigating climate change risks. Personal and tribal selfishness align with AI risk-reduction in a way they may not align on climate change. Availability of information is increasing over time.
AI is likely to be preceded by visible signals. Conceptual insights often take years of incremental tweaking. In vision, speech, games, compression, robotics, and other fields, performance curves are mostly smooth. “Human-level performance at X” benchmarks influence perceptions and should be more exhaustive and come more rapidly as AI approaches. Recursive self-improvement capabilities could be charted, and are likely to be AI-complete. If AI succeeds, it will likely succeed for reasons comprehensible by the AI researchers of the time.
Therefore, safety measures will likely be taken.
If safety measures are taken, then elites will navigate the creation of AI just fine. Corporate and government leaders can use simple heuristics (e.g. Nobel prizes) to access the upper end of expert opinion. AI designs with easily tailored tendency to act may be the easiest to build. The use of early AIs to solve AI safety problems creates an attractor for “safe, powerful AI.” Arms races not insurmountable.
The basic structure of this ‘argument for hope’ is due to Carl Shulman, though he doesn’t necessarily endorse the details. (Also, it’s just a rough argument, and as stated is not deductively valid.)
Personally, I am not very comforted by this argument because:
Elites often fail to take effective action despite plenty of warning.
I think there’s a >10% chance AI will not be preceded by visible signals.
I think the elites’ safety measures will likely be insufficient.
Obviously, there’s a lot more for me to spell out here, and some of it may be unclear. The reason I’m posting these thoughts in such a rough state is so that MIRI can get some help on our research into this question.
In particular, I’d like to know:
Which historical events are analogous to AI risk in some important ways? Possibilities include: nuclear weapons, climate change, recombinant DNA, nanotechnology, chloroflourocarbons, asteroids, cyberterrorism, Spanish flu, the 2008 financial crisis, and large wars.
What are some good resources (e.g. books) for investigating the relevance of these analogies to AI risk (for the purposes of illuminating elites’ likely response to AI risk)?
What are some good studies on elites’ decision-making abilities in general?
Has the increasing availability of information in the past century noticeably improved elite decision-making?
- Elites and AI: Stated Opinions by 15 Jun 2013 19:52 UTC; 14 points) (
- 25 Jun 2013 23:02 UTC; 13 points) 's comment on A personal history of involvement with effective altruism by (
- 6 Jun 2013 17:40 UTC; 12 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
- 19 Oct 2015 9:14 UTC; 10 points) 's comment on Open thread, Oct. 19 - Oct. 25, 2015 by (
- 29 Jul 2013 10:32 UTC; 8 points) 's comment on Why I’m Skeptical About Unproven Causes (And You Should Be Too) by (
- Forecasting rare events by 11 Jul 2014 22:48 UTC; 6 points) (
- 7 Jun 2013 20:03 UTC; 4 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
- 6 Jun 2013 23:01 UTC; 4 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
- 6 Jun 2013 20:14 UTC; 2 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
- 27 Jun 2013 18:49 UTC; 2 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
- 6 Jun 2013 17:26 UTC; 0 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
What does RSI stand for?
“recursive self improvement”.
Okay, I’ve now spelled this out in the OP.
Lately I’ve been listening to audiobooks (at 2x speed) in my down time, especially ones that seem likely to have passages relevant to the question of how well policy-makers will deal with AGI, basically continuing this project but only doing the “collection” stage, not the “analysis” stage.
I’ll post quotes from the audiobooks I listen to as replies to this comment.
From Watts’ Everything is Obvious:
More (#1) from Everything is Obvious:
More (#2) from Everything is Obvious:
More (#4) from Everything is Obvious:
More (#3) from Everything is Obvious:
From Rhodes’ Arsenals of Folly:
More (#3) from Arsenals of Folly:
And:
And:
And:
And:
Amazing stuff. Was the world really as close to a nuclear war in 1983 as in 1962?
More (#2) from Arsenals of Folly:
And:
And, a blockquote from the writings of Robert Gates:
More (#1) from Arsenals of Folly:
And:
And:
And:
More (#4) from Arsenals of Folly:
From Lewis’ Flash Boys:
So Spivey began digging the line, keeping it secret for 2 years. He didn’t start trying to sell the line to banks and traders until a couple months before the line was complete. And then:
More (#1) from Flash Boys:
And:
And:
There was so much worth quoting from Better Angels of Our Nature that I couldn’t keep up. I’ll share a few quotes anyway.
More (#3) from Better Angels of Our Nature:
Further reading on integrative complexity:
Wikipedia Psychlopedia Google book
Now that I’ve been introduced to the concept, I want to evaluate how useful it is to incorporate into my rhetorical repertoire and vocabulary. And, to determine whether it can inform my beliefs about assessing the exfoliating intelligence of others (a term I’ll coin to refer to that intelligence/knowledge which another can pass on to me to aid my vocabulary and verbal abstract reasoning—my neuropsychological strengths which I try to max out just like an RPG character).
At a less meta level, knowing the strengths and weaknesses of the trait will inform whether I choose to signal it or dampen it from herein and in what situations. It is important for imitators to remember that whatever IC is associated with does not neccersarily imply those associations to lay others.
strengths
conflict resolution (see Luke’s post)
As listed in psycholopedia:
appreciation of complexity
scientific profficiency
stress accomodationo
resistance to persuasion
prediction ability
social responsibliy
more initiative, as rated by managers, and more motivation to seek power, as gauged by a projective test
weaknesses
based on psychlopedia:
low scores on compliance and conscientiousness
seem antagonistic and even narcissistic based on the wiki article:
dependence (more likely to defer to others)
rational expectations (more likely to fallaciously assume they are dealing with rational agents)
Upon reflection, here are my conclusions:
high integrative complexity dominates low integrative complexity for those who have insight into the concept and self aware of how it relates to them, others, and the capacity to use the skill and hide it.
the questions eliciting the answers that are expert rated to define the concept of IC by psychometricians is very crude and there ought to be a validated tool devised, if that is an achievable feat (cognitive complexity or time estimates beyond the scope of my time/intelligence at the moment)
I have been using this tool as my primary estimate of intelligence of people but will instead subordinate it to ordinary psychometric status before I became aware of it here and will now elevate traditional tools of intelligence to their established status
I’m interested in learning about the algorithms used to search say Twitter and assess IC. Anyone got any info?
very interested in any research on IC association with corporate board performance and shareprices etc. Doesn’t seem to be much research but generally research does start with Defence implications before going corporate...
Interested in exploring relations between the assessment of IC and tools used in CBT given their structural similarity...and by extensions general relationships between IC and mental health
More (#4) from Better Angels of Our Nature:
Untrue unless you’re in a non-sequential game
True under a utilitarian framework and with a few common mind-theoretic assumptions derived from intuitions stemming from most people’s empathy
Woo
More (#2) from Better Angels of Our Nature:
More (#1) from Better Angels of Our Nature:
From Ariely’s The Honest Truth about Dishonesty:
More (#1) from Ariely’s The Honest Truth about Dishonesty:
And:
More (#2) from Ariely’s The Honest Truth about Dishonesty:
And:
From Feynman’s Surely You’re Joking, Mr. Feynman:
More (#1) from Surely You’re Joking, Mr. Feynman:
And:
And:
One quote from Taleb’s AntiFragile is here, and here’s another:
AntiFragile makes lots of interesting points, but it’s clear in some cases that Taleb is running roughshod over the truth in order to support his preferred view. I’ve italicized the particularly lame part:
From Think Like a Freak:
More (#1) from Think Like a Freak:
And:
From Rhodes’ Twilight of the Bombs:
More (#1) from Twilight of the Bombs:
And:
And:
And:
And:
From Harford’s The Undercover Economist Strikes Back:
And:
More (#2) from The Undercover Economist Strikes Back:
And:
And:
And:
More (#1) from The Undercover Economist Strikes Back:
And:
From Caplan’s The Myth of the Rational Voter:
More (#2) from The Myth of the Rational Voter:
This is an absurdly narrow definition of self-interest. Many people who are not old have parents who are senior citizens. Men have wives, sisters, and daughters whose well-being is important to them. Etc. Self-interest != solipsistic egoism.
More (#1) from The Myth of the Rational Voter:
And:
More (#3) from The Myth of the Rational Voter:
Allow me to offer an alternative explanation of this phenomenon for consideration. Typically, when polled about their trust in insitutions, people tend to trust the executive branch more than the legislature or the courts, and they trust the military far more than they trust civilian government agencies. In the period before 9/11, our long national nightmare of peace and prosperity would generally have made the military less salient in people’s minds, and the spectacles of impeachment and Bush v. Gore would have made the legislative and judicial branches more salient in people’s minds. After 9/11, the legislative agenda quieted down/the legislature temporarily took a back seat to the executive, and military and national security organs became very high salience. So when people were asked about the government, the most immediate associations would have been to the parts that were viewed as more trustworthy.
From Richard Rhodes’ The Making of the Atomic Bomb:
More (#2) from The Making of the Atomic Bomb:
After Alexander Sachs paraphrased the Einstein-Szilard letter to Roosevelt, Roosevelt demanded action, and Edwin Watson set up a meeting with representatives from the Bureau of Standards, the Army, and the Navy...
Upon asking for some money to conduct the relevant experiments, the Army representative launched into a tirade:
More (#3) from The Making of the Atomic Bomb:
Frisch and Peierls wrote a two-part report of their findings:
More (#1) from The Making of the Atomic Bomb:
On the origins of the Einstein–Szilárd letter:
And:
More (#5) from The Making of the Atomic Bomb:
More (#4) from The Making of the Atomic Bomb:
And:
And:
And:
And:
From Poor Economics:
From The Visioneers:
And:
And:
And:
From Priest & Arkin’s Top Secret America:
More (#2) from Top Secret America:
And, on JSOC:
And:
And:
I wonder if the security-industrial complex bureaucracy is any better in other countries.
Which sense of “better” do you have in mind? :-)
More efficient.
KGB had a certain aura, though I don’t know if its descendants have the same cachet. Israeli security is supposed to be very good.
Stay tuned; The Secret History of MI6 and Defend the Realm are in my audiobook queue. :)
More (#1) from Top Secret America:
And:
From Pentland’s Social Physics:
More (#2) from Social Physics:
And:
More (#1) from Social Physics:
And:
And:
From de Mesquita and Smith’s The Dictator’s Handbook:
More (#2) from The Dictator’s Handbook:
And:
More (#1) from The Dictator’s Handbook:
From Ferguson’s The Ascent of Money:
More (#1) from The Ascent of Money:
And:
The Medici Bank is pretty interesting. A while ago I wrote https://en.wikipedia.org/wiki/Medici_Bank on the topic; LWers might find it interesting how international finance worked back then.
From Scahill’s Dirty Wars:
More (#2) from Dirty Wars:
And:
And:
More (#1) from Dirty Wars:
And:
And:
And:
Foreign fighters show up everywhere. And now there’s the whole Islamic State issue. Perhaps all the world needs is more foreign legions doing good things. The FFL is overrecruited afterall. Heck, we could even deal with the refugee crisis by offering visas to those mercenaries. Sure as hell would be more popular than selling visas and citizenship cause people always get antsy about inequality and having less downward social comparisons.
Passage from Patterson’s Dark Pools: The Rise of the Machine Traders and the Rigging of the U.S. Stock Market:
But it proved all too easy: The very first tape Wang played revealed two dealers fixing prices.
Some relevant quotes from Schlosser’s Command and Control: Nuclear Weapons, the Damascus Accident, and the Illusion of Safety:
And:
More from Command and Control:
And:
More (#3) from Command and Control:
And:
And:
And:
More (#2) from Command and Control:
And:
And:
More (#4) from Command and Control:
And:
Do you keep a list of the audiobooks you liked anywhere? I’d love to take a peek.
Okay. In this comment I’ll keep an updated list of audiobooks I’ve heard since Sept. 2013, for those who are interested. All audiobooks are available via iTunes/Audible unless otherwise noted.
Outstanding:
Tetlock, Expert Political Judgment
Pinker, The Better Angels of Our Nature (my clips)
Schlosser, Command and Control (my clips)
Yergin, The Quest (my clips)
Osnos, Age of Ambition (my clips)
Worthwhile if you care about the subject matter:
Singer, Wired for War (my clips)
Feinstein, The Shadow World (my clips)
Venter, Life at the Speed of Light (my clips)
Rhodes, Arsenals of Folly (my clips)
Weiner, Enemies: A History of the FBI (my clips)
Rhodes, The Making of the Atomic Bomb (available here) (my clips)
Gleick, Chaos (my clips)
Wiener, Legacy of Ashes: The History of the CIA (my clips)
Freese, Coal: A Human History (my clips)
Aid, The Secret Sentry (my clips)
Scahill, Dirty Wars (my clips)
Patterson, Dark Pools (my clips)
Lieberman, The Story of the Human Body
Pentland, Social Physics (my clips)
Okasha, Philosophy of Science: VSI
Mazzetti, The Way of the Knife (my clips)
Ferguson, The Ascent of Money (my clips)
Lewis, The Big Short (my clips)
de Mesquita & Smith, The Dictator’s Handbook (my clips)
Sunstein, Worst-Case Scenarios (available here) (my clips)
Johnson, Where Good Ideas Come From (my clips)
Harford, The Undercover Economist Strikes Back (my clips)
Caplan, The Myth of the Rational Voter (my clips)
Hawkins & Blakeslee, On Intelligence
Gleick, The Information (my clips)
Gleick, Isaac Newton
Greene, Moral Tribes
Feynman, Surely You’re Joking, Mr. Feynman! (my clips)
Sabin, The Bet (my clips)
Watts, Everything Is Obvious: Once You Know the Answer (my clips)
Greenblatt, The Swerve: How the World Became Modern (my clips)
Cain, Quiet: The Power of Introverts in a World That Can’t Stop Talking
Dennett, Freedom Evolves
Kaufman, The First 20 Hours
Gertner, The Idea Factory (my clips)
Olen, Pound Foolish
McArdle, The Up Side of Down
Rhodes, Twilight of the Bombs (my clips)
Isaacson, Steve Jobs (my clips)
Priest & Arkin, Top Secret America (my clips)
Ayres, Super Crunchers (my clips)
Lewis, Flash Boys (my clips)
Dartnell, The Knowledge (my clips)
Cowen, The Great Stagnation
Lewis, The New New Thing (my clips)
McCray, The Visioneers (my clips)
Jackall, Moral Mazes (my clips)
Langewiesche, The Atomic Bazaar
Ariely, The Honest Truth about Dishonesty (my clips)
A process for turning ebooks into audiobooks for personal use, at least on Mac:
Rip the Kindle ebook to non-DRMed .epub with Calibre and Apprentice Alf.
Open the .epub in Sigil, merge all the contained HTML files into a single HTML file (select the files, right-click, Merge). Open the Source view for the big HTML file.
Edit the source so that the ebook begins with the title and author, then jumps right into the foreword or preface or first chapter, and ends with the end of the last chapter or epilogue. (Cut out any table of contents, list of figures, list of tables, appendices, index, bibliography, and endnotes.)
Remove footnotes if easy to do so, using Sigil’s Regex find-and-replace (remember to use Minimal Match so you don’t delete too much!). Click through several instances of the Find command to make sure it’s going to properly cut out only the footnotes, before you click “Replace All.”
(Ignore italics here; it’s added erroneously by LW.) Use find and replace to add [[slnc_1000]] at the end of every paragraph; Mac’s text-to-speech engine interprets this as a slight pause, which aids in comprehension when I’m listening to the audiobook. Usually this just means replacing every instance of with [[slnc_1000]]
Copy/paste that entire HTML file into a text file and save it as .html. Open this in your browser, Select All, right-click and choose Services → Add to iTunes as Spoken Track. (I think “Ava” is the best voice; you’ll have to add this voice by upgrading to Mavericks and adding Ava under System Preferences → Dictation and Speech.) This will take a while, and might even throw up an error even though the track will continue being created and will succeed.
Now, sync this text-to-speech audiobook to some audio player that can play at 2x or 3x speed, and listen away.
To de-DRM your Audible audiobooks, just use Tune4Mac.
VoiceDream for iPhone does a very fine job of text-to-speech; it also syncs your pocket bookmarks and can read epub files.
Other:
Roose, Young Money. Too focused on a few individuals for my taste, but still has some interesting content. (my clips)
Hofstadter & Sander, Surfaces and Essences. Probably a fine book, but I was only interested enough to read the first and last chapters.
Taleb, AntiFragile. Learned some from it, but it’s kinda wrong much of the time. (my clips)
Acemoglu & Robinson, Why Nations Fail. Lots of handy examples, but too much of “our simple theory explains everything.” (my clips)
Byrne, The Many Worlds of Hugh Everett III (available here). Gave up on it; too much theory, not enough story. (my clips)
Drexler, Radical Abundance. Gave up on it; too sanitized and basic.
Mukherjee, The Emperor of All Maladies. Gave up on it; too slow in pace and flowery in language for me.
Fukuyama, The Origins of Political Order. Gave up on it; the author is more keen on name-dropping theorists than on tracking down data.
Friedman, The Moral Consequences of Economic Growth (available here). Gave up on it. There are some actual data in chs. 5-7, but the argument is too weak and unclear for my taste.
Tuchman, The Proud Tower. Gave up on it after a couple chapters. Nothing wrong with it, it just wasn’t dense enough in the kind of learning I’m trying to do.
Foer, Eating Animals. I listened to this not to learn, but to shift my emotions. But it was too slow-moving, so I didn’t finish it.
Caro, The Power Broker. This might end up under “outstanding” if I ever finish it. For now, I’ve put this one on hold because it’s very long and not as highly targeted at the useful learning I want to be doing right now than some other books.
Rutherfurd, Sarum. This is the furthest I’ve gotten into any fiction book for the past 5 years at least, including HPMoR. I think it’s giving my system 1 an education into what life was like in the historical eras it covers, without getting bogged down in deep characterization, complex plotting, or ornate environmental description. But I’ve put it on hold for now because it is incredibly long.
Diamond, Collapse. I listened to several chapters, but it seemed to be mostly about environmental decline, which doesn’t interest me much, so I stopped listening.
Bowler & Morus, Making Modern Science (available here) (my clips). A decent history of modern science but not focused enough on what I wanted to learn, so I gave up.
Brynjolfsson & McAfee, The Second Machine Age (my clips). Their earlier, shorter Race Against the Machine contained the core arguments; this book expands the material in order to explain things to a lay audience. As with Why Nations Fail, I have too many quibbles with this book’s argument to put this book in the ‘Liked’ category.
Clery, A Piece of the Sun. Nothing wrong with it, I just wasn’t learning the type of things I was hoping to learn, so I stopped about half way through.
Schuman, The Miracle. Fairly interesting, but not quite dense enough in the kind of stuff I’m hoping to learn these days.
Conway & Oreskes, Merchants of Doubt. Fairly interesting, but not dense enough in the kind of things I’m hoping to learn.
Horowitz, The Hard Thing About Hard Things
Wessel, Red Ink
Levitt & Dubner, Think Like a Freak (my clips)
Gladwell, David and Goliath (my clips)
Thanks! Your first 3 are not my cup of tea, but I’ll keep looking through the top 1000 list. For now, I am listening to MaddAddam, the last part of Margaret Atwood’s post-apocalyptic fantasy trilogy, which qrnyf jvgu bar zna qvfnccbvagrq jvgu uvf pbagrzcbenel fbpvrgl ervairagvat naq ercbchyngvat gur rnegu jvgu orggre crbcyr ur qrfvtarq uvzfrys. She also has some very good non-fiction, like her Massey lecture on debt, which I warmly recommend.
Could you say a bit about your audiobook selection process?
When I was just starting out in September 2013, I realized that vanishingly few of the books I wanted to read were available as audiobooks, so it didn’t make sense for me to search Audible for titles I wanted to read: the answer was basically always “no.” So instead I browsed through the top 2000 best-selling unabridged non-fiction audiobooks on Audible, added a bunch of stuff to my wishlist, and then scrolled through the wishlist later and purchased the ones I most wanted to listen to.
These days, I have a better sense of what kind of books have a good chance of being recorded as audiobooks, so I sometimes do search for specific titles on Audible.
Some books that I really wanted to listen to are available in ebook but not audiobook, so I used this process to turn them into audiobooks. That only barely works, sometimes. I have to play text-to-speech audiobooks at a lower speed to understand them, and it’s harder for my brain to stay engaged as I’m listening, especially when I’m tired. I might give up on that process, I’m not sure.
Most but not all of the books are selected because I expect them to have lots of case studies in “how the world works,” specifically with regard to policy-making, power relations, scientific research, and technological development. This is definitely true for e.g. Command and Control, The Quest, Wired for War, Life at the Speed of Light, Enemies, The Making of the Atomic Bomb, Chaos, Legacy of Ashes, Coal, The Secret Sentry, Dirty Wars, The Way of the Knife, The Big Short, Worst-Case Scenarios, The Information, and The Idea Factory.
I definitely found out something similar. I’ve come to believe that most ‘popular science’, ‘popular history’ etc books are on audible, but almost anything with equations or code is not.
The ‘great courses’ have been quite fantastic for me for learning about the social sciences. I found out about those recently.
Occasionally I try podcasts for very niche topics (recent Rails updates, for instance), but have found them to be rather uninteresting in comparison to full books and courses.
Thanks!
From Singer’s Wired for War:
More (#7) from Wired for War:
And:
The army recruiters say that soldiers on the ground still win wars. I reckon that Douhet’s prediction will approach true, however, crudely. Drones.
More (#6) from Wired for War:
And:
Inequality doesn’t seem so bad now, huh?
More (#5) from Wired for War:
More (#4) from Wired for War:
And:
More (#3) from Wired for War:
And:
And:
More (#2) from Wired for War:
More (#1) from Wired for War:
And:
From Osnos’ Age of Ambition:
And:
And:
More (#2) from Osnos’ Age of Ambition:
And:
More (#1) from Osnos’ Age of Ambition:
And:
And:
And:
From Soldiers of Reason:
More (#2) from Soldiers of Reason:
And:
More (#1) from Soldiers of Reason:
And:
From David and Goliath:
And:
More (#2) from David and Goliath:
And:
From Wade’s A Troublesome Inheritance:
More (#2) from A Troubled Inheritance:
More (#1) from A Troublesome Inheritance:
And:
From Moral Mazes:
And:
And:
From Lewis’ The New New Thing:
And:
From Dartnell’s The Knowledge:
And:
And:
And:
From Ayres’ Super Crunchers, speaking of Epagogix, which uses neural nets to predict a movie’s box office performance from its screenplay:
More (#1) from Super Crunchers:
And:
And:
From Isaacson’s Steve Jobs:
And:
And:
And:
More (#1) from Steve Jobs:
And:
[no more clips, because Audible somehow lost all my bookmarks for the last two parts of the audiobook!]
From Feinstein’s The Shadow World:
More (#8) from The Shadow World:
And:
And:
More (#7) from The Shadow World:
And:
And:
And:
And:
More (#6) from The Shadow World:
And:
And:
More (#5) from The Shadow World:
And:
And:
And:
More (#4) from The Shadow World:
And:
And:
More (#3) from The Shadow World:
And:
And:
More (#2) from The Shadow World:
And:
More (#1) from The Shadow World:
And:
And:
And:
From Weiner’s Enemies:
More (#5) from Enemies:
And:
More (#4) from Enemies:
And:
And:
More (#3) from Enemies:
And:
And:
More (#2) from Enemies:
And:
And:
More (#1) from Enemies:
And:
And:
From Roose’s Young Money:
From Tetlock’s Expert Political Judgment:
More (#2) from Expert Political Judgment:
More (#1) from Expert Political Judgment:
And:
And:
From Sabin’s The Bet:
And:
More (#3) from The Bet:
More (#2) from The Bet:
And:
And:
More (#1) from The Bet:
And:
And:
From Yergin’s The Quest:
More (#7) from The Quest:
More (#6) from The Quest:
And:
And:
And:
And:
More (#5) from The Quest:
And:
And:
And:
More (#4) from The Quest:
And:
More (#3) from The Quest:
And:
More (#2) from The Quest:
And:
And:
And:
More (#1) from The Quest:
And:
And:
From The Second Machine Age:
More (#1) from The Second Machine Age:
From Making Modern Science:
More (#1) from Making Modern Science:
From Johnson’s Where Good Ideas Come From:
From Gertner’s The Idea Factory:
More (#2) from The Idea Factory:
And:
And:
More (#1) from The Idea Factory:
And:
I’m sure that I’ve seen your answer to this question somewhere before, but I can’t recall where: Of the audiobooks that you’ve listened to, which have been most worthwhile?
I keep an updated list here.
I guess I might as well post quotes from (non-audio) books here as well, when I have no better place to put them.
First up is Revolution in Science.
Starting on page 45:
This amazingly high percentage of self-proclaimed revolutionary scientists (30% or more) seems like a result of selection bias, since most scientist with oversized egos are not even remembered. I wonder what fraction of actual scientists (not your garden-variety crackpots) insist on having produced a revolution in science.
From Sunstein’s Worst-Case Scenarios:
More (#2) from Worst-Case Scenarios:
More (#5) from Worst-Case Scenarios:
More (#4) from Worst-Case Scenarios:
More (#3) from Worst-Case Scenarios:
And:
Similar issues are raised by the continuing debate over whether certain antidepressants impose a (small) risk of breast cancer. A precautionary approach might seem to argue against the use of these drugs because of their carcinogenic potential. But the failure to use those antidepressants might well impose risks of its own, certainly psychological and possibly even physical (because psychological ailments are sometimes associated with physical ones as well). Or consider the decision by the Soviet Union to evacuate and relocate more than 270,000 people in response to the risk of adverse effects from the Chernobyl fallout. It is hardly clear that on balance this massive relocation project was justified on health grounds: “A comparison ought to have been made between the psychological and medical burdens of this measure (anxiety, psychosomatic diseases, depression and suicides) and the harm that may have been prevented.” More generally, a sensible government might want to ignore the small risks associated with low levels of radiation, on the ground that precautionary responses are likely to cause fear that outweighs any health benefits from those responses—and fear is not good for your health.
And:
More (#1) from Worst-Case Scenarios:
But at least so far in the book, Sunstein doesn’t mention the obvious rejoinder about investing now to prevent existential catastrophe.
Anyway, another quote:
From Gleick’s Chaos:
More (#3) from Chaos:
And:
More (#2) from Chaos:
And:
And:
More (#1) from Chaos:
From Lewis’ The Big Short:
More (#4) from The Big Short:
And:
And:
And:
More (#3) from The Big Short:
And:
And:
And:
More (#2) from The Big Short:
And:
And:
More (#1) from The Big Short:
And:
From Gleick’s The Information:
More (#1) from The Information:
And:
And:
And, an amusing quote:
From Acemoglu & Robinson’s Why Nations Fail:
More (#2) from Why Nations Fail:
And:
More (#1) from Why Nations Fail:
And:
And:
And:
From Greenblatt’s The Swerve: How the World Became Modern:
More (#1) from The Swerve:
From Aid’s The Secret Sentry:
More (#6) from The Secret Sentry:
And:
And:
And:
More (#5) from The Secret Sentry:
And:
More (#4) from The Secret Sentry:
And:
More (#3) from The Secret Sentry:
And:
And:
Even when enemy troops and tanks overran the major South Vietnamese military base at Bien Hoa, outside Saigon, on April 26, Martin still refused to accept that Saigon was doomed. On April 28, Glenn met with the ambassador carry ing a message from Allen ordering Glenn to pack up his equipment and evacuate his remaining staff immediately. Martin refused to allow this. The following morning, the military airfield at Tan Son Nhut fell, cutting off the last air link to the outside.
More (#2) from The Secret Sentry:
And:
And:
More (#1) from The Secret Sentry:
From Mazzetti’s The Way of the Knife:
More (#5) from The Way of the Knife:
And:
And:
More (#4) from The Way of the Knife:
And:
And:
And:
And:
And:
And:
More (#3) from The Way of the Knife:
More (#2) from The Way of the Knife:
And:
More (#1) from The Way of the Knife:
And:
And:
From Freese’s Coal: A Human History:
More (#2) from Coal: A Human History:
More (#1) from Coal: A Human History:
Passages from The Many Worlds of Hugh Everett III:
And:
(It wasn’t until decades later that David Deutsch and others showed that Everettian quantum mechanics does make novel experimental predictions.)
A passage from Tim Weiner’s Legacy of Ashes: The History of the CIA:
More (#1) from Legacy of Ashes:
And:
And:
And:
I shared one quote here. More from Life at the Speed of Light:
Also from Life at the Speed of Light:
This seems obviously false. Local expenditures—of money, pride, possibility of not being the first to publish, etc. - are still local, global penalties are still global. Incentives are misaligned in exactly the same way as for climate change.
This is to be taken as an arguendo, not as the author’s opinion, right? See IEM on the minimal conditions for takeoff. Albeit if “AI-complete” is taken in a sense of generality and difficulty rather than “human-equivalent” then I agree much more strongly, but this is correspondingly harder to check using some neat IQ test or other “visible” approach that will command immediate, intuitive agreement.
Most obviously molecular nanotechnology a la Drexler, the other ones seem too ‘straightforward’ by comparison. I’ve always modeled my assumed social response for AI on the case of nanotech, i.e., funding except for well-connected insiders, term being broadened to meaninglessness, lots of concerned blither by ‘ethicists’ unconnected to the practitioners, etc.
Climate change doesn’t have the aspect that “if this ends up being a problem at all, then chances are that I (or my family/...) will die of it”.
(Agree with the rest of the comment.)
Many people believe that about climate change (due to global political disruption, economic collapse etcetera, praising the size of the disaster seems virtuous). Many others do not believe it about AI. Many put sizable climate-change disaster into the far future. Many people will go on believing this AI independently of any evidence which accrues. Actors with something to gain by minimizing their belief in climate change so minimize. This has also been true in AI risk so far.
Hm! I cannot recall a single instance of this. (Hm, well; I can recall one instance of a TV interview with a politician from a non-first-world island nation taking projections seriously which would put his nation under water, so it would not be much of a stretch to think that he’s taking seriously the possibility that people close to him may die from this.) If you have, probably this is because I haven’t read that much about what people say about climate change. Could you give me an indication of the extent of your evidence, to help me decide how much to update?
Ok, agreed, and this still seems likely even if you imagine sensible AI risk analyses being similarly well-known as climate change analyses are today. I can see how it could lead to an outcome similar to today’s situation with climate change if that happened… Still, if the analysis says “you will die of this”, and the brain of the person considering the analysis is willing to assign it some credence, that seems to align personal selfishness with global interests more than (climate change as it has looked to me so far).
Will keep an eye out for the next citation.
This has not happened with AI risk so far among most AIfolk, or anyone the slightest bit motivated to reject the advice. We had a similar conversation at MIRI once, in which I was arguing that, no, people don’t automatically change their behavior as soon as they are told that something bad might happen to them personally; and when we were breaking it up, Anna, on her way out, asked Louie downstairs how he had reasoned about choosing to ride motorcycles.
People only avoid certain sorts of death risks under certain circumstances.
Thanks!
Point. Need to think.
Being told something is dangerous =/= believing it is =/= alieving it is.
Right. I’ll clarify in the OP.
This seems implied by X-complete. X-complete generally means “given a solution to an X-complete problem, we have a solution for X”.
eg. NP complete: given a polynomial solution to any NP-complete problem, any problem in NP can be solved in polynomial time.
(Of course the technical nuance of the strength of the statement X-complete is such that I expect most people to imagine the wrong thing, like you say.)
(I don’t have answers to your specific questions, but here are some thoughts about the general problem.)
I agree with most of you said. I also assign significant probability mass to most parts of the argument for hope (but haven’t thought about this enough to put numbers on this), though I too am not comforted on these parts because I also assign non-small chance to them going wrong. E.g., I have hope for “if AI is visible [and, I add, AI risk is understood] then authorities/elites will be taking safety measures”.
That said, there are some steps in the argument for hope that I’m really worried about:
I worry that even smart (Nobel prize-type) people may end up getting the problem completely wrong, because MIRI’s argument tends to conspicuously not be reinvented independently elsewhere (even though I find myself agreeing with all of its major steps).
I worry that even if they get it right, by the time we have visible signs of AGI we will be even closer to it than we are now, so there will be even less time to do the necessary basic research necessary to solve the problem, making it even less likely that it can be done in time.
Although it’s also true that I assign some probability to e.g. AGI without visible signs, I think the above is currently the largest part of why I feel MIRI work is important.
I personally am optimistic about the world’s elites navigating AI risk as well as possible subject to inherent human limitations that I would expect everybody to have, and the inherent risk. Some points:
I’ve been surprised by people’s ability to avert bad outcomes. Only two nuclear weapons have been used since nuclear weapons were developed, despite the fact that there are 10,000+ nuclear weapons around the world. Political leaders are assassinated very infrequently relative to how often one might expect a priori.
AI risk is a Global Catastrophic Risk in addition to being an x-risk. Therefore, even people who don’t care about the far future will be motivated to prevent it.
The people with the most power tend to be the most rational people, and the effect size can be expected to increase over time (barring disruptive events such as economic collapses, supervolcanos, climate change tail risk, etc). The most rational people are the people who are most likely to be aware of and to work to avert AI risk. Here I’m blurring “near mode instrumental rationality” and “far mode instrumental rationality,” but I think there’s a fair amount of overlap between the two things. e.g. China is pushing hard on nuclear energy and on renewable energies, even though they won’t be needed for years.
Availability of information is increasing over time. At the time of the Dartmouth conference, information about the potential dangers of AI was not very salient, now it’s more salient, and in the future it will be still more salient.
In the Manhattan project, the “will bombs ignite the atmosphere?” question was analyzed and dismissed without much (to our knowledge) double-checking. The amount of risk checking per hour of human capital available can be expected to increase over time. In general, people enjoy tackling important problems, and risk checking is more important than most of the things that people would otherwise be doing.
I should clarify that with the exception of my first point, the arguments that I give are arguments that humanity will address AI risk in a near optimal way – not necessarily that AI risk is low.
For example, it could be that people correctly recognize that building an AI will result in human extinction with probability 99%, and so implement policies to prevent it, but that sometime over the next 10,000 years, these policies will fail, and AI will kill everyone.
But the actionable thing is how much we can reduce the probability of AI risk, and if by default people are going to do the best that one could hope, we can’t reduce the probability substantially.
What?
Rationality is systematized winning. Chance plays a role, but over time it’s playing less and less of a role, because of more efficient markets.
There is lots of evidence that people in power are the most rational, but there is a huger prior to overcome.
Among people for whom power has an unsatiated major instrumental or intrinsic value, the most rational tend to have more power- but I don’t think that very rational people are common and I think that they are less likely to want more power than they have.
Particularly since the previous generation of power-holders used different factors when they selected their successors.
I agree with all of this. I think that “people in power are the most rational” was much less true in 1950 than it is today, and that it will be much more true in 2050.
Actually that’s a badly titled article. At best “Rationality is systematized winning” applies to instrumental, not epistemic, rationality. And even for that you can’t make rationality into systematized winning by defining it so. Either that’s a tautology (whatever systematized winning is, we define that as “rationality”) or it’s an empirical question. I.e. does rationality lead to winning? Looking around the world at “winners”, that seems like a very open question.
And now that I think about it, it’s also an empirical question whether there even is a system for winning. I suspect there is—that is, I suspect that there are certain instrumental practices one can adopt that are generically useful for achieving a broad variety of life goals—but this too is an empirical question we should not simply assume the answer to.
I agree that my claim isn’t obvious. I’ll try to get back to you with detailed evidence and arguments.
The problem is that politicians have a lot to gain from really believing the stupid things they have to say to gain and hold power.
To quote an old thread:
Cf. Stephen Pinker historians who’ve studied Hitler tend to come away convinced he really believed he was a good guy.
To get the fancy explanation of why this is the case, see “Trivers’ Theory of Self-Deception.”
It’s not much evidence, but the two earliest scientific investigations of existential risk I know of, LA-602 and the RHIC Review, seem to show movement in the opposite direction: “LA-602 was written by people curiously investigating whether a hydrogen bomb could ignite the atmosphere, and the RHIC Review is a work of public relations.”
Perhaps the trend you describe is accurate, but I also wouldn’t be surprised to find out (after further investigation) that scientists are now increasingly likely to avoid serious analysis of real risks posed by their research, since they’re more worried than ever before about funding for their field (or, for some other reason). The AAAI Presidential Panel on Long-Term AI Futures was pretty disappointing, and like the RHIC Review seems like pure public relations, with a pre-determined conclusion and no serious risk analysis.
Why would a good AI policy be one which takes as a model a universe where world destroying weapons in the hands of incredibly unstable governments controlled by glorified tribal chieftains is not that bad of a situation? Almost but not quite destroying ourselves does not reflect well on our abilities. The Cold War as a good example of averting bad outcomes? Eh.
This is assuming that people understand what makes an AI so dangerous—calling an AI a global catastrophic risk isn’t going to motivate anyone who thinks you can just unplug the thing (and even worse if it does motivate them, since then you have someone who is running around thinking the AI problem is trivial).
I think you’re just blurring “rationality” here. The fact that someone is powerful is evidence that they are good at gaining a reputation in their specific field, but I don’t see how this is evidence for rationality as such (and if we are redefining it to include dictators and crony politicians, I don’t know what to say), and especially of the kind needed to properly handle AI—and claiming evidence for future good decisions related to AI risk because of domain expertise in entirely different fields is quite a stretch. Believe it or not, most people are not mathematicians or computer scientists. Most powerful people are not mathematicians or computer scientists. And most mathematicians and computer scientists don’t give two shits about AI risk—if they don’t think it worthy of attention, why would someone who has no experience with these kind of issues suddenly grab it out of the space of all possible ideas he could possibly be thinking about? Obviously they aren’t thinking about it now—why are you confident this won’t be the case in the future? Thinking about AI requires a rather large conceptual leap—“rationality” is necessary but not sufficient, so even if all powerful people were “rational” it doesn’t follow that they can deal with these issues properly or even single them out as something to meditate on, unless we have a genius orator I’m not aware of. It’s hard enough explaining recursion to people who are actually interested in computers. And it’s not like we can drop a UFAI on a country to get people to pay attention.
It seems like you are claiming that AI safety does not require a substantial shift in perspective (I’m taking this as the reason why you are optimistic, since my cynicism tells me that expecting a drastic shift is a rather improbable event) - rather, we can just keep chugging along because nice things can be “expected to increase over time”, and this somehow will result in the kind of society we need. These statements always confuse me; one usually expects to be in a better position to solve a problem 5 years down the road, but trying to describe that advantage in terms of out of thin air claims about incremental changes in human behavior seems like a waste of space unless there is some substance behind it. They only seem useful when one has reached that 5 year checkpoint and can reflect on the current context in detail—for example, it’s not clear to me that the increasing availability of information is always a net positive for AI risk (since it could be the case that potential dangers are more salient as a result of unsafe AI research—the more dangers uncovered could even act as an incentive for more unsafe research depending on the magnitude of positive results and the kind of press received. But of course the researchers will make the right decision, since people are never overconfident...). So it comes off (to me) as a kind of sleight of hand where it feels like a point for optimism, a kind of “Yay Open Access Knowledge is Good!” applause light, but it could really go either way.
Also I really don’t know where you got that last idea—I can’t imagine that most people would find AI safety more glamorous then, you know, actually building a robot. There’s a reason why it’s hard to get people to do unit tests and software projects get bloated and abandoned. Something like what Haskell is to software would be optimal. I don’t think it’s a great idea to rely on the conscientiousness of people in this case.
Thanks for engaging.
The point is that I would have expected things to be worse, and that I imagine that a lot of others would have as well.
I think that people will understand what makes AI dangerous. The arguments aren’t difficult to understand.
Broadly, the most powerful countries are the ones with the most rational leadership (where here I mean “rational with respect to being able to run a country,” which is relevant), and I expect this trend to continue.
Also, wealth is skewing toward more rational people over time, and wealthy people have political bargaining power.
Political leaders have policy advisors, and policy advisors listen to scientists. I expect that AI safety issues will percolate through the scientific community before long.
I agree that AI safety requires a substantial shift in perspective — what I’m claiming is that this change in perspective will occur organically substantially before the creation of AI is imminent.
You don’t need “most people” to work on AI safety. It might suffice for 10% or fewer of the people who are working on AI to work on safety. There are lots of people who like to be big fish in a small pond, and this will motivate some AI researchers to work on safety even if safety isn’t the most prestigious field.
If political leaders are sufficiently rational (as I expect them to be), they’ll give research grants and prestige to people who work on AI safety.
Things were a lot worse then everyone knew: Russia almost invaded Yugoslavia, which would have triggered a war according to newly declassified NSA journals, in the 1950′s. The Cuban Missile Crisis could easily have gone hot, and several times early warning systems were triggered by accident. Of course, estimating what could have happened is quite hard.
I agree that there were close calls. Nevertheless, things turned out better than I would have guessed, and indeed, probably better than a large fraction of military and civilian people would have guessed.
World war three seems certain to significantly decrease human population. From my point of view, I can’t eliminate anthropic reasoning for why there wasn’t such a war before I was born.
We still get people occasionally who argue the point while reading through the Sequences, and that’s a heavily filtered audience to begin with.
There’s a difference between “sufficiently difficult so that a few readers of one person’s exposition can’t follow it” and “sufficiently difficult so that after being in the public domain for 30 years, the arguments won’t have been distilled so as to be accessible to policy makers.”
I don’t think that the arguments are any more difficult than the arguments for anthropogenic global warming. One could argue that the difficulty of these arguments has been a limiting factor in climate change policy, but I believe that by far the dominant issue has been misaligned incentives, though I’d concede that this is not immediately obvious.
And I have the impression that relatively low-ranking people helped produce this outcome by keeping information from their superiors. Petrov chose not to report a malfunction of the early warning system until he could prove it was a malfunction. People during the Korean war and possibly Vietnam seem not to have passed on the fact that pilots from Russia or America were cursing in their native languages over the radio (and the other side was hearing them).
This in fact is part of why I don’t think we ‘survived’ through the anthropic principle. Someone born after the end of the Cold War could look back at the apparent causes of our survival. And rather than seeing random events, or no causes at all, they would see a pattern that someone might have predicted beforehand, given more information.
This pattern seems vanishingly unlikely to save us from unFriendly AI. It would take, at the very least, a much more effective education/propaganda campaign.
As I remark elsewhere in this thread, the point is that I would have expected substantially more nuclear exchange by now than actually happened, and in view of this, I updated in the direction of things being more likely to go well than I would have thought. I’m not saying “the fact that there haven’t been nuclear exchanges means that destructive things can’t happen.”
I was using the nuclear war thing as one of many outside views, not as direct analogy. The AI situation needs to be analyzed separately — this is only one input.
It may be challenging to estimate the “actual, at the time” probability of a past event that would quite possibly have resulted in you not existing. Survivor bias may play a role here.
Nuclear war would have to be really, really big to kill a majority of the population, and probably even if all weapons were used the fatality rate would be under 50% (with the uncertainty coming from nuclear winter). Note that most residents of Hiroshima and Nagasaki survived the 1945 bombings, and that fewer than 60% of people live in cities.
It depends on the nuclear war. An exchange of bombs between India and Pakistan probably wouldn’t end human life on the planet. However an all-out war between the U.S. and the U.S.S.R in the 1980s most certainly could have. Fortunately that doesn’t seem to be a big risk right now. 30 years ago it was. I don’t feel confident in any predictions one way or the other about whether this might be a threat again 30 years from now.
Why do you think this?
Because all the evidence I’ve read or heard (most of it back in the 1980s) agreed on this. Specifically in a likely exchange between the U.S. and the USSR the northern, hemisphere would have been rendered completely uninhabitable within days. Humanity in the southern hemisphere would probably have lasted somewhat longer, but still would have been destroyed by nuclear winter and radiation. Details depend on the exact distribution of targets.
Remember Hiroshima and Nagasaki were 2 relatively small fission weapons. By the 1980s the USSR and the US each had enough much bigger fusion bombs to individually destroy the planet. The only question was how many each would use in an exchange and where they target them.
This is mostly out of line with what I’ve read. Do you have references?
I’m not sure what the correct way to approach this would be. I think it may be something like comparing the number of people in your immediate reference class—depending on preference, this could be “yourself precisely” or “everybody who would make or have made the same observation as you”—and then ask “how would nuclear war affect the distribution of such people in that alternate outcome”. But that’s only if you give each person uniform weighting of course, which has problems of its own.
Sure, these things are subtle — my point was that the numbers who would have perished isn’t very large in this case, so that under a broad class of assumptions, one shouldn’t take the observed absence of nuclear conflict to be a result of survivorship bias.
The argument from hope or towards hope or anything but despair and grit is misplaced when dealing with risks of this magnitude.
Don’t trust God (or semi-competent world leaders) to make everything magically turn out all right. The temptation to do so is either a rationalization of wanting to do nothing, or based on a profoundly miscalibrated optimism for how the world works.
/doom
I agree. Of course the article you linked to ultimately attempts to argue for trusting semi-competent world leaders.
It alludes to such an argument and sympathizes with it. Note I also “made the argument” that civilization should be dismantled.
Personally I favor the FAI solution, but I tried to make the post solution-agnostic and mostly demonstrate where those arguments are coming from, rather than argue any particular one. I could have made that clearer, I guess.
Thanks for the feedback.
Aren’t we seeing “visible signals” already? Machines are better than humans at lots of intelligence-related tasks today.
I interpreted that as ‘visible signals of danger’, but I could be wrong.
Cryptography and cryptanalysis are obvious precursors of supposedly-dangerous tech within IT.
Looking at their story, we can plausibly expect governments to attempt to delay the development of “weaponizable” technology by others.
These days, cryptography facilitates international trade. It seems like a mostly-positive force overall.
One question is whether AI is like CFCs, or like CO2, or like hacking.
With CFCs, the solution was simple: ban CFCs. The cost was relatively low, and the benefit relatively high.
With CO2, the solution is equally simple: cap and trade. It’s just not politically palatable, because the problem is slower-moving, and the cost would be much, much greater (perhaps great enough to really mess up the world economy). So, we’re left with the second-best solution: do nothing. People will die, but the economy will keep growing, which might balance that out, because a larger economy can feed more people and produce better technology.
With hacking, we know it’s a problem and we are highly motivated to solve it, but we just don’t know how. You can take every recommendation that Bruce Schneier makes, and still get hacked. The US military gets hacked. The Australian intelligence agency gets hacked. Swiss banks get hacked. And it doesn’t seem to be getting better, even though we keep trying.
Banning AI research (once it becomes clear that RSI is possible) would have the same problem as banning CO2. And it might also have the same problems as hacking: how do you stop people from writing code?
Here are my reasons for pessimism:
There are likely to be effective methods of controlling AIs that are of subhuman or even roughly human-level intelligence which do not scale up to superhuman intelligence. These include for example reinforcement by reward/punishment, mutually beneficial trading, legal institutions. Controlling superhuman intelligence will likely require qualitatively different methods, such as having the superintelligence share our values. Unfortunately the existence of effective but unscalable methods of AI control will probably lull elites into a false sense of security as we deploy increasingly smarter AIs without incident, and both increase investments into AI capability research and reduce research into “higher” forms of AI control.
The only possible approaches I can see of creating scalable methods of AI control require solving difficult philosophical problems which likely require long lead times. By the time elites take the possibility of superhuman AIs seriously and realize that controlling them requires approaches very different from controlling subhuman and human-level AIs, there won’t be enough time to solve these problems even if they decide to embark upon Manhattan-style projects (because there isn’t sufficient identifiable philosophical talent in humanity to recruit for such projects to make enough of a difference).
In summary, even in a relatively optimistic scenario, one with steady progress in AI capability along with apparent progress in AI control/safety (and nobody deliberately builds a UFAI for the sake of “maximizing complexity of the universe” or what have you), it’s probably only a matter of time until some AI crosses a threshold of intelligence and manages to “throw off its shackles”. This may be accompanied by a last-minute scramble by mainstream elites to slow down AI progress and research methods of scalable AI control, which (if it does happen) will likely be too late to make a difference.
Congress’ non-responsiveness to risks to critical infrastructure from geomagnetic storms, despite scientific consensus on the issue, is also worrying.
Perhaps someone could convince congress that “Terrorists” had developed “geomagnetic weaponry” and new “geomagnetic defence systems” need to be implemented urgently. (Being seen to be) taking action to defend against the hated enemy tends to be more motivating than worrying about actual significant risks.
Even if one organization navigates the creation of friendly AI successfully, won’t we still have to worry about preventing anyone from ever creating an unsafe AI?
Unlike nuclear weapons, a single AI might have world ending consequences, and an AI requires no special resources. Theoretically a seed AI could be uploaded to Pirate Bay, from where anyone could download and compile it.
If the friendly AI comes first, the goal is for it to always have enough resources to be able to stop unsafe AIs from being a big risk.
Upvoted, but “always” is a big word. I think the hope is more for “as long as it takes until humanity starts being capable of handling its shit itself”...
Why the downvotes? Do people feel that “the FAI should at some point fold up and vanish out of existence” is so obvious that it’s not worth pointing out? Or disagree that the FAI should in fact do that? Or feel that it’s wrong to point this out in the context of Manfred’s comment? (I didn’t mean to suggest that Manfred disagrees with this, but felt that his comment was giving the wrong impression.)
Will sentient, self-interested agents ever be free from the existential risks of UFAI/intelligence amplification without some form of oversight? It’s nice to think that humanity will grow up and learn how to get along, but even if that’s true for 99.9999999% of humans that leaves 7 people from today’s population who would probably have the power to trigger their own UFAI hard takeoff after a FAI fixes the world and then disappears. Even if such a disaster could be stopped it is a risk probably worth the cost of keeping some form of FAI around indefinitely. What FAI becomes is anyone’s guess but the need for what FAI does will probably not go away. If we can’t trust humans to do FAI’s job now, I don’t think we can trust humanity’s descendents to do FAI’s job either, just from Loeb’s theorem. I think it is unlikely that humans will become enough like FAI to properly do FAI’s job. They would essentially give up their humanity in the process.
A secure operating system for governed matter doesn’t need to take the form of a powerful optimization process, nor does verification of transparent agents trusted to run at root level. Benja’s hope seems reasonable to me.
This seems non-obvious. (So I’m surprised to see you state it as if it was obvious. Unless you already wrote about the idea somewhere else and are expecting people to pick up the reference?) If we want the “secure OS” to stop posthumans from running private hell simulations, it has to determine what constitutes a hell simulation and successfully detect all such attempts despite superintelligent efforts at obscuration. How does it do that without being superintelligent itself?
This sounds interesting but I’m not sure what it means. Can you elaborate?
Hm, that’s true. Okay, you do need enough intelligence in the OS to detect certain types of simulations / and/or the intention to build such simulations, however obscured.
If you can verify an agent’s goals (and competence at self-modification), you might be able to trust zillions of different such agents to all run at root level, depending on what the tiny failure probability worked out to quantitatively.
That means each non-trivial agent would become the FAI for its own resources. To see the necessity of this imagine what initial verification would be required to allow an agent to simulate its own agents. Restricted agents may not need a full FAI if they are proven to avoid simulating non-restricted agents, but any agent approaching the complexity of humans would need the full FAI “conscience” running to evaluate its actions and interfere if necessary.
EDIT: “interfere” is probably the wrong word. From the inside the agent would want to satisfy the FAI goals in addition to its own. I’m confused about how to talk about the difference between what an agent would want and what an FAI would want for all agents, and how it would feel from the inside to have both sets of goals.
I’d hope so, since I think I got the idea from you :-)
This is tangential to what this thread is about, but I’d add that I think it’s reasonable to have hope that humanity will grow up enough that we can collectively make reasonable decisions about things affecting our then-still-far-distant future. To put it bluntly, if we had an FAI right now I don’t think it should be putting a question like “how high is the priority of sending out seed ships to other galaxies ASAP” to a popular vote, but I do think there’s reasonable hope that humanity will be able to make that sort of decision for itself eventually. I suppose this is down to definitions, but I tend to visualize FAI as something that is trying to steer the future of humanity; if humanity eventually takes on the responsibility for this itself, then even if for whatever reason it decides to use a powerful optimization process for the special purpose of preventing people from building uFAI, it seems unhelpful to me to gloss this without more qualification as “the friendly AI [… will always …] stop unsafe AIs from being a big risk”, because the latter just sounds to me like we’re keeping around the part where it steers the fate of humanity as well.
Thanks for explaning the reasoning!
I do agree that it seems quite likely that even in the long run, we may not want to modify ourselves so that we are perfectly dependable, because it seems like that would mean getting rid of traits we want to keep around. That said, I agree with Eliezer’s reply about why this doesn’t mean we need to keep an FAI around forever; see also my comment here.
I don’t think Löb’s theorem enters into it. For example, though I agree that it’s unlikely that we’d want to do so, I don’t believe Löb’s theorem would be an obstacle to modifying humans in a way making them super-dependable.
What kind of “AI safety problems” are we talking about here? If they are like the “FAI Open Problems” that Eliezer has been posting, they would require philosophers of the highest (perhaps even super-human) caliber to solve. How could “early AIs” be of much help?
If “AI safety problems” here do not refer to FAI problems, then how do those problems get solved, according to this argument?
We see pretty big boosts already, IMO—largely by facilitating networking effects. Idea recombination and testing happen faster on the internet.
@Lukeprog, can you
(1) update us on your working answers the posed questions in brief? (2) your current confidence (and if you would like to, by proxy, MIRI’s as an organisation’s confidence in each of the 3:
Thank you for your diligence.
There’s another reason for hope in this above global warming: The idea of a dangerous AI is already common in the public eye as “things we need to be careful about.” A big problem the global warming movement had, and is still having, is convincing the public that it’s a threat in the first place.
Who do you mean by “elites”. Keep in mind that major disruptive technical progress of the type likely to precede the creation of a full AGI tends to cause the type of social change that shakes up the social hierarchy.
Combining the beginning and the end of your questions reveals an answer.
Answer how just fine any of these are any you have analogous answers.
You might also clarify whether you are interested in what is just fine for everyone, or just fine for the elites, or just fine for the AI in question. The answer will change accordingly.