Eliezer Yudkowsky comments on How does MIRI Know it Has a Medium Probability of Success?

Eliezer Yudkowsky 4 Aug 2013 16:43 UTC
27 points
Not easily. Antiantiheroic epistemology might be a better term, i.e., I think that a merely accurate epistemology doesn’t have a built-in mechanism which prevents people from thinking they can do things because the outside view says it’s nonvirtuous to try to distinguish yourself within reference class blah. Antiantiheroic epistemology doesn’t say that it’s possible to distinguish yourself within reference class blah so much as it thinks that the whole issue is asking the wrong question and you should mostly be worrying about staying engaged with the object-level problem because this is how you learn more and gain the ability to take opportunities as they arrive. An antiheroic epistemology that throws up some reference class or other saying this is impossible will regard you as trying to distinguish yourself within this reference class, but this is not what the antiantiheroic epistemology is actually about; that’s an external indictment of nonvirtuosity arrived at by additional modus ponens to conclusions on which antiantiheroic epistemology sees no reason to expend cognitive effort.

Obviously from my perspective non-antiheroic epistemology cancels out to mere epistemology, simpler for the lack of all this outside-view-social-modesty wasted motion, but to just go around telling you “That’s not how epistemology works, of course!” would be presuming a known standard which is logically rude (I think you are doing this, though not too flagrantly).

An archetypal example of antiantiheroic epistemology is Harry in Methods of Rationality, who never bothers to think about any of this reference class stuff or whether he’s being immodest, just his object-level problems in taking over the universe, except once when Hermione challenges him on it and Harry manages to do one thing a normal wizard can’t. Harry doesn’t try to convince himself of anything along those lines, or think about it without Hermione’s prompting. It just isn’t something that occurs to him might be a useful thought process.

I don’t think it’s a useful thought process either, and rationalizing elaborate reasons why I’m allowed to be a hero wouldn’t be useful either (Occam’s Imaginary Razor: decorating my thought processes with supportive tinsel will just slow down any changes I need to make), which is why I tend to be annoyed by the entire subject and wish people would get back to the object level instead of meta demands for modesty that come with no useful policy suggestions about ways to do anything better. Tell me a better object-level way to save the world and we can talk about my doing that instead.
- CarlShulman 4 Aug 2013 19:27 UTC
  34 points
  Parent
  “Antiantiheroic epistemology might be a better term, i.e., I think that a merely accurate epistemology doesn’t have a built-in mechanism which prevents people from thinking they can do things because the outside view says it’s nonvirtuous to try to distinguish yourself within reference class blah. ”
  
  Taken literally, I can’t possibly disagree with this, but it doesn’t seem to answer my question, which is “where is the positive evidence that one is not supposed to ignore.” I favor combining many different kinds of evidence, including sparse data. And that can and does lead to very high expectations for particular individuals.
  
  For example, several of my college fraternity brothers are now billionaires. Before facebook Mark Zuckerberg was clearly the person with the highest entrepreneurial potential that I knew, based on his intelligence, motivation, ambition, and past achievements in programming, business, and academics. People described him to me as resembling a young Bill Gates. His estimated expected future wealth based on that data if pursuing entrepreneurship, and informed by the data about the relationship of all of the characteristics I could track with it, was in the 9-figure range. Then add in that facebook was a very promising startup (I did some market sizing estimates for it, and people who looked at it and its early results were reliably impressed).
  
  Moving from entrepreneurship to politics, one can predict success to a remarkable degree with evidence like “Eton graduate, Oxford PPE graduate with first class degree, Oxford Union leader, interested in politics, starts in an entry-level political adviser job with a party.” See this post or this paper. Most of the distance in log odds to reliably becoming Prime Minister, let alone Member of Cabinet or Parliament, can be crossed with objective indicators. Throw in a bit more data about early progress, media mentions, and the like and the prediction improves still more.
  
  I would then throw in other evidence, like the impressiveness of the person’s public speaking relative to other similar people, their number and influence of friends and contacts in high places relative to other similar types (indicating both social capital, and skill at getting more), which could improve or worsen the picture. There is still a sizable chunk of randomness in log terms, as political careers are buffeted by switches in party control, the economy, the rise and fall of factions that carry members with them, and other hard-to-control factors at many stages. So I can and do come to expect that someone will probably get federal political office, and have a good shot at Cabinet, and less so for PM. But within the real distribution of characteristics I won’t be convinced that a young person will probably become PM, which would require almost zero noise.
  
  In science I can be convinced a young star is a good prospect for Nobel or Fields medal caliber work. But I would need stronger evidence than we have seen for anyone to expect that they would do this 10 times (since no one has done so). I am sympathetic to Wei Dai’s comment
  
  Carl referenced “Staring Into the Singularity” as an early indicator of your extraordinary verbal abilities (which explains much if not all of your subsequent successes). It suggests that’s how you initially attracted his attention. The same is certainly true for me. I distinctly recall saying to myself “I should definitely keep track of this guy” when I read that, back in the extropian days. Is that enough for you to count as “people who you met because of their previous success”?
  
  To state my overall position on the topic being discussed, I think according to “non-heroic epistemology”, after someone achieves an “impossible success”, you update towards them being able to achieve further successes of roughly the same difficulty and in related fields that use similar skills, but the posterior probabilities of them solving much more difficult problems or in fields that use very different skills remain low (higher relative to the prior, but still low in an absolute sense). Given my understanding of the distribution of cognitive abilities in humans, I don’t see why I would ever “give up” this epistemology, unless you achieved a level of success that made me suspect that you’re an alien avatar or something.
  
  I would be quite surprised to see you reliably making personal mathematical contributions at the level of the best top math and AI people. I would not be surprised to see MIRI workshop participants making progress on the problems at a level consistent with the prior evidence of their ability, and somewhat higher per unit time because workshops harvest ideas generated over a longer period, are solely dedicated to research, have a lot of collaboration and cross-fertilization, and may benefit from improved motivation and some nice hacking of associated productivity variables. And I would not be surprised at a somewhat higher than typical rate of interesting (to me, etc) results because of looking at strange problems.
  
  I would be surprised if the strange problems systematically deliver relatively huge gains on actual AI problems (and this research line is supposed to deliver AGI as a subset of FAI before others get AGI so it must have great utility in AGI design), i.e. if the strange problems are super-promising by the criteria that Pearl or Hinton or Ng or Norvig are using but neglected by blunder. I would be surprised if the distance to AGI is crossable in 20 years.
  
  I don’t think it’s a useful thought process either, and rationalizing elaborate reasons why I’m allowed to be a hero wouldn’t be useful either (Occam’s Imaginary Razor: decorating my thought processes with supportive tinsel will just slow down any changes I need to make), which is why I tend to be annoyed by the entire subject and wish people would get back to the object level instead of meta demands for modesty that come with no useful policy suggestions about ways to do anything better. Tell me a better object-level way to save the world and we can talk about my doing that instead.
  
  You are asking other people for their money and time, when they have other opportunities. To do that they need an estimate of the chance of MIRI succeeding, considering things like AI timelines, the speed of takeoff given powerful AI, competence of other institutions, the usefulness of MIRI’s research track, the feasibility of all alternative solutions to AI risk/AI control problems, how much MIRI-type research will be duplicated by researchers interested for other reasons over what timescales, and many other factors including the ability to execute given the difficulty of the problems and likelihood of relevance. So they need adequate object-level arguments about those contributing factors, or some extraordinary evidence to trust your estimates of all of them over the estimates of others without a clear object-level case. Some of the other opportunities available to them that they need to compare against MIRI:
  - Build up general altruistic capacities through things like the effective altruist movement or GiveWell’s investigation of catastrophic risks, (which can address AI in many ways, including ones now not visible, and benefit from much greater resources as well as greater understanding from being closer to AI); noting that these seem to scale faster and spill over
  - Invest money in an investment fund for the future which can invest more (in total and as a share of effort) when there are better opportunities, either by the discovery of new options, or the formation of better organizations or people (which can receive seed funding from such a trust)
  - Enhance decision-making and forecasting capabilities with things like the IARPA forecasting tournaments, science courts, etc, to improve reactions to developments including AI and others (recalling that most of the value of MIRI in your model comes from major institutions being collectively foolish or ignorant regarding AI going forward)
  - Prediction markets, meta-research, and other institutional changes
  - Work like Bostrom’s, seeking out crucial considerations and laying out analysis of issues such as AI risk for the world to engage with and to let key actors see the best arguments and reasons bearing on the problems
  - Pursue cognitive enhancement technologies or education methods (you give CFAR in this domain) to improve societal reaction to such problems
  - Find the most effective options for synthetic biology threats (GiveWell will be looking through them) and see if that is a more promising angle
  What links here?
  - Rationality is about pattern recognition, not reasoning by JonahS (26 May 2015 19:23 UTC; 45 points)
  - Eliezer Yudkowsky 4 Aug 2013 19:50 UTC
    29 points
    Parent
    
    You are asking other people for their money and time, when they have other opportunities. To do that they need an estimate of the chance of MIRI succeeding
    
    No they don’t; they could be checking relative plausibility of causing an OK outcome without trying to put absolute numbers on a probability estimate, and this is reasonable due to the following circumstances:
    
    The life lesson I’ve learned is that by the time you really get anywhere, if you get anywhere, you’ll have encountered some positive miracles, some negative miracles, your plans will have changed, you’ll have found that the parts which took the longest weren’t what you thought they would be, and that other things proved to be much easier than expected. Your successes won’t come through foreseen avenues, and neither will your failures. But running through it all will be the fundamental realization that everything you accomplished, and all the unforeseen opportunities you took advantage of, were things that you would never have received if you hadn’t attacked the hardest part of the problem that you knew about straight-on, without distraction.
    
    How do you estimate probabilities like that? I honestly haven’t a clue. Now, we all still have to maximize expected utility, but the heuristic I’m applying to do that (which at the meta level I think is the planning heuristic with the best chance of actually working) is to ask “Is there any better way of attacking the hardest part of the problem?” or “Is there any better course of action which doesn’t rely on someone else performing a miracle?” So far as I can tell, these other proposed courses of action don’t attack the hardest part of the problem for humanity’s survival, but rely on someone else performing a miracle. I cannot make myself believe that this would really actually work. (And System 2 agrees that System 1′s inability to really believe seems well-founded.)
    
    Since I’m acting on such reasons and heuristics as “If you don’t attack the hardest part of the problem, no one else will” and “Beware of taking the easy way out” and “Don’t rely on someone else to perform a miracle”, I am indeed willing to term what I’m doing “heroic epistemology”. It’s just that I think such reasoning is, you know, actually correct and normative under these conditions.
    
    If you don’t mind mixing the meta-level and the object-level, then I find any reasoning along the lines of “The probability of our contributing to solving FAI is too low, maybe we can have a larger impact by working on synthetic biology defense and hoping a miracle happens elsewhere” much less convincing than the meta-level observation, “That’s a complete Hail Mary pass, if there’s something you think is going to wipe out humanity then just work on that directly as your highest priority.” All the side cleverness, on my view, just adds up to losing the chance that you get by engaging directly with the problem and everything unforeseen that happens from there.
    
    Another way of phrasing this is that if we actually win, I fully expect the counterfactual still-arguing-about-this version of 2013-Carl to say, “But we succeeded through avenue X, while you were then advocating avenue Y, which I was right to say wouldn’t work.” And to this the counterfactual reply of Eliezer will be, “But Carl, if I’d taken your advice back then, I wouldn’t have stayed engaged with the problem long enough to discover and comprehend avenue X and seize that opportunity, and this part of our later conversation was totally foreseeable in advance.” Hypothetical oblivious!Carl then replies, “But the foreseeable probability should still have been very low” or “Maybe you or someone else would’ve tried Y without that detour, if you’d worked on Z earlier” where Z was not actually uniquely suggested as the single best alternative course of action at the time. If there’s a reply that counterfactual non-oblivious Carl can make, I can’t foresee it from here, under those hypothetical circumstances unfolding as I describe (and you shouldn’t really be trying to justify yourself under those hypothetical circumstances, any more than I should be making excuses in advance for what counterfactual Eliezer says after failing, besides “Oops”).
    
    My reasoning here is, from my internal perspective, very crude, because I’m not sure I really actually trust non-crude reasoning. There’s this killer problem that’s going to make all that other stuff pointless. I see a way to make progress on it, on the object level; the next problem up is visible and can be attacked. (Even this wasn’t always true, and I stuck with the problem anyway long enough to get to the point where I could state the tiling problem.) Resources should go to attacking this visible next step on the hardest problem. An exception to this as top priority maximization was CFAR, via “teaching rationality demonstrably channels more resources toward FAI; and CFAR which will later be self-sustaining is just starting up; plus CFAR might be useful for a general saving throw bonus; plus if a rational EA community had existed in 1996 it would have shaved ten years off the timeline and we could easily run into that situation again; plus I’m not sure MIRI will survive without CFAR”. Generalizing, young but hopefully self-sustaining initiatives can be plausibly competitive with MIRI for small numbers of marginal dollars, provided that they’re sufficiently directly linked to FAI down the road. Short of that, it doesn’t really make sense to ignore the big killer problem and hope somebody else handles it later. Not really actually.
    - CarlShulman 4 Aug 2013 20:04 UTC
      13 points
      Parent
      If the year was 1960, which would you rather have?
      
      10 smart people trying to build FAI for 20 years, 1960-1980
      A billion dollars, a large supporting movement, prediction markets and science courts that make the state of the evidence on AI transparent, and teams working on FAI, brain emulations, cognitive enhancement, and more but starting in 1990 (in expectation closer to AI)
      
      At any given time there are many problems where solutions are very important, but the time isn’t yet right to act on them, rather than on the capabilities to act on them, and also to deal with the individually unexpected problems that come along so regularly. Investment-driven and movement-building-driven discount rates are relevant even for existential risk.
      
      GiveWell has grown in influence much faster than the x-risk community while working on global health, and are now in the process of investigating and pivoting towards higher leverage causes, with global catastrophic risk among the top three under consideration.
      - Eliezer Yudkowsky 4 Aug 2013 20:12 UTC
        7 points
        Parent
        I’d rather have both, hence diverting some marginal resources to CFAR until it was launched, then switching back to MIRI. Is there a third thing that MIRI should divert marginal resources to right now?
        CarlShulman 4 Aug 2013 20:31 UTC
        26 points
        Parent
        I have just spent a month in England interacting extensively with the EA movement here (maybe your impressions from the California EA summit differ, I’d be curious to hear). Donors interested in the far future are also considering donations to the following (all of these are from talks with actual people making concrete short-term choices; in addition to donations, people are also considering career choices post-college):
        
        80,000 Hours, CEA and other movement building and capacity-increasing organizations (including CFAR), which also increase non-charity options (e.g. 80k helping people going into scientific funding agencies and political careers where they will be in a position to affect research and policy reactions to technologies relevant to x-risk and other trajectory changes)
        AMF/GiveWell charities to keep GiveWell and the EA movement growing while actors like GiveWell, Paul Christiano, Nick Beckstead and others at FHI, investigate the intervention options and cause prioritization, followed by organization-by-organization analysis of the GiveWell variety, laying the groundwork for massive support for the top far future charities and organizations identified by said processes
        Finding ways to fund such evaluation with RFMF, e.g. by paying for FHI or CEA hires to work on them
        The FHI’s other work
        A donor-advised fund investing the returns until such evaluations or more promising opportunities present themselves or are elicited by the fund (possibilities like Drexler’s nanotech panel, extensions of the DAGGRE methods, a Bayesian aggregation algorithm that greatly improves extraction of scientific expert opinion or science courts that could mobilize much more talent and resources to neglected problems with good cases, some key steps in biotech enhancement)
        
        That’s why Peter Hurford posted the OP, because he’s an EA considering all these options, and wants to compare them to MIRI.
        Eliezer Yudkowsky 4 Aug 2013 20:45 UTC
        24 points
        Parent
        That is a sort of discussion my brain puts in a completely different category. Peter and Carl, please always give me a concrete alternative policy option that (allegedly) depends on a debate, if such is available; my brain is then far less likely to label the conversation “annoying useless meta objections that I want to just get over with as fast as possible”.
        
        Can we have a new top-level comment on this?
        What links here?
        lukeprog's comment on How does MIRI Know it Has a Medium Probability of Success? by Peter Wildeford (7 Aug 2013 21:14 UTC; 16 points)
        CarlShulman 4 Aug 2013 22:39 UTC
        8 points
        Parent
        I edited my top-level comment to include the list and explanation.
        Rain 4 Aug 2013 23:58 UTC
        0 points
        Parent
        
        AMF/GiveWell charities to keep GiveWell and the EA movement growing while actors like GiveWell, Paul Christiano, Nick Beckstead and others at FHI, investigate the intervention options and cause prioritization, followed by organization-by-organization analysis of the GiveWell variety, laying the groundwork for massive support for the top far future charities and organizations identified by said processes
        
        Cool, if MIRI keeps going, they might be able to show FAI as top focus with adequate evidence by the time all of this comes together.
        lukeprog 7 Aug 2013 4:19 UTC
        0 points
        Parent
        Well, in collaboration with FHI. As soon as Bostrom’s Superintelligence is released, we’ll probably be building on and around that to make whatever cases we think are reasonable to make.
    - lukeprog 7 Aug 2013 4:30 UTC
      1 point
      Parent
      Thanks; I found this comment illuminating about your views.
    - lukeprog 14 Sep 2013 22:36 UTC
      0 points
      Parent
      
      they could be checking relative plausibility of causing an OK outcome without trying to put absolute numbers on a probability estimate, and this is reasonable due to the following circumstances
      
      Also, because this.
  - Rain 4 Aug 2013 23:56 UTC
    1 point
    Parent
    
    Build up general altruistic capacities through things like the effective altruist movement or GiveWell’s investigation of catastrophic risks
    
    I read every blog post they put out.
    
    Invest money in an investment fund for the future which can invest more [...] when there are better opportunities
    
    I figure I can use my retirement savings for this.
    
    (recalling that most of the value of MIRI in your model comes from major institutions being collectively foolish or ignorant regarding AI going forward)
    
    I thought it came from them being collectively foolish or ignorant regarding Friendliness rather than AGI.
    
    Prediction markets, meta-research, and other institutional changes
    
    Meh. Sounds like Lean Six Sigma or some other buzzword business process improvement plan.
    
    Work like Bostrom’s
    
    Luckily, Bostrom is already doing work like Bostrom’s.
    
    Pursue cognitive enhancement technologies or education methods
    
    Too indirect for my taste.
    
    Find the most effective options for synthetic biology threats
    
    Not very scary compared to AI. Lots of known methods to combat green goo.