Eliezer Yudkowsky comments on Tiling Agents for Self-Modifying AI (OPFAI #2)

Eliezer Yudkowsky 28 Jun 2013 1:53 UTC
4 points
Hm. I’m not sure if Scott Aaronson has any weird views on AI in particular, but if he’s basically mainstream-oriented we could potentially ask him to briefly skim the Tiling Agents paper and say if it’s roughly the sort of paper that it’s reasonable for an organization like MIRI to be working on if they want to get some work started on FAI. At the very least if he disagreed I’d expect he’d do so in a way I’d have better luck engaging conversationally, or if not then I’d have two votes for ‘please explore this issue’ rather than one.

I feel again like you’re trying to interpret the paper according to a different purpose from what it has. Like, I suspect that if you described what you thought a promising AGI research agenda was supposed to deliver on what sort of timescale, I’d say, “This paper isn’t supposed to do that.”

No, it’s clear that there have been many advances, for example in chess playing programs, auto-complete search technology, automated translation, driverless cars, and speech recognition.

But my impression is that this work has only made a small dent in the problem of general artificial intelligence.

This part is clearer and I think I may have a better idea of where you’re coming from, i.e., you really do think the entire field of AI hasn’t come any closer to AGI, in which case it’s much less surprising that you don’t think the Tiling Agents paper is the very first paper ever to come closer to AGI. But this sounds like a conversation that someone else could have with you, because it’s not MIRI-specific or FAI-specific. I also feel somewhat at a loss for where to proceed if I can’t say “But just look at the ideas behind Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, that’s obviously important conceptual progress because...” In other words, you see AI doing a bunch of things, we already mostly agree on what these sorts of surface real-world capabilities are, but after checking with some friends you’ve concluded that this doesn’t mean we’re less confused about AGI then we were in 1955. I don’t see how I can realistically address that except by persuading your authorities; I don’t see what kind of conversation we could have about that directly without being able to talk about specific AI things.

Meanwhile, if you specify “I’m not convinced that MIRI’s paper has a good chance of being relevant to FAI, but only for the same reasons I’m not convinced any other AI work done in the last 60 years is relevant to FAI” then this will make it clear to everyone where you’re coming from on this issue.
- Shmi 28 Jun 2013 2:09 UTC
  8 points
  Parent
  
  Hm. I’m not sure if Scott Aaronson has any weird views on AI in particular
  
  He wrote this about a year ago:
  
  could an AI improve itself to something that was “as incomprehensibly far beyond humans as Turing machines are beyond finite automata”?
  
  as I wrote in my “Singularity is Far” post, my strong guess (based, essentially, on the Church-Turing Thesis) is that the answer is no. I believe—as David Deutsch also argues in “The Beginning of Infinity”—that human beings are “qualitatively,” if not quantitatively, already at some sort of limit of intellectual expressive power. More precisely, I conjecture that for every AI that can exist in the physical world, there exists a constant k such that a reasonably-intelligent human could understand the AI perfectly well, provided the AI were slowed down by a factor of k. So then the issue is “merely” that k could be something like 10^20.
  
  And later:
  
  I’m not sure how much I agree with Karnofsky’s “tool vs. agent” distinction, but his broader point is very similar to mine: namely, the uncertainties regarding “Friendly AI” are so staggering that it’s impossible to say with confidence whether any “research” we do today would be likelier to increase or decrease the chance of catastrophe (or just be completely irrelevant).
  
  For that reason, I would advise donating to SIAI if, and only if, you find the tangible activities that they actually do today—most notably (as far as I can tell), the Singularity Summit and Eliezer’s always-interesting blog posts about “the art of rationality”—to be something you want to support.
  - Eliezer Yudkowsky 28 Jun 2013 2:12 UTC
    15 points
    Parent
    Without further context I see nothing wrong here. Superintelligences are Turing machines, check. You might need a 10^20 slowdown before that becomes relevant, check. It’s possible that the argument proves too much by showing that a well-trained high-speed immortal dog can simulate Mathematica and therefore a dog is ‘intellectually expressive’ enough to understand integral calculus, but I don’t know if that’s what Scott means and principle of charity says I shouldn’t assume that without confirmation.
    
    EDIT: Parent was edited, my reply was to the first part, not the second. The second part sounds like something to talk with Scott about. I really think the “You’re just as likely to get results in the opposite direction” argument is on the priors overstated for most forms of research. Does Scott think that work we do today is just as likely to decrease our understanding of P/NP as increase it? We may be a long way off from proving an answer but that’s not a reason to adopt such a strange prior.
    - lukeprog 28 Jun 2013 4:01 UTC
      9 points
      Parent
      As it happens, I’ve been chatting with Scott about this issue recently, due to some comments he made in his recent quantum Turing machine paper:
      
      the uncomfortable truth is that it’s the Singularitarians who are the scientiﬁc conservatives, while those who reject their vision as fantasy are scientiﬁc radicals. For at some level, all the Singularitarians are doing is taking conventional thinking about physics and the brain to its logical conclusion. If the brain is a “meat computer,” then given the right technology, why shouldn’t we be able to copy its program from one physical substrate to another? And why couldn’t we then run multiple copies of the program in parallel...?
      
      ...Certainly, one could argue that the Singularitarians’ timescales might be wildly oﬀ… [Also,] suppose we conclude — as many Singularitarians have — that the greatest problem facing humanity today is how to ensure that, when superhuman AIs are ﬁnally built, those AIs will be “friendly” to human concerns. The diﬃculty is: given our current ignorance about AI, how on earth should we act on that conclusion? Indeed, how could we have any conﬁdence that whatever steps we did take wouldn’t backﬁre, and increase the probability of an unfriendly AI?
      
      I thought his second objection (“how could we know what to do about it?”) was independent of his first objection (“AI seems farther away than the singularitarians tend to think”), but when I asked him about it, he said his second objection just followed from the first. So given his view that AI is probably centuries away, it seems really hard to know what could possibly help w.r.t. FAI. And if I thought AI was several centuries away, I’d probably have mostly the same view.
      
      I asked Scott: “Do you think you’d hold roughly the same view if you had roughly the probability distribution over year of AI creation as I gave in When Will AI Be Created? Or is this part of your view contingent on AI almost certainly being several centuries away?”
      
      He replied: “No, if my distribution assigned any significant weight to AI in (say) a few decades, then my views about the most pressing tasks today would almost certainly be different.” But I haven’t followed up to get more specifics about how his views would change.
      
      And yes, Scott said he was fine with quoting this conversation in public.
      - Eliezer Yudkowsky 28 Jun 2013 5:27 UTC
        8 points
        Parent
        I think I’d be happy with a summary of persistent disagreement where Jonah or Scott said, “I don’t think MIRI’s efforts are valuable because we think that AI in general has made no progress on AGI for the last 60 years / I don’t think MIRI’s efforts are priorities because we don’t think we’ll get AGI for another 2-3 centuries, but aside from that MIRI isn’t doing anything wrong in particular, and it would be an admittedly different story if I thought that AI in general was making progress on AGI / AGI was due in the next 50 years”.
        JonahS 28 Jun 2013 5:49 UTC
        19 points
        Parent
        I think that your paraphrasing
        
        I don’t think MIRI’s efforts are valuable because I think that AI in general has made no progress on AGI for the last 60 years, but aside from that MIRI isn’t doing anything wrong in particular, and it would be an admittedly different story if I thought that AI in general was making progress on AGI.
        
        is pretty close to my position.
        
        I would qualify it by saying:
        
        I’d replace “no progress” with “not enough progress for there to be a known research program with a reasonable chance of success.”
        
        I have high confidence that some of the recent advances in narrow AI will contribute (whether directly or indirectly) to the eventual creation of AGI (contingent on this event occurring), just not necessarily in a foreseeable way.
        
        If I discover that there’s been significantly more progress on AGI than I had thought, then I’ll have to reevaluate my position entirely. I could imagine updating in the directly of MIRI’s FAI work being very high value, or I could imagine continuing to believe that MIRI’s FAI research isn’t a priority, for reasons different from my current ones.
        
        What links here?
        JonahS's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by Eliezer Yudkowsky (1 Jul 2013 21:18 UTC; 5 points)
        Eliezer Yudkowsky 28 Jun 2013 22:35 UTC
        15 points
        Parent
        Agreed-on summaries of persistent disagreement aren’t ideal, but they’re more conversational progress than usually happens, so… thanks!
    - JonahS 28 Jun 2013 4:04 UTC
      5 points
      Parent
      
      I really think the “You’re just as likely to get results in the opposite direction” argument is on the priors overstated for most forms of research. Does Scott think that work we do today is just as likely to decrease our understanding of P/NP as increase it? We may be a long way off from proving an answer but that’s not a reason to adopt such a strange prior.
      
      I’m doing some work for MIRI looking at the historical track record of predictions of the future and actions taken based on them, and whether such attempts have systematically done as much harm as good.
      
      To this end, among other things, I’ve been reading Nate Silver’s The Signal and the Noise. In Chapter 5, he discusses how attempts to improve earthquake predictions have consistently yielded worse predictive models than the Gutenberg-Richter law. This has slight relevance.
      
      Such examples not withstanding, my current prior is on MIRI’s FAI research having positive expected value. I don’t think that the expected value of the research is zero or negative – only that it’s not competitive with the best of the other interventions on the table.
    - redlizard 2 Jul 2013 15:47 UTC
      0 points
      Parent
      
      I really think the “You’re just as likely to get results in the opposite direction” argument is on the priors overstated for most forms of research. Does Scott think that work we do today is just as likely to decrease our understanding of P/NP as increase it?
      
      My own interpretation of Scott’s words here is that it’s unclear whether your research is actually helping in the “get Friendly AI before some idiot creates a powerful Unfriendly one” challenge. Fundamental progress in AI in general could just as easily benefit the fool trying to build a AGI without too much concern for Friendliness, as it could benefit you. Thus, whether fundamental research helps out avoiding the UFAI catastrophy is unclear.
      - ESRogs 11 Aug 2013 6:03 UTC
        0 points
        Parent
        I’m not sure that interpretation works, given that he also wrote:
        
        suppose we conclude — as many Singularitarians have — that the greatest problem facing humanity today is how to ensure that, when superhuman AIs are finally built, those AIs will be “friendly” to human concerns. The difficulty is: given our current ignorance about AI, how on earth should we act on that conclusion? Indeed, how could we have any confidence that whatever steps we did take wouldn’t backﬁre, and increase the probability of an unfriendly AI?
        
        Since Scott was addressing steps taken to act on the conclusion that friendliness was supremely important, presumably he did not have in mind general AGI research.
- JonahS 28 Jun 2013 3:55 UTC
  3 points
  Parent
  Hm. I’m not sure if Scott Aaronson has any weird views on AI in particular, but if he’s basically mainstream-oriented we could potentially ask him to briefly skim the Tiling Agents paper and say if it’s roughly the sort of paper that it’s reasonable for an organization like MIRI to be working on if they want to get some work started on FAI.
  
  Yes, I would welcome his perspective on this.
  
  I feel again like you’re trying to interpret the paper according to a different purpose from what it has. Like, I suspect that if you described what you thought a promising AGI research agenda was supposed to deliver on what sort of timescale, I’d say, “This paper isn’t supposed to do that.”
  
  I think I’ve understood your past comments on this point. My questions are about the implicit assumptions upon which the value of the research rests, rather than about what the research does or doesn’t succeed in arguing.
  
  This part is clearer and I think I may have a better idea of where you’re coming from, i.e., you really do think the entire field of AI hasn’t come any closer to AGI, in which case it’s much less surprising that you don’t think the Tiling Agents paper is the very first paper ever to come closer to AGI. But this sounds like a conversation that someone else could have with you, because it’s not MIRI-specific or FAI-specific.
  
  As I said in earlier comments, the case for the value of the research hinges on its potential relevance to AI safety, which in turn hinges on how good the model is for the sort of AI that will actually be built. Here I don’t mean “Is the model exactly right?” — I recognize that you’re not claiming it to be — the question is whether the model is in the right ballpark.
  
  A case for the model being a good one requires pointing to a potentially promising AGI research program to which the model is relevant. This is the point that I feel hasn’t been addressed.
  
  Some things that I see as analogous to the situation under discussion are:
  1. A child psychology researcher who’s never interacted with children could write about good child rearing practices without the research being at all relevant to how to raise children well.
  2. An economist who hasn’t looked at real world data about politics could study political dynamics using mathematical models without the researcher being at all relevant to politics in practice.
  3. A philosopher who hasn’t study math could write the philosophy of math without the writing being relevant to math.
  4. A therapist who’s never had experience with depression could give advice to a patient on overcoming depression without the advice being at all relevant to overcoming depression.
  Similarly, somebody without knowledge of the type of AI that’s going to be built could research AI safety without the research being relevant to AI safety.
  
  Does this help clarify where I’m coming from?
  
  I also feel somewhat at a loss for where to proceed if I can’t say “But just look at the ideas behind Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, that’s obviously important conceptual progress because...” In other words, you see AI doing a bunch of things, we already mostly agree on what these sorts of surface real-world capabilities are, but after checking with some friends you’ve concluded that this doesn’t mean we’re less confused about AGI then we were in 1955. I don’t see how I can realistically address that except by persuading your authorities; I don’t see what kind of conversation we could have about that directly without being able to talk about specific AI things.
  
  I’m open to learning object level material if I learn new information that convinces me that there’s a reasonable chance that MIRI’s FAI research is relevant to AI safety in practice.
  
  Meanwhile, if you specify “I’m not convinced that MIRI’s paper has a good chance of being relevant to FAI, but only for the same reasons I’m not convinced any other AI work done in the last 60 years is relevant to FAI” then this will make it clear to everyone where you’re coming from on this issue.
  
  Yes, this is where I’m coming from.