[deleted] comments on AI risk, new executive summary

[deleted] 21 Apr 2014 0:57 UTC
0 points

Sorry, didn’t mean to call you personally any of those adjectives :)

None taken then.

Pretty much, yes, I find it totally possible. I am not saying that I am confident that this is the case, just that I find it more likely than the alternative, which would require an additional reason why it isn’t so.

Well, tell me what you think of this argument:

Lets divide the meta-language into two sets: P (the sentences that cannot be rendered in English) and Q (the sentences that can). If you expect Q to be empty, then let me know and we can talk about that case. But let’s assume for now that Q is not empty, since I assume we both think that an AGI will be able to handle human language quite easily. Q is, for all intents and purposes, a ‘human’ language itself.

Premise one is that that translation is transitive: if I can translate language a into language b, and language b into language c, then I can translate language a into language c (maybe I need to use language b as an intermediate step, though).

Premise two: If I cannot translate a sentence in language a into an expression in language b, then there is no expression in language b that expresses the same thought as that sentence in language a.

Premise three: Any AGI would have to learn language originally from us, and thereafter either from us or from previous versions of itself.

So by stipulation, every sentence in Q can be rendered in English, and Q is non-empty. If any sentence in P cannot be rendered in English, then it follows from premise one that sentences in P cannot be rendered in sentences in Q (since then they could thereby be rendered into English). It also follows, if you accept premise two, that Q cannot express any sentence in P. So an AGI knowing only Q could never learn to express any sentence in P, since if it could, any speaker of Q (potentially any non-improved human) could in principle learn to express sentences in P (given an arbitrarily large amount of resources like time, questions and answers, etc.).

Hence, no AGI, beginning from a language like English could go on to learn how to express any sentence in P. Therefore no AGI will ever know P.

I’m not super confident this argument is sound, but it seems to me to be at least plausible.

If you agree with Eliezer’s definition of intelligence as optimization power

Well, that’s a fine definition, but it’s tricky in this case. Because if intelligence is optimization power, and optimizing presupposes something to optimize, then intelligence (on that definition) isn’t strictly a factor in (ultimate) goal formation. If that’s right, than something’s being much more intelligent would (as I think someone else mentioned) just lead to very hard to understand instrumental goals. It would have no direct relationship with terminal goals.
- Armok_GoB 21 Apr 2014 14:15 UTC
  0 points
  Parent
  Premise one is false assuming finite memory.
  
  Premise 3 does not hold well either; Many new words come from pointing out a pattern in the environment, not from defining in terms of previous words.
  - [deleted] 21 Apr 2014 14:29 UTC
    0 points
    Parent
    
    Premise one is false assuming finite memory.
    
    Well, maybe it’s not necessarily true assuming finite memory. Do you have reason to expect it to be false in the case we’re talking about?
    
    Many new words come from pointing out a pattern in the environment, not from defining in terms of previous words.
    
    I’m of course happy to grant that part of using a language involves developing neologisms. We do this all the time, of course, and generally we don’t think of it as departing from English. Do you think it’s possible to coin a neologism in a language like Q, such that the new term is in P (and inexpressible in any part of Q)? A user of this neologism would be unable to, say, taboo or explain what they mean by a term (even to themselves). How would the user distinguish their P-neologism from nonsense?
    - Armok_GoB 22 Apr 2014 2:04 UTC
      0 points
      Parent
      I expect the tabo/explanation to look like a list of 10^20, 1000 hour long clips of incomprehensible n-dimensional multimedia, each with a real number attached representing the amount of [untranslatable 92] it has, with a jupiter brain being required to actually find any pattern.
      - [deleted] 22 Apr 2014 14:24 UTC
        0 points
        Parent
        
        I’m talking about the simplest possible in principle expression in the human language being that long and complex.
        
        Ah, I see. Even if that were a possibility, I’m not sure that would be such a problem. I’m happy to allow the AGI to spend a few centuries manipulating our culture, our literature, our public discourse etc. in the name of making its goals clear to us. Our understanding something doesn’t depend on us being able to understand a single complex expression of it, or to be able to produce such. It’s not like we all understood our own goals from day one either, and I’m not sure we totally understand them now. Terminal goals are basically pretty hard to understand, but I don’t see why we should expect the (terminal) goals of a super-intelligence to be harder.
        
        I expect it to be false in at least some cases talked about because it’s not 3 but 100 levels, and each one makes it 1000 times longer because complex explanations and examples are needed for almost every “word”.
        
        It may be that there’s a lot of inferential and semantic ground to cover. But again: practical problem. My point has been to show that we shouldn’t expect there to be a problem of in principle untranslatability. I’m happy to admit there might be serious practical problems in translation. The question is now whether we should default to thinking ‘An AGI is going to solve those problems handily, given the resources it has for doing so’, or ‘An AGI’s thought is going to be so much more complex and sophisticated, that it will be unable to solve the practical problem of communication’. I admit, I don’t have good ideas about how to come down on the issue. I was just trying to respond to Shim’s point about untranslatable meta-languages.
        
        Form my part, I don’t see any reason to expect the AGI’s terminal goals to be any more complex than ours, or any harder to communicate, so I see the practical problem as relatively trivial. Instrumental goals, forget about it. But terminal goals aren’t the sorts of things that seem to admit of very much complexity.
        Armok_GoB 22 Apr 2014 21:32 UTC
        0 points
        Parent
        
        Form my part, I don’t see any reason to expect the AGI’s terminal goals to be any more complex than ours, or any harder to communicate, so I see the practical problem as relatively trivial. Instrumental goals, forget about it. But terminal goals aren’t the sorts of things that seem to admit of very much complexity.
        
        That the AI can have a simple goal is obvious, I never argued against that. The AIs goal might be “maximize the amount of paperclips”, which is explained in that many words. I dont expect the AI as a whole to have anything directly analogous to instrumental goals on the highest level either, so that’s a non issue. I thought we were talking about the AI’s decision theory.
        
        On manipulating culture for centuries and solving as practical problem: Or it could just instal an implant or guide evolution to increase intelligence until we were smart enough. The implicit constraint of “translate” is that it’s to an already existing specific human, and they have to still be human at the end of the process. Not “could something that was once human come to understand it”.
        [deleted] 22 Apr 2014 21:44 UTC
        0 points
        Parent
        
        I thought we were talking about the AI’s decision theory.
        
        No, Shiminux and I were talking about (I think) terminal goals: that is, we were talking about whether or not we could come to understand what an AGI was after, assuming it wanted us to know. We started talking about a specific part of this problem, namely translating concepts novel to the AGI’s outlook into our own language.
        
        I suppose my intuition, like yours, is that the AGI decision theory would be a much more serious problem, and not one subject to my linguistic argument. Since I expect we also agree that it’s the decision theory that’s really the core of the safety issue, my claim about terminal goals is not meant to undercut the concern for AGI safety. I agree that we could be radically ignorant about how safe an AGI is, even given a fairly clear understanding of its terminal goals.
        
        The implicit constraint of “translate” is that it’s to an already existing specific human, and they have to still be human at the end of the process.
        
        I’d actually like to remain indifferent to the question of how intelligent the end-user of the translation has to be. My concern was really just whether or not there are in principle any languages that are mutually untranslatable. I tried to argue that there may be, but they wouldn’t be mutually recognizable as languages anyway, and that if they are so recognizable, then they are at least partly inter-translatable, and that any two languages that are partly inter-translatable are in fact wholly inter-translatable. But this is a point about the nature of languages, not degrees of intelligence.
        TheAncientGeek 23 Apr 2014 21:54 UTC
        0 points
        Parent
        Human languages? Alien languages? Machine languages?
        [deleted] 24 Apr 2014 16:26 UTC
        0 points
        Parent
        I don’t think those distinctions really mean very much. Languages don’t come in types in any significant sense.
        TheAncientGeek 24 Apr 2014 16:59 UTC
        0 points
        Parent
        Yes they do. Eg the Chomsky Hierarchy, the Aglutinative /synthetic/Ananytical distinction, etc.
        
        Also. We recognise ,maths as a language.,but have no idea now to translate, as opposed to re code, English into it.
        Armok_GoB 23 Apr 2014 19:23 UTC
        0 points
        Parent
        So one of the questions we actually agreed on the whole time and the other were just the semantics of “language” and “translate”. Oh well, discussion over.
        [deleted] 23 Apr 2014 21:32 UTC
        0 points
        Parent
        Ha! Well, I did argue that all languages (recognizable as such) were in principle inter-translatable for what could only be described as metaphysical reasons. I’d be surprised if you couldn’t find holes in an argument that ambitious and that unempirical. But it may be that some of the motivation is lost.
    - Armok_GoB 22 Apr 2014 1:58 UTC
      0 points
      Parent
      I expect it to be false in at least some cases talked about because it’s not 3 but 100 levels, and each one makes it 1000 times longer because complex explanations and examples are needed for almost every “word”.
- Shmi 21 Apr 2014 2:36 UTC
  0 points
  Parent
  
  So an AGI knowing only Q could never learn to express any sentence in P, since if it could, any speaker of Q (potentially any non-improved human) could in principle learn to express sentences in P (given an arbitrarily large amount of resources like time, questions and answers, etc.).
  
  Honestly, I expected you to do a bit more steelmanning with the examples I gave. Or maybe you have, just didn’t post them here, Anyway, does the quote mean that any English sentence can be expressed in Chimp, since we evolved from a common ancestor? If you don’t claim that (I hope you don’t), then where did your logic stop applying to humans and chimps vs AGI and humans? Presumably it’s relying on the Premise 3 that gets us a wrong conclusion in the English/Chimp example, since it is required to construct an unbroken chain of languages. What happened to humans over their evolution which made them create Q out of P where Q is not reducible to P? And if this is possible in the mindless evolutionary process, then would it not be even more likely during intelligence explosion?
  
  If that’s right, than something’s being much more intelligent would (as I think someone else mentioned) just lead to very hard to understand instrumental goals. It would have no direct relationship with terminal goals.
  
  I don’t understand this point. I would expect the terminal goals evolve as the evolving intelligence understands more and more about the world. For example, for many people here the original terminal goal was, ostensibly, “serve God”. Then they stopped believing and now their terminal goal is more like “do good”. Similarly, I would expect an evolving AGI to adjust its terminal goals as the ones it had before are obsoleted, not because they have been reached, but because they become meaningless.
  - [deleted] 21 Apr 2014 3:04 UTC
    0 points
    Parent
    
    Anyway, does the quote mean that any English sentence can be expressed in Chimp, since we evolved from a common ancestor?
    
    No, I said nothing about evolving from a common ancestor. The process of biological variation, selection, and retention of genes seems to be to be entirely irrelevant to this issue, since we don’t know languages in virtue of having specific sets of genes. We know languages by learning them from language-users. You might be referring to homo ancestors that developed language at some time in the past, and the history of linguistic development that led to modern languages. I think my argument does show (if it’s sound) that anything in our linguistic history that qualifies as a language is inter-translatable with a modern language (given arbitrary resources of time interrogation, metaphor, neologism, etc.).
    
    It’s hard to say what qualifies as a language, but then it’s also hard to say when a child goes from being a non-language user to being a language user. It’s certainly after they learn their first word, but it’s not easy to say exactly when. But remember I’m arguing that we can always inter-translate two languages, not that we can some how make the thoughts of a language user intelligible to a non-language user (without making them a language user). This is, incidentally, where I think your AGI:us::us:chimps analogy breaks down. I still see no reason to think it plausible. At any rate, I don’t need to draw a line between those homo that spoke languages and those that did not. I grant that the former could not be understood by the latter. I just don’t think the same goes for languages and ‘meta-languages’.
    
    I would expect the terminal goals evolve as the evolving intelligence understands more and more about the world.
    
    Me too, but that would have nothing to do with intelligence on EY’s definition. If intelligence is optimizing power, then it can’t be used to reevaluate terminal goals. What would it optimize for? It can only be used to reevaluate instrumental goals so as to optimize for satisfying terminal goals. I don’t know how the hell we do reevaluate terminal goals anyway, but we do, so there you go.
    
    For example, for many people here the original terminal goal was, ostensibly, “serve God”. Then they stopped believing and now their terminal goal is more like “do good”.
    
    You might think they just mistook an instrumental goal (‘serve God’) for a terminal goal, when actually they wanted to ‘do good’ all along.
    - Shmi 21 Apr 2014 3:16 UTC
      0 points
      Parent
      
      At any rate, I don’t need to draw a line between those homo that spoke languages and those that did not. I grant that the former could not be understood by the latter. I just don’t think the same goes for languages and ‘meta-languages’.
      
      Ah. To me language is just a meta-grunt. That’s why I don’t think it’s different from the next level up. But I guess I don’t have any better arguments than those I have already made and they are clearly not convincing. So I will stop here.
      
      You might think they just mistook an instrumental goal (‘serve God’) for a terminal goal, when actually they wanted to ‘do good’ all along.
      
      Right, you might. Except they may not even had the vocabulary to explain that underlying terminal goal. In this example my interpretation would be that their terminal goal evolved rather than was clarified. Again, I don’t have any better argument, so I will leave it at that.
      - [deleted] 21 Apr 2014 3:23 UTC
        0 points
        Parent
        
        language is just a meta-grunt.
        
        I see. If that is true, then I can’t dispute your point (for more than one reason).
- Jiro 21 Apr 2014 2:21 UTC
  0 points
  Parent
  By this reasoning no AGI beginning from English could ever know French either, for similar reasons. (Note that every language has sentences that cannot be rendered in another language, in the sense that someone who knows the truth value of the unrendered sentence can know the truth value of the rendered sentence; consider variations on Godel-undecideable sentences.)
  - [deleted] 21 Apr 2014 2:39 UTC
    2 points
    Parent
    
    By this reasoning no AGI beginning from English could ever know French either, for similar reasons.
    
    This is true only if this...
    
    Note that every language has sentences that cannot be rendered in another language
    
    is true. But I don’t think it is. English and French, for instance, seem to me to be entirely inter-translatable. I don’t mean that we can assign, for every word in French, a word of equivalent meaning in English. But maybe it would be helpful if I made it more clear what I mean by ‘inter-translatable’. I think language L is inter-translatable with language M if for ever sentence in language L, I can express the same thought using an arbitrarily complex expression in language M.
    
    By ‘arbitrarily complex’ I mean this: Say I have a sentence in L. In order to translate it into M, I am allowed to write in M an arbitrarily large number of sentences qualifying and triangulating the meaning of the sentence in L. I am allowed to write an arbitrarily large number of poems, novels, interpretive dances, etymological and linguistic papers, and encyclopedias discussing the meaning and spirit of that sentence in L. In other words, two languages are by my standard inter-translatable if for any expression in L of n bits, I can translate it into M in n’ bits, where n’ is allowed to be any positive number.
    
    I think, by this standard, French and English count as inter-translatable, as are any languages I can think of. I’m arguing, effectively, that for any language, either none of that language is inter-translatable with any language we know (in which case, I doubt we could recognize it as a language at all), or all of it is.
    
    Now, even if I have shown that we and an AGI will necessarily be able to understand each other entirely in principle, I certainly haven’t shown that it can be done in practice. However, I want to push the argument in the direction of a practical problem, just because in general, I think I can argue that AGI will be able to overcome practical problems of any reasonable difficulty.