DanielFilan comments on Challenge: know everything that the best go bot knows about go

DanielFilan 11 May 2021 6:42 UTC
2 points
AF

[D]oes understanding the go bot in your sense imply that you could play an even game against it?

I imagine so. One complication is that it can do more computation than you.
- ESRogs 11 May 2021 7:20 UTC
  LW: 9 AF: 1
  AF Parent
  But once you let it do more computation, then it doesn’t have to know anything at all, right? Like, maybe the best go bot is, “Train an AlphaZero-like algorithm for a million years, and then use it to play.”
  I know more about go than that bot starts out knowing, but less than it will know after it does computation.
  I wonder if, when you use the word “know”, you mean some kind of distilled, compressed, easily explained knowledge?
  - DanielFilan 11 May 2021 7:25 UTC
    LW: 3 AF: 2
    AF Parent
    Perhaps the bot knows different things at different times and your job is to figure out (a) what it always knows and (b) a way to quickly find out everything it knows at a certain point in time.
    What links here?
    DanielFilan's comment on Challenge: know everything that the best go bot knows about go by DanielFilan (3 Jun 2021 18:25 UTC; 2 points)
    - Richard_Ngo 11 May 2021 10:24 UTC
      LW: 11 AF: 3
      AF Parent
      I think at this point you’ve pushed the word “know” to a point where it’s not very well-defined; I’d encourage you to try to restate the original post while tabooing that word.
      This seems particularly valuable because there are some versions of “know” for which the goal of knowing everything a complex model knows seems wildly unmanageable (for example, trying to convert a human athlete’s ingrained instincts into a set of propositions). So before people start trying to do what you suggested, it’d be good to explain why it’s actually a realistic target.
      - DanielFilan 11 May 2021 17:33 UTC
        LW: 5 AF: 1
        AF Parent
        Hmmm. It does seem like I should probably rewrite this post. But to clarify things in the meantime:
        
        it’s not obvious to me that this is a realistic target, and I’d be surprised if it took fewer than 10 person-years to achieve.
        I do think the knowledge should ‘cover’ all the athlete’s ingrained instincts in your example, but I think the propositions are allowed to look like “it’s a good idea to do x in case y”.
        Richard_Ngo 13 May 2021 15:23 UTC
        LW: 4 AF: 2
        AF Parent
        it’s not obvious to me that this is a realistic target
        Perhaps I should instead have said: it’d be good to explain to people why this might be a useful/realistic target. Because if you need propositions that cover all the instincts, then it seems like you’re basically asking for people to revive GOFAI.
        (I’m being unusually critical of your post because it seems that a number of safety research agendas lately have become very reliant on highly optimistic expectations about progress on interpretability, so I want to make sure that people are forced to defend that assumption rather than starting an information cascade.)
        DanielFilan 14 May 2021 18:31 UTC
        LW: 6 AF: 3
        AF Parent
        OK, the parenthetical helped me understand where you’re coming from. I think a re-write of this post should (in part) make clear that I think a massive heroic effort would be necessary to make this happen, but sometimes massive heroic efforts work, and I have no special private info that makes it seem more plausible than it looks a priori.
        DanielFilan 14 May 2021 19:06 UTC
        LW: 2 AF: 1
        AF Parent
        Actually, hmm. My thoughts are not really in equilibrium here.
        DanielFilan 14 May 2021 18:33 UTC
        LW: 2 AF: 1
        AF Parent
        (Also: such a rewrite would be a combination of ‘what I really meant’ and ‘what the comments made me realize I should have really meant’)
  - DanielFilan 11 May 2021 7:23 UTC
    2 points
    AF Parent
    
    But once you let it do more computation, then it doesn’t have to know anything at all, right? Like, maybe the best go bot is, “Train an AlphaZero-like algorithm for a million years, and then use it to play.”
    
    I would say that bot knows what the trained AlphaZero-like model knows.
    - DanielFilan 11 May 2021 7:23 UTC
      LW: 0 AF: 1
      AF Parent
      Also it certainly knows the rules of go and the win condition.
      - Richard_Ngo 11 May 2021 10:26 UTC
        2 points
        AF Parent
        As an additional reason for the importance of tabooing “know”, note that I disagree with all three of your claims about what the model “knows” in this comment and its parent.
        (The definition of “know” I’m using is something like “knowing X means possessing a mental model which corresponds fairly well to reality, from which X can be fairly easily extracted”.)
        DanielFilan 14 May 2021 18:29 UTC
        2 points
        AF Parent
        In the parent, is your objection that the trained AlphaZero-like model plausibly knows nothing at all?
        Richard_Ngo 16 May 2021 14:25 UTC
        6 points
        AF Parent
        The trained AlphaZero model knows lots of things about Go, in a comparable way to how a dog knows lots of things about running.
        But the algorithm that gives rise to that model can know arbitrarily few things. (After all, the laws of physics gave rise to us, but they know nothing at all.)
        DanielFilan 3 Jun 2021 18:25 UTC
        LW: 2 AF: 1
        AF Parent
        Ah, understood. I think this is basically covered by talking about what the go bot knows at various points in time, a la this comment—it seems pretty sensible to me to talk about knowledge as a property of the actual computation rather than the algorithm as a whole. But from your response there it seems that you think that this sense isn’t really well-defined.
        Richard_Ngo 3 Jun 2021 23:48 UTC
        LW: 2 AF: 1
        AF Parent
        I’m not sure what you mean by “actual computation rather than the algorithm as a whole”. I thought that I was talking about the knowledge of the trained model which actually does the “computation” of which move to play, and you were talking about the knowledge of the algorithm as a whole (i.e. the trained model plus the optimising bot).
        DanielFilan 11 May 2021 16:53 UTC
        LW: 2 AF: 1
        AF Parent
        On that definition, how does one train an AlphaZero-like algorithm without knowing the rules of the game and win condition?
        Richard_Ngo 13 May 2021 15:27 UTC
        2 points
        AF Parent
        The human knows the rules and the win condition. The optimisation algorithm doesn’t, for the same reason that evolution doesn’t “know” what dying is: neither are the types of entities to which you should ascribe knowledge.
        DanielFilan 14 May 2021 18:28 UTC
        2 points
        AF Parent
        Suppose you have a computer program that gets two neural networks, simulates a game of go between them, determines the winner, and uses the outcome to modify the neural networks. It seems to me that this program has a model of the ‘go world’, i.e. a simulator, and from that model you can fairly easily extract the rules and winning condition. Do you think that this is a model but not a mental model, or that it’s too exact to count as a model, or something else?
        Richard_Ngo 16 May 2021 14:20 UTC
        LW: 4 AF: 2
        AF Parent
        I’d say that this is too simple and programmatic to be usefully described as a mental model. The amount of structure encoded in the computer program you describe is very small, compared with the amount of structure encoded in the neural networks themselves. (I agree that you can have arbitrarily simple models of very simple phenomena, but those aren’t the types of models I’m interested in here. I care about models which have some level of flexibility and generality, otherwise you can come up with dumb counterexamples like rocks “knowing” the laws of physics.)
        As another analogy: would you say that the quicksort algorithm “knows” how to sort lists? I wouldn’t, because you can instead just say that the quicksort algorithm sorts lists, which conveys more information (because it avoids anthropomorphic implications). Similarly, the program you describe builds networks that are good at Go, and does so by making use of the rules of Go, but can’t do the sort of additional processing with respect to those rules which would make me want to talk about its knowledge of Go.