Richard Korzekwa comments on What’s Up With Confusingly Pervasive Goal Directedness?

Richard Korzekwa Jan 22, 2022, 11:06 PM
7 points

Like, you can’t make an “oracle chess AI” that tells you at the beginning of the game what moves to play, because even chess is too chaotic for that game tree to be feasibly representable. You’ve gotta keep running your chess AI on each new observation, to have any hope of getting the fragment of the game tree that you consider down to a managable size.

It’s not obvious to me how generally true this is. You can’t literally specify every move at the beginning of the game, but it seems like there could be instructions that work for more specified chess tasks. Like, I imagine a talented human chess coach could generate a set of instructions in English that would work well for defeating me at chess at least once (maybe there already exist “how to beat noobs at chess” instructions that will work for this). I would be unsurprised if there exists a set of human-readable instructions of human-readable length that would give me better-than-even odds of defeating a pre-specified chess expert at least once, that can be generated just by noticing and exploiting as-yet-unnoticed regularities in either that expert’s play in particular or human-expert-level chess in general.

It’s possible my intuition here is related to my complete lack of expertise in chess, and I would not be surprised if Magnus-Carlsen-defeating instructions do not exist (at least, not without routing through a reasoner). Still, I think I assign greater credence to shallow-pattern-finding AI enabling a pivotal act than you do, and I’m wondering if the chess example is probing this difference in intuition.
- ShardPhoenix Jan 23, 2022, 2:59 AM
  20 points
  Parent
  As a causal chess player it seems unlikely to me that there are any such instructions that would lead a beginner to beat even a halfway decent player. Chess is very dependent on calculation (mentally stepping through the game tree) and evaluation (recognising if a position is good or bad). Given the slow clock speed of the human brain (compared to computers), our calculations are slow and so we must lean heavily on a good learned evaluation function, which probably can’t be explicitly represented in a way that would be fast enough to execute manually. In other words you’d end up taking hours to make a move or something.
  
  There’s no shortcut like “just move these pawns 3 times in a mysterious pattern, they’ll never expect it”—“computer lines” that bamboozle humans require deep search that you won’t be able to do in realtime.
  
  Edit: the Oracle’s best chance against an ok player would probably be to give you a list of trick openings that lead to “surprise” checkmate and hope that the opponent falls into one, but it’s a low percentage.
  - Rafael Harth Jan 23, 2022, 9:08 PM
    5 points
    Parent
    I’m not sure that this is true. (Depends a lot on what rating do you define as “Halfway decent”). There are, in fact, rules that generalize over lots of board states, such as
    
    capture toward the center
    don’t advance the pawns around your king
    early on, focus on getting knight/bishop to squares from which they have many moves
    etc.
    
    If I had one day to make such a list, I don’t think a beginner could use it to beat a 1200 player in, say, a 30 minute game. But I’m very uncertain about the upper limit of usefulness of such a list. I wonder about stuff like that a lot, but it’s very hard to tell. (Have you read a book about chess principles?)
    
    I’m not even confident that you couldn’t beat Magnus. It depends on a bunch of factors, but perhaps you could just choose a line that seems forcing for black and try to specify enough branches of the tree to give you > 50% chance that it covers the game with Magnus. You could call this cheating, but it’s unclear how to formalize the challenge to avoid it. If Magnus knows who he’s playing against, this would make it significantly harder.
    - Martin Randall Jan 24, 2022, 2:01 PM
      3 points
      Parent
      I’m very confident that Magnus absolutely crushes a beginner who has been given a personal chess book, of normal book length, written by God. Magnus still has all the advantages.
      
      Magnus can evaluate moves faster and has a deeper search tree.
      The book of chess can provide optimal opening lines, but the beginner needs to memorize them, and Magnus has a greater capacity for memorizing openings.
      The book of chess can provide optimal principles for evaluating moves, but the beginner has to apply them, and decide what to do when they point in different directions. This comes from practice. A book of normal size can only provide limited practice examples.
      The beginner will have a higher rate of blunders. It is hard to “capture toward the center” when you don’t even see the capture.
      
      Some intuitions from chess books: the book God would give to a beginner is different to the book God would give a 1200 player. After reading a chess book, it is normal for ability to initially go down, until the advice has been distilled and integrated with practice. Reading a chess book helps improve faster, not to be instantly better.
      
      Some intuitions from chess programs: they lose a lot of power if you cut down their search time to simulate the ability of a beginner to calculate variations, and also cut down their opening database to simulate the ability of a beginner to memorize openings, and also give them a random error chance to simulate a beginner’s rate of blunders.
      - Rafael Harth Jan 24, 2022, 5:43 PM
        9 points
        Parent
        Sorry for the double response, but a separate point here is that your method of estimating the effectiveness of the best possible book seems dubious to me. It seems to be “let’s take the best book we have; the perfect book won’t be that much better”. But why would this be true, at all? We have applied tons of optimization pressure to chess and probably know that the ceiling isn’t that far above Stockfish, but we haven’t applied tons of optimization pressure to distilling chess. How do you know that the best possible book won’t be superior by some large factor? Why can’t the principles be so simple that applying them is easy? (This is a more general question; how can you e.g. estimate the effectiveness of the best possible text book for some subfield of math?)
        
        I’m a bit more sympathetic to this if we play Blitz, but for the most interesting argument, I think we should assume classical time format, where any beginner can see all possible captures.
        Martin Randall Feb 18, 2022, 3:14 AM
        1 point
        Parent
        Thanks for the double response. This line seems potentially important. If we could safely create an Oracle that can create a book of chess that massively boosts chess ability, then we could maybe possibly miraculously do the same thing to create a book that massively boosts AI safety research ability.
        
        I agree that my argument above was pretty sketchy, just “intuitions” really. Here’s something a bit more solid, after further reflection.
        
        I’m aware of adversarial examples and security vulnerabilities, so I’m not surprised if a superintelligence is able to severely degrade human performance via carefully selected input. A chess book that can make Magnus lose to a beginner wouldn’t surprise me. Neither would a chess book that degraded a beginner’s priorities such that they obsessed about chess, for however many Elo points that would be worth.
        
        But mostly this problem is in the opposite direction: can we provide carefully curated input that allows an intelligence to learn much faster? In this direction the results seem much less dramatic. My impression is that the speed of learning is limited by both the inputs and the learner. If the book of chess is a perfect input, then the limiting factor is the reader, and an average reader won’t get outsized benefits from perfect inputs.
        
        Possible counter-argument: supervised learning can outperform unsupervised learning by some large factor, data quality can likewise have a big impact. That’s fine, but every chess book I’ve read has been supervised learning, and chess books are already higher data quality than scraping r/chess. So those optimizations have already been made.
        
        Possible counter-argument: few-shot learning in GPT-3? This seems more like surface knowledge that is already in the language model. So maybe a chess beginner already has the perfect chess algorithm somewhere in their brain, and the chess book just needs to surface that model and suppress all the flawed models that are competing with it? I don’t buy it, that’s not what it feels like learning chess from the inside, but maybe I need to give the idea some weight.
        
        Possible counter-argument: maybe humans are actually really intelligent and really good learners and the reason we’re so flawed is that we have bad inputs? Eg from other flawed humans, random chance hiding things, biases in what we pay attention to, etc. I don’t buy this, but I don’t actually have a clear reason why.
        gwern Feb 20, 2022, 3:50 PM
        7 points
        Parent
        
        But mostly this problem is in the opposite direction: can we provide carefully curated input that allows an intelligence to learn much faster? In this direction the results seem much less dramatic. My impression is that the speed of learning is limited by both the inputs and the learner. If the book of chess is a perfect input, then the limiting factor is the reader, and an average reader won’t get outsized benefits from perfect inputs.
        
        Which results did you have in mind? The ‘machine teaching’ results are pretty dramatic and surprising, although one could question whether they have any practical implications.
        Martin Randall Mar 13, 2022, 2:43 PM
        1 point
        Parent
        I wasn’t aware of them. Thanks. Yes, that’s exactly the sort of thing I’d expect to see if there was a large possible upside in better teaching materials that an Oracle could produce. So I no longer disagree with Rafael & Richard on this.
        Rafael Harth Feb 19, 2022, 2:27 PM
        2 points
        Parent
        
        But mostly this problem is in the opposite direction: can we provide carefully curated input that allows an intelligence to learn much faster? In this direction the results seem much less dramatic. My impression is that the speed of learning is limited by both the inputs and the learner. If the book of chess is a perfect input, then the limiting factor is the reader, and an average reader won’t get outsized benefits from perfect inputs.
        
        My problem with this is that you’re treating the amount of material as fixed and abstracting it as “speed”; however, what makes me unsure about the power of the best possible book is that it may choose a completely different approach.
        
        E.g., consider the “ontology” of high-level chess principles. We think in terms of “development” and “centralization [of pieces]” and “activity” and “pressure” and “attacking” and “discoveries” and so forth. Presumably, most of these are quite helpful; if you have no concept of discoveries, you will routinely place your queen or king on inconvenient squares and get punished. If you have no concept of pressure, you have no elegant way of pre-emptive reaction if your opponent starts aligning a lot of pieces toward your king, et cetera.
        
        So, at the upper end of my probability distribution for how good a book would be, it may introduce a hundred more such concepts, each one highly useful to elegantly compress various states. It will explain them all in the maximally intuitive and illustrative way, such that they all effortlessly stick, in the same way that sometimes things you hear just make sense and fit your aesthetic, and you recall them effortlessly. After reading this book, a beginner will look at a bunch of moves of a 2000 elo player, and go “ah, these two moves clearly violate principle Y”. Even though this player has far less ability to calculate lines, they know so many elegant compressions that they may compensate in a direct match. Much in the same way that you may beat someone who has practiced twice as long as you but has no concept of pressure; they just can’t figure out how to spot situations from afar where their king is suddenly in trouble.
      - Rafael Harth Jan 24, 2022, 2:52 PM
        4 points
        Parent
        Isn’t it trivial for the beginner to beat Magnus using this book? God just needs to predict Magnus perfectly, and write down a single list of moves that the beginner needs to follow to beat him. Half a page is enough.
        
        In general, you ignored this approach, which is the main reason why I’m unsure whether a book from a superintelligence could beat Magnus.
        Martin Randall Feb 9, 2022, 4:31 AM
        3 points
        Parent
        I read your idea of “a line that seems forcing for black”, and I interpreted it as being forcing for black in general, and responded in terms of memorizing optimal opening lines. It sounds like you meant a line that would cause Magnus in particular to respond in predictable ways? Sorry for missing that.
        
        I can imagine a scenario with an uploaded beginner and an uploaded Magnus in a sealed virtual environment running on error-correcting hardware with a known initial state and a deterministic algorithm, and your argument goes through there, and in sufficiently similar scenarios.
        
        Whereas I had in mind a much more chaotic scenario. For example, I expect Magnus’s moves to depend in part on the previous games he played, so predicting Magnus requires predicting all of those games, and thus the exponential tree of previous games. And I expect his moves to depend in part on his mood, eg how happy he’d be with a draw. So our disagreement could be mostly about the details of the hypothetical, such as how much time passes between creating the book and playing the game?
        Rafael Harth Feb 9, 2022, 3:48 PM
        3 points
        Parent
        
        I read your idea of “a line that seems forcing for black”, and I interpreted it as being forcing for black in general
        
        So to clarify: this interpretation was correct. I was assuming that a superintelligence cannot perfectly predict Magnus, pretty much for the reasons you mention (dependency on previous games, mood, etc.) But I then changed that standard when you said
        
        I’m very confident that Magnus absolutely crushes a beginner who has been given a personal chess book, of normal book length, written by God.
        
        Unlike a superintelligence, surely god could simulate Magnus perfectly no matter what; this is why I called the problem trivial—if you invoke god.
        
        If you don’t invoke god (and thus can’t simulate magnus), I remain unsure. There are already games where world champions play the top move recommended by the engine 10 times in a row, and those have not been optimized for forcing lines. You may overestimate how much uncertainty or variance there really is. (Though again, if Magnus knows what you’re doing, it gets much harder since then he could just play a few deliberately bad moves and get you out of preparation.)
        Martin Randall Feb 13, 2022, 3:27 AM
        3 points
        Parent
        Yes, I used “God” to try to avoid ambiguity about (eg) how smart the superintelligence is, and ended up just introducing ambiguity about (eg) whether God plays dice. Oops. I think the God hypothetical ends up showing the usual thing: Oracles fail^[1] at large/chaotic tasks, and succeed at small/narrow tasks. Sure, more things are small and narrow if you are God, but that’s not very illuminating.
        
        So, back to an Oracle, not invoking God, writing a book of chess for a beginner, filling it with lines that are forcing for black, trying to get >50% of the tree. Why do we care, why are we discussing this? I think because chess is so much smaller and less chaotic than most domains we care about, so if an Oracle fails at chess, it’s probably going to also fail at AI alignment, theorem proving, pivotal acts, etc.
        
        There’s some simple failure cases we should get out of the way:
        
        As you said, if Magnus knows or suspects what he’s playing against, he plays a few lower probability moves and gets out of the predicted tree. Eg, 1. e4 d6 is a 1% response from Magnus. Or, if Magnus thinks he’s playing a beginner, then he uses the opportunity to experiment, and becomes less predictable. So assume that he plays normally, predictably.
        If Magnus keeps playing when he’s in a lost position, it’s really hard for a move to be “forced” if all moves lead to a loss with correct play. One chess principle I got from a book: don’t resign before the end game if you don’t know that your opponent can play the end game well. Well, assume that Magnus resigns a lost position.
        What if the beginner misremembers something, and plays the wrong move? How many moves can a beginner remember, working from an Oracle-created book that has everything pre-staged with optimized mnemonics? I assume 1,000 moves, perfect recall. 10 moves per page for a 100 page book.
        
        So we need to optimize for lines that are forcing, short, and winning^[2]. Shortness is important because a 40 move line where each move is 98% forced is overall ~45% forcing, and because we can fit more short lines into our beginner’s memory. If you search through all top-level chess games and find ones where the players play the engine-recommended move ten times in a row, that is optimizing for winning (from the players) and forcing (from the search). Ten moves isn’t long enough, we need ~30 moves for a typical game.
        
        Terrible estimate: with 500,000 games in chessgames.com, say there are 50 games with forcing lines of ten moves, a 10,000x reduction. An Oracle can search better, for games that haven’t been played yet. So maybe if Oracle searched through 5 trillion games it would find a game with a forcing line of 20 moves? At some point I question whether chess can be both low variance enough to have these long forcing lines, and also high variance enough to have so many potential games to search through. Of course chess has ample variance if you allow white to play bad moves, but then you’re not winning.
        
        Another approach, trying to find a forcing opening, running through the stats on chessgames.com in a greedy way, I get this “Nimzo-Indian, Samisch” variation, which seems to be playable for both sides, but perhaps slightly favors black:
        
        d4 Nf6 (73% forced—Magnus games)
        c4 e6 (72% forced—Magnus games)
        Nc3 Bb4 (83% forced—all games)
        a3 Bxc3+ (100% forced—all games)
        bxc3 c5 (55% forced—all games)
        f3 d5 (85% forced—all games)
        
        Multiplying that through gets 20% forcing over six moves. So maybe Oracle is amazingly lucky and there are hitherto undiscovered forcing lines directly from this well-known position to lost positions for black, missed by Stockfish, AlphaZero, and all humans. Well, then Oracle still needs to cover another 30% of the tree and get just as lucky a few more times. If that happens, I think I’m in crisis of faith mode where I have to reevaluate whether grandmaster chess was an elaborate hoax. So many positions we thought were even turn out to be winning for white, everyone missed it, what happened?
        
        ↩︎
        Where “fail” means “no plan found”, “memory and time exhausted”, “here’s a plan that involves running a reasoner in real-time” or “feed me observations in real-time and ask me only to generate a local and by-default-inscrutable action”, as listed by so8res above.
        
        ↩︎
        It doesn’t help that chess players also search for lines that are forcing, short, and winning, at least some of the time.
        
        Rafael Harth Feb 14, 2022, 6:56 PM
        3 points
        Parent
        You can consider me convinced that the “find forcing lines” approach isn’t going to work.
        
        (How well the perfect book could “genuinely” teach someone is a different question, but that’s definitely not enough to beat Magnus.)
        Richard Korzekwa Jan 24, 2022, 5:00 PM
        1 point
        Parent
        Yeah, this is part of what I was getting at. The narrowness of the task “write a set of instructions for a one-off victory against a particular player” is a crucial part of what makes it seem not-obviously-impossible to me. Fully simulating Magnus should be adequate, but then obviously you’re invoking a reasoner. What I’m uncertain about is if you can write such instructions without invoking a reasoner.
- So8res Jan 23, 2022, 12:06 AM
  13 points
  Parent
  I agree that it’s plausible chess-plans can be compressed without invoking full reasoners (and with a more general point that there are degrees of compression you can do short of full-on ‘reasoner’, and with the more specific point that I was oversimplifying in my comment). My intent with my comment was to highlight how “but my AI only generates plans” is sorta orthogonal to the alignment question, which is pushed, in the oracle framework, over to “how did that plan get compressed, and what sort of cognition is invoved in the plan, and why does running that cognition yield good outcomes”.
  
  I have not yet found a pivotal act that seems to me to require only shallow realtime/reactive cognition, but I endorse the exercise of searching for highly specific and implausibly concrete pivotal acts with that property.