Martin Randall comments on What’s Up With Confusingly Pervasive Goal Directedness?

Martin Randall 18 Feb 2022 3:14 UTC
1 point
0
Thanks for the double response. This line seems potentially important. If we could safely create an Oracle that can create a book of chess that massively boosts chess ability, then we could maybe possibly miraculously do the same thing to create a book that massively boosts AI safety research ability.

I agree that my argument above was pretty sketchy, just “intuitions” really. Here’s something a bit more solid, after further reflection.

I’m aware of adversarial examples and security vulnerabilities, so I’m not surprised if a superintelligence is able to severely degrade human performance via carefully selected input. A chess book that can make Magnus lose to a beginner wouldn’t surprise me. Neither would a chess book that degraded a beginner’s priorities such that they obsessed about chess, for however many Elo points that would be worth.

But mostly this problem is in the opposite direction: can we provide carefully curated input that allows an intelligence to learn much faster? In this direction the results seem much less dramatic. My impression is that the speed of learning is limited by both the inputs and the learner. If the book of chess is a perfect input, then the limiting factor is the reader, and an average reader won’t get outsized benefits from perfect inputs.

Possible counter-argument: supervised learning can outperform unsupervised learning by some large factor, data quality can likewise have a big impact. That’s fine, but every chess book I’ve read has been supervised learning, and chess books are already higher data quality than scraping r/chess. So those optimizations have already been made.

Possible counter-argument: few-shot learning in GPT-3? This seems more like surface knowledge that is already in the language model. So maybe a chess beginner already has the perfect chess algorithm somewhere in their brain, and the chess book just needs to surface that model and suppress all the flawed models that are competing with it? I don’t buy it, that’s not what it feels like learning chess from the inside, but maybe I need to give the idea some weight.

Possible counter-argument: maybe humans are actually really intelligent and really good learners and the reason we’re so flawed is that we have bad inputs? Eg from other flawed humans, random chance hiding things, biases in what we pay attention to, etc. I don’t buy this, but I don’t actually have a clear reason why.
- gwern 20 Feb 2022 15:50 UTC
  7 points
  0
  Parent
  
  But mostly this problem is in the opposite direction: can we provide carefully curated input that allows an intelligence to learn much faster? In this direction the results seem much less dramatic. My impression is that the speed of learning is limited by both the inputs and the learner. If the book of chess is a perfect input, then the limiting factor is the reader, and an average reader won’t get outsized benefits from perfect inputs.
  
  Which results did you have in mind? The ‘machine teaching’ results are pretty dramatic and surprising, although one could question whether they have any practical implications.
  - Martin Randall 13 Mar 2022 14:43 UTC
    1 point
    0
    Parent
    I wasn’t aware of them. Thanks. Yes, that’s exactly the sort of thing I’d expect to see if there was a large possible upside in better teaching materials that an Oracle could produce. So I no longer disagree with Rafael & Richard on this.
- Rafael Harth 19 Feb 2022 14:27 UTC
  2 points
  0
  Parent
  
  But mostly this problem is in the opposite direction: can we provide carefully curated input that allows an intelligence to learn much faster? In this direction the results seem much less dramatic. My impression is that the speed of learning is limited by both the inputs and the learner. If the book of chess is a perfect input, then the limiting factor is the reader, and an average reader won’t get outsized benefits from perfect inputs.
  
  My problem with this is that you’re treating the amount of material as fixed and abstracting it as “speed”; however, what makes me unsure about the power of the best possible book is that it may choose a completely different approach.
  
  E.g., consider the “ontology” of high-level chess principles. We think in terms of “development” and “centralization [of pieces]” and “activity” and “pressure” and “attacking” and “discoveries” and so forth. Presumably, most of these are quite helpful; if you have no concept of discoveries, you will routinely place your queen or king on inconvenient squares and get punished. If you have no concept of pressure, you have no elegant way of pre-emptive reaction if your opponent starts aligning a lot of pieces toward your king, et cetera.
  
  So, at the upper end of my probability distribution for how good a book would be, it may introduce a hundred more such concepts, each one highly useful to elegantly compress various states. It will explain them all in the maximally intuitive and illustrative way, such that they all effortlessly stick, in the same way that sometimes things you hear just make sense and fit your aesthetic, and you recall them effortlessly. After reading this book, a beginner will look at a bunch of moves of a 2000 elo player, and go “ah, these two moves clearly violate principle Y”. Even though this player has far less ability to calculate lines, they know so many elegant compressions that they may compensate in a direct match. Much in the same way that you may beat someone who has practiced twice as long as you but has no concept of pressure; they just can’t figure out how to spot situations from afar where their king is suddenly in trouble.