jsteinhardt comments on What’s Up With Confusingly Pervasive Goal Directedness?

jsteinhardt Jan 23, 2022, 8:29 PM
4 points
I feel like you are arguing for a very strong claim here, which is that “as soon as you have an efficient way of determining whether a problem is solved, and any way of generating a correct solution some very small fraction of the time, you can just build an efficient solution that solves it all of the time”
Hm, this isn’t the claim I intended to make. Both because it overemphasizes on “efficient” and because it adds a lot of “for all” statements.
If I were trying to state my claim more clearly, it would be something like “generically, for the large majority of problems of the sort you would come across in ML, once you can distinguish good answers you can find good answers (modulo some amount of engineering work), because non-convex optimization generally works and there are a large number of techniques for solving the sparse rewards problem, which are also getting better over time”.
- habryka Jan 23, 2022, 11:21 PM
  3 points
  Parent
  If I were trying to state my claim more clearly, it would be something like “generically, for the large majority of problems of the sort you would come across in ML, once you can distinguish good answers you can find good answers (modulo some amount of engineering work), because non-convex optimization generally works and there are a large number of techniques for solving the sparse rewards problem, which are also getting better over time”.
  I am a bit confused by what we mean by “of the sort you would come across in ML”. Like, this situation, where we are trying to derive an algorithm that solves problems without optimizers, from an algorithm that solves problems with optimizers, is that “the sort of problem you would come across in ML”?. It feels pretty different to me from most usual ML problems.
  I also feel like in ML it’s quite hard to actually do this in practice. Like, it’s very easy to tell whether a self-driving car AI has an accident, but not very easy to actually get it to not have any accidents. It’s very easy to tell whether an AI can produce a Harry Potter-level quality novel, but not very easy to get it to produce one. It’s very easy to tell if an AI has successfully hacked some computer system, but very hard to get it to actually do so. I feel like the vast majority of real-world problems we want to solve do not currently follow the rule of “if you can distinguish good answers you can find good answers”. Of course, success in ML has been for the few subproblems where this turned out to be easy, but clearly our prior should be on this not working out, given the vast majority of problems where this turned out to be hard.
  (Also, to be clear, I think you are making a good point here, and I am pretty genuinely confused for which kind of problems the thing you are saying does turn out to be true, and appreciate your thoughts here)