Continuing the metaphor, what the authors are saying looks to some extent similar to stochastic gradient descent (which would be the real way you minimize the distance to finish in the maze analogy.)
Or A*, which is a much more computationally efficient and deterministic way to minimize the distance to finish the maze, if you have an appropriate heuristic. I don’t have an argument for it, but I feel like finding a good heuristic and leveraging it probably works very well as a generalizable strategy.
Continuing the metaphor, what the authors are saying looks to some extent similar to stochastic gradient descent (which would be the real way you minimize the distance to finish in the maze analogy.)
Or A*, which is a much more computationally efficient and deterministic way to minimize the distance to finish the maze, if you have an appropriate heuristic. I don’t have an argument for it, but I feel like finding a good heuristic and leveraging it probably works very well as a generalizable strategy.