Rohin Shah comments on Will humans build goal-directed agents?

Rohin Shah 5 Jan 2019 21:02 UTC
4 points
Yes, as long as you keep doing the MCTS + training. The value/policy networks by themselves are not goal-directed.
- Daniel Kokotajlo 7 Jan 2019 21:17 UTC
  3 points
  Parent
  I get why the MCTS is important, but what about the training? It seems to me that if we stop training AlphaGo (Zero) and I play a game against it, it’s goal-directed even though we have stopped training it.
  - Rohin Shah 8 Jan 2019 16:38 UTC
    2 points
    Parent
    Yeah, I agree that even without the training it would be goal-directed, that comes from the MCTS.
    Note though that if we stop training and also stop using MCTS and you play a game against it, it will beat you and yet I would say that it is not goal-directed.