paulfchristiano comments on AlphaStar: Impressive for RL progress, not for AGI progress

paulfchristiano 15 Nov 2019 3:38 UTC
1 point
AGZ is only trained on the situations that actually arise in games it plays.
I agree with the point that “imitation learning from human games” will only make you play well on kinds of situations that arise in human games, and that self-play can do better by making you play well on a broader set of situations. You could also train on all the situations that arise in a bigger tree search (though AGZ did not) or against somewhat-random moves (which AGZ probably did).
(Though I don’t see this as affecting the basic point.)