Richard Korzekwa comments on AlphaStar: Impressive for RL progress, not for AGI progress

Richard Korzekwa 15 Nov 2019 14:14 UTC
1 point
I’m not sure how surprised to be about middle of training, versus final RL policy. Are you saying that this sort of mistake should be learned quickly in RL?
- paulfchristiano 15 Nov 2019 17:12 UTC
  2 points
  Parent
  I don’t have a big difference in my model of mid vs. final, they have very similar MMR, the difference between them is pretty small in the scheme of things (e..g probably smaller than the impact of doubling model size) and my picture isn’t refined enough to appreciate those differences. For any particular dumb mistake I’d be surprised if the line between not making it and making it was in that particular doubling.