The first thing you mention does not learn to play Atari, and is in general trained quite differently from Atari-playing AI’s (as it relies on self-play to kind of automatically generate a curriculum of harder and harder tasks, at least for the some of the more competitive tasks in XLand).
[Deleted]
Sorry, was being kinda lazy and hoping someone had already thought about this.
This was the newer Deepmind one:
https://www.lesswrong.com/posts/mTGrrX8SZJ2tQDuqz/deepmind-generally-capable-agents-emerge-from-open-ended?commentId=bosARaWtGfR836shY#bosARaWtGfR836shY
I was motivated to post by this algorithm from China I heard about today:
https://www.facebook.com/nellwatson/posts/10159870157893559
I think this is the older deepmind paper:
https://deepmind.com/research/publications/2019/playing-atari-deep-reinforcement-learning
The first thing you mention does not learn to play Atari, and is in general trained quite differently from Atari-playing AI’s (as it relies on self-play to kind of automatically generate a curriculum of harder and harder tasks, at least for the some of the more competitive tasks in XLand).