The first thing you mention does not learn to play Atari, and is in general trained quite differently from Atari-playing AI’s (as it relies on self-play to kind of automatically generate a curriculum of harder and harder tasks, at least for the some of the more competitive tasks in XLand).
Sorry, was being kinda lazy and hoping someone had already thought about this.
This was the newer Deepmind one:
https://www.lesswrong.com/posts/mTGrrX8SZJ2tQDuqz/deepmind-generally-capable-agents-emerge-from-open-ended?commentId=bosARaWtGfR836shY#bosARaWtGfR836shY
I was motivated to post by this algorithm from China I heard about today:
https://www.facebook.com/nellwatson/posts/10159870157893559
I think this is the older deepmind paper:
https://deepmind.com/research/publications/2019/playing-atari-deep-reinforcement-learning
The first thing you mention does not learn to play Atari, and is in general trained quite differently from Atari-playing AI’s (as it relies on self-play to kind of automatically generate a curriculum of harder and harder tasks, at least for the some of the more competitive tasks in XLand).