It looks like transformers started to eat reinforcement learning: [Trajectory Transformer] Offline Reinforcement Learning as One Big Sequence Modeling Problem Michael Janner, Qiyang Li, Sergey Levine https://arxiv.org/abs/2106.02039
Karpathy had a tweeter thread about it. “The ongoing consolidation in AI is incredible. Thread: When I started ~decade ago vision, speech, natural language, reinforcement learning, etc. were completely separate; You couldn’t read papers across areas—the approaches were completely different, often not even ML based.”...
It looks like transformers started to eat reinforcement learning:
[Trajectory Transformer] Offline Reinforcement Learning as One Big Sequence Modeling Problem
Michael Janner, Qiyang Li, Sergey Levine
https://arxiv.org/abs/2106.02039
Karpathy had a tweeter thread about it. “The ongoing consolidation in AI is incredible. Thread: When I started ~decade ago vision, speech, natural language, reinforcement learning, etc. were completely separate; You couldn’t read papers across areas—the approaches were completely different, often not even ML based.”...
Yesterday: “Offline Pre-trained Multi-Agent Decision Transformer (MADT): One Big Sequence Model Conquers All StarCraft II Tasks”, Meng et al 2021.