I’m not familiar with LeCun’s ideas, but I don’t think the idea of having an actor, critic, and world model is new in this paper. For a while, most RL algorithms have used an actor-critic architecture, including OpenAI’s old favorite PPO. Model-based RL has been around for years as well, so probably plenty of projects have used an actor, critic, and world model.
Even though the core idea isn’t novel, this paper getting good results might indicate that model-based RL is making more progress than expected, so if LeCun predicted that the future would look more like model-based RL, maybe he gets points for that.
I’m not familiar with LeCun’s ideas, but I don’t think the idea of having an actor, critic, and world model is new in this paper. For a while, most RL algorithms have used an actor-critic architecture, including OpenAI’s old favorite PPO. Model-based RL has been around for years as well, so probably plenty of projects have used an actor, critic, and world model.
Even though the core idea isn’t novel, this paper getting good results might indicate that model-based RL is making more progress than expected, so if LeCun predicted that the future would look more like model-based RL, maybe he gets points for that.