RL algorithms don’t minimize costs, but maximize expected reward, which can well be unbounded, so it’s wrong to say that the ML field only minimizes cost.
Yann LeCun’s proposals are based on cost-minimization.
I’m not sure he has coherent expectations, but I’d expect his vibe is some combination of “RL doesn’t currently work” and “fields generally implement safety standards”.
Yann LeCun’s proposals are based on cost-minimization.
Do you expect Lecun to have been assuming that the entire field of RL stops existing in order to focus on his specific vision?
I’m not sure he has coherent expectations, but I’d expect his vibe is some combination of “RL doesn’t currently work” and “fields generally implement safety standards”.