This seems to be akin to asking, ‘does RL scale like every other area of DL so far?’ and the answer is more or less yes: https://www.reddit.com/r/mlscaling/search?q=flair%3ARL&restrict_sr=on&include_over_18=on https://gwern.net/doc/reinforcement-learning/scaling/index
This seems to be akin to asking, ‘does RL scale like every other area of DL so far?’ and the answer is more or less yes: https://www.reddit.com/r/mlscaling/search?q=flair%3ARL&restrict_sr=on&include_over_18=on https://gwern.net/doc/reinforcement-learning/scaling/index