The important thing is that both do active learning & decisionmaking & search, i.e. RL. *
LLMs don’t do that. So the gain from doing that is huge.
Synthetic data is a bit of a weird word that get’s thrown around a lot. There are fundamental limits on how much information resampling from the same data source will yield about completely different domains. So that seems a bit silly. Ofc sometimes with synthetic data people just mean doing rollouts, i.e. RL.
*the word RL sometimes gets mistaken for only very specific reinforcement learning algorithm. I mean here a very general class of algorithms that solve MDPs.
The important thing is that both do active learning & decisionmaking & search, i.e. RL. *
LLMs don’t do that. So the gain from doing that is huge.
Synthetic data is a bit of a weird word that get’s thrown around a lot. There are fundamental limits on how much information resampling from the same data source will yield about completely different domains. So that seems a bit silly. Ofc sometimes with synthetic data people just mean doing rollouts, i.e. RL.
*the word RL sometimes gets mistaken for only very specific reinforcement learning algorithm. I mean here a very general class of algorithms that solve MDPs.