Veedrac comments on GPTs are Predictors, not Imitators

Veedrac 10 Apr 2023 0:04 UTC
1 point
0
They typically are uniform, but I think this feels like not the most useful place to be arguing minutia, unless you have a cruxy point underneath I’m not spotting. “The training process for LLMs can optimize for distributional correctness at the expense of sample plausibility, and are functionally different to processes like GANs in this regard” is a clarification with empirically relevant stakes, but I don’t know what the stakes are for this digression.
- David Johnston 10 Apr 2023 0:45 UTC
  2 points
  1
  Parent
  I was just trying to clarify the limits of autoregressive vs other learning methods. Autoregressive learning is at an apparent disadvantage if $P (X_{t} | X_{t - 1})$ is hard to compute and the reverse is easy and low entropy. It can “make up for this” somewhat if it can do a good job of predicting $X_{t}$ from $X_{t - 2}$ , but it’s still at a disadvantage if, for example, that’s relatively high entropy compared to $X_{t - 1}$ from $X_{t}$ . That’s it, I’m satisfied.