Cole Wyeth comments on Heresies in the Shadow of the Sequences

Cole Wyeth 14 Nov 2024 20:49 UTC
1 point
0
The standard method for training LLM’s is next token prediction with teacher-forcing, penalized by the negative log-loss. This is exactly the right setup to elicit calibrated conditional probabilities, and exactly the “prequential problem” that Solomonoff induction was designed for. I don’t think this was motivated by decision theory, but it definitely makes perfect sense as an approximation to Bayesian inductive inference—the only missing ingredient is acting to optimize a utility function based on this belief distribution. So I think it’s too early to suppose that decision theory won’t play a role.