Owain_Evans comments on Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data

Owain_Evans 23 Jun 2024 20:05 UTC
LW: 5 AF: 4
5
AF
I agree that there are ways to explain the results and these points from Steven and Thane make sense. I will note that the models are significantly more reliable at learning in-distribution (i.e. to predict the training set) than they are at generalizing to the evaluations that involve verbalizing the latent state (and answering downstream questions about it). So it’s not the case that learning to predict the training set (or inputs very similar to training inputs) automatically results in generalization to the verbalized evaluations. We do see improvement in reliability with GPT-4 over GPT-3.5, but we don’t have enough information to draw any firm conclusions about scaling.