Zach Furman comments on Generalization, from thermodynamics to statistical physics

Zach Furman 4 Dec 2023 18:46 UTC
3 points
0
It’s worth noting that Jesse is mostly following the traditional “approximation, generalization, optimization” error decomposition from learning theory here—where “generalization” specifically refers to finite-sample generalization (gap between train/test loss), rather than something like OOD generalization. So e.g. a failure of transformers to solve recursive problems would be a failure of approximation, rather than a failure of generalization. Unless I misunderstood you?
- Noosphere89 4 Dec 2023 18:48 UTC
  5 points
  0
  Parent
  Ok, I understand now. You haven’t misunderstood me. I’m not sure what to do with my comment above now.
  - Jesse Hoogland 4 Dec 2023 21:17 UTC
    2 points
    0
    Parent
    Thanks for raising that, it’s a good point. I’d appreciate it if you also cross-posted this to the approximation post here.
    - Noosphere89 4 Dec 2023 21:48 UTC
      2 points
      0
      Parent
      I’ll cross post it soon.
      
      I actually did it: https://www.lesswrong.com/posts/gq9GR6duzcuxyxZtD/?commentId=feuGTuRRAi6r6DRRK