Joseph Bloom comments on Interpreting Preference Models w/ Sparse Autoencoders

Joseph Bloom 2 Jul 2024 7:54 UTC
3 points
0
7B parameter PM
@Logan Riggs this link doesn’t work for me.
- Logan Riggs 2 Jul 2024 15:31 UTC
  2 points
  0
  Parent
  Fixed! Thanks:)