I think @RogerDearnaley means Sparse Autoencoders (SAEs), see for example these papers and the SAE tag on LessWrong.
As I mentioned in my other comment, SAEs finds features that correspond to abstract features of words and text. That’s not the same as finding features that correspond to reality.
I think @RogerDearnaley means Sparse Autoencoders (SAEs), see for example these papers and the SAE tag on LessWrong.
As I mentioned in my other comment, SAEs finds features that correspond to abstract features of words and text. That’s not the same as finding features that correspond to reality.