Activation engineering (as unsupervised interp)
Much of this is now supervised, [Roger questions how much value the unsupervised part brings](https://www.lesswrong.com/posts/bWxNPMy5MhPnQTzKz/what-discovering-latent-knowledge-did-and-did-not-find-4). So it might make sense to merge with model edits in the next one.
Much of this is now supervised, [Roger questions how much value the unsupervised part brings](https://www.lesswrong.com/posts/bWxNPMy5MhPnQTzKz/what-discovering-latent-knowledge-did-and-did-not-find-4). So it might make sense to merge with model edits in the next one.