Sounds like a cool project, I’m looking forward to seeing the results!
https://www.lesswrong.com/posts/pvbWz5mEFCS6DFaWD/progress-report-1-interpretability-experiments-and-learning
https://www.lesswrong.com/posts/nxLHgG5SCKCxM9oJX/progress-report-2
Now that I got a grant from the Long Term Future Fund and quit my job to do interpretability research full time, I’m actually making progress on some of my ideas!
Sounds like a cool project, I’m looking forward to seeing the results!
https://www.lesswrong.com/posts/pvbWz5mEFCS6DFaWD/progress-report-1-interpretability-experiments-and-learning
https://www.lesswrong.com/posts/nxLHgG5SCKCxM9oJX/progress-report-2
Now that I got a grant from the Long Term Future Fund and quit my job to do interpretability research full time, I’m actually making progress on some of my ideas!