RSS

StefanHex

Karma: 1,660

Stefan Heimersheim. Research Scientist at Apollo Research, Mechanistic Interpretability. The opinions expressed here are my own and do not necessarily reflect the views of my employer.

Try train­ing to­ken-level probes

StefanHexApr 14, 2025, 11:56 AM
46 points
4 comments8 min readLW link

Proof-of-Con­cept De­bug­ger for a Small LLM

Mar 17, 2025, 10:27 PM
27 points
0 comments11 min readLW link