RSS

Lennart Buerger

Karma: 25

I studied Physics in Heidelberg and Oxford and am now doing research on AI Alignment and LLM Interpretability, currently as part of my Master’s thesis in Fred Hamprecht’s SciAI Lab (Heidelberg, Germany). If you want to discuss something, have questions or would like to collaborate, feel free to drop me a message!

Truth is Univer­sal: Ro­bust De­tec­tion of Lies in LLMs

Lennart Buerger19 Jul 2024 14:07 UTC
24 points
3 comments2 min readLW link
(arxiv.org)