RSS

Drake Thomas

Karma: 672

Interested in math puzzles, fermi estimation, strange facts about the world, toy models of weird scenarios, unusual social technologies, and deep dives into the details of random phenomena.

Working on the pretraining team at Anthropic as of October 2024; before that I did independent alignment research of various flavors and worked in quantitative finance.

Au­dit­ing lan­guage mod­els for hid­den objectives

Mar 13, 2025, 7:18 PM
138 points
15 comments13 min readLW link