RSS

Miles Turpin

Karma: 260

Research scientist at Scale AI on the SEAL team (safety)

Do mod­els say what they learn?

Mar 22, 2025, 3:19 PM
113 points
9 comments13 min readLW link

Re­ward hack­ing be­hav­ior can gen­er­al­ize across tasks

May 28, 2024, 4:33 PM
79 points
5 comments21 min readLW link

Bias-Aug­mented Con­sis­tency Train­ing Re­duces Bi­ased Rea­son­ing in Chain-of-Thought

Miles TurpinMar 11, 2024, 11:46 PM
16 points
0 comments1 min readLW link
(arxiv.org)

Some Quick Fol­low-Up Ex­per­i­ments to “Taken out of con­text: On mea­sur­ing situ­a­tional aware­ness in LLMs”

Miles TurpinOct 3, 2023, 2:22 AM
31 points
0 comments9 min readLW link

Un­faith­ful Ex­pla­na­tions in Chain-of-Thought Prompting

Miles TurpinJun 3, 2023, 12:22 AM
42 points
8 comments7 min readLW link