RSS

Evan R. Murphy

Karma: 1,142

I’m doing research and other work focused on AI safety/​security, governance and risk reduction. Currently my top projects are (last updated Feb 26, 2025):

General areas of interest for me are AI safety strategy, comparative AI alignment research, prioritizing technical alignment work, analyzing the published alignment plans of major AI labs, interpretability, the Conditioning Predictive Models agenda, deconfusion research and other AI safety-related topics. My work is currently self-funded.

Research that I’ve authored or co-authored:

Other recent work:

Before getting into AI safety, I was a software engineer for 11 years at Google and various startups. You can find details about my previous work on my LinkedIn.

I’m always happy to connect with other researchers or people interested in AI alignment and effective altruism. Feel free to send me a private message!

Evan R. Mur­phy’s Shortform

Evan R. MurphyFeb 28, 2025, 12:56 AM
6 points
2 comments1 min readLW link

Steven Pinker on ChatGPT and AGI (Feb 2023)

Evan R. MurphyMar 5, 2023, 9:34 PM
11 points
8 comments1 min readLW link
(news.harvard.edu)

Steer­ing Be­havi­our: Test­ing for (Non-)My­opia in Lan­guage Models

Dec 5, 2022, 8:28 PM
40 points
19 comments10 min readLW link

Paper: Large Lan­guage Models Can Self-im­prove [Linkpost]

Evan R. MurphyOct 2, 2022, 1:29 AM
52 points
15 comments1 min readLW link
(openreview.net)

Google AI in­te­grates PaLM with robotics: SayCan up­date [Linkpost]

Evan R. MurphyAug 24, 2022, 8:54 PM
25 points
0 comments1 min readLW link
(sites.research.google)

Sur­prised by ELK re­port’s coun­terex­am­ple to De­bate, IDA

Evan R. MurphyAug 4, 2022, 2:12 AM
18 points
0 comments5 min readLW link

New US Se­nate Bill on X-Risk Miti­ga­tion [Linkpost]

Evan R. MurphyJul 4, 2022, 1:25 AM
35 points
12 comments1 min readLW link
(www.hsgac.senate.gov)

In­ter­pretabil­ity’s Align­ment-Solv­ing Po­ten­tial: Anal­y­sis of 7 Scenarios

Evan R. MurphyMay 12, 2022, 8:01 PM
58 points
0 comments59 min readLW link

In­tro­duc­tion to the se­quence: In­ter­pretabil­ity Re­search for the Most Im­por­tant Century

Evan R. MurphyMay 12, 2022, 7:59 PM
16 points
0 comments8 min readLW link

[Question] What is a train­ing “step” vs. “epi­sode” in ma­chine learn­ing?

Evan R. MurphyApr 28, 2022, 9:53 PM
10 points
4 comments1 min readLW link

Ac­tion: Help ex­pand fund­ing for AI Safety by co­or­di­nat­ing on NSF response

Evan R. MurphyJan 19, 2022, 10:47 PM
23 points
8 comments3 min readLW link

Promis­ing posts on AF that have fallen through the cracks

Evan R. MurphyJan 4, 2022, 3:39 PM
34 points
6 comments2 min readLW link