RSS

gasteigerjo

Karma: 219

Working on Alignment Science at Anthropic

Paper High­lights, March ’25

gasteigerjoApr 7, 2025, 8:17 PM
8 points
0 comments9 min readLW link
(aisafetyfrontier.substack.com)

Au­to­mated Re­searchers Can Subtly Sandbag

Mar 26, 2025, 7:13 PM
41 points
0 comments4 min readLW link
(alignment.anthropic.com)

AI Safety at the Fron­tier: Paper High­lights, Fe­bru­ary ’25

gasteigerjoMar 3, 2025, 10:09 PM
7 points
0 comments7 min readLW link
(aisafetyfrontier.substack.com)

AI Safety at the Fron­tier: Paper High­lights, Jan­uary ’25

gasteigerjoFeb 11, 2025, 4:14 PM
7 points
0 comments8 min readLW link
(aisafetyfrontier.substack.com)

AI Safety at the Fron­tier: Paper High­lights, De­cem­ber ’24

gasteigerjoJan 11, 2025, 10:54 PM
7 points
2 comments7 min readLW link
(aisafetyfrontier.substack.com)

Paper High­lights, Novem­ber ’24

gasteigerjoDec 7, 2024, 7:15 PM
7 points
0 comments8 min readLW link
(aisafetyfrontier.substack.com)

AI Safety at the Fron­tier: Paper High­lights, Oc­to­ber ’24

gasteigerjoOct 31, 2024, 12:09 AM
3 points
0 comments9 min readLW link
(aisafetyfrontier.substack.com)

AI Safety at the Fron­tier: Paper High­lights, Septem­ber ’24

gasteigerjoOct 2, 2024, 9:49 AM
13 points
0 comments7 min readLW link
(aisafetyfrontier.substack.com)

AI Safety at the Fron­tier: Paper High­lights, Au­gust ’24

gasteigerjoSep 3, 2024, 7:17 PM
28 points
0 comments6 min readLW link
(aisafetyfrontier.substack.com)

AI Safety at the Fron­tier: Paper High­lights, July ’24

gasteigerjoAug 5, 2024, 1:00 PM
8 points
0 comments7 min readLW link
(aisafetyfrontier.substack.com)

Dis­cus­sion: Challenges with Un­su­per­vised LLM Knowl­edge Discovery

Dec 18, 2023, 11:58 AM
147 points
21 comments10 min readLW link