RSS

HoldenKarnofsky

Karma: 7,105

Sab­o­tage Eval­u­a­tions for Fron­tier Models

Oct 18, 2024, 10:33 PM
94 points
56 comments6 min readLW link
(assets.anthropic.com)

Case stud­ies on so­cial-welfare-based stan­dards in var­i­ous industries

HoldenKarnofskyJun 20, 2024, 1:33 PM
42 points
0 comments1 min readLW link

Good job op­por­tu­ni­ties for helping with the most im­por­tant century

HoldenKarnofskyJan 18, 2024, 5:30 PM
36 points
0 comments4 min readLW link
(www.cold-takes.com)

We’re Not Ready: thoughts on “paus­ing” and re­spon­si­ble scal­ing policies

HoldenKarnofskyOct 27, 2023, 3:19 PM
200 points
33 comments8 min readLW link

3 lev­els of threat obfuscation

HoldenKarnofskyAug 2, 2023, 2:58 PM
69 points
14 comments7 min readLW link