RSS

Jeffrey Ladish

Karma: 1,981

Bounty for Ev­i­dence on Some of Pal­isade Re­search’s Beliefs

23 Sep 2024 20:01 UTC
46 points
4 comments2 min readLW link

Take SCIFs, it’s dan­ger­ous to go alone

1 May 2024 8:02 UTC
42 points
1 comment3 min readLW link

Pal­isade is hiring Re­search Engineers

11 Nov 2023 3:09 UTC
23 points
0 comments3 min readLW link

unRLHF—Effi­ciently un­do­ing LLM safeguards

12 Oct 2023 19:58 UTC
117 points
15 comments20 min readLW link

LoRA Fine-tun­ing Effi­ciently Un­does Safety Train­ing from Llama 2-Chat 70B

12 Oct 2023 19:58 UTC
151 points
29 comments14 min readLW link

The Agency Overhang

Jeffrey Ladish21 Apr 2023 7:47 UTC
85 points
6 comments6 min readLW link

Dona­tion offsets for ChatGPT Plus subscriptions

Jeffrey Ladish16 Mar 2023 23:29 UTC
53 points
3 comments3 min readLW link

To de­ter­mine al­ign­ment difficulty, we need to know the ab­solute difficulty of al­ign­ment generalization

Jeffrey Ladish14 Mar 2023 3:52 UTC
12 points
3 comments2 min readLW link

Thoughts on the OpenAI al­ign­ment plan: will AI re­search as­sis­tants be net-pos­i­tive for AI ex­is­ten­tial risk?

Jeffrey Ladish10 Mar 2023 8:21 UTC
58 points
3 comments9 min readLW link

AGI sys­tems & hu­mans will both need to solve the al­ign­ment problem

Jeffrey Ladish24 Feb 2023 3:29 UTC
59 points
14 comments4 min readLW link

When you plan ac­cord­ing to your AI timelines, should you put more weight on the me­dian fu­ture, or the me­dian fu­ture | even­tual AI al­ign­ment suc­cess? ⚖️

Jeffrey Ladish5 Jan 2023 1:21 UTC
25 points
10 comments2 min readLW link

Mar­riage, the Giv­ing What We Can Pledge, and the dam­age caused by vague pub­lic commitments

Jeffrey Ladish11 Jul 2022 19:38 UTC
98 points
27 comments6 min readLW link1 review

My vi­sion of a good fu­ture, part I

Jeffrey Ladish6 Jul 2022 1:23 UTC
66 points
18 comments9 min readLW link

In­for­ma­tion se­cu­rity con­sid­er­a­tions for AI and the long term future

2 May 2022 20:54 UTC
76 points
6 comments10 min readLW link

Don’t die with dig­nity; in­stead play to your outs

Jeffrey Ladish6 Apr 2022 7:53 UTC
280 points
60 comments5 min readLW link

EA Han­gout Pri­son­ers’ Dilemma

Jeffrey Ladish27 Sep 2021 23:15 UTC
55 points
18 comments3 min readLW link

Com­ment on the lab leak hypothesis

Jeffrey Ladish11 Jun 2021 22:49 UTC
63 points
14 comments4 min readLW link

Nu­clear war is un­likely to cause hu­man extinction

Jeffrey Ladish7 Nov 2020 5:42 UTC
134 points
48 comments11 min readLW link3 reviews

Was SARS-CoV-2 ac­tu­ally pre­sent in March 2019 wastew­a­ter sam­ples?

Jeffrey Ladish7 Jul 2020 23:08 UTC
4 points
1 comment2 min readLW link

land­fish lab

Jeffrey Ladish20 Feb 2020 0:20 UTC
5 points
20 comments1 min readLW link