Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Jeffrey Ladish
Karma:
1,980
All
Posts
Comments
New
Top
Old
Page
1
Bounty for Evidence on Some of Palisade Research’s Beliefs
benwr
and
Jeffrey Ladish
Sep 23, 2024, 8:01 PM
46
points
4
comments
2
min read
LW
link
Take SCIFs, it’s dangerous to go alone
latterframe
,
Jeffrey Ladish
and
schroederdewitt
May 1, 2024, 8:02 AM
43
points
1
comment
3
min read
LW
link
Palisade is hiring Research Engineers
Charlie Rogers-Smith
and
Jeffrey Ladish
Nov 11, 2023, 3:09 AM
23
points
0
comments
3
min read
LW
link
unRLHF—Efficiently undoing LLM safeguards
Pranav Gade
,
Jeffrey Ladish
and
Simon Lermen
Oct 12, 2023, 7:58 PM
117
points
15
comments
20
min read
LW
link
LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B
Simon Lermen
and
Jeffrey Ladish
Oct 12, 2023, 7:58 PM
151
points
29
comments
14
min read
LW
link
The Agency Overhang
Jeffrey Ladish
Apr 21, 2023, 7:47 AM
85
points
6
comments
6
min read
LW
link
Donation offsets for ChatGPT Plus subscriptions
Jeffrey Ladish
Mar 16, 2023, 11:29 PM
53
points
3
comments
3
min read
LW
link
To determine alignment difficulty, we need to know the absolute difficulty of alignment generalization
Jeffrey Ladish
Mar 14, 2023, 3:52 AM
12
points
3
comments
2
min read
LW
link
Thoughts on the OpenAI alignment plan: will AI research assistants be net-positive for AI existential risk?
Jeffrey Ladish
Mar 10, 2023, 8:21 AM
58
points
3
comments
9
min read
LW
link
AGI systems & humans will both need to solve the alignment problem
Jeffrey Ladish
Feb 24, 2023, 3:29 AM
59
points
14
comments
4
min read
LW
link
When you plan according to your AI timelines, should you put more weight on the median future, or the median future | eventual AI alignment success? ⚖️
Jeffrey Ladish
Jan 5, 2023, 1:21 AM
25
points
10
comments
2
min read
LW
link
Marriage, the Giving What We Can Pledge, and the damage caused by vague public commitments
Jeffrey Ladish
Jul 11, 2022, 7:38 PM
98
points
27
comments
6
min read
LW
link
1
review
My vision of a good future, part I
Jeffrey Ladish
Jul 6, 2022, 1:23 AM
66
points
18
comments
9
min read
LW
link
Information security considerations for AI and the long term future
Jeffrey Ladish
and
lennart
May 2, 2022, 8:54 PM
76
points
6
comments
10
min read
LW
link
Don’t die with dignity; instead play to your outs
Jeffrey Ladish
Apr 6, 2022, 7:53 AM
281
points
60
comments
5
min read
LW
link
EA Hangout Prisoners’ Dilemma
Jeffrey Ladish
Sep 27, 2021, 11:15 PM
55
points
18
comments
3
min read
LW
link
Comment on the lab leak hypothesis
Jeffrey Ladish
Jun 11, 2021, 10:49 PM
63
points
14
comments
4
min read
LW
link
Nuclear war is unlikely to cause human extinction
Jeffrey Ladish
Nov 7, 2020, 5:42 AM
131
points
48
comments
11
min read
LW
link
3
reviews
Was SARS-CoV-2 actually present in March 2019 wastewater samples?
Jeffrey Ladish
Jul 7, 2020, 11:08 PM
4
points
1
comment
2
min read
LW
link
landfish lab
Jeffrey Ladish
Feb 20, 2020, 12:20 AM
5
points
20
comments
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel