Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
charlie_griffin
Karma:
333
All
Posts
Comments
New
Top
Old
Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
Alex Mallen
,
charlie_griffin
and
Buck
Mar 24, 2025, 5:55 PM
34
points
0
comments
8
min read
LW
link
LASR Labs Spring 2025 applications are open!
Erin Robertson
,
charlie_griffin
,
joehardie
and
Justin Olive
Oct 4, 2024, 1:44 PM
38
points
0
comments
4
min read
LW
link
Games for AI Control
charlie_griffin
and
Buck
Jul 11, 2024, 6:40 PM
45
points
0
comments
5
min read
LW
link
Apply to LASR Labs: a London-based technical AI safety research programme
Erin Robertson
,
charlie_griffin
and
joehardie
Apr 9, 2024, 5:34 PM
45
points
1
comment
3
min read
LW
link
Scenario Forecasting Workshop: Materials and Learnings
elifland
and
charlie_griffin
Mar 8, 2024, 2:30 AM
50
points
3
comments
2
min read
LW
link
Five projects from AI Safety Hub Labs 2023
charlie_griffin
Nov 8, 2023, 7:19 PM
47
points
1
comment
6
min read
LW
link
(www.aisafetyhub.org)
Goodhart’s Law in Reinforcement Learning
jacek
,
Joar Skalse
,
OliverHayman
,
charlie_griffin
and
Xingjian Bai
Oct 16, 2023, 12:54 AM
126
points
22
comments
7
min read
LW
link
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel