Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Kshitij Sachan
Karma:
343
Redwood Research
All
Posts
Comments
New
Top
Old
AI Control: Improving Safety Despite Intentional Subversion
Buck
,
Fabien Roger
,
ryan_greenblatt
and
Kshitij Sachan
Dec 13, 2023, 3:51 PM
236
points
24
comments
10
min read
LW
link
4
reviews
LLMs are (mostly) not helped by filler tokens
Kshitij Sachan
Aug 10, 2023, 12:48 AM
66
points
35
comments
6
min read
LW
link
Polysemanticity and Capacity in Neural Networks
Buck
,
Adam Jermyn
and
Kshitij Sachan
Oct 7, 2022, 5:51 PM
87
points
14
comments
3
min read
LW
link
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel