Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Jacob_Hilton
Karma:
1,510
All
Posts
Comments
New
Top
Old
Jacob_Hilton’s Shortform
Jacob_Hilton
May 1, 2025, 12:58 AM
6
points
1
comment
LW
link
A bird’s eye view of ARC’s research
Jacob_Hilton
Oct 23, 2024, 3:50 PM
119
points
12
comments
7
min read
LW
link
(www.alignment.org)
Backdoors as an analogy for deceptive alignment
Jacob_Hilton
and
Mark Xu
Sep 6, 2024, 3:30 PM
104
points
2
comments
8
min read
LW
link
(www.alignment.org)
Formal verification, heuristic explanations and surprise accounting
Jacob_Hilton
Jun 25, 2024, 3:40 PM
156
points
11
comments
9
min read
LW
link
(www.alignment.org)
ARC is hiring theoretical researchers
paulfchristiano
,
Jacob_Hilton
and
Mark Xu
Jun 12, 2023, 6:50 PM
126
points
12
comments
4
min read
LW
link
(www.alignment.org)
The effect of horizon length on scaling laws
Jacob_Hilton
Feb 1, 2023, 3:59 AM
23
points
2
comments
1
min read
LW
link
(arxiv.org)
Scaling Laws for Reward Model Overoptimization
leogao
,
John Schulman
and
Jacob_Hilton
Oct 20, 2022, 12:20 AM
103
points
13
comments
1
min read
LW
link
(arxiv.org)
Common misconceptions about OpenAI
Jacob_Hilton
Aug 25, 2022, 2:02 PM
238
points
154
comments
5
min read
LW
link
1
review
How much alignment data will we need in the long run?
Jacob_Hilton
Aug 10, 2022, 9:39 PM
37
points
15
comments
4
min read
LW
link
Deep learning curriculum for large language model alignment
Jacob_Hilton
Jul 13, 2022, 9:58 PM
57
points
3
comments
1
min read
LW
link
(github.com)
Procedurally evaluating factual accuracy: a request for research
Jacob_Hilton
Mar 30, 2022, 4:37 PM
25
points
2
comments
6
min read
LW
link
Truthful LMs as a warm-up for aligned AGI
Jacob_Hilton
Jan 17, 2022, 4:49 PM
65
points
14
comments
13
min read
LW
link
Stationary algorithmic probability
Jacob_Hilton
Apr 29, 2017, 5:23 PM
3
points
7
comments
1
min read
LW
link
(www.jacobh.co.uk)
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel