“We also are using it to assist humans in evaluating AI outputs, starting the second phase in our alignment strategy.”
Probably something along the lines of RLAIF? Anthropic’s Claude might be more robustly tuned because of this, though GPT-4 might already have similar things as part of its own training.
Current theme: default
Less Wrong (text)
Less Wrong (link)
Arrow keys: Next/previous image
Escape or click: Hide zoomed image
Space bar: Reset image size & position
Scroll to zoom in/out
(When zoomed in, drag to pan; double-click to close)
Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).
]
Keys shown in grey (e.g., ?) do not require any modifier keys.
?
Esc
h
f
a
m
v
c
r
q
t
u
o
,
.
/
s
n
e
;
Enter
[
\
k
i
l
=
-
0
′
1
2
3
4
5
6
7
8
9
→
↓
←
↑
Space
x
z
`
g
Probably something along the lines of RLAIF? Anthropic’s Claude might be more robustly tuned because of this, though GPT-4 might already have similar things as part of its own training.