As far as why Anthropic should probably race, here’s @joshc’s take on it, using the fictional company Magma as an example:
https://www.lesswrong.com/posts/8vgi3fBWPFDLBBcAx/planning-for-extreme-ai-risks#5__Heuristic__1__Scale_aggressively_until_meaningful_AI_software_R_D_acceleration
The other winning pathways I can list are:1. Unlearning becomes more effective, such that you can use AI control strategies much easier.
2. We are truly in an alignment is easy world, where giving it data mostly straightforwardly changes it’s values.
3. We somehow muddle through, with an outcome that none of us expected.
Current theme: default
Less Wrong (text)
Less Wrong (link)
Arrow keys: Next/previous image
Escape or click: Hide zoomed image
Space bar: Reset image size & position
Scroll to zoom in/out
(When zoomed in, drag to pan; double-click to close)
Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).
]
Keys shown in grey (e.g., ?) do not require any modifier keys.
?
Esc
h
f
a
m
v
c
r
q
t
u
o
,
.
/
s
n
e
;
Enter
[
\
k
i
l
=
-
0
′
1
2
3
4
5
6
7
8
9
→
↓
←
↑
Space
x
z
`
g
As far as why Anthropic should probably race, here’s @joshc’s take on it, using the fictional company Magma as an example:
https://www.lesswrong.com/posts/8vgi3fBWPFDLBBcAx/planning-for-extreme-ai-risks#5__Heuristic__1__Scale_aggressively_until_meaningful_AI_software_R_D_acceleration
The other winning pathways I can list are:
1. Unlearning becomes more effective, such that you can use AI control strategies much easier.
2. We are truly in an alignment is easy world, where giving it data mostly straightforwardly changes it’s values.
3. We somehow muddle through, with an outcome that none of us expected.