Alignment, conflict, powerseeking

Is powerseeking instrumentally convergent?

Having power is instrumentally useful for most things.

Gaining power is a means to having power.

Seeking power is a means to gaining power.

But seeking power is also a means to producing conflict.

Avoiding costly conflict is instrumentally useful for most things.

This cashes out to a few simple tradeoffs.

Who has power right now matters. Are they aligned on outcome preferences?

The risk of costly conflict matters. How much is at stake? What are the odds of things getting destroyed by conflict? Escalation and de-escalation are important factors here.

An overly simple visualisation^[1]

preferences are direction vectors
power balance produces magnitudes
conflict spends some overall magnitude but changes power balance
alignment is literal vector alignment

Blue and red are preferences. Arrow magnitudes show blue has balance of power. Blue dashed shows projection of blue onto red direction. Left: red has a lot to lose from conflict; blue is basically aligned. Right: red has little to lose from conflict; blue is very misaligned.

Usually when we talk about instrumental convergence we’re interested in the forward direction: what will X do given we know some limited amount about X? But in light of the above tradeoffs, we can also use this for the inverse problem of inference: given what X does, what can we infer about X (preferences, beliefs)?

Of course, in the knowledge that behaviour can be used for this inverse inference problem, deception arises as an obvious strategy. (This is where surprise coups come from^[2].)

When we observe entities seeking power, or gauge their willingness to escalate, we can learn something about

their estimate of the risk and cost of conflict/escalation
their estimate of their alignment with the existing power balance

Caveat: this post is secretly more about humans than AI and I want to pre-emptively clarify that I’m not trying to imply AI won’t understand or exploit the instrumental usefulness of gaining power.

↩︎
This really is simplistic, but actually not a terrible fit for some more legit maths
↩︎
Deceptively pretending to be misaligned should be systematically rarer: if aligned with the powers-that-be, who are you aiming to deceive? The 4-d chess version of this is seeking power when actually aligned, in order to signal misalignment for some reason, perhaps to gain support from some third/fourth party.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer