ektimo comments on Individually incentivized safe Pareto improvements in open-source bargaining

ektimo Jul 18, 2024, 5:45 PM
1 point
0
A key question is how prosaic AI systems can be designed to satisfy the conditions under which the PMM is guaranteed (e.g., via implementing surrogate goals)

Is something like surrogate goals needed, such that the agent would need to maintain a substituted goal, for this to work? (I don’t currently fully understand the proposal but my sense was the goal of renegotiation programs is to not require this?)
- Anthony DiGiovanni Jul 19, 2024, 8:36 AM
  3 points
  1
  Parent
  Sorry this was unclear — surrogate goals indeed aren’t required to implement renegotiation. Renegotiation can be done just in the bargaining context without changing one’s goals generally (which might introduce unwanted side effects). We just meant to say that surrogate goals might be one way for an agent to self-modify so as to guarantee the PMM for themselves (from the perspective of the agent before they had the surrogate goal), without needing to implement a renegotiation program per se.
  I think renegotiation programs help provide a proof of concept for a rigorous argument that, given certain capabilities and beliefs, EU maximizers are incentivized ex ante to avoid the worst conflict. But I expect you’d be able to make an analogous argument, with different assumptions, that surrogate goals are an individually incentivized unilateral SPI.^[1]
  1. ^
    Though note that even though SPIs implemented with renegotiation programs are bilateral, our result is that each agent individually prefers to use a (PMP-extension) renegotiation program. Analogous to how “cooperate iff your source code == mine” only works bilaterally, but doesn’t require coordination. So it’s not clear that they require much stronger conditions in practice than surrogate goals.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer