Values handshakes

TagLast edit: Apr 6, 2021, 4:05 PM by Yoav Ravid

Values handshakes are a proposed form of trade between superintelligences. From The Hour I First Believed by Scott Alexander:

Suppose that humans make an AI which wants to convert the universe into paperclips. And suppose that aliens in the Andromeda Galaxy make an AI which wants to convert the universe into thumbtacks.

When they meet in the middle, they might be tempted to fight for the fate of the galaxy. But this has many disadvantages. First, there’s the usual risk of losing and being wiped out completely. Second, there’s the usual deadweight loss of war, devoting resources to military buildup instead of paperclip production or whatever. Third, there’s the risk of a Pyrrhic victory that leaves you weakened and easy prey for some third party. Fourth, nobody knows what kind of scorched-earth strategy a losing superintelligence might be able to use to thwart its conqueror, but it could potentially be really bad – eg initiating vacuum collapse and destroying the universe. Also, since both parties would have superintelligent prediction abilities, they might both know who would win the war and how before actually fighting. This would make the fighting redundant and kind of stupid.

Although they would have the usual peace treaty options, like giving half the universe to each of them, superintelligences that trusted each other would have an additional, more attractive option. They could merge into a superintelligence that shared the values of both parent intelligences in proportion to their strength (or chance of military victory, or whatever). So if there’s a 60% chance our AI would win, and a 40% chance their AI would win, and both AIs know and agree on these odds, they might both rewrite their own programming with that of a previously-agreed-upon child superintelligence trying to convert the universe to paperclips and thumbtacks in a 60-40 mix.

This has a lot of advantages over the half-the-universe-each treaty proposal. For one thing, if some resources were better for making paperclips, and others for making thumbtacks, both AIs could use all their resources maximally efficiently without having to trade. And if they were ever threatened by a third party, they would be able to present a completely unified front.

Superrational Agents Kelly Bet Influence!

abramdemskiApr 16, 2021, 10:08 PM

47 points

7 comments5 min readLW link

[REPOST] The Demiurge’s Older Brother

Scott AlexanderMar 22, 2017, 2:03 AM

97 points

2 comments6 min readLW link

How LDT helps reduce the AI arms race

Tamsin LeakeDec 10, 2023, 4:21 PM

65 points

13 comments4 min readLW link

(carado.moe)

Expected Utility, Geometric Utility, and Other Equivalent Representations

StrivingForLegibilityNov 20, 2024, 11:28 PM

10 points

0 comments11 min readLW link

Acausal trade naturally results in the Nash bargaining solution

Christopher KingMay 8, 2023, 6:13 PM

3 points

0 comments4 min readLW link

Acausal Now: We could totally acausally bargain with aliens at our current tech level if desired

Christopher KingAug 9, 2023, 12:50 AM

1 point

5 comments4 min readLW link

Negotiating Up and Down the Simulation Hierarchy: Why We Might Survive the Unaligned Singularity

David UdellMay 4, 2022, 4:21 AM

26 points

14 comments2 min readLW link

Threat-Resistant Bargaining Megapost: Introducing the ROSE Value

DiffractorSep 28, 2022, 1:20 AM

162 points

19 comments53 min readLW link 2 reviews

Could Roko’s basilisk acausally bargain with a paperclip maximizer?

Christopher KingMar 13, 2023, 6:21 PM

1 point

8 comments1 min readLW link

Even if we lose, we win

MorphismJan 15, 2024, 2:15 AM

24 points

17 comments4 min readLW link

Geometric Utilitarianism (And Why It Matters)

StrivingForLegibilityMay 12, 2024, 3:41 AM

34 points

2 comments11 min readLW link

No comments.