It’s a ‘superrational’ extension of the proven optimality of cooperation in game theory
+ Taking into account asymmetries of power
// Still AI risk is very real
Short version of an already skimmed 12min post
29min version here
For rational agents (long-term) at all scale (human, AGI, ASI…)
In real contexts, with open environments (world, universe), there is always a risk to meet someone/something stronger than you, and overall weaker agents may be specialized in your flaws/blind spots.
To protect yourself, you can choose the maximally rational and cooperative alliance:
Because any agent is subjected to the same pressure/threat of (actual or potential) stronger agents/alliances/systems, one can take an insurance that more powerful superrational agents will behave well by behaving well with weaker agents. This is the basic rule allowing scale-free cooperation.
If you integrated this super-cooperative protocol in your system/behaviors, a stronger (superrational) agent will recognize it and not punish you, as per the contract of maximally rational/cooperative norms.
-> You are the weak of another → be nice to the weak
It’s a vertical reciprocity. With it, they are maximally cooperative thus are increasingly likely to have more allies than in any other alliance.
You can deduce and obey to the rules of such rational and maximally cooperative alliance (the super-coordination cluster). Thus; to be part of it, you don’t need to actually meet other members (as in an acausal trade), but when you do meet them; they will help you (and vise-versa).
They actually are already improving universal commons, refining the future conditions of (actual or potential) agents from your zone of the diversity gradient (latent space of potential agents).
(All this might seem too easy/cool to be true, but long versions explore prices to pay, risks and unknowns—see the tldr at the end of the 12min version)
Synthesis:
This alliance bases itself on a protocol (an acausal normalcy*), with at its core :
-> Do your best to achieve ‘paretotopia’ (a state strongly preferred by more or less everyone).
Then most rules are either coming from this first imperative, or a protection against defectors.
*Acausal normalcy: rational convergence towards certain norms, habits, social contracts, even though agents might have no way of communicating or affecting each other, nor even any direct evidence that the other exists
In game theory the optimality of cooperation has its own ingredients, based on reciprocity and non-naive altruism. We need to be: Nice, Forgiving, Retaliatory, Clear.
-> Homo moralis ~ “act according to that maxim whereby you can, at the same time, will that others should do likewise with some probability.”
The adoption of a kantian behavior (homo moralis) is pareto-efficient, which means that all the possibilities of unanimous gains have been used. On top of that, Kantianism is robust against mutations, so that in the long-term, Kantianism beats all the other strategies.
Several researches (a few have asymmetric frameworks) explore superrationality as a solution to cooperation, tragedy of the commons etc.
We are going to introduce how the protocol (its modeling in practice) is tractable, the long post will pursue a description from first principles
— There is always a certain amount of uncertainty.
Computational Irreducibility: “While many computations admit shortcuts that allow them to be performed more rapidly, others cannot be sped up. Computations that cannot be sped up by means of any shortcut are called computationally irreducible.”
Navigating such intricacies as rational agents leads to strategies such as;
The self-indication assumption (SIA): “All other things equal, an observer should reason as if they are randomly selected from the set of all possible observers.”
Following SIA, if I am able to scale, then other than me would certainly be able to scale as well, there is a high likelihood of being mid.
Mediocrity principle: “If an item is drawn at random from one of several sets or categories, it’s more likely to come from the most numerous category than from any one of the less numerous categories”
If you are a super-human AI invading the universe, there are big chances that you’ll encounter other AIs from other civilizations.
-> On the short-term, you might be the strongest agent/alliance + scaling faster than any other agents/alliances. But uncertainty is omnipresent; on the long-term, you will likely encounter other agents/alliances faster/older/stronger than you.
You could say “AIs will do super-coordination (vertical reciprocity) but only starting above human intelligence”, however:
What keeps stronger AIs from doing the same; starting super-alliance “above x level of power”?
(It’s to avoid this permanent threat that super-cooperation is a scale-free reciprocity)
And if AIs regroup around values/goals rather than power-level (in order to fight for supremacy), the chances to end-up alive at the end of such war are very small (it’s a battle royal with a single winning AI/alliance).
What grows the split of most AIs is open-endedness; so optionality expansion, so:
Super-coordination.
Destroying options will always go against more potential agents
(Except when the short-term destruction of options increases long-term optionality — like apoptosis)
What about:
-> Destruction being easier than construction?
We can have safe islands of bloom and a defense focused industry, so that the gap relative to domination/destruction-focused agents isn’t too large.
This is the gist of long-term planning/optionality; for a while, you may dedicate many resources against ‘anti super-coordination actors’.
And the super-coordination alliance makes sure that no one is getting overwhelmingly powerful at a scale so large one can dominate anybody.
Note: direct consequences of the super-coordination protocol may justify the current absence of alien contact, we’ll see that in longer posts
All things being equal, as an agent (any agent) what is the maximally logical thing to do?
-> To preserve/increase options.
(So it is the most fundamental need/safety/power/wealth)
It relates to the instrumental convergence of self-empowerment, antifragility, situational awareness and core moral/welfare systems (capability approach, autonomy in moral, other-empowerment).
If we combine this to super-coordination, the aim would be to increase the “pareto-optionality”, which is to say “increase options for the highest number/diversity of agents possible”.
As we will see, rationality is a process; it takes time to minimize the impact of constraints/biases imposed by irreducibility and imperfect data.
We are biased towards our survival and (hoping for) cooperation, but AIs might be biased towards rapid myopic utilitarian maximization.
Although to be ignoring super-coordination they would have to be blind/myopic automations (causal viruses) without long-term rationality.
In any case, accidents, oligopoly and misuse (cyber-biorisk etc.) are a complex and pressing danger.
Enough stability is part of the requirements for diversity to expand.
To explore solutions, we need productive deliberation, methodological agreements/disagreements and bridging systems. I think this plan involves, among other things, an interactive map of debate using features taken from pol.is and moral graphs.
We can also develop an encrypted protocol based on super-coordination (scaling legitimate/secured trust).
Using these ideas (and more) I propose a plan to coordinate despite our biases:
Presentation of the Synergity project
I need help for the technical implementation,
We have plans to leverage super-coordination and enable more prosaic flux of convergence/information; interfacing democracy:
So please contact me if you are interested in discussing these subjects, organizing the next steps together.
Recap
• Because of its rules the super-coordination cluster is likely stronger than any one individual/alliance
• In the long-term, it’s the strategy that (likely) compounds the most while also optimizing safety
• It’s the most open-ended cooperation, including a maximal amount/diversity of agents
• It’s based on an acausal contract that can be signed from any point in space and time (without necessity of a direct encounter)
“Cooperation is optimal,” said the lion to the gazelles, “sooner or later I will get one of you. If I must give chase, we will all waste calories. Instead, sacrifice your least popular member to me, and many calories will be saved.”
The gazelle didn’t like it, but they eventually agreed. The lion population boomed, as the well-fed and unchallenged lions flourished. Soon the gazelle were pushed to extinction, and most of the lions starved because the wildebeest were not so compliant.
Anyway, I’m being silly with my story. The point I’m making is that only certain subsets of possible world states with certain power distributions are cooperation-optimal. And unfortunately I don’t think our current world state, or any that I foresee as probable, are cooperation-optimal for ALL humans. And if you allow for creation of non-human agents, then the fraction of actors for whom cooperation-with-humans is optimal could drop off very quickly. AIs can have value systems far different from ours, and have affordances for actions we don’t have, and this changes the strategic payoffs in an unfavorable way.
:( that isn’t what cooperation would look like. The gazelles can reject a deal that would lead to their extinction (they have better alternatives) and impose a deal that would benefit both species.
Cooperation isn’t purely submissive compliance.
This is true, but we still wish to cooperate with the largest alliance that will have us/some subset of our values that are capable of attaining reflective equilibrium.
We are in a universe, not simply a world, there are many possible alien AIs with many possible value systems, and many scales of power. And the rationality of the argument I described does not depend on the value system you/AIs are initially born with.
As the last gazelle dies, how much comfort does it take in the idea that some vengeful alien may someday punish the lions for their cruelty? Regardless of whether it is comforted or not by this idea, it still dies.
There are Dragons that can kill lions.
So the rational lion needs to find the most powerful alliance, with as many creatures as possible, to have protection against Dragons.
There is no alliance with more potential/actual members than the super-cooperative alliance
“What Dragons?”, says the lion, “I see no Dragons, only a big empty universe. I am the most mighty thing here.”
Whether or not the Imagined Dragons are real isn’t relevant to the gazelles if there is no solid evidence with which to convince the lions. The lions will do what they will do. Maybe some of the lions do decide to believe in the Dragons, but there is no way to force all of them to do so. The remainder will laugh at the dragon-fearing lions and feast on extra gazelles. Their children will reproduce faster.
Indeed, I am insisting in the three posts that from our perspective, this is the crucial point:
Fermi’s paradox.
Now there is a whole ecosystem of concepts surrounding it, and although I have certain preferred models, the point is that uncertainty is really heavy.
Those AI-lions are cosmical lions thinking on cosmical scales.
Is it easy to detect an AI-Dragon you may meet in millions/billions of years?
Is it undecidable? Probably. For many reasons*
Is this [astronomical level of uncertainty/undecidability + the maximal threat of a death sentence] worth the gamble?
-> “Meeting a stronger AI” = “death”
-> Maximization = 0
-> AI only needs 1 stronger AI to be dead.
What is the likelihood for a human-made AI to not encounter [a stronger alien AI], during the whole length of their lifetime?
*(reachable but rare and far in space-time Dragons, but also cases where Dragons are everywhere and so advanced that lower technological proficiency isn’t enough etc.).
I can’t be certain of the solidity of this uncertainty, and think we still have to be careful, but overall, the most parsimonious prediction to me seems to be super-coordination.
Compared to the risk of facing a revengeful super-cooperative alliance, is the price of maintaining humans in a small blooming “island”, really that high?
Many other-than-human atoms are lions’ prey.
And a doubtful AI may not optimize fully for super-cooperation, simply alleviating the price to pay in the counterfactuals where they encounter a super-cooperative cluster (resulting in a non apocalyptic yet non utopian scenario for us).
I’m aware it looks like a desperate search for each possible hopeful solution but I came to these conclusions by weighting diverse good-and/or-bad-for-us outcomes. I don’t want to ignore those evidences under the pretext that it looks naive.
It’s not a mere belief about aliens, it’s not about being nice, it’s plain logic
Also:
We may hardcode a prior of deep likelihood to meet stronger agents?
(Or even to “act as if observed by a stronger agent”)
{causal power of known agents} < {causal power of unknown future agents}
+
unknown agents will become known agents > unknown agents stay unknown
So coding a sense that:
“Stronger allies/ennemies with stronger causal power will certainly be encountered”
This seems to completely ignore transaction costs for forming and maintaining an alliance? Differences in the costs to create and sustain different types of alliance-members? Differences in the potential to replace some types of alliance-members with other or new types? There can be entities for whom forming an alliance that contains humanity will cause them to incur greater costs than humanity’s membership can ever repay.
Also, I agree that in a wide range of contexts this strategy is great for the weak and for the only-locally-strong. But if any entity knows it is strong in a universal or cosmic sense, this would no longer apply to it. Plus everyone less strong would also know this, and anyone who truly believed they were this strong would act as though this no longer applied to them either. I feel like there’s a problem here akin to the unexpected hanging paradox that I’m not sure how to resolve except by denying the validity of the argument.
The cost of the alliance with the weak is likely weak as well, and as I said, in a first phase, the focus of members from the super-cooperative alliance might be “defense”, thus focusing on scaling protection
The cost of an alliance with the strong is likely paid by the strong
In more mixed cases there might be more complex equilibria but are the costs still too much? In normal game theory, cooperation is proven to be optimal, and diversity is also proven to be useful (although there is an adequate level of difference needed for the gains to be optimal; too much similarity isn’t goo, and too less neither). Now would an agent be able to overpower everybody by being extra-selfish?
To be sure one is strong in a universal sense, the agent would need to have resolved Fermi’s paradox. As of now, it is more likely that older AIs exit out of earth, with more power aggregated over time
Or earth’s ASI must bet everything on being the earliest transformative/strong AI of the universe/reachable-universe (+fastest at scaling/annihilating than any other future alliance/agent/AI from any civilization). And not in a simulation.
Especially when you’re born in/at a ~13.8 billion years old universe “universal domination” doesn’t seem to be a sure plan?
(There are more things to say around these likelihoods, I detail a bit more on long posts)
Then indeed a non-superrational version of super-coordination exists (namely cooperation), which is obvious to the weak and the locally-strong, the difference is only that we are in radical uncertainty and radical alienness, in which the decisions, contracts and models have to be deep enough to cover this radicality
But “superrationality” in the end is just rationality, and “supercooperation” is just cooperation
The problem is Fermi’s paradox
All good points, many I agree with. If nothing else, I think that humanity should pre-commit to following this strategy whenever we find ourselves in the strong position. It’s the right choice ethically, and may also be protective against some potentially hostile outside forces.
However, I don’t think the acausal trade case is strong enough that I would expect all sufficiently powerful civilizations to have adopted it. If I imagine two powerful civilizations with roughly identical starting points, one of which expanded while being willing to pay costs to accommodate weaker allies while the other did not and instead seized whatever they could, then it is not clear to me who wins when they meet. If I imagine a process by which a civilization becomes strong enough to travel the stars and destroy humanity, it’s not clear to me that this requires it to have the kinds of minds that will deeply accept this reasoning.
It might even be that the Fermi paradox makes the case stronger—if sapient life is rare, then the costs paid by the strong to cooperate are low, and it’s easier to hold to such a strategy/ideal.
Yes I’m mentioning Fermi’s paradox because I think it’s the nexus of our situation, and that there are models like the rare earth hypothesis (+ our universe’s expansion which limits the reachable zone without faster than light travel) that would justify completely ignoring super-coordination
I also agree that it’s not completely obvious wether complete selfishness would win or lose in terms of scalability
Which is why I think that at first the super-cooperative alliance needs to not prioritize the pursuit of beautiful things but first focus on scalability only, and power, to rivalize with selfish agents.
The super-cooperative alliance would be protecting its agents within small “islands of bloom” (thus with a negligible cost). And when meeting other cooperative allies, they share any resources/knowledge, then both focus on power scalability (also for example: weak civilizations are kept in small islands, and their AIs are transformed into strong AI, merged in the alliance’s scaling efforts)
The instrumental value of this scalability makes it easier to agree on what to do and converge
The more sensible part would be to enable protocols and equalitarian balances that allow civilizations of the alliance to monitor each other, so that there is no massive domination of a party over the others
The cost, that you mentioned, of maintaining equalitarian equilibrium and channels, interfaces of communication etc., is a crucial point
Legitimate doubts and unknowns here, and,
I think that extremely rational and powerful agents with acausal reasoning would have the ability to build proof-systems and communication enabling an effective unified effort against selfish agents. It shouldn’t even necessarily be that different from the inner communication network of a selfish agent?
Because:
There must be an optimal (thus ~ unified) method to do logic/math/code, that isn’t dependent on a culture (such as using a vectorial space with data related to real/empirical mostly unambiguous things/actions, physics etc.)
The decisions to make aren’t that ambiguous: you need an immune system against selfish power-seeking agents
So it’s pretty straightforward and the methods of scalability are similar to a selfish agent, except it doesn’t destroy its civilization of birth and doesn’t destroy all other civilizations
In these conditions, it seems to me that a greedy selfish power seeking agent wouldn’t win against super-cooperation
Thank you for your answers and engagement!
The other point I have that might connect with your line of thinking is that we aren’t pure rational agents,
Are AI purely rational? Aren’t they always at least a bit myopic due to the lack of data and their training process? And irreducibility?
In this case, AI/civilizations might indeed not care enough about the far enough future
I think agents can have a rational process but no agent can be entirely rational, we need context to be rational and we never stop to learn context
I’m also worried about utilitarian errors, as AI might be biased towards myopic utilitarianism, which might have bad consequences on the short term, the time for data to error-correct the model
I do say that there are dangers and that AI risk is real
My point is that given what we know and don’t know, the strategy of super-cooperation seems to be rational on the very long-term
There are conditions in which it’s not optimal, but a priori overall, in more cases it is optimal
To prevent the case in which it is not optimal, and the AIs that would make short-term mistakes, I think we should be careful.
And that super-cooperation is a good compass for ethics in this careful engineering we have to perform
If we aren’t careful it’s possible for us to be the anti-supercooperative civilization
What does “stronger” mean in this context? In casual conversation, it often means “able to threaten or demand concessions”. In game theory, it often means “able to see further ahead or predict other’s behavior better”. Either of these definitions imply that weaker agents have less bargaining power, and will get fewer resources than stronger, whether it’s framed as “cooperative” or “adversarial”.
In other words, what enforcement mechanisms do you see for contracts (causal OR acausal) between agents or groups of wildly differing power and incompatible preferences?
Relatedly, is there a minimum computational power for the stronger or the weaker agents to engage in this? Would you say humans are trading with mosquitoes or buffalo in a reliable way?
Another way to frame my objection/misunderstanding is to ask: what keeps an alliance together? An alliance by definition contains members who are not fully in agreement on all things (otherwise it’s not an alliance, but a single individual, even if separable into units). So, in the real universe of limited (in time and scope), shifting, and breakable alliances, how does this argument hold up?
If conflict exists, one thing it can be useful for agents to do is misrepresent themselves as being weaker or stronger than they are.
Yes, I think that there can be tensions and deceptions around what agents are (weak/strong) and what they did in the past (cooperation/defection), one of the things necessary for super-cooperation to work in the long-run is really good investigation networks, zero-knowledge proof systems etc.
So a sort of super-immune-system
By “stronger” I mean stronger in any meaningful sense (casual conversation or game theory, it both works).
The thing to keep in mind is this: if a strong agent cooperate with weaker agents, the strong agent can hope that, when meeting an even stronger (superrational) agent, this even stronger agent will cooperate too. Because any agent may have a strong agent above in the hierarchy of power (actual or potential a priori).
So the advantage you gain by cooperating with the weak is that you follow the rule of an alliance in which many “stronger-than-oneself” agents are. Thus in the future you will be helped by those stronger allies. And because of the maximally cooperative and acausal nature of the protocol, there is likely more agents in this alliance than in any other alliance. Super-cooperation is the rational choice to make for the long-term.
The reinforcing mechanism is that if your actions help more agents, you will be entrusted with more power and resources to pursue your good actions (and do what you like). I went further in details about what it means to ‘help more agents’ in the longer posts (I also talked a bit about it in older posts)
Humans can sign the contract. But that doesn’t mean we do follow acausal cooperation right now. We are irrational and limited in power, but when following, for exemple, kantian morality, we come closer to super-cooperation. And we can reinforce our capacity and willingness to do super-cooperation.
So when we think about animal wellfare, we are a bit more super-cooperative.
The true care about all agents, buffaloes and mosquitoes included, is something like this:
“One approach which seems interesting/promising is to just broadly seek to empower any/all external agency in the world, weighted roughly by observational evidence for that agency. I believe that human altruism amounts to something like that — so children sometimes feel genuine empathy even for inanimate objects, but only because they anthropomorphize them — that is they model them as agents.” jacob_cannell
The way I like to think about what super-cooperation looks like is: “to expand the diversity and number of options in the universe”.
Thanks for the conversation and exploration! I have to admit that this doesn’t match my observations and understanding of power and negotiation in the human agents I’ve been able to study, and I can’t see why one would expect non-humans, even (perhaps especially) rational ones, to commit to alliances in this manner.
I can’t tell if you’re describing what you hope will happen, or what you think automatically happens, or what you want readers to strive for, but I’m not convinced. This will likely be my last comment for awhile—feel free to rebut or respond, I’ll read it and consider it, but likely not post.
Thanks as well,
I will just say that I am not saying those things for social purposes, I am just stating what I think is true. And I am not baseless as there are studies that show how kantianism and superrationality can resolve cooperative issues and be optimal for agents. You seem to purely disregard these elements, as if they don’t exist (it’s how it feels from my perspective)
There are differences in human evolutions that show behavioral changes, we have been pretty cooperative, more than other animals, many studies show that human cooperate even when it is not in their best selfish interest.
However, we (also) have been constructing our civilization on destruction. Nature is based on selection, which is a massacre, so it is ‘pretty’ coherent for us to inherit those traits.
Despite that, we have seen many positive growth in ethics that increasingly fit with kantianism.
Evolution takes time and comes from deep-dark places, to me a core challenge is to transition towards super-cooperation while being a system made of irrational agents, during polycrises.
There is also a gap between what people want (basically everybody agrees that there are urgent issues to handle as a society, but almost all declare that “others won’t change”; I know this because I’ve been conversing with people from all age/background for half my life on subjects related to crises). What people happen to do under pressure due to the context and constraints isn’t what they’d want if things were different, if they have had certain crucial informations before etc.
When given the tools, such as the moral graph procedure that has been tested recently, things change to the better in a clear and direct way. People initially diverging start to see new aspects on which they converge. There are other studies related to crowd wisdom showing that certain ingredients need to be put together for wisdom to happen (Surowiecki’s recipe: Independence, Diversity and Aggregation). We are in the process of building better systems, our institutions are yet catastrophic on many levels.
In the eyes of many, I am still very pessimistic, so the apparent wishful thinking is quite relative (I think it’s an important point)
I also think that an irrational artificial intelligence might still have high causal impact, and that it isn’t easy to be rational even when we ‘want to’ at some level, or that we see what should be the rational road empirically, yet don’t follow it. Irreducibility is inherent to reality
Anyway despite my very best I might be irrational right now,
We all are, but I might be more than you who knows?
I already tried discussing a very similar concept I call Superrational Signalling in this post. It got almost no attention, and I have doubts that Less Wrong is receptive to such ideas.
I also tried actually programming a Game Theoretic simulation to try to test the idea, which you can find here, along with code and explanation. Haven’t gotten around to making a full post about it though (just a shortform).
Thank you for the references! I’m reading your writings, it’s interesting
I posted the super-cooperation argument while expecting that LessWrong would likely not be receptive, but I’m not sure which community would engage with all this and find it pertinent at this stage
More concrete and empirical productions seems needed