Eliezer claims that dath ilani never give in to threats. But I’m not sure I buy it.
The only reason people will make threats against you, the argument goes, is if those people expect that you might give in. If you have an iron-clad policy against acting in response to threats made against you, then there’s no point in making or enforcing the threats in the first place. There’s no reason for the threatener to bother, so they don’t. Which means in some sufficiently long run, refusing to submit to threats means you’re not subject to threats.
This seems a bit fishy to me. I have a lingering suspicion that this argument doesn’t apply, or at least doesn’t apply universally, in the real world.
I’m thinking here mainly of a prototypical case of an isolated farmer family (like the early farming families of the greek peninsula, not absorbed into a polis), being accosted by some roving bandits, such as the soldiers of the local government. The bandits say “give us half your harvest, or we’ll just kill you.”
The argument above depends on a claim about the cost of executing on a threat. “There’s no reason to bother” implies that the threatener has a preference not to bother, if they know that the threat won’t work.
I don’t think that assumption particularly applies. For many cases, like the case above, the cost to the threatener of executing on the threat is negligible, or at least small relative to the available rewards. The bandits don’t particularly mind killing the farmers and taking their stuff, if the farmers don’t want to give it up. There isn’t a realistic chance that the bandits, warriors specializing in violence and outnumbering the farmers, will lose a physical altercation.
From the badnits’ perspective their are two options:
Showing up, threatening to kill the farmers, taking away ask much food as they can carry (and then maybe coming back to accost them again next year).
Showing up, threatening to kill the farmers, actually killing the farmers, and then taking away as much food as they can carry.
It might be easier and less costly for the bandits to get what they want by being scary rather than by being violent. But the plunder is definitely enough to make violence worth it if it comes to that. They prefer option 1, but they’re totally willing to fall back on option 2.
It seems like, in this situation, the farmers are probably better off cooperating with the bandits and giving them some food, even knowing that that means that the bandits will come back and demand “taxes” from them every harvest. They’re just better off submitting.
Maybe, decision theoretically, this situation doesn’t count as a threat. The bandits are taking food from the the farmers, one way or the other, and they’re killing the farmers if they try to stop that. They’re not killing the farmers so that they’ll give up their food.
But that seems fishy. Most of the time, the bandits don’t, in fact have to resort to violence. Just showing up and threatening violence is enough to get what they want. The farmers do make the lives of the bandits easier by submitting and giving them much of the harvest without resistance. Doing otherwise would be straightforwardly worse for them.
Resisting the bandits out of a commitment to some notion of decision-theoretic rationality seems exactly analogous to two-boxing in Newcom’s problem, because of a commitment to (causal) decision-theoretic rationality.
You might not want to give in out of spite. “Fuck you. I’d rather die than help you steal from me.” But a dath ilani would say that that’s a matter of the utility function, not of decision theory. You just don’t like submitting to threats, and so will pay big costs to avoid it, not that you’re following a policy that maximizes your payoffs.
So, it seems like the policy has to be “don’t give into threats that are sufficiently costly to execute that the threatener would prefer not to bother, if they knew in advance that you wouldn’t give in”. (And possibly with the additional caveat “if the subjunctive dependence between you and the threatener is sufficiently high.”)
But that’s a much more complicated policy. For one thing, it requires a person-being-threatened to accurately estimate how costly it would be for the threatener to execute their threat (and the threatener is thereby incentivized to deceive them about that).
Hm. But maybe that’s easy to estimate actually, in the cases where the threatener gets a payout of 0, if the person-being-threatened doesn’t cooperate with the threat? Which is the case for most blackmail attempts, for instances, but not necessarily “if you don’t give me some of your harvest, I’ll kill you.”
In lots of case, it seems like it would be ambiguous. Especially when there are large power disparities in favor of the threatener. When someone powerful threatens you the cost of executing the the threat is likely to be small for them, possibly small enough to be negligible. And in those cases, their own spite at you for resisting them might be more than enough reason to act on it.
Eliezer, this is what you get for not writing up the planecrash threat lecture thread. We’ll keep bothering you with things like this until you give in to our threats and write it.
What you’ve hit upon is “BATNA,” or “Best alternative to a negotiated agreement.” Because the robbers can get what they want by just killing the farmers, the dath ilani will give in- and from what I understand, Yudowsky therefore doesn’t classify the original request (give me half your wheat or die) as a threat.
This may not be crazy- it reminds me of the Ancient Greek social mores around hospitality, which seem insanely generous to a modern reader but I guess make sense if the equilibrium number of roving <s>bandits</s> honored guests is kept low by some other force
from what I understand, Yudowsky therefore doesn’t classify the original request (give me half your wheat or die) as a threat.
This seems like it weakens the “don’t give into threats” policy substantially, because it makes it much harder to tell what’s a threat-in-the-technical-sense, and the incentives push of exaggeration and dishonesty about what is or isn’t a threat-in-the-the-technical-sense.
The bandits should always act as if they’re willing to kill the farmers and take their stuff, even if they’re bluffing about their willingness to do violence. The farmers need to estimate whether the bandits are bluffing, and either call the bluff, or submit to the demand-which-is-not-technically-a-threat.
That policy has notably more complexity than just “don’t give in to threats.”
“Anytime someone credibly demands that you do X, otherwise they’ll do Y to you, you should not do X.” This is a simple reading of the “don’t give into threats” policy.
There’s a sort of quiet assumption that should be louder about the dath Ilan fiction: which is that it’s about a world where a bunch of theorems like “as systems of agents get sufficiently intelligent, they gain the ability to coordinate in prisoner’s dilemma like problems” have proofs. You could similarly write fiction set in a world where P=NP has a proof and all of cryptography collapses. I’m not sure whether EY would guess that sufficiently intelligent agents actually coordinate- Just like I could write the P=NP fiction while being pretty sure that P/=NP
Huh, the idea that Greek guest-friendship was a adaption to warriors who would otherwise kill you and take your stuff is something that I had never considered before. Isn’t it generally depicited as a relationship between nobles who, presumably, would be able to repel roving bandits?
Threateners similarly can employ bindings, always enforcing regardless of local cost. A binding has an overall cost from following it in all relevant situations, costs in individual situations are what goes into estimating this overall cost, but individually they are not decision relevant, when deciding whether to commit to a global binding.
In this case opposing commitments effectively result in global enmity (threateners always enforce, targets never give in to threats), so if targets are collectively stronger than threateners, then threateners lose. But this collective strength (for the winning side) or vulnerability (for the losing side) is only channeled through targets or threateners who join their respective binding. If few people join, the faction is weak and loses.
The equilibrium depends on which faction is stronger. Threateners who don’t always enforce and targets who don’t always ignore threats are not parts of this game, so it’s not even about relative positions of threateners and targets, only those that commit are relevant. If the threateners win, targets start mostly giving in to threats, and so for threateners the cost of binding becomes low overall.
I’m talking about the equilibrium where targets are following their “don’t give in to threats” policy. Threateners don’t want to follow a policy of always executing threats in that world—really, they’d probably prefer to never make any threats in that world, since it’s strictly negative EV for them.
If the unyielding targets faction is stronger, the equilibrium is bad for committed enforcers. If the committed enforcer faction is stronger, the equilibrium doesn’t retain high cost of enforcement, and in that world the targets similarly wouldn’t prefer to be unyielding. I think the toy model where that fails leaves the winning enforcers with no pie, but that depends on enforcers not making use of their victory to set up systems for keeping targets relatively defenseless, taking the pie even without their consent. This would no longer be the same game (“it’s not a threat”), but it’s not a losing equilibrium for committed enforcers of the preceding game either.
This distinction of which demands are or aren’t decision-theoretic threats that rational agents shouldn’t give in to is a major theme of the last ~quarter of Planecrash (enormous spoilers in the spoiler text).
Keltham demands to the gods “Reduce the amount of suffering in Creation or I will destroy it”. But this is not a decision-theoretic threat, because Keltham honestly prefers destroying creation to the status quo. If the gods don’t give into his demand, carrying through with his promise is in his own interest.
If Nethys had made the same demand, it would have been a decision-theoretic threat. Nethys prefers the status quo to Creation being destroyed, so he would have no reason to make the demand other than the hope that the other gods would give in.
This theme is brought up many times, but there’s not one comprehensive explanation to link to. (The parable of the little bird is the closest I can think of.)
I’m thinking here mainly of a prototypical case of an isolated farmer family (like the early farming families of the greek peninsula, not absorbed into a polis), being accosted by some roving bandits
The assertion IIUC is not that it never makes sense for anyone to give in to a threat—that would clearly be an untrue assertion—but rather that it is possible for a society to reach a level of internal coordination where it starts to make sense to adopt a categorical policy of never giving in to a threat. That would mean for example that any society member that wants to live in dath ilan’s equivalent of an isolated farm would probably need to formally and publicly relinquish their citizenship to maintain dath ilan’s reputation for never giving in to a threat. Or dath ilan would make it very clear that they must not give in to any threats, and if they do and dath ilan finds out, then dath ilan will be the one that slaughters the whole family. The latter policy is a lot like how men’s prisons work at least in the US whereby the inmates are organized into groups (usually based on race or gang affiliation) and if anyone even hints (where others can hear) that you might give in to sexual extortion, you need to respond with violence because if you don’t, your own group (the main purpose of which is mutual protection from the members of the other groups) will beat you up.
That got a little grim. Should I add a trigger warning? Should I hide the grim parts behind a spoiler tag thingie?
At worst, all the farmers will relentlessly fight to the death, in that case the bandits get one year of food and have to figure something else out next year.
That outcome strictly dominates not stealing any food this year, and needing to figure out something else out both this year and next year.
I don’t recall Eliezer claiming that dath ilani characters never give in to threats. *Dath ilani characters* claim they never give in to threats. My interpretation is that the characters *say* “We don’t give in to threats”, and *believe* it, but it’s not *true*. Rather it’s something between a self-fulfilling prophecy, a noble lie-told-to-children, and an aspiration.
There are few threats in dath ilan, partly because the conceit of dath ilan is that it’s mostly composed of people who are cooperative-libertarianish by nature and don’t want to threaten each other very much, but partly because it’s a political structure where it’s much harder to get threats to actually *work*. One component of that political structure is how people are educated to defy threats by reflex, and to expect their own threats to fail, by learning am idealized system of game theory in which threats are always defied.
However, humans don’t actually follow ideal game theory when circumstances get sufficiently extreme, even dath ilani humans. Peranza can in fact be “shattered in Hell beyond all hope of repair” in the bad timeline, for all that she might rationally “decide not to break”. Similarly when the Head Keeper commits suicide to make a point: “So if anybody did deliberately destroy their own brain in attempt to increase their credibility—then obviously, the only sensible response would be to ignore that, so as not create hideous system incentives. Any sensible person would reason out that sensible response, expect it, and not try the true-suicide tactic.” But despite all that the government sets aside the obvious and sensible policy because, come on, the Head Keeper just blew up her own brain, stop fucking around and get serious. And the Head Keeper, who knows truths about psychology which the members of government do not, *accurately predicted they would respond that way*.
So dath ilani are educated to believe that giving in to threats is irrational, and to believe that people don’t give in to threats. This plus their legal system means that there are few threats, and the threats usually fail, so their belief is usually correct, and the average dath ilani never sees it falsified. Those who think carefully about the subject will realize that threats can sometimes work, in circumstances which are rare in dath ilan, but they’ll also realize that it’s antisocial to go around telling everyone about the limits of their threat-resistance and keep it quiet. The viewpoint characters start believing the dath ilani propaganda but update pretty quickly when removed from dath ilan. Keltham has little trouble understanding the Golarian equilibrium of force and threats once he gets oriented. Thellim presumably pays taxes off camera once she settles in to Earth.
Eliezer claims that dath ilani never give in to threats. But I’m not sure I buy it.
The only reason people will make threats against you, the argument goes, is if those people expect that you might give in. If you have an iron-clad policy against acting in response to threats made against you, then there’s no point in making or enforcing the threats in the first place. There’s no reason for the threatener to bother, so they don’t. Which means in some sufficiently long run, refusing to submit to threats means you’re not subject to threats.
This seems a bit fishy to me. I have a lingering suspicion that this argument doesn’t apply, or at least doesn’t apply universally, in the real world.
I’m thinking here mainly of a prototypical case of an isolated farmer family (like the early farming families of the greek peninsula, not absorbed into a polis), being accosted by some roving bandits, such as the soldiers of the local government. The bandits say “give us half your harvest, or we’ll just kill you.”
The argument above depends on a claim about the cost of executing on a threat. “There’s no reason to bother” implies that the threatener has a preference not to bother, if they know that the threat won’t work.
I don’t think that assumption particularly applies. For many cases, like the case above, the cost to the threatener of executing on the threat is negligible, or at least small relative to the available rewards. The bandits don’t particularly mind killing the farmers and taking their stuff, if the farmers don’t want to give it up. There isn’t a realistic chance that the bandits, warriors specializing in violence and outnumbering the farmers, will lose a physical altercation.
From the badnits’ perspective their are two options:
Showing up, threatening to kill the farmers, taking away ask much food as they can carry (and then maybe coming back to accost them again next year).
Showing up, threatening to kill the farmers, actually killing the farmers, and then taking away as much food as they can carry.
It might be easier and less costly for the bandits to get what they want by being scary rather than by being violent. But the plunder is definitely enough to make violence worth it if it comes to that. They prefer option 1, but they’re totally willing to fall back on option 2.
It seems like, in this situation, the farmers are probably better off cooperating with the bandits and giving them some food, even knowing that that means that the bandits will come back and demand “taxes” from them every harvest. They’re just better off submitting.
Maybe, decision theoretically, this situation doesn’t count as a threat. The bandits are taking food from the the farmers, one way or the other, and they’re killing the farmers if they try to stop that. They’re not killing the farmers so that they’ll give up their food.
But that seems fishy. Most of the time, the bandits don’t, in fact have to resort to violence. Just showing up and threatening violence is enough to get what they want. The farmers do make the lives of the bandits easier by submitting and giving them much of the harvest without resistance. Doing otherwise would be straightforwardly worse for them.
Resisting the bandits out of a commitment to some notion of decision-theoretic rationality seems exactly analogous to two-boxing in Newcom’s problem, because of a commitment to (causal) decision-theoretic rationality.
You might not want to give in out of spite. “Fuck you. I’d rather die than help you steal from me.” But a dath ilani would say that that’s a matter of the utility function, not of decision theory. You just don’t like submitting to threats, and so will pay big costs to avoid it, not that you’re following a policy that maximizes your payoffs.
So, it seems like the policy has to be “don’t give into threats that are sufficiently costly to execute that the threatener would prefer not to bother, if they knew in advance that you wouldn’t give in”. (And possibly with the additional caveat “if the subjunctive dependence between you and the threatener is sufficiently high.”)
But that’s a much more complicated policy. For one thing, it requires a person-being-threatened to accurately estimate how costly it would be for the threatener to execute their threat (and the threatener is thereby incentivized to deceive them about that).
Hm. But maybe that’s easy to estimate actually, in the cases where the threatener gets a payout of 0, if the person-being-threatened doesn’t cooperate with the threat? Which is the case for most blackmail attempts, for instances, but not necessarily “if you don’t give me some of your harvest, I’ll kill you.”
In lots of case, it seems like it would be ambiguous. Especially when there are large power disparities in favor of the threatener. When someone powerful threatens you the cost of executing the the threat is likely to be small for them, possibly small enough to be negligible. And in those cases, their own spite at you for resisting them might be more than enough reason to act on it.
[Ok. that’s enough for now.]
Eliezer, this is what you get for not writing up the planecrash threat lecture thread. We’ll keep bothering you with things like this until you give in to our threats and write it.
What you’ve hit upon is “BATNA,” or “Best alternative to a negotiated agreement.” Because the robbers can get what they want by just killing the farmers, the dath ilani will give in- and from what I understand, Yudowsky therefore doesn’t classify the original request (give me half your wheat or die) as a threat.
This may not be crazy- it reminds me of the Ancient Greek social mores around hospitality, which seem insanely generous to a modern reader but I guess make sense if the equilibrium number of roving <s>bandits</s> honored guests is kept low by some other force
This seems like it weakens the “don’t give into threats” policy substantially, because it makes it much harder to tell what’s a threat-in-the-technical-sense, and the incentives push of exaggeration and dishonesty about what is or isn’t a threat-in-the-the-technical-sense.
The bandits should always act as if they’re willing to kill the farmers and take their stuff, even if they’re bluffing about their willingness to do violence. The farmers need to estimate whether the bandits are bluffing, and either call the bluff, or submit to the demand-which-is-not-technically-a-threat.
That policy has notably more complexity than just “don’t give in to threats.”
What is the “don’t give in to threats” policy that this is more complex than? In particular, what are ‘threats’?
“Anytime someone credibly demands that you do X, otherwise they’ll do Y to you, you should not do X.” This is a simple reading of the “don’t give into threats” policy.
What are the semantics of “otherwise”? Are they more like:
X otherwise Y
↦ X → ¬Y, orX otherwise Y
↦ X ↔ ¬YPresumably you also want the policy to include that you don’t want “Y” and weren’t going to do “X” anyway?
Yes, to the first part, probably yes to the second part.
With a grain of salt,
There’s a sort of quiet assumption that should be louder about the dath Ilan fiction: which is that it’s about a world where a bunch of theorems like “as systems of agents get sufficiently intelligent, they gain the ability to coordinate in prisoner’s dilemma like problems” have proofs. You could similarly write fiction set in a world where P=NP has a proof and all of cryptography collapses. I’m not sure whether EY would guess that sufficiently intelligent agents actually coordinate- Just like I could write the P=NP fiction while being pretty sure that P/=NP
Huh, the idea that Greek guest-friendship was a adaption to warriors who would otherwise kill you and take your stuff is something that I had never considered before. Isn’t it generally depicited as a relationship between nobles who, presumably, would be able to repel roving bandits?
Threateners similarly can employ bindings, always enforcing regardless of local cost. A binding has an overall cost from following it in all relevant situations, costs in individual situations are what goes into estimating this overall cost, but individually they are not decision relevant, when deciding whether to commit to a global binding.
In this case opposing commitments effectively result in global enmity (threateners always enforce, targets never give in to threats), so if targets are collectively stronger than threateners, then threateners lose. But this collective strength (for the winning side) or vulnerability (for the losing side) is only channeled through targets or threateners who join their respective binding. If few people join, the faction is weak and loses.
But threateners don’t want want to follow that policy, since in the resulting equilibrium they’re wasting a lot of their own resources.
The equilibrium depends on which faction is stronger. Threateners who don’t always enforce and targets who don’t always ignore threats are not parts of this game, so it’s not even about relative positions of threateners and targets, only those that commit are relevant. If the threateners win, targets start mostly giving in to threats, and so for threateners the cost of binding becomes low overall.
I’m talking about the equilibrium where targets are following their “don’t give in to threats” policy. Threateners don’t want to follow a policy of always executing threats in that world—really, they’d probably prefer to never make any threats in that world, since it’s strictly negative EV for them.
If the unyielding targets faction is stronger, the equilibrium is bad for committed enforcers. If the committed enforcer faction is stronger, the equilibrium doesn’t retain high cost of enforcement, and in that world the targets similarly wouldn’t prefer to be unyielding. I think the toy model where that fails leaves the winning enforcers with no pie, but that depends on enforcers not making use of their victory to set up systems for keeping targets relatively defenseless, taking the pie even without their consent. This would no longer be the same game (“it’s not a threat”), but it’s not a losing equilibrium for committed enforcers of the preceding game either.
This distinction of which demands are or aren’t decision-theoretic threats that rational agents shouldn’t give in to is a major theme of the last ~quarter of Planecrash (enormous spoilers in the spoiler text).
Keltham demands to the gods “Reduce the amount of suffering in Creation or I will destroy it”. But this is not a decision-theoretic threat, because Keltham honestly prefers destroying creation to the status quo. If the gods don’t give into his demand, carrying through with his promise is in his own interest.
If Nethys had made the same demand, it would have been a decision-theoretic threat. Nethys prefers the status quo to Creation being destroyed, so he would have no reason to make the demand other than the hope that the other gods would give in.
This theme is brought up many times, but there’s not one comprehensive explanation to link to. (The parable of the little bird is the closest I can think of.)
The assertion IIUC is not that it never makes sense for anyone to give in to a threat—that would clearly be an untrue assertion—but rather that it is possible for a society to reach a level of internal coordination where it starts to make sense to adopt a categorical policy of never giving in to a threat. That would mean for example that any society member that wants to live in dath ilan’s equivalent of an isolated farm would probably need to formally and publicly relinquish their citizenship to maintain dath ilan’s reputation for never giving in to a threat. Or dath ilan would make it very clear that they must not give in to any threats, and if they do and dath ilan finds out, then dath ilan will be the one that slaughters the whole family. The latter policy is a lot like how men’s prisons work at least in the US whereby the inmates are organized into groups (usually based on race or gang affiliation) and if anyone even hints (where others can hear) that you might give in to sexual extortion, you need to respond with violence because if you don’t, your own group (the main purpose of which is mutual protection from the members of the other groups) will beat you up.
That got a little grim. Should I add a trigger warning? Should I hide the grim parts behind a spoiler tag thingie?
Bandits have obvious cost: if they kill all farmers, from whom are they going to take stuff?
That’s not a cost.
At worst, all the farmers will relentlessly fight to the death, in that case the bandits get one year of food and have to figure something else out next year.
That outcome strictly dominates not stealing any food this year, and needing to figure out something else out both this year and next year.
I don’t recall Eliezer claiming that dath ilani characters never give in to threats. *Dath ilani characters* claim they never give in to threats. My interpretation is that the characters *say* “We don’t give in to threats”, and *believe* it, but it’s not *true*. Rather it’s something between a self-fulfilling prophecy, a noble lie-told-to-children, and an aspiration.
There are few threats in dath ilan, partly because the conceit of dath ilan is that it’s mostly composed of people who are cooperative-libertarianish by nature and don’t want to threaten each other very much, but partly because it’s a political structure where it’s much harder to get threats to actually *work*. One component of that political structure is how people are educated to defy threats by reflex, and to expect their own threats to fail, by learning am idealized system of game theory in which threats are always defied.
However, humans don’t actually follow ideal game theory when circumstances get sufficiently extreme, even dath ilani humans. Peranza can in fact be “shattered in Hell beyond all hope of repair” in the bad timeline, for all that she might rationally “decide not to break”. Similarly when the Head Keeper commits suicide to make a point: “So if anybody did deliberately destroy their own brain in attempt to increase their credibility—then obviously, the only sensible response would be to ignore that, so as not create hideous system incentives. Any sensible person would reason out that sensible response, expect it, and not try the true-suicide tactic.” But despite all that the government sets aside the obvious and sensible policy because, come on, the Head Keeper just blew up her own brain, stop fucking around and get serious. And the Head Keeper, who knows truths about psychology which the members of government do not, *accurately predicted they would respond that way*.
So dath ilani are educated to believe that giving in to threats is irrational, and to believe that people don’t give in to threats. This plus their legal system means that there are few threats, and the threats usually fail, so their belief is usually correct, and the average dath ilani never sees it falsified. Those who think carefully about the subject will realize that threats can sometimes work, in circumstances which are rare in dath ilan, but they’ll also realize that it’s antisocial to go around telling everyone about the limits of their threat-resistance and keep it quiet. The viewpoint characters start believing the dath ilani propaganda but update pretty quickly when removed from dath ilan. Keltham has little trouble understanding the Golarian equilibrium of force and threats once he gets oriented. Thellim presumably pays taxes off camera once she settles in to Earth.
You need spoiler tags!
Downvoting until they’re added.