The word “threat” here is used as a loose translation for a mathematically much more precise concept in dath ilan. I am pretty sure that the real world doesn’t have the mathematical sophistication to support such a definition, and perhaps such a distinction doesn’t actually make mathematical sense after all. Multipolar game theory is very far from solved.
On the other hand, the author’s description “foolish and dangerous to build a Civilization that would stop working if people started behaving more rationally” provides a guideline for how such a criterion might be intended to work, in simple cases. It seems to point toward an attempt to get more than the best coordinated rational actors could get, at the expense of ensuring that other parties always get less.
“I will reject any offer below 50%” seems to be not a threat in this sense for the usual formulation of the Ultimatum Game, since it allows for the respondent to get 50%, which is the best result that coordinating rational actors can get on average anyway. I’d call it a borderline threat in that arbitrarily small perturbations in the game could make it a threat, though. It’s too sharp and fragile.
“I will ignore all threats” also fails to be a threat by this criterion, since it allows both parties to attain the coordinating optimum, but it also seems overly sharp. “I will reject threats with enough probability that on average it’s not worth your while making them” seems like a smoother strategy.
There is a mathematically precise definition of “threat” in game theory. It’s approximately the one Yair semi-explicitly used above. Alice threatens Bob when Alice says that, if Bob performs some action X, then Alice will respond with action Y, where Y (a) harms Bob and (b) harms Alice. (If one wants to be “mathematical”, then one could say that each combination of actions is associated with a set of payoffs, and that “action Y harms Bob” == “[Bob’s payoff with Y] < [Bob’s payoff with not-Y]”.) The threat should successfully deter Bob if, and only if, (1) Bob believes Alice’s statement; (2) the harm inflicted on Bob by Y exceeds the benefit he gains from X; and (3) because of (b), Bob believes Alice wouldn’t just do Y anyway.
If Alice has an action Z that harms Bob and benefits her, then she can’t use it in a threat, because Bob would assume she’d do it anyway. But what she can do is make a promise, that if Bob does what she wants, then she’ll do action Q, which (a) helps Bob and (b) harms her; in this case Q would be “refrain from Z”.
Of course, carrying out a threat or promise is by definition irrational. But being able to change others’ behavior is useful, so that’s what creates evolutionary value in emotional responses like anger/revenge, gratitude/obligation, etc., and other methods of self-compulsion.
I would be surprised if dath ilan didn’t have the base concepts of game-theoretic threats and promises; and if they did, then I’m not sure what other names they would use for them. I’m not certain about this (and have only read one dath ilan story and it wasn’t “mad investor chaos”), but I suspect the authors would avoid giving new definitions of terms from Earth economics and game theory that already have precise definitions.
Alice threatens Bob when Alice says that, if Bob performs some action X, then Alice will respond with action Y, where Y (a) harms Bob and (b) harms Alice. (If one wants to be “mathematical”, then one could say that each combination of actions is associated with a set of payoffs, and that “action Y harms Bob” == “[Bob’s payoff with Y] < [Bob’s payoff with not-Y]”.)
Note that the dath ilan “negotiation algorithm” arguably fits this definition of “threat”:
If Alis and Bohob both do an equal amount of labor to gain a previously unclaimed resource worth 10 value-units, and Alis has to propose a division of the resource, and Bohob can either accept that division or say they both get nothing, and Alis proposes that Alis get 6 units and Bohob get 4 units, Bohob should accept this proposal with probability < 5⁄6 so Alis’s expected gain from this unfair policy is less than her gain from proposing the fair division of 5 units apiece.
Because for X=”proposes that Alis get 6 units and Bohob get 4 units” and Y=”accepting the proposal with probability < 5/6″, if Alis performs X, then Y harms both Alis and Bohob relative to not-Y (accepting the proposal with probability 1).
So I’m guessing that Eliezer is using some definition of “threat” that refers to “fairness”, such that “fair” actions do not count as threats according to his definition.
By this definition any statement that sets any conditions whatsoever in the Ultimatum Game is a threat. Or indeed any statement setting conditions under which you might withdraw from otherwise mutually beneficial trade.
I think, if there is any way to interpret any such statements as not being a threat, it would be of the form “I have already made my precommitments; I’ve already altered my brain so that I assign lower payoffs (due to psychological pain or whatever) to the outcomes where I fail to carry out my threat. I’m not making a new strategic move; I’m informing you of a past strategic move.” One could argue that the game is no longer the Ultimatum Game, due to the payoffs not being what they are in the Ultimatum Game.
Of course, both sides would like to do this, and to be “first” to do it. An extreme person in this vein could say “I’ve altered my brain so that I will reject anything less than 9-1 in my favor”, and this could even be true. Two such people would be guaranteed to have a bad time if they ran into one another, and a fairly bad time if they met a dath ilani; but one could choose to be such a person.
If both sides do set up their psychology well in advance of encountering the game, then the strategic moves are effectively made simultaneously. One can then think about the game of “making your strategic move”.
Eliezer is using some definition of “threat” that refers to “fairness”, such that “fair” actions do not count as threats
This seems likely. Much of Eliezer’s fiction includes a lot of typical mind fallacy and a seemingly-willful ignorance of power dynamics and “unfair” results in equilibria being the obvious outcome for unaligned agents with different starting conditions.
This kind of game-theory analysis is just silly unless it includes the information about who has the stronger/more-visible precommittments, and what extra-game impacts the actions will have. It’s actually quite surprising how deeply CDT is assumed (agents can freely choose their actions at the point in the narrative where it happens) in such analyses.
It’s hardly a wilful ignorance, it’s a deliberate rejection. A good decision theory, by nature, should produce results that don’t actually depend on visible precommitments to achieve negotiation equilibrium, since an ideal agent negotiating ought to be able to accept postcommitment to things you would predictably wish you’d precommitted to. And if a decision theory doesn’t allow you to hold out for fairness in the face of an uneven power dynamic, why even have one?
While I have seen the word used in that context in some game theory, it doesn’t fit the meaning intended in the story at all. It’s almost the exact opposite.
It also doesn’t fit the use of the term in more general practice, where a great many “RL threats” in real life are not “GT threats” in this very different game theory definition.
Hmm, do you have examples of that? If a robber holds a gun to someone’s head and says “I’ll kill you if you don’t give me your stuff”, that’s clearly a threat, and I believe it also fits the game theory definition: most robbers would have at least a mild preference to not shoot the person (if only because of the mess it creates).
In the stated terms, Alice is the robber, Bob is the victim, X is “Bob resists Alice”, Y is “Alice kills Bob and takes his stuff anyway”, and not-Y is “Alice gives up”.
It is uncontroversial that Bob is worse off under Y than not-Y, but much less certain that Alice is also worse off. If Bob resists Alice and Alice gives up, then Alice is probably going to prison for a very long time. Alice seems much better off killing Bob and taking his stuff, so this was not a “threat” under the proposed definition.
Hmm, this depends on assumptions not stated. I was thinking of the situation where Alice has broken into Bob’s house, and there are neighbors who might hear a gunshot and call the cops, and might be able to describe Alice’s getaway car and possibly its license plate. In other words, Alice shooting Bob carries nontrivial risk of getting her caught.
If we imagine the opposite, that Alice shooting Bob decreases her chance of getting caught, then, after Bob gives her his stuff, why shouldn’t Alice just shoot Bob afterward? In which case why should Bob cooperate? To incentivize Bob, Alice would have to promise that she won’t shoot him after he cooperates, rather than threaten him. (And it’s harder for an apparently-willing-to-murder-you criminal to make a credible promise than a credible threat.)
So let’s flesh out the situation I imagined. If Bob cooperates and then Alice kills him, the cops will seriously investigate the murder. If Bob cooperates and Alice leaves him tied up and able to eventually free himself, then the cops won’t bother putting so much effort into finding Alice. Then Bob can really believe that, if he cooperates, Alice won’t want to shoot him. Now we consider the case where Bob refuses; does Alice prefer to shoot him?
If she does, then we could say that, if both parties understand the situation, then Alice doesn’t need to threaten anything. She may need to explain what she wants and show him her gun, but she doesn’t need to make herself look like a madman, a hothead, or otherwise irrational; she just needs to honestly convey information.[1] And Bob will benefit from learning this information; if he were deaf or uncomprehending, then Alice would have just killed him.
Whereas if Alice would rather not shoot Bob, then her attempts to convince Bob will involve either lying to him, or visibly making herself angry or otherwise trying to commit herself to the shoot-if-resist choice. In this case, Bob does not benefit from being in a position to receive Alice’s communications; if Bob were clearly deaf / didn’t know the language / otherwise couldn’t be communicated with, then Alice wouldn’t try to enrage herself and would probably just leave. (Technically, given that Alice thinks Bob can hear her, Bob benefits from actually hearing her.)
There is an important distinction to be made here. The question is what words to use for each case. I do think it’s reasonably common for people to understand the distinction, and, when they are making a distinction, I think they use “threat” for the second case, while the first might be called “a fact” or possibly “a warning”.
For a less violent case, consider one company telling their vendor, “If you don’t drop your prices by 5% by next month, then we’ll stop buying from you.” If that’s actually in the company’s interest—e.g. because they found a competing seller whose prices are 5% lower—then, again, the vendor is glad to know; but if the company is just trying to get a better deal and really hopes they’re not put in a position where they have to either follow through or eat their words, then this is a very different thing. I do think that common parlance would say that the latter is a threat, and the former is a (possibly friendly!) warning.
Incidentally, it’s clear that people refer to “a thing that might seriously harm X” as “a threat to X”. In the “rational psychopath” case, Alice is a threat to Bob, but her words, her line of communication with Bob, are not—they actually help Bob. In the “wannabe madman” case, Alice’s words are themselves a threat (or, technically, the fact that Alice thinks Bob is comprehending her words). Likewise, the communication (perhaps a letter) from the company that says they’ll stop buying is itself a threat in the second case and not the first. One can also say that the wannabe-madman Alice and the aggressively negotiating company are making a threat—they are creating a danger (maybe fake, but real if they do commit themselves) where none existed.
Now, despite the above arguments, it is possible that the bare word “threat” is not the best term. The relevant Wikipedia article is called “Non-credible threat”. I don’t think that’s a good name, because if Alice truly is a madman (and has a reputation for shooting people who irritated her, and she’s managed to evade capture), then, when Alice tells you to do something or she’ll shoot you, it can be very credible. I would probably say “game-theoretic threat”.
[1] Though in practice she might need to convince Bob that she, unlike most people, is willing to kill him. Pointing a gun at him would be evidence of this, but I think people would also tend to say that’s “threatening”… though waving a gun around might indeed be “trying to convince them that you’re irrational enough to carry out an irrational threat”. I dunno. In game theory, one often prefers to start with situations in which all parties are rational...
It is definitely true that the author was not using the word “threat” in this way. OneTwo Some of the explicitly given examples fit this definition and were provided as an example of strategies that were not considered threats.
Though also consider: the character there is not from Earth, does not know what words Earth people use, they are communicating via a translation system that is known to be imprecise, the target language is also not from Earth, and that language already known to be missing words for many simple game theory concepts. The use of the word “threat” in the text is definitely not to be taken as exactly the same meaning as used in Earth game theory.
The word “threat” here is used as a loose translation for a mathematically much more precise concept in dath ilan. I am pretty sure that the real world doesn’t have the mathematical sophistication to support such a definition, and perhaps such a distinction doesn’t actually make mathematical sense after all. Multipolar game theory is very far from solved.
On the other hand, the author’s description “foolish and dangerous to build a Civilization that would stop working if people started behaving more rationally” provides a guideline for how such a criterion might be intended to work, in simple cases. It seems to point toward an attempt to get more than the best coordinated rational actors could get, at the expense of ensuring that other parties always get less.
“I will reject any offer below 50%” seems to be not a threat in this sense for the usual formulation of the Ultimatum Game, since it allows for the respondent to get 50%, which is the best result that coordinating rational actors can get on average anyway. I’d call it a borderline threat in that arbitrarily small perturbations in the game could make it a threat, though. It’s too sharp and fragile.
“I will ignore all threats” also fails to be a threat by this criterion, since it allows both parties to attain the coordinating optimum, but it also seems overly sharp. “I will reject threats with enough probability that on average it’s not worth your while making them” seems like a smoother strategy.
There is a mathematically precise definition of “threat” in game theory. It’s approximately the one Yair semi-explicitly used above. Alice threatens Bob when Alice says that, if Bob performs some action X, then Alice will respond with action Y, where Y (a) harms Bob and (b) harms Alice. (If one wants to be “mathematical”, then one could say that each combination of actions is associated with a set of payoffs, and that “action Y harms Bob” == “[Bob’s payoff with Y] < [Bob’s payoff with not-Y]”.) The threat should successfully deter Bob if, and only if, (1) Bob believes Alice’s statement; (2) the harm inflicted on Bob by Y exceeds the benefit he gains from X; and (3) because of (b), Bob believes Alice wouldn’t just do Y anyway.
If Alice has an action Z that harms Bob and benefits her, then she can’t use it in a threat, because Bob would assume she’d do it anyway. But what she can do is make a promise, that if Bob does what she wants, then she’ll do action Q, which (a) helps Bob and (b) harms her; in this case Q would be “refrain from Z”.
Of course, carrying out a threat or promise is by definition irrational. But being able to change others’ behavior is useful, so that’s what creates evolutionary value in emotional responses like anger/revenge, gratitude/obligation, etc., and other methods of self-compulsion.
(I learned this from the book “Game Theory and Strategy” by Straffin, but you can see the same definitions given in e.g. http://pi.math.cornell.edu/~mec/2008-2009/Anema/stategicmoves.htm .)
I would be surprised if dath ilan didn’t have the base concepts of game-theoretic threats and promises; and if they did, then I’m not sure what other names they would use for them. I’m not certain about this (and have only read one dath ilan story and it wasn’t “mad investor chaos”), but I suspect the authors would avoid giving new definitions of terms from Earth economics and game theory that already have precise definitions.
Note that the dath ilan “negotiation algorithm” arguably fits this definition of “threat”:
Because for X=”proposes that Alis get 6 units and Bohob get 4 units” and Y=”accepting the proposal with probability < 5/6″, if Alis performs X, then Y harms both Alis and Bohob relative to not-Y (accepting the proposal with probability 1).
So I’m guessing that Eliezer is using some definition of “threat” that refers to “fairness”, such that “fair” actions do not count as threats according to his definition.
By this definition any statement that sets any conditions whatsoever in the Ultimatum Game is a threat. Or indeed any statement setting conditions under which you might withdraw from otherwise mutually beneficial trade.
This is true.
I think, if there is any way to interpret any such statements as not being a threat, it would be of the form “I have already made my precommitments; I’ve already altered my brain so that I assign lower payoffs (due to psychological pain or whatever) to the outcomes where I fail to carry out my threat. I’m not making a new strategic move; I’m informing you of a past strategic move.” One could argue that the game is no longer the Ultimatum Game, due to the payoffs not being what they are in the Ultimatum Game.
Of course, both sides would like to do this, and to be “first” to do it. An extreme person in this vein could say “I’ve altered my brain so that I will reject anything less than 9-1 in my favor”, and this could even be true. Two such people would be guaranteed to have a bad time if they ran into one another, and a fairly bad time if they met a dath ilani; but one could choose to be such a person.
If both sides do set up their psychology well in advance of encountering the game, then the strategic moves are effectively made simultaneously. One can then think about the game of “making your strategic move”.
This seems likely. Much of Eliezer’s fiction includes a lot of typical mind fallacy and a seemingly-willful ignorance of power dynamics and “unfair” results in equilibria being the obvious outcome for unaligned agents with different starting conditions.
This kind of game-theory analysis is just silly unless it includes the information about who has the stronger/more-visible precommittments, and what extra-game impacts the actions will have. It’s actually quite surprising how deeply CDT is assumed (agents can freely choose their actions at the point in the narrative where it happens) in such analyses.
It’s hardly a wilful ignorance, it’s a deliberate rejection. A good decision theory, by nature, should produce results that don’t actually depend on visible precommitments to achieve negotiation equilibrium, since an ideal agent negotiating ought to be able to accept postcommitment to things you would predictably wish you’d precommitted to. And if a decision theory doesn’t allow you to hold out for fairness in the face of an uneven power dynamic, why even have one?
While I have seen the word used in that context in some game theory, it doesn’t fit the meaning intended in the story at all. It’s almost the exact opposite.
It also doesn’t fit the use of the term in more general practice, where a great many “RL threats” in real life are not “GT threats” in this very different game theory definition.
Hmm, do you have examples of that? If a robber holds a gun to someone’s head and says “I’ll kill you if you don’t give me your stuff”, that’s clearly a threat, and I believe it also fits the game theory definition: most robbers would have at least a mild preference to not shoot the person (if only because of the mess it creates).
In the stated terms, Alice is the robber, Bob is the victim, X is “Bob resists Alice”, Y is “Alice kills Bob and takes his stuff anyway”, and not-Y is “Alice gives up”.
It is uncontroversial that Bob is worse off under Y than not-Y, but much less certain that Alice is also worse off. If Bob resists Alice and Alice gives up, then Alice is probably going to prison for a very long time. Alice seems much better off killing Bob and taking his stuff, so this was not a “threat” under the proposed definition.
Hmm, this depends on assumptions not stated. I was thinking of the situation where Alice has broken into Bob’s house, and there are neighbors who might hear a gunshot and call the cops, and might be able to describe Alice’s getaway car and possibly its license plate. In other words, Alice shooting Bob carries nontrivial risk of getting her caught.
If we imagine the opposite, that Alice shooting Bob decreases her chance of getting caught, then, after Bob gives her his stuff, why shouldn’t Alice just shoot Bob afterward? In which case why should Bob cooperate? To incentivize Bob, Alice would have to promise that she won’t shoot him after he cooperates, rather than threaten him. (And it’s harder for an apparently-willing-to-murder-you criminal to make a credible promise than a credible threat.)
So let’s flesh out the situation I imagined. If Bob cooperates and then Alice kills him, the cops will seriously investigate the murder. If Bob cooperates and Alice leaves him tied up and able to eventually free himself, then the cops won’t bother putting so much effort into finding Alice. Then Bob can really believe that, if he cooperates, Alice won’t want to shoot him. Now we consider the case where Bob refuses; does Alice prefer to shoot him?
If she does, then we could say that, if both parties understand the situation, then Alice doesn’t need to threaten anything. She may need to explain what she wants and show him her gun, but she doesn’t need to make herself look like a madman, a hothead, or otherwise irrational; she just needs to honestly convey information.[1] And Bob will benefit from learning this information; if he were deaf or uncomprehending, then Alice would have just killed him.
Whereas if Alice would rather not shoot Bob, then her attempts to convince Bob will involve either lying to him, or visibly making herself angry or otherwise trying to commit herself to the shoot-if-resist choice. In this case, Bob does not benefit from being in a position to receive Alice’s communications; if Bob were clearly deaf / didn’t know the language / otherwise couldn’t be communicated with, then Alice wouldn’t try to enrage herself and would probably just leave. (Technically, given that Alice thinks Bob can hear her, Bob benefits from actually hearing her.)
There is an important distinction to be made here. The question is what words to use for each case. I do think it’s reasonably common for people to understand the distinction, and, when they are making a distinction, I think they use “threat” for the second case, while the first might be called “a fact” or possibly “a warning”.
For a less violent case, consider one company telling their vendor, “If you don’t drop your prices by 5% by next month, then we’ll stop buying from you.” If that’s actually in the company’s interest—e.g. because they found a competing seller whose prices are 5% lower—then, again, the vendor is glad to know; but if the company is just trying to get a better deal and really hopes they’re not put in a position where they have to either follow through or eat their words, then this is a very different thing. I do think that common parlance would say that the latter is a threat, and the former is a (possibly friendly!) warning.
Incidentally, it’s clear that people refer to “a thing that might seriously harm X” as “a threat to X”. In the “rational psychopath” case, Alice is a threat to Bob, but her words, her line of communication with Bob, are not—they actually help Bob. In the “wannabe madman” case, Alice’s words are themselves a threat (or, technically, the fact that Alice thinks Bob is comprehending her words). Likewise, the communication (perhaps a letter) from the company that says they’ll stop buying is itself a threat in the second case and not the first. One can also say that the wannabe-madman Alice and the aggressively negotiating company are making a threat—they are creating a danger (maybe fake, but real if they do commit themselves) where none existed.
Now, despite the above arguments, it is possible that the bare word “threat” is not the best term. The relevant Wikipedia article is called “Non-credible threat”. I don’t think that’s a good name, because if Alice truly is a madman (and has a reputation for shooting people who irritated her, and she’s managed to evade capture), then, when Alice tells you to do something or she’ll shoot you, it can be very credible. I would probably say “game-theoretic threat”.
[1] Though in practice she might need to convince Bob that she, unlike most people, is willing to kill him. Pointing a gun at him would be evidence of this, but I think people would also tend to say that’s “threatening”… though waving a gun around might indeed be “trying to convince them that you’re irrational enough to carry out an irrational threat”. I dunno. In game theory, one often prefers to start with situations in which all parties are rational...
It is definitely true that the author was not using the word “threat” in this way.
OneTwoSome of the explicitly given examples fit this definition and were provided as an example of strategies that were not considered threats.Though also consider: the character there is not from Earth, does not know what words Earth people use, they are communicating via a translation system that is known to be imprecise, the target language is also not from Earth, and that language already known to be missing words for many simple game theory concepts. The use of the word “threat” in the text is definitely not to be taken as exactly the same meaning as used in Earth game theory.
By this definition the statement “I will not give into threats” is a threat itself...