But they did communicate with you, as a result of a somewhat deterministic decision process, and not by random choice. How should you reason about this counterfactual? Why doesn’t the “false” assumption of their never communicating with you imply that the Moon is made out of cheese?
People engage in this kind of counterfactual reasoning all the time without declaring the moon to be made of cheese; I’m not sure why you’re questioning it here. If it makes it any easier, think of it as being about the change in expected value immediately after the communication vs. the expected value immediately before the communication—in other words, whether the communication is a positive or negative surprise.
I think they have an underlying premise that they will believe whatever is necessary to make their lives better, or at least not worse.
Their beliefs about what’s better and worse may never be examined, so some of their actions may be obviously sub-optimal. However, they won’t fall into thinking that one contradiction means they’re obligated to believe every piece of obvious nonsense.
When reasoning about counterfactuals a good principle is never to reach to a more distant* world than necessary.
*(less similar)
If you were to simulate the universe as it was before they contacted you, and make 1 single alteration (tapping their brain so they decide not to contact you) would the simulation’s moon be made of green cheese?
That universe is pretty much the closest possible universe to ours where they don’t contact you.
Why are merely similar worlds ought to be relevant at all? There could be ways of approximate reasoning about complicated definition of the actual world you care about, but actually normatively caring about the worlds that you know not to be actual (i.e. the one you actually care about) is a contradiction of terms.
Why are merely similar worlds ought to be relevant at all?
I’m not sure what you’re asking now. Could you please clarify?
The reason I think about counterfactuals is to understand cause and effect. If you change something, then anything to which it is a cause must also change.
But things (such as whether the moon is made of green cheese) which AREN’T caused by the something, would not change
You answered informally. It’s easy, and not what I wondered about.
The reason I think about counterfactuals is to understand cause and effect. If you change something, then anything to which it is a cause must also change.
Things don’t change. When you make a decision, you are not changing the future, you are deciding the future. The future is what it is given your actual decision, all else is fantasy (logically inconsistent even, because the structure of your own mind implies only one decision, when you ignore some unimportant cognitive noise), perhaps morally important fantasy whose structure we ought to understand, but still not the reality.
The goal is not to dismiss counterfactuals, but to understand where they come from, and how are they relevant for reasoning about the actual world. Distinguish inability to reason informally from lack of formal understanding of the structure of that informal reasoning.
Okay: Thinking about how things would differ now, or in future, based on a slightly modified version of the past, allows us to accurately consider what the world could be like, in the future, based on our options in the present.
That something is, doesn’t answer why it’s normatively useful (“they come from the thinker”).
You said you sought to understand where they came from. That they come from the thinker is an answer to that. I answered how they’re relevant in the second part of the post (and the first part of this one)
You ignore these points and repeat something contradictory to them, which is wrong in a debate even if you don’t accept them. You (or I) need to find another path, and not rehash the same ground.
Okay, I’ll go point to point, and try and understand what you meant in that post, that you think I’m ignoring.
Things don’t change.
This is simply false, as a statement, so I won’t treat it on it’s own.
When you make a decision, you are not changing the future, you are deciding the future.
This is fine. Sure. My post works fine within such a structure.
The future is what it is given your actual decision,
True. But making choices requires that one accept that one doesn’t know what the future is, nor does one know what one’s decision will be. It requires the use of “if… then” thoughts, or counterfactuals.
So, nope, not ignored, just irrelevant.
all else is fantasy
Emotional dismissal, not an actual point.
(logically inconsistent even, because the structure of your own mind implies only one decision, when you ignore some unimportant cognitive noise)
A good counterfactual should be logically consistent. It isn’t the real world, but the real world isn’t the only logically consistent possible world.
Perhaps you’re making the same mistake as you made with the term “logically impossible” earlier?
perhaps morally important fantasy whose structure we ought to understand, but still not the reality.
Dismissal, not an actual point.
EDIT: So, which of those are you claiming I contradicted exactly?
This is simply false, as a statement, so I won’t treat it on it’s own.
It has an intended interpretation that isn’t false, which I referred in the following statements which you’ve accepted. (It’s more of a summary than a separate point.)
The future is what it is given your actual decision,
True. But making choices requires that one accept that one doesn’t know what the future is, nor does one know what one’s decision will be. It requires the use of “if… then” thoughts, or counterfactuals.
Yes. If there is a (logical) fact in what your actual decision is, say it’s actually A, and you are uncertain about what it’ll be, then the assumption A=B is logically false, inconsistent, even if you don’t know that it is. When you reason about what happens if A=B, not knowing that it’s a false statement, you are reasoning from a false premise, and everything logically follows from a false premise. This is the relevance of this description.
all else is fantasy
Emotional dismissal, not an actual point.
Not emotional. What else is there? There is reality, and then all the thoughts you can have to reason about reality.
A good counterfactual should be logically consistent. It isn’t the real world, but the real world isn’t the only logically consistent possible world.
If there is a fact of the matter of what your action is, then assuming a possible action that is not actual is logically inconsistent. This is normal. If you are considering something that is not the real world, you need to explain what relation it has to the real world, and how this particular not-real-world is different from all the other not-real-worlds, and what this not-real-world actually is, especially if it’s inconsistent, but even if it’s consistent, there is still the same question of what privileges it, since it’s not the real world, and real world is what you want to reason about.
perhaps morally important fantasy whose structure we ought to understand, but still not the reality.
Dismissal, not an actual point.
It’s a point that it’s unclear what relation is there between the counterfactuals and reality, given that counterfactuals are usually not the reality.
EDIT: So, which of those are you claiming I contradicted exactly?
You referred to “slightly modified version of the past”, and modifying things is starting to consider things other than reality, where it becomes unclear how considering those not-real things helps to understand reality. (Uncertainty is a much better concept than change in this context.) I would further qualify that you can’t change your notion of reality without moving away from your original notion of reality, and thus conceptualizing something other than reality, when what you wish to understand is reality, and not this not-reality you’ve constructed by modifying the concept.
Yes. If there is a fact in what your actual decision is, say it’s actually A, and you are uncertain about what it’ll be, then the assumption A=B is logically false, inconsistent, even if you don’t know that it is.
Where is the inconsistency?
If you assume both that the actual action (A) and the possible, but not actual, action (B) you have an inconsistency.
But if you assume only B; ie. you assume a world as similar to this one as possible where you are such that you will choose action B, then it is perfectly consistent.
hen you reason about what happens if A=B, not knowing that it’s a false statement, you are reasoning from a false premise, and everything logically follows from a false premise. This is the relevance of this description.
But, as I have already explained, reasoning about counterfactuals is not, in fact, reasoning in such a way. The statement “If I am in front of you, I am punching you in the face” is true. The statement “If I were in front of you, I would be punching you in the face” is a different, false, statement.
Similarly “If I did not post this comment you are a pizza” is true, but “If I had not posted this comment, you would be a pizza” is false
Until you are willing to grapple with your confusion on this issue, I don’t see this conversation being productive.
If you assume both that the actual action (A) and the possible, but not actual, action (B) you have an inconsistency.
Yes.
But if you assume only B; ie. you assume a world as similar to this one as possible where you are such that you will choose action B, then it is perfectly consistent.
You need to show that this other “similar” world has any relevance to reasoning about the actual world. You are not justified in considering a problem statement that stars a non-actual world unless you explain how that helps in reasoning about the actual world.
Of course we know that intuitively that’s how it works, but it’s not clear why, and “just so” doesn’t help in understanding this mystery. Also, it’s not clear how to generally construct those counterfactuals, even if we leave the question of their relevance aside. Where is the “you” that ought to be replaced in the environment? What about the thoughts about you in other people’s minds, should they be modified as well? If not you run into pitfalls of CDT.
Until you are willing to grapple with your confusion on this issue, I don’t see this conversation being productive.
I’m trying to argue that you should be confused, just as I am confused. Notice your own confusion and all.
You need to show that this other “similar” world has any relevance to reasoning about the actual world.
It allows you to think about cause and effect, and it is a necessity in making rational choices. You cannot make a rational choice without thinking through the consequences of different (counterfactual) possible choices.
You are not justified in considering a problem statement that stars a non-actual world unless you explain how that helps in reasoning about the actual world.
In reasoning with counterfactuals, the counterfactual worlds aren’t “stars” and the certainly aren’t “problem statement”s. They’re tools for thinking about the world and making choices.
but it’s not clear why,
You keep on making it less clear for yourself, by bringing in things like the principle of explosion, which is irrelevant.
Also, it’s not clear how to generally construct those counterfactuals, even if we leave the question of their relevance aside.
I’ve explained a general method for constructing the counterfactuals already. You assume the smallest possible divergence that could lead to the specific divergence you’re interested in.
Where is the “you” that ought to be replaced in the environment?
That’s a problem with personal identity, not counterfactuals. Sure, it’s an important problem, but adding more confusions to this discussion will not help you to understand counterfactuals.
What about the thoughts about you in other people’s minds, should they be modified as well?
To the extent that they were caused by the properties you had at the time.
I’m trying to argue that you should be confused, just as I am confused. Notice your own confusion and all.
But all the reasons you’ve given for your confusion seem to be trivially irrelevant, or incorrect. The reasons for your confusion seem to be confusions, not good reasons.
Curious. I was arguing motivation for study of TDT/UDT/ADT, without betraying any knowledge about results that are already known. And you’ve managed to rule out all combinations of confusions these theories are intended to resolve as being irrelevant. The general pattern I see here is that any individual question has an “obvious” intuitive answer, especially if you don’t go into detail, and you refuse to either consider multiple questions at once (since they are “unrelated”), or to go deeply enough into each of them individually (since if you assume the intuitive understanding of the other questions, they provide strong enough support for not being confused about this one too).
In other words, you are trapped in the net of intuitive understanding of multiple concepts that help in understanding each other, and are comfortable with this level of understanding, which makes any attempt to look deeper into their nature preposterous to you.
You were arguing from a position whereby you couldn’t tell the difference between the statements “If I had precommitted to not give into blackmail I wouldn’t have been blackmailed” and “If I had precommitted to call upon the FSM for help, the FSM would exist”
That is a very confused position. I have since explained the difference between those things.
Every confusion you have actually brought to the fore, I have clarified; with the exception of the confusion of “what makes personal identity”; because that wasn’t the topic at hand. It’s a big and complicated, and seperable, issue. And yes, the personal identity issue leads to some changes in decision theory. But we’re not talking about decision theory at the moment.
If you think I haven’t clarified one of your confusions, please point it out? Because, honestly, you seem to be just plain ignoring any attempts at clarification.
Because, honestly, you seem to be just plain ignoring any attempts at clarification.
Informal, intuitive attempts at clarifications. Attempts at clarification that don’t give deep understanding of what’s going on. The standard of understanding I was aiming at, in particular by refusing to accept less formal explanations.
So you say, but you don’t point out any difficulty with them.
You simply dismiss them like that.
Your confusions are faux-logical (talking about worlds being logically impossible, when they’re not; talking about the principle of explosion, when it doesn’t apply); if you want a thorough clarification, give a thorough problem.
I am not sure that I am correct. But there seems to be another possibility.
If we assume that the world is a model of some formal theory, then counterfactuals are models of different formal theories, whose models have finite isomorphic subsets (reality accessible to the agent before it makes a decision).
Thus counterfactuals aren’t inconsistent as they use different formal theories, and they are important because agent cannot decide the one that applies to the world before it makes a decision.
I’m getting this more clearly figured out. In the language of ambient control, we have:
You-program, Mailer-program, World-program, Your utility, Mailer utility
“Mailer” here doesn’t mean anything. Anyone could be a mailer.
It is simpler with one mailer but this can be extended to a multiple-mailer situation.
We write your utility as a function of your actions and the mailer’s actions based on ambient control. This allows us to consider what would happen if you changed one action and left everything else constant. If you would have a lower utility, we define this to be a “sacrificial action”.
A “policy” is a strategy in which one plays a sacrificial action in a certain class of situation.
A “workable policy” is a policy where playing it will induce the mailer to model you as an agent that plays that policy for a significant proportion of the times you play together, either for:
causal reasons—they see you play the policy and deduce you will probably continue to play it, or they see you not play it and deduce that you probably won’t
acausal reasons—they accurately model you and predict that you will/won’t use the policy.
A “beneficial workable policy” is when this modeling will increase your utility.
Depending on the costs/benefits, a beneficial workable policy could be rational or irrational, determined using normal decision theory. The name people use for it is unrelated—people have given in to and stood up against blackmail, they have given in to and stood up against terrorism, they have helped those who helped them or not helped them.
Not responding to blackmail is a specific kind of policy that is frequently, when dealing with humans, workable. It deals with a conceptual category that humans create without fundamental decision-theoretic relevance.
We write your utility as a function of your actions and the mailer’s actions based on ambient control. This allows us to consider what would happen if you changed one action and left everything else constant.
It doesn’t (at least not by varying one argument of that function), because of explicit dependence bias (this time I’m certain of it). Your action can acausally control the other agent’s action, so if you only resolve uncertainty about the parameter of utility function that corresponds to your action, you are being logically rude by not taking into account possible inferences about the other agent’s actions (the same way as CDT is logically rude in only considering the inferences that align with definition of physical causality). Form this, “sacrificial action” is not well-defined.
I think you’re mostly right. This suggests that a better policy than ‘don’t respond to blackmail’ is ‘don’t respond to blackmail if and only if you believe the blackmailer to be someone who is capable of accurately modelling you’.
Unfortunately this only works if you have perfect knowledge of blackmailers and cannot be fooled by one who pretends to be less intelligent than they actually are.
This also suggests a possible meta-strategy for blackmailers, namely “don’t allow considerations of whether someone will pay to affect your decision of whether to blackmail them”, since if blackmailers were known to do this then “don’t pay blackmailers” would no longer work.
I would also suggest that while blackmail works with some agents and not others, it isn’t human-specific. For example, poison arrow frogs seem like a good example of evolution using a similar strategy, having an adaptation that is in no way directly beneficial (and presumably is at least a little costly) that exists purely to minimize the utility of animals which do not do what it wants.
Unfortunately this only works if you have perfect knowledge of blackmailers and cannot be fooled by one who pretends to be less intelligent than they actually are.
Not perfect knowledge, just some knowledge together with awareness that you can’t reason from it in certain otherwise applicable heuristic ways because of the incentives to deceive.
Can I take it that since you criticized a criticism of this hypothesis without offering a criticism of your own, that you believe that this hypothesis is correct?
My comment was entirely local, targeting a popular argument that demands perfect knowledge where any knowledge would suffice, similarly to the rhetoric device of demanding absolute certainty where you were already presented with plenty of evidence.
It’s evidence that you have seen the comment that he’s replying to, in which I lay out my hypothesis for the answer to your original question. (You’ve provided an answer which seems incomplete.)
Isn’t the default just what would happen if the other person never communicated with you?
But they did communicate with you, as a result of a somewhat deterministic decision process, and not by random choice. How should you reason about this counterfactual? Why doesn’t the “false” assumption of their never communicating with you imply that the Moon is made out of cheese?
People engage in this kind of counterfactual reasoning all the time without declaring the moon to be made of cheese; I’m not sure why you’re questioning it here. If it makes it any easier, think of it as being about the change in expected value immediately after the communication vs. the expected value immediately before the communication—in other words, whether the communication is a positive or negative surprise.
Indeed. How do they manage that? That’s one fascinating question.
I think they have an underlying premise that they will believe whatever is necessary to make their lives better, or at least not worse.
Their beliefs about what’s better and worse may never be examined, so some of their actions may be obviously sub-optimal. However, they won’t fall into thinking that one contradiction means they’re obligated to believe every piece of obvious nonsense.
When reasoning about counterfactuals a good principle is never to reach to a more distant* world than necessary.
*(less similar)
If you were to simulate the universe as it was before they contacted you, and make 1 single alteration (tapping their brain so they decide not to contact you) would the simulation’s moon be made of green cheese?
That universe is pretty much the closest possible universe to ours where they don’t contact you.
Why are merely similar worlds ought to be relevant at all? There could be ways of approximate reasoning about complicated definition of the actual world you care about, but actually normatively caring about the worlds that you know not to be actual (i.e. the one you actually care about) is a contradiction of terms.
You asked how to reason about counterfactuals.
I answered.
I’m not sure what you’re asking now. Could you please clarify?
The reason I think about counterfactuals is to understand cause and effect. If you change something, then anything to which it is a cause must also change.
But things (such as whether the moon is made of green cheese) which AREN’T caused by the something, would not change
You answered informally. It’s easy, and not what I wondered about.
Things don’t change. When you make a decision, you are not changing the future, you are deciding the future. The future is what it is given your actual decision, all else is fantasy (logically inconsistent even, because the structure of your own mind implies only one decision, when you ignore some unimportant cognitive noise), perhaps morally important fantasy whose structure we ought to understand, but still not the reality.
What did you wonder about? You seemed to be wondering why you shouldn’t just go “Well, if they’d done that, the moon would have been made of cheese”.
If you can’t think about counterfactuals such as “What will happen if I do X?” “What will happen if I do Y?” etc., you can’t make rational decisions.
You may wish to dismiss counterfactuals as fantasy, but does doing so help you come to good decisions? Or does it hinder you?
The goal is not to dismiss counterfactuals, but to understand where they come from, and how are they relevant for reasoning about the actual world. Distinguish inability to reason informally from lack of formal understanding of the structure of that informal reasoning.
They are a mode of thought. They come from the thinker.
They allow you to look at cause and effect. Without counterfactuals, you can’t reason about cause and effect, you can only reason about correlation.
Taboo cause, effect, taboo counterfactuals. That something is, doesn’t answer why it’s normatively useful (“they come from the thinker”).
Okay: Thinking about how things would differ now, or in future, based on a slightly modified version of the past, allows us to accurately consider what the world could be like, in the future, based on our options in the present.
You said you sought to understand where they came from. That they come from the thinker is an answer to that. I answered how they’re relevant in the second part of the post (and the first part of this one)
You ignore these points and repeat something contradictory to them, which is wrong in a debate even if you don’t accept them. You (or I) need to find another path, and not rehash the same ground.
Okay, I’ll go point to point, and try and understand what you meant in that post, that you think I’m ignoring.
This is simply false, as a statement, so I won’t treat it on it’s own.
This is fine. Sure. My post works fine within such a structure.
True. But making choices requires that one accept that one doesn’t know what the future is, nor does one know what one’s decision will be. It requires the use of “if… then” thoughts, or counterfactuals.
So, nope, not ignored, just irrelevant.
Emotional dismissal, not an actual point.
A good counterfactual should be logically consistent. It isn’t the real world, but the real world isn’t the only logically consistent possible world.
Perhaps you’re making the same mistake as you made with the term “logically impossible” earlier?
Dismissal, not an actual point.
EDIT: So, which of those are you claiming I contradicted exactly?
It has an intended interpretation that isn’t false, which I referred in the following statements which you’ve accepted. (It’s more of a summary than a separate point.)
Yes. If there is a (logical) fact in what your actual decision is, say it’s actually A, and you are uncertain about what it’ll be, then the assumption A=B is logically false, inconsistent, even if you don’t know that it is. When you reason about what happens if A=B, not knowing that it’s a false statement, you are reasoning from a false premise, and everything logically follows from a false premise. This is the relevance of this description.
Not emotional. What else is there? There is reality, and then all the thoughts you can have to reason about reality.
If there is a fact of the matter of what your action is, then assuming a possible action that is not actual is logically inconsistent. This is normal. If you are considering something that is not the real world, you need to explain what relation it has to the real world, and how this particular not-real-world is different from all the other not-real-worlds, and what this not-real-world actually is, especially if it’s inconsistent, but even if it’s consistent, there is still the same question of what privileges it, since it’s not the real world, and real world is what you want to reason about.
It’s a point that it’s unclear what relation is there between the counterfactuals and reality, given that counterfactuals are usually not the reality.
You referred to “slightly modified version of the past”, and modifying things is starting to consider things other than reality, where it becomes unclear how considering those not-real things helps to understand reality. (Uncertainty is a much better concept than change in this context.) I would further qualify that you can’t change your notion of reality without moving away from your original notion of reality, and thus conceptualizing something other than reality, when what you wish to understand is reality, and not this not-reality you’ve constructed by modifying the concept.
Where is the inconsistency?
If you assume both that the actual action (A) and the possible, but not actual, action (B) you have an inconsistency.
But if you assume only B; ie. you assume a world as similar to this one as possible where you are such that you will choose action B, then it is perfectly consistent.
But, as I have already explained, reasoning about counterfactuals is not, in fact, reasoning in such a way. The statement “If I am in front of you, I am punching you in the face” is true. The statement “If I were in front of you, I would be punching you in the face” is a different, false, statement.
Similarly “If I did not post this comment you are a pizza” is true, but “If I had not posted this comment, you would be a pizza” is false
Until you are willing to grapple with your confusion on this issue, I don’t see this conversation being productive.
Yes.
You need to show that this other “similar” world has any relevance to reasoning about the actual world. You are not justified in considering a problem statement that stars a non-actual world unless you explain how that helps in reasoning about the actual world.
Of course we know that intuitively that’s how it works, but it’s not clear why, and “just so” doesn’t help in understanding this mystery. Also, it’s not clear how to generally construct those counterfactuals, even if we leave the question of their relevance aside. Where is the “you” that ought to be replaced in the environment? What about the thoughts about you in other people’s minds, should they be modified as well? If not you run into pitfalls of CDT.
I’m trying to argue that you should be confused, just as I am confused. Notice your own confusion and all.
It allows you to think about cause and effect, and it is a necessity in making rational choices. You cannot make a rational choice without thinking through the consequences of different (counterfactual) possible choices.
In reasoning with counterfactuals, the counterfactual worlds aren’t “stars” and the certainly aren’t “problem statement”s. They’re tools for thinking about the world and making choices.
You keep on making it less clear for yourself, by bringing in things like the principle of explosion, which is irrelevant.
I’ve explained a general method for constructing the counterfactuals already. You assume the smallest possible divergence that could lead to the specific divergence you’re interested in.
That’s a problem with personal identity, not counterfactuals. Sure, it’s an important problem, but adding more confusions to this discussion will not help you to understand counterfactuals.
To the extent that they were caused by the properties you had at the time.
But all the reasons you’ve given for your confusion seem to be trivially irrelevant, or incorrect. The reasons for your confusion seem to be confusions, not good reasons.
Curious. I was arguing motivation for study of TDT/UDT/ADT, without betraying any knowledge about results that are already known. And you’ve managed to rule out all combinations of confusions these theories are intended to resolve as being irrelevant. The general pattern I see here is that any individual question has an “obvious” intuitive answer, especially if you don’t go into detail, and you refuse to either consider multiple questions at once (since they are “unrelated”), or to go deeply enough into each of them individually (since if you assume the intuitive understanding of the other questions, they provide strong enough support for not being confused about this one too).
In other words, you are trapped in the net of intuitive understanding of multiple concepts that help in understanding each other, and are comfortable with this level of understanding, which makes any attempt to look deeper into their nature preposterous to you.
You were arguing from a position whereby you couldn’t tell the difference between the statements “If I had precommitted to not give into blackmail I wouldn’t have been blackmailed” and “If I had precommitted to call upon the FSM for help, the FSM would exist”
That is a very confused position. I have since explained the difference between those things.
Every confusion you have actually brought to the fore, I have clarified; with the exception of the confusion of “what makes personal identity”; because that wasn’t the topic at hand. It’s a big and complicated, and seperable, issue. And yes, the personal identity issue leads to some changes in decision theory. But we’re not talking about decision theory at the moment.
If you think I haven’t clarified one of your confusions, please point it out? Because, honestly, you seem to be just plain ignoring any attempts at clarification.
Informal, intuitive attempts at clarifications. Attempts at clarification that don’t give deep understanding of what’s going on. The standard of understanding I was aiming at, in particular by refusing to accept less formal explanations.
So you say, but you don’t point out any difficulty with them.
You simply dismiss them like that.
Your confusions are faux-logical (talking about worlds being logically impossible, when they’re not; talking about the principle of explosion, when it doesn’t apply); if you want a thorough clarification, give a thorough problem.
I am not sure that I am correct. But there seems to be another possibility.
If we assume that the world is a model of some formal theory, then counterfactuals are models of different formal theories, whose models have finite isomorphic subsets (reality accessible to the agent before it makes a decision).
Thus counterfactuals aren’t inconsistent as they use different formal theories, and they are important because agent cannot decide the one that applies to the world before it makes a decision.
I’m getting this more clearly figured out. In the language of ambient control, we have: You-program, Mailer-program, World-program, Your utility, Mailer utility
“Mailer” here doesn’t mean anything. Anyone could be a mailer.
It is simpler with one mailer but this can be extended to a multiple-mailer situation.
We write your utility as a function of your actions and the mailer’s actions based on ambient control. This allows us to consider what would happen if you changed one action and left everything else constant. If you would have a lower utility, we define this to be a “sacrificial action”.
A “policy” is a strategy in which one plays a sacrificial action in a certain class of situation.
A “workable policy” is a policy where playing it will induce the mailer to model you as an agent that plays that policy for a significant proportion of the times you play together, either for:
causal reasons—they see you play the policy and deduce you will probably continue to play it, or they see you not play it and deduce that you probably won’t
acausal reasons—they accurately model you and predict that you will/won’t use the policy.
A “beneficial workable policy” is when this modeling will increase your utility.
Depending on the costs/benefits, a beneficial workable policy could be rational or irrational, determined using normal decision theory. The name people use for it is unrelated—people have given in to and stood up against blackmail, they have given in to and stood up against terrorism, they have helped those who helped them or not helped them.
Not responding to blackmail is a specific kind of policy that is frequently, when dealing with humans, workable. It deals with a conceptual category that humans create without fundamental decision-theoretic relevance.
It doesn’t (at least not by varying one argument of that function), because of explicit dependence bias (this time I’m certain of it). Your action can acausally control the other agent’s action, so if you only resolve uncertainty about the parameter of utility function that corresponds to your action, you are being logically rude by not taking into account possible inferences about the other agent’s actions (the same way as CDT is logically rude in only considering the inferences that align with definition of physical causality). Form this, “sacrificial action” is not well-defined.
I think you’re mostly right. This suggests that a better policy than ‘don’t respond to blackmail’ is ‘don’t respond to blackmail if and only if you believe the blackmailer to be someone who is capable of accurately modelling you’.
Unfortunately this only works if you have perfect knowledge of blackmailers and cannot be fooled by one who pretends to be less intelligent than they actually are.
This also suggests a possible meta-strategy for blackmailers, namely “don’t allow considerations of whether someone will pay to affect your decision of whether to blackmail them”, since if blackmailers were known to do this then “don’t pay blackmailers” would no longer work.
I would also suggest that while blackmail works with some agents and not others, it isn’t human-specific. For example, poison arrow frogs seem like a good example of evolution using a similar strategy, having an adaptation that is in no way directly beneficial (and presumably is at least a little costly) that exists purely to minimize the utility of animals which do not do what it wants.
Not perfect knowledge, just some knowledge together with awareness that you can’t reason from it in certain otherwise applicable heuristic ways because of the incentives to deceive.
Yes, that’s what I meant. I have a bad habit of saying ‘perfect knowledge’ where I mean ‘enough knowledge’.
Can I take it that since you criticized a criticism of this hypothesis without offering a criticism of your own, that you believe that this hypothesis is correct?
What hypothesis?
My comment was entirely local, targeting a popular argument that demands perfect knowledge where any knowledge would suffice, similarly to the rhetoric device of demanding absolute certainty where you were already presented with plenty of evidence.
It’s evidence that you have seen the comment that he’s replying to, in which I lay out my hypothesis for the answer to your original question. (You’ve provided an answer which seems incomplete.)