So, am I the only one perplexed by why people care about Newcomb’s Problem? Like most paradoxes, the confusion is entirely due to a posing the problem in a confusing way; clean things up, and then it becomes obvious. But it strikes me as difficult to get an explanation that’s less than a page long out there and taken seriously.
In addition to what the others have said, the class of “Newcomblike problems” does map to real-world scenarios. I do agree that insufficient effort has been spent describing such situations though, which is why I’m compiling examples for a possible article. Here’s a peak at what I have so far:
The decision of whether to shoplift is a real-life Newcomb’s problem. It is easy to get away with, and your decision does not cause (in the technical sense) the opportunity to exist (the “box” to be “filled”). However, merchants only locate stores (“fill the box”) where they predict people won’t (in sufficient numbers) take this opportunity, and their accuracy is high enough for retailing to stay profitable in the aggregate.
Evolution and its “genetic” decision theories: You could just be selfish and not spend resources spreading your genes (thus like stiffing the rescuer in Parfit’s Hitchhiker); however, you would not be in the position to make such a choice unless you were already selected for your propensity not to make such a choice. (My article on the matter.)
Hazing, akrasia, and abuse cycles (where being abused motivates one to abuse others) are real-life examples of Counterfactual Mugging, since your decision within a “losing branch” has implications for (symmetric) versions of yourself in other branches.
Expensive punishment. Should you punish a criminal when the cost to do so exceeds the value of all future crimes they could ever commit? If you don’t, you save on the costs of administering the punishment, but if criminals expect that this is sufficient reason for you not to punish, they have no reason not to commit the crimes. The situation is parallel to that of whether you should pay ransoms or other extortioners. (This has a non-obvious connection to Newcomb’s problem that may require explanation—but I elaborate in the link.)
I think that Newcomb’s Problem is a terrible central example to work off of, though. Most of those look like they can be instrumentalized by reputation far better, and then everyone gets the right answers.
I don’t think that works. Purely causal calculations of the costs/benefits, even accounting for reputation, can’t explain the winning answers in any of those cases except maybe expensive punishment. And even then, you can just use the harder version of the problem I gave in the linked discussion: what if, even accounting for future impact on the criminal and others who are deterred, the punishment still has a net cost?
Could you give me an idea of what you mean by e.g. a causal account of why:
shoplifting is easy,
most people don’t shoplift,
most merchants accurately predict shoplifting rates,
each person is made strictly better off by their local decision to shoplift?
It’s true that people usually give the winning answers to these problems (compared to what is possible), and without using TDT/UDT / Drescher’s decision theory. But that doesn’t answer the problem of finding a rigorous grounding for why they should do so.
Could you give me an idea of what you mean by e.g. a causal account of why:
People that don’t shoplift lose more identity by shoplifting than they gain in stolen product.
That means I disagree with the claim that each person is made strictly better off by their local decision to shoplift. If they were actually made better off, they would shoplift. Actions reveal preferences.
This is the issue I got into in the Parfitian filter article I wrote. (And later in some exchanges with Perplexed.)
Basically, the problem with your second paragraph is that actions do not uniquely determine preferences. (See in particular the a/b theory comparisons in the article.) There are an infinite number of preference sets—not to mention preference/belief sets—that can explain any given action. So, you have to use a few more constraints to explain behavior.
That, in turn, leads you to the question of whether an agent is pursuing a terminal value, or an instrumental value in the belief that it will satisfy a terminal value. And that’s also what makes it hard to say in what sense a shoplifter makes himself better off—does he satisfy a terminal value? Believe he’s satisfying an instrumental value? Correctly or incorrectly?
However, I don’t know of a concise way to point to the (purported) benefits of shoplifting.
So we’re left with a number of hypotheses: it could be that people overestimate the risks of shoplifting. Or that they never consider it. Or that they have a more complex way of evaluating the benefits (which your “identity loss” approach is a good, insightful example of).
So, there’s more to it than a simple action → preference mapping, but likewise, there’s more to the decision theory than these “local monetary gains”. Regardless, we have a case where people are doing the equivalent of repeatedly playing Newcomb’s problem and one-boxing “even though the box is already filled or not”, and it would be interesting to look at the mechanisms at play in such a real-life situation.
It’s true that people usually give the winning answers to these problems …
I think that the source of the communication problem is that some people use the word “winning” for a result that other people would not characterize as a “win”.
It is also confusing that you ask for a causal account of something and then shift to a paragraph talking about normative theories and “winning”. I suppose it is possible to give an evolutionary account which incorporates both the normative and the causal—is this what you are asking for? Are you asking for an argument that not shoplifting is an ESS? I don’t think that one is possible in a society that eschews “punishment”.
Lots of separate issues getting tangled up here. Let me try to clarify what I mean:
1) I meant a win in the sense that the aggregate, people’s shoplifting decisions lead them to have opportunities that they would not have if they calculated the optimality of shoplifting as causal decision theorists. There is certainly a (corresponding, dual) sense in which people don’t win—specifically, the case where their recognition of certain rights is so lacking that they don’t actually every get the opportunity to shoplift—or even buy—certain goods in the first place. These are the stores and business models that don’t exist in the first place and leave us in a Pareto-inferior position. (IP recognition, I’m looking in your general direction here.)
2) When I asked for a causal account above, what I meant was, “How do you explain, assuming everyone uses CDT, why most people don’t shoplift, given the constraints I listed?” That is, what CDT-type reasoning tells you not to shoplift when it’s trivial to get away with?
3) I claim that it is possible—in fact, necessary—to give an evolutionary account of why people don’t act purely as causal decision theorists (and it’s not particularly important what you call the non-causal motivations behind their decisions), since people demonstrably differ from CDT. (My Parfitian filter article was an attempt, citing Drescher, to account for these non-causal, “moral” components of human reasoning through natural selection.)
4) However, I don’t think the the issue of ESSes and shoplifting are necessarily connected in the sense that you have to explain the (absence of) the latter as the former. However, I believe the opportunity to shoplift is a real-world example of Newcomb’s problem, in which people (do the analogue of) one-box, even though it’s certainly not because of TDT-type reasoning. This raises the question of why people use a decision theory that gives the same results as TDT would on a “contrived” problem.
How do you explain, assuming everyone uses CDT, why most people don’t shoplift, given the constraints I listed?
But that is an absurd request for explanation, because you are demanding that two false statements be accepted as hypotheses:
That shoplifting is risk-free.
That everyone adheres to a particular normative decision theory.
As to the definition of “winning”, I sense that there is still a failure to communicate here. Are you talking about winning individuals or winning societies? As I see it, given your unrealistic hypotheses, the winning strategy is to shoplift, but convince other people that they win by not shoplifting. The losing strategy seems to be the one you advocate—which is apparently to refrain from shoplifting, while encouraging others to shoplift by denying the efficacy of punishment.
But that is an absurd request for explanation, because you are demanding that two false statements be accepted as hypotheses:
That shoplifting is risk-free.
That everyone adheres to a particular normative decision theory.
No, I’m showing that they can’t both be true. (Btw, what does “normative” add to your meaning here.) (1) is false, but easily close enough to truth for our purposes.
As I see it, given your unrealistic hypotheses, the winning strategy is to shoplift, but convince other people that they win by not shoplifting.
Hence the parallel to Newcomb’s problem, where the “winning” strategy is to two-box, but convince Omega you’ll one-box, and hence the tension between whether the “individual” or “society” perspective is correct here.
If you would deem it optimal to shoplift, worse stores are available in the first place, just as if you would deem it optimal to two-box, emptier boxes are available in the first place.
Something motivates people not to shoplift given present conditions, which is isomorphic to one-boxing “given” that Omega has left. So, I claim, it’s a real life case of people consistently one-boxing. A world (or at least, community) in which people deem it more optimal to shoplift has different (and worse) opportunities than one in which people do not. Their decisions “in the moment” are not unrelated to what kind of community they are in the first place.
(1) is false, but easily close enough to truth for our purposes.
I suspect that this claim is at the heart of the dispute. I think that it is far from close enough to the truth.
The reason people don’t shoplift is that they fear the consequences. There is no mystery to be explained. Except perhaps why people are sufficiently motivated by fear of being temporarily physically constrained by a store-owner or security guard and publicly shamed (typical punishment for a first offense).
Btw, what does “normative” add to your meaning here.
It serves to emphasize the type-error that I see in your request. You seem to be criticizing one normative theory (CDT) while promoting another (UDT/TDT). But you are doing so by asking whether the normative theory is satisfactory when used as a descriptive theory. And you are asking that it function descriptively in a fictitious universe in which shoplifters are rarely caught and mildly punished.
I agreed there is no need to invoke TDT/UDT to explain lack of shoplifting.
In addition to what Perplexed said, it seems to me that people tend to care more about their reputation compared to what is evolutionarily adaptive today, probably because in our EEA, you couldn’t move to a new city and start over (or if you could move to another tribe, it was only at an extremely high cost), nor did you interact mostly with strangers. That would explain why people are sometimes deterred or motivated by reputation/shame when it doesn’t seem to make sense to be, without having to invoke TDT/UDT.
[“Shoplifting is risk-free”] is false, but easily close enough to truth for our purposes.
I don’t think so. Shoplifting is more and less risky in different places and situations. I bet that the amount of shoplifting is monotone increasing in the amount of risk, even when that amount is relatively small. If that’s true, then “why don’t people shoplift more” doesn’t require an explanation beyond “because they don’t want to take the risk.” Do you disagree?
Yes, I disagree. Keep in mind, there is a very wide variety of protection a store can have for its goods. The depends on the value of the goods, but also on the “kind of person” that exists in the area, and the latter factor is crucial to understanding the dynamic I’m trying to highlight.
For the same goods, a store will have more security measures in areas where the “kind of people” (decision theory types) tend to steal more. But the population is never uniform. So, although the security measures account for some percentage of prevented shoplifting (and thus can be explained purely though causal consequences to shoplifters), there remains the group that differs from the typical person in the area. This group must stay sufficiently small for the store to stay profitable.
Therefore, the store is relying on a certain fraction of the population refraining from shoplifting even when they could get away with it.
But even if shoplifting is really kept low because of (mistaken) beliefs about its difficulty, that still doesn’t eliminate the newcomblike aspect. You still have to account for why this epistemic error happens in just the right way so as to increase total utility. And the explanation for that looks similar to the evolution case I discussed at the beginning of this subthread, but with memes replacing genes: basically, regions with “better norms” or “more systematic overestimation of shoplifting’s difficulty” will tend to flourish and outcompete those that don’t. Economic competition, then, acts as a sort of “Parfitian filter” in the same sense that evolution does.
It’s not interesting as a brain teaser but as a test case for decision theories. Newcomb’s is especially interesting because Newcomb’s-winning agents have the potential to reach Pareto efficient outcomes without needing precommitments or other outside help.
I think the confusion comes from the difference in importance to win conflicts vs. to make the correct decision. Many people who think about this problem go “ah, doing X is the obvious solution.” When asked to be more formal they come up with decision theories. Other people then explore those theories and find their flaws. Newcomb’s problem is important because it led (may be not directly, but I think it contributed) to the schism into evidential decision theory and causal decision theory. Both have different approaches to solving problems.
Newcomb’s problem is important because it led (may be not directly, but I think it contributed) to the schism into evidential decision theory and causal decision theory.
As far as I can tell, that’s because the causal decision theorists are crippled by using magicless thinking in a magical problem. The only outcome is “huh, people who use all the information provided by a problem do better than people who ignore some of the information!” As schisms go, that seems pretty tame.
That does make it clearer why I’m a 0-boxer and uninterested by it, and suggests I should refrain from approaching it on a level as intense as Eliezer’s paper until I am interested in formality, as a correct one-page explanation is unlikely to be formal and the reason the problem is interesting is in its formality.
So, am I the only one perplexed by why people care about Newcomb’s Problem? Like most paradoxes, the confusion is entirely due to a posing the problem in a confusing way; clean things up, and then it becomes obvious. But it strikes me as difficult to get an explanation that’s less than a page long out there and taken seriously.
In addition to what the others have said, the class of “Newcomblike problems” does map to real-world scenarios. I do agree that insufficient effort has been spent describing such situations though, which is why I’m compiling examples for a possible article. Here’s a peak at what I have so far:
The decision of whether to shoplift is a real-life Newcomb’s problem. It is easy to get away with, and your decision does not cause (in the technical sense) the opportunity to exist (the “box” to be “filled”). However, merchants only locate stores (“fill the box”) where they predict people won’t (in sufficient numbers) take this opportunity, and their accuracy is high enough for retailing to stay profitable in the aggregate.
Evolution and its “genetic” decision theories: You could just be selfish and not spend resources spreading your genes (thus like stiffing the rescuer in Parfit’s Hitchhiker); however, you would not be in the position to make such a choice unless you were already selected for your propensity not to make such a choice. (My article on the matter.)
Hazing, akrasia, and abuse cycles (where being abused motivates one to abuse others) are real-life examples of Counterfactual Mugging, since your decision within a “losing branch” has implications for (symmetric) versions of yourself in other branches.
Expensive punishment. Should you punish a criminal when the cost to do so exceeds the value of all future crimes they could ever commit? If you don’t, you save on the costs of administering the punishment, but if criminals expect that this is sufficient reason for you not to punish, they have no reason not to commit the crimes. The situation is parallel to that of whether you should pay ransoms or other extortioners. (This has a non-obvious connection to Newcomb’s problem that may require explanation—but I elaborate in the link.)
I think that Newcomb’s Problem is a terrible central example to work off of, though. Most of those look like they can be instrumentalized by reputation far better, and then everyone gets the right answers.
I don’t think that works. Purely causal calculations of the costs/benefits, even accounting for reputation, can’t explain the winning answers in any of those cases except maybe expensive punishment. And even then, you can just use the harder version of the problem I gave in the linked discussion: what if, even accounting for future impact on the criminal and others who are deterred, the punishment still has a net cost?
Could you give me an idea of what you mean by e.g. a causal account of why:
shoplifting is easy,
most people don’t shoplift,
most merchants accurately predict shoplifting rates,
each person is made strictly better off by their local decision to shoplift?
It’s true that people usually give the winning answers to these problems (compared to what is possible), and without using TDT/UDT / Drescher’s decision theory. But that doesn’t answer the problem of finding a rigorous grounding for why they should do so.
People that don’t shoplift lose more identity by shoplifting than they gain in stolen product.
That means I disagree with the claim that each person is made strictly better off by their local decision to shoplift. If they were actually made better off, they would shoplift. Actions reveal preferences.
This is the issue I got into in the Parfitian filter article I wrote. (And later in some exchanges with Perplexed.)
Basically, the problem with your second paragraph is that actions do not uniquely determine preferences. (See in particular the a/b theory comparisons in the article.) There are an infinite number of preference sets—not to mention preference/belief sets—that can explain any given action. So, you have to use a few more constraints to explain behavior.
That, in turn, leads you to the question of whether an agent is pursuing a terminal value, or an instrumental value in the belief that it will satisfy a terminal value. And that’s also what makes it hard to say in what sense a shoplifter makes himself better off—does he satisfy a terminal value? Believe he’s satisfying an instrumental value? Correctly or incorrectly?
However, I don’t know of a concise way to point to the (purported) benefits of shoplifting.
So we’re left with a number of hypotheses: it could be that people overestimate the risks of shoplifting. Or that they never consider it. Or that they have a more complex way of evaluating the benefits (which your “identity loss” approach is a good, insightful example of).
So, there’s more to it than a simple action → preference mapping, but likewise, there’s more to the decision theory than these “local monetary gains”. Regardless, we have a case where people are doing the equivalent of repeatedly playing Newcomb’s problem and one-boxing “even though the box is already filled or not”, and it would be interesting to look at the mechanisms at play in such a real-life situation.
Why is this a problem? [edit] To be clearer, I get why actions to do not uniquely determine preferences, but I don’t yet get why I should care.
Sorry for the thread necromancy, but this has an easy answer: read the rest of my comment, after the part you quoted.
I think that the source of the communication problem is that some people use the word “winning” for a result that other people would not characterize as a “win”.
It is also confusing that you ask for a causal account of something and then shift to a paragraph talking about normative theories and “winning”. I suppose it is possible to give an evolutionary account which incorporates both the normative and the causal—is this what you are asking for? Are you asking for an argument that not shoplifting is an ESS? I don’t think that one is possible in a society that eschews “punishment”.
Lots of separate issues getting tangled up here. Let me try to clarify what I mean:
1) I meant a win in the sense that the aggregate, people’s shoplifting decisions lead them to have opportunities that they would not have if they calculated the optimality of shoplifting as causal decision theorists. There is certainly a (corresponding, dual) sense in which people don’t win—specifically, the case where their recognition of certain rights is so lacking that they don’t actually every get the opportunity to shoplift—or even buy—certain goods in the first place. These are the stores and business models that don’t exist in the first place and leave us in a Pareto-inferior position. (IP recognition, I’m looking in your general direction here.)
2) When I asked for a causal account above, what I meant was, “How do you explain, assuming everyone uses CDT, why most people don’t shoplift, given the constraints I listed?” That is, what CDT-type reasoning tells you not to shoplift when it’s trivial to get away with?
3) I claim that it is possible—in fact, necessary—to give an evolutionary account of why people don’t act purely as causal decision theorists (and it’s not particularly important what you call the non-causal motivations behind their decisions), since people demonstrably differ from CDT. (My Parfitian filter article was an attempt, citing Drescher, to account for these non-causal, “moral” components of human reasoning through natural selection.)
4) However, I don’t think the the issue of ESSes and shoplifting are necessarily connected in the sense that you have to explain the (absence of) the latter as the former. However, I believe the opportunity to shoplift is a real-world example of Newcomb’s problem, in which people (do the analogue of) one-box, even though it’s certainly not because of TDT-type reasoning. This raises the question of why people use a decision theory that gives the same results as TDT would on a “contrived” problem.
But that is an absurd request for explanation, because you are demanding that two false statements be accepted as hypotheses:
That shoplifting is risk-free.
That everyone adheres to a particular normative decision theory.
As to the definition of “winning”, I sense that there is still a failure to communicate here. Are you talking about winning individuals or winning societies? As I see it, given your unrealistic hypotheses, the winning strategy is to shoplift, but convince other people that they win by not shoplifting. The losing strategy seems to be the one you advocate—which is apparently to refrain from shoplifting, while encouraging others to shoplift by denying the efficacy of punishment.
No, I’m showing that they can’t both be true. (Btw, what does “normative” add to your meaning here.) (1) is false, but easily close enough to truth for our purposes.
Hence the parallel to Newcomb’s problem, where the “winning” strategy is to two-box, but convince Omega you’ll one-box, and hence the tension between whether the “individual” or “society” perspective is correct here.
If you would deem it optimal to shoplift, worse stores are available in the first place, just as if you would deem it optimal to two-box, emptier boxes are available in the first place.
Something motivates people not to shoplift given present conditions, which is isomorphic to one-boxing “given” that Omega has left. So, I claim, it’s a real life case of people consistently one-boxing. A world (or at least, community) in which people deem it more optimal to shoplift has different (and worse) opportunities than one in which people do not. Their decisions “in the moment” are not unrelated to what kind of community they are in the first place.
I suspect that this claim is at the heart of the dispute. I think that it is far from close enough to the truth.
The reason people don’t shoplift is that they fear the consequences. There is no mystery to be explained. Except perhaps why people are sufficiently motivated by fear of being temporarily physically constrained by a store-owner or security guard and publicly shamed (typical punishment for a first offense).
It serves to emphasize the type-error that I see in your request. You seem to be criticizing one normative theory (CDT) while promoting another (UDT/TDT). But you are doing so by asking whether the normative theory is satisfactory when used as a descriptive theory. And you are asking that it function descriptively in a fictitious universe in which shoplifters are rarely caught and mildly punished.
I agreed there is no need to invoke TDT/UDT to explain lack of shoplifting.
In addition to what Perplexed said, it seems to me that people tend to care more about their reputation compared to what is evolutionarily adaptive today, probably because in our EEA, you couldn’t move to a new city and start over (or if you could move to another tribe, it was only at an extremely high cost), nor did you interact mostly with strangers. That would explain why people are sometimes deterred or motivated by reputation/shame when it doesn’t seem to make sense to be, without having to invoke TDT/UDT.
I don’t think so. Shoplifting is more and less risky in different places and situations. I bet that the amount of shoplifting is monotone increasing in the amount of risk, even when that amount is relatively small. If that’s true, then “why don’t people shoplift more” doesn’t require an explanation beyond “because they don’t want to take the risk.” Do you disagree?
Yes, I disagree. Keep in mind, there is a very wide variety of protection a store can have for its goods. The depends on the value of the goods, but also on the “kind of person” that exists in the area, and the latter factor is crucial to understanding the dynamic I’m trying to highlight.
For the same goods, a store will have more security measures in areas where the “kind of people” (decision theory types) tend to steal more. But the population is never uniform. So, although the security measures account for some percentage of prevented shoplifting (and thus can be explained purely though causal consequences to shoplifters), there remains the group that differs from the typical person in the area. This group must stay sufficiently small for the store to stay profitable.
Therefore, the store is relying on a certain fraction of the population refraining from shoplifting even when they could get away with it.
But even if shoplifting is really kept low because of (mistaken) beliefs about its difficulty, that still doesn’t eliminate the newcomblike aspect. You still have to account for why this epistemic error happens in just the right way so as to increase total utility. And the explanation for that looks similar to the evolution case I discussed at the beginning of this subthread, but with memes replacing genes: basically, regions with “better norms” or “more systematic overestimation of shoplifting’s difficulty” will tend to flourish and outcompete those that don’t. Economic competition, then, acts as a sort of “Parfitian filter” in the same sense that evolution does.
Newcomb’s puzzle is an idealization of real-life puzzles like Parfit’s Hitchhiker. The linked paper discusses this in more detail.
It’s not interesting as a brain teaser but as a test case for decision theories. Newcomb’s is especially interesting because Newcomb’s-winning agents have the potential to reach Pareto efficient outcomes without needing precommitments or other outside help.
I think the confusion comes from the difference in importance to win conflicts vs. to make the correct decision. Many people who think about this problem go “ah, doing X is the obvious solution.” When asked to be more formal they come up with decision theories. Other people then explore those theories and find their flaws. Newcomb’s problem is important because it led (may be not directly, but I think it contributed) to the schism into evidential decision theory and causal decision theory. Both have different approaches to solving problems.
As far as I can tell, that’s because the causal decision theorists are crippled by using magicless thinking in a magical problem. The only outcome is “huh, people who use all the information provided by a problem do better than people who ignore some of the information!” As schisms go, that seems pretty tame.
The issue is expressing formally the algorithm which uses all the information to get the right answer in Newcomb’s.
That does make it clearer why I’m a 0-boxer and uninterested by it, and suggests I should refrain from approaching it on a level as intense as Eliezer’s paper until I am interested in formality, as a correct one-page explanation is unlikely to be formal and the reason the problem is interesting is in its formality.