Because Alice sees that Bob’s source code is the same as hers except for a comment difference, and Bob sees that Alice’s source code is the same as his except for a comment difference, so the situation is symmetric.
Newcomb, Alice: The simulation’s source code and available information is literally exactly the same as Alice’s, so if Alice 2-boxes, the simulation will too. There’s no way around it. So Alice one-boxes.
Newcomb, Bob: The simulation was in the situation described above. Bob thus predicts that it will one-box. Bob himself is in an entirely different situation, since he can see a source code difference, so if he two-boxes, it does not logically imply that the simulation will two-box. So Bob two-boxes and the simulation one-boxes.
Prisoner’s Dilemma: Alice sees Bob’s source code, and summarizes it as “identical to me except for a different comment”. Bob sees Alice’s source code, and summarizes it as “identical to me except for a different comment”. Both Alice and Bob run the same algorithm, and they now have the same input, so they must produce the same result. They figure this out, and cooperate.
Don’t ignore Alice’s perspective. Bob knows what Alice’s perspective is, so since there is a difference in Alice’s perspective, there is by extension a difference in Bob’s perspective.
The source code that Bob is looking at is the same in each case, but the source code that [the source code that Bob is looking at] is looking at is different in the two situations.
NP: Bob is looking at Alice, who is looking at Alice, who is looking at Alice, …
PD: Bob is looking at Alice, who is looking at Bob, who is looking at Alice, …
Clarifying edit: In both cases, Bob concludes that the source code he is looking at is functionally equivalent to his own. But in NP, Bob treats the input to the program he is looking at as different from his input, whereas in PD, Bob treats the input to the program he is looking at as functionally equivalent to his input.
PD: Bob is looking at Alice, who is looking at Bob, who is looking at Alice, …
But you said Bob concludes that their decision theories are functionally identical, and thus it reduces to:
PD: TDT is looking at TDT, who is looking at TDT, who is looking at TDT, …
And yet this does not occur in NP.
EDIT:
The source code that Bob is looking at is the same in each case, but the source code that [the source code that Bob is looking at] is looking at is different in the two situations.
The point is that his judgement of the source code changes, from “some other agent” to “another TDT agent”.
Clarifying edit: In both cases, Bob concludes that the source code he is looking at is functionally equivalent to his own. But in NP, Bob treats the input to the program he is looking at as different from his input, whereas in PD, Bob treats the input to the program he is looking at as functionally equivalent to his input.
One way of describing it is that the comment is extra information that is distinct from the decision agent, and that Bob can make use of this information when making his decision.
What’s the point of adding comments if Bob’s just going to conclude their code is functionally identical anyway? Doesn’t that mean that you might as well use the same code for Bob and Alice, and call it TDT?
In NP, the comments are to provide Bob an excuse to two-box that does not result in the simulation two-boxing. In PD, the comments are there to illustrate that TDT needs a sophisticated algorithm for identifying copies of itself that can recognize different implementations of the same algorithm.
Do you understand why Bob acts differently in the two situations, now?
Bob is an AI. He’s programmed to look for similarities between other AIs and himself so that he can treat their action and his as logically linked when it is to his advantage to do so. I was arguing that a proper implementation of TDT should consider Bob’s and Alice’s decisions linked in PD and nonlinked in the NP variant. I don’t really understand your objection.
My objection is that an AI looking at the same question—is Alice functionally identical to me—can’t look for excuses why they’re not really the same when this would be useful, if they actually behave the same way. His answer should be the same in both cases, because they are either functionally identical or not.
The proper question is “In the context of the problems each of us face, is there a logical connection between my actions and Alice’s actions?”, not “Is Alice functionally identical to me?”
For reference, by “functionally identical” I meant “likely to choose the same way I do”. Thus, an agent that will abandon the test to eat beans is functionally identical when beans are unavailable.
I guess my previous response was unhelpful. Although “Is Alice functionally identical to me?” is not the question of primary concern, it is a relevant question. But another relevant question is “Is Alice facing the same problem that I am?” Two functionally identical agents facing different problems may make different choices.
In the architecture I’ve been envisioning, Alice and Bob can classify other agents as “identical to me in both algorithm and implementation” or “identical to me in algorithm, with differing implementation”, or one of many other categories. For each of the two categories I named, they would assume that an agent in that category will make the same decision as they would when presented with the same problem (so they would both be subcategories of “functionally identical”). In both situations, each agent classifies the other as identical in algorithm and differing in implementation.
In the prisoners’ dilemma, each agent is facing the same problem, that is, “I’m playing a prisoner’s dilemma with another agent that is identical to me in algorithm but differing in implementation”. So they treat their decisions as linked.
In the Newcomb’s problem variant, Alice’s problem is “I’m in Newcomb’s problem, and the predictor used a simulation that is identical to me in both algorithm and implementation, and which faced the same problem that I am facing.” Bob’s problem is “I’m in Newcomb’s problem, and the predictor used a simulation that is identical to me in algorithm but differing in implementation, and which faced the same situation as Alice.” There was a difference in the two problem descriptions even before the part about what problem the simulation faced, so when Bob notes that the simulation faced the same problem as Alice, he finds a difference between the problem that the simulation faced and the problem that he faces.
For each of the two categories I named, they would assume that an agent in that category will make the same decision as they would when presented with the same problem (so they would both be subcategories of “functionally identical”).
Then why are we talking about “Bob” and “Alice” when they’re both just TDT agents?
There is a difference. In the first one, the agents have a slight difference in their source code. In the second one, the source code of the two agents is identical.
If you’re claiming that TDT does not pay attention to such differences, then we only have a definitional dispute, and by your definition, an agent programmed the way I described would not be TDT. But I can’t think of anything about the standard descriptions of TDT that would indicate such a restriction. It is certainly not the “whole point” of TDT.
For now, I’m going to call the thing you’re telling me TDT is “TDT1”, and I’m going to call the agent architecture I was describing “TDT2″. I’m not sure if this is good terminology, so let me know if you’d rather call them something else.
Anyway, consider the four programs Alice1, Bob1, Alice2, and Bob2. Alice1 and Bob1 are implementations of TDT1, and are identical except for having a different identifier in the comments (and this difference changes nothing). Alice2 and Bob2 are implementations of TDT2, and are identical except for having a different identifier in the comments.
Consider the Newcomb’s problem variant with the first pair of agents (Alice1 and Bob1). Alice1 is facing the standard Newcomb’s problem, so she one-boxes and gets $1,000,000. As far as Bob1 can tell, he also faces the standard Newcomb’s problem (there is a difference, but he ignores it), so he one-boxes and gets $1,000,000.
Now consider the same problem, but with all instances of Alice1 replaced with Alice2, and all instances of Bob1 replaced with Bob2. Alice2 still faces the standard Newcomb’s problem, so she one-boxes and gets $1,000,000. But Bob2 two-boxes and gets $1,001,000.
The problem seems pretty fair; it doesn’t specifically reference either TDT1 or TDT2 in an attempt to discriminate. However, when we replace the TDT1 agents with TDT2 agents, one of them does better and neither of them does worse, which seems to indicate a pretty serious deficiency in TDT1.
Either TDT decides if something is identical based on it’s actions, in which case I am right, or it’s source code, in which case you are wrong, because such an agent would not cooperate in the Prisoner’s Dilemma.
They decide using the source code. I already explained why this results in them cooperating in the Prisoner’s Dilemma.
In the architecture I’ve been envisioning, Alice and Bob can classify other agents as “identical to me in both algorithm and implementation” or “identical to me in algorithm, with differing implementation”, or one of many other categories. For each of the two categories I named, they would assume that an agent in that category will make the same decision as they would when presented with the same problem (so they would both be subcategories of “functionally identical”). In both situations, each agent classifies the other as identical in algorithm and differing in implementation.
In the prisoners’ dilemma, each agent is facing the same problem, that is, “I’m playing a prisoner’s dilemma with another agent that is identical to me in algorithm but differing in implementation”. So they treat their decisions as linked.
Wait! I think I get it! In a Prisoner’s Dilemma, both agents are facing another agent, whereas in Newcomb’s Problem, Alice is facing an infinite chain of herself, whereas Bob is facing an infinite chain of someone else. It’s like the “favorite number” example in the followup post.
Because Alice sees that Bob’s source code is the same as hers except for a comment difference, and Bob sees that Alice’s source code is the same as his except for a comment difference, so the situation is symmetric.
Newcomb:
Prisoner’s Dilemma:
Do you see the contradiction here?
Newcomb, Alice: The simulation’s source code and available information is literally exactly the same as Alice’s, so if Alice 2-boxes, the simulation will too. There’s no way around it. So Alice one-boxes.
Newcomb, Bob: The simulation was in the situation described above. Bob thus predicts that it will one-box. Bob himself is in an entirely different situation, since he can see a source code difference, so if he two-boxes, it does not logically imply that the simulation will two-box. So Bob two-boxes and the simulation one-boxes.
Prisoner’s Dilemma: Alice sees Bob’s source code, and summarizes it as “identical to me except for a different comment”. Bob sees Alice’s source code, and summarizes it as “identical to me except for a different comment”. Both Alice and Bob run the same algorithm, and they now have the same input, so they must produce the same result. They figure this out, and cooperate.
Ignore Alice’s perspective for a second. Why is Bob acting differently? He’s seeing the same code both times.
Don’t ignore Alice’s perspective. Bob knows what Alice’s perspective is, so since there is a difference in Alice’s perspective, there is by extension a difference in Bob’s perspective.
Bob looks at the same code both times. In the PD, he treats it as identical to his own. In NP, he treats it as different. Why?
The source code that Bob is looking at is the same in each case, but the source code that [the source code that Bob is looking at] is looking at is different in the two situations.
NP: Bob is looking at Alice, who is looking at Alice, who is looking at Alice, …
PD: Bob is looking at Alice, who is looking at Bob, who is looking at Alice, …
Clarifying edit: In both cases, Bob concludes that the source code he is looking at is functionally equivalent to his own. But in NP, Bob treats the input to the program he is looking at as different from his input, whereas in PD, Bob treats the input to the program he is looking at as functionally equivalent to his input.
But you said Bob concludes that their decision theories are functionally identical, and thus it reduces to:
And yet this does not occur in NP.
EDIT:
The point is that his judgement of the source code changes, from “some other agent” to “another TDT agent”.
Looks like my edit was poorly timed.
One way of describing it is that the comment is extra information that is distinct from the decision agent, and that Bob can make use of this information when making his decision.
Oops, didn’t see that.
What’s the point of adding comments if Bob’s just going to conclude their code is functionally identical anyway? Doesn’t that mean that you might as well use the same code for Bob and Alice, and call it TDT?
In NP, the comments are to provide Bob an excuse to two-box that does not result in the simulation two-boxing. In PD, the comments are there to illustrate that TDT needs a sophisticated algorithm for identifying copies of itself that can recognize different implementations of the same algorithm.
Do you understand why Bob acts differently in the two situations, now?
I was assuming Bob was an AI, lacking a ghost to look over his code for reasonableness. If he’s not, then he isn’t strictly implementing TDT, is he?
Bob is an AI. He’s programmed to look for similarities between other AIs and himself so that he can treat their action and his as logically linked when it is to his advantage to do so. I was arguing that a proper implementation of TDT should consider Bob’s and Alice’s decisions linked in PD and nonlinked in the NP variant. I don’t really understand your objection.
My objection is that an AI looking at the same question—is Alice functionally identical to me—can’t look for excuses why they’re not really the same when this would be useful, if they actually behave the same way. His answer should be the same in both cases, because they are either functionally identical or not.
The proper question is “In the context of the problems each of us face, is there a logical connection between my actions and Alice’s actions?”, not “Is Alice functionally identical to me?”
I think those terms both mean the same thing.
For reference, by “functionally identical” I meant “likely to choose the same way I do”. Thus, an agent that will abandon the test to eat beans is functionally identical when beans are unavailable.
I guess my previous response was unhelpful. Although “Is Alice functionally identical to me?” is not the question of primary concern, it is a relevant question. But another relevant question is “Is Alice facing the same problem that I am?” Two functionally identical agents facing different problems may make different choices.
In the architecture I’ve been envisioning, Alice and Bob can classify other agents as “identical to me in both algorithm and implementation” or “identical to me in algorithm, with differing implementation”, or one of many other categories. For each of the two categories I named, they would assume that an agent in that category will make the same decision as they would when presented with the same problem (so they would both be subcategories of “functionally identical”). In both situations, each agent classifies the other as identical in algorithm and differing in implementation.
In the prisoners’ dilemma, each agent is facing the same problem, that is, “I’m playing a prisoner’s dilemma with another agent that is identical to me in algorithm but differing in implementation”. So they treat their decisions as linked.
In the Newcomb’s problem variant, Alice’s problem is “I’m in Newcomb’s problem, and the predictor used a simulation that is identical to me in both algorithm and implementation, and which faced the same problem that I am facing.” Bob’s problem is “I’m in Newcomb’s problem, and the predictor used a simulation that is identical to me in algorithm but differing in implementation, and which faced the same situation as Alice.” There was a difference in the two problem descriptions even before the part about what problem the simulation faced, so when Bob notes that the simulation faced the same problem as Alice, he finds a difference between the problem that the simulation faced and the problem that he faces.
Then why are we talking about “Bob” and “Alice” when they’re both just TDT agents?
Because if Bob does not ignore the implementation difference, he ends up with more money in the Newcomb’s problem variant.
But there is no difference between “Bob looking at Alice looking at Bob” and “Alice looking at Alice looking at Alice”. That’s the whole point of TDT.
There is a difference. In the first one, the agents have a slight difference in their source code. In the second one, the source code of the two agents is identical.
If you’re claiming that TDT does not pay attention to such differences, then we only have a definitional dispute, and by your definition, an agent programmed the way I described would not be TDT. But I can’t think of anything about the standard descriptions of TDT that would indicate such a restriction. It is certainly not the “whole point” of TDT.
For now, I’m going to call the thing you’re telling me TDT is “TDT1”, and I’m going to call the agent architecture I was describing “TDT2″. I’m not sure if this is good terminology, so let me know if you’d rather call them something else.
Anyway, consider the four programs Alice1, Bob1, Alice2, and Bob2. Alice1 and Bob1 are implementations of TDT1, and are identical except for having a different identifier in the comments (and this difference changes nothing). Alice2 and Bob2 are implementations of TDT2, and are identical except for having a different identifier in the comments.
Consider the Newcomb’s problem variant with the first pair of agents (Alice1 and Bob1). Alice1 is facing the standard Newcomb’s problem, so she one-boxes and gets $1,000,000. As far as Bob1 can tell, he also faces the standard Newcomb’s problem (there is a difference, but he ignores it), so he one-boxes and gets $1,000,000.
Now consider the same problem, but with all instances of Alice1 replaced with Alice2, and all instances of Bob1 replaced with Bob2. Alice2 still faces the standard Newcomb’s problem, so she one-boxes and gets $1,000,000. But Bob2 two-boxes and gets $1,001,000.
The problem seems pretty fair; it doesn’t specifically reference either TDT1 or TDT2 in an attempt to discriminate. However, when we replace the TDT1 agents with TDT2 agents, one of them does better and neither of them does worse, which seems to indicate a pretty serious deficiency in TDT1.
Either TDT decides if something is identical based on it’s actions, in which case I am right, or it’s source code, in which case you are wrong, because such an agent would not cooperate in the Prisoner’s Dilemma.
They decide using the source code. I already explained why this results in them cooperating in the Prisoner’s Dilemma.
Wait! I think I get it! In a Prisoner’s Dilemma, both agents are facing another agent, whereas in Newcomb’s Problem, Alice is facing an infinite chain of herself, whereas Bob is facing an infinite chain of someone else. It’s like the “favorite number” example in the followup post.
Yes.
Well that took embarrassingly long.