Since the agent can deduce (by low-level simulation) what the predictor will do, the agent does not regard the prediction outcome as contingent on the agent’s computation.
I’m confused on this point, would you mind checking if my thinking on it is correct?
My initial objection was that this seems to assume that the predictor doesn’t take anything into account, and that the agent was trying to predict what the predictor would do without trying to figure out what the predictor would predict the agent would choose.
Then I noticed that the predictor isn’t actually waiting for the agent to finish making its decision, it was using a higher level of representation of how the agent thinks. Taking this into account, the agent’s ability to simulate the predictor implicitly includes the ability to compute what the predictor predicts the agent will do.
So then I was confused about why this is still a problem. My intuition was banging its head against the wall insisting that the predictor still has to take into account the agent’s decision, and that the agent couldn’t model the predictor’s prediction as not contingent on the agent’s decision.
Then I noticed that the real issue isn’t any particular prediction that the agent thinks the predictor would predict, so much as the fact that the agent sees this happening with probability one, and happening regardless of what it chooses to do. Since the agent already “knows” what the predictor will predict, it is free to choose to two-box, which will always be higher utility once you can’t causally effect the boxes.
Your thinking seems to be sort of correct, but informal. I’m not sure if my proof in the second part of the post was parseable, but at least it’s an actual proof of the thing you want to know, so maybe you could try parsing it :-)
I can parse it, but I don’t really think that I understand it in a mathematical way.
A is a statement that makes sense to me, and I can see why the predictor needs to know that the agent’s proof system is consistent.
What I don’t get about it is why you specify that the predictor computes proofs up to length N, and then just say how the predictor will do its proof.
Basically, I have no formal mathematics education in fields that aren’t a direct prerequisite of basic multivariable calculus, and my informal mathematics education consists of Godel, Escher, Bach. And a wikipedia education in game theory.
Do you have any suggestions on how to get started in this area? Like, introductory sources or even just terms that I should look up on wikipedia or google?
What I don’t get about it is why you specify that the predictor computes proofs up to length N, and then just say how the predictor will do its proof.
If the outlined proof is less than N symbols long (which is true if N is large enough), the predictor will find it because it enumerates all proofs up to that length. Since the predictor’s proof system is consistent, it won’t find any other proofs contradicting this one.
The N < M is necessary to guarantee that the agent predicts the predictor’s proof, right?
Yeah. Actually, N must be exponentially smaller than M, so the agent’s proofs can completely simulate the predictor’s execution.
What happens if the outlined proof is more than N symbols long?
No idea. :-) Maybe the predictor will fail to prove anything, and fall back to filling only one box, I guess? Anyway, the outlined proof is quite short, so the problem already arises for not very large values of N.
What the agent tries to infer, is predictor’s behavior for each of agent’s possible actions, not just unconditional predictor’s behavior. Being able to infer predictor’s decision without assuming agent’s action is equivalent to conclusing that predictor’s decision is a constant function of agent’s action (in agent’s opinion, given the kind of decision-maker our agent is, which is something that it should be able to control better, but current version of the theory doesn’t support).
I’m confused on this point, would you mind checking if my thinking on it is correct?
My initial objection was that this seems to assume that the predictor doesn’t take anything into account, and that the agent was trying to predict what the predictor would do without trying to figure out what the predictor would predict the agent would choose.
Then I noticed that the predictor isn’t actually waiting for the agent to finish making its decision, it was using a higher level of representation of how the agent thinks. Taking this into account, the agent’s ability to simulate the predictor implicitly includes the ability to compute what the predictor predicts the agent will do.
So then I was confused about why this is still a problem. My intuition was banging its head against the wall insisting that the predictor still has to take into account the agent’s decision, and that the agent couldn’t model the predictor’s prediction as not contingent on the agent’s decision.
Then I noticed that the real issue isn’t any particular prediction that the agent thinks the predictor would predict, so much as the fact that the agent sees this happening with probability one, and happening regardless of what it chooses to do. Since the agent already “knows” what the predictor will predict, it is free to choose to two-box, which will always be higher utility once you can’t causally effect the boxes.
Your thinking seems to be sort of correct, but informal. I’m not sure if my proof in the second part of the post was parseable, but at least it’s an actual proof of the thing you want to know, so maybe you could try parsing it :-)
Thanks for the response, that was fast.
I can parse it, but I don’t really think that I understand it in a mathematical way.
A is a statement that makes sense to me, and I can see why the predictor needs to know that the agent’s proof system is consistent.
What I don’t get about it is why you specify that the predictor computes proofs up to length N, and then just say how the predictor will do its proof.
Basically, I have no formal mathematics education in fields that aren’t a direct prerequisite of basic multivariable calculus, and my informal mathematics education consists of Godel, Escher, Bach. And a wikipedia education in game theory.
Do you have any suggestions on how to get started in this area? Like, introductory sources or even just terms that I should look up on wikipedia or google?
If the outlined proof is less than N symbols long (which is true if N is large enough), the predictor will find it because it enumerates all proofs up to that length. Since the predictor’s proof system is consistent, it won’t find any other proofs contradicting this one.
The N < M is necessary to guarantee that the agent predicts the predictor’s proof, right?
What happens if the outlined proof is more than N symbols long?
Yeah. Actually, N must be exponentially smaller than M, so the agent’s proofs can completely simulate the predictor’s execution.
No idea. :-) Maybe the predictor will fail to prove anything, and fall back to filling only one box, I guess? Anyway, the outlined proof is quite short, so the problem already arises for not very large values of N.
What the agent tries to infer, is predictor’s behavior for each of agent’s possible actions, not just unconditional predictor’s behavior. Being able to infer predictor’s decision without assuming agent’s action is equivalent to conclusing that predictor’s decision is a constant function of agent’s action (in agent’s opinion, given the kind of decision-maker our agent is, which is something that it should be able to control better, but current version of the theory doesn’t support).