Having knowledge of the decision lying around is not a problem, the problem is if it’s used in construction of the decision itself in such a way that the resulting decision is not consequentialist. The diagonal rule allows breaking the dependence of your knowledge on your decision, if it hypothetically existed, so that it becomes easier to prove that the decision procedure produces a consequentialist decision.
Also, the decision can’t be “inconsistent”, as it’s part of the territory. Agent’s knowledge may be inconsistent, that is useless, but even then there is fact of the matter of what its gibbering self decides.
I meant agent (its proof system) becoming inconsistent, of course, not its decision. Bad wording on my part.
The problem, as I see it, is that the standard UDT agent (its proof system) is not allowed to prove that it will do a certain action (or that it will not do some action). Because then it will prove stupid counterfactuals, which will make it change its decision, which will make its proof wrong, which will make its proof system inconsistent.
I think this is a serious limitation. Maybe it is impossible to define well-behaved consequentialist agents without this limitation, but I didn’t see an actual proof...
Because then it will prove stupid counterfactuals, which will make it change its decision, which will make its proof wrong, which will make its proof system inconsistent.
This is not how it works. The inference system is consistent, and nothing can be changed, only determined in a stupid way. It’s not false that one-boxing implies that you get $31, if you in fact two-box; your inference system doesn’t need to be inconsistent to produce that argument.
But what if the system proves it will one-box, then forms a counterfactual that two-boxing will get it 10^10$ and so two-boxes. This makes the thing that it proved false, which makes the system inconsistent.
If we know the inference system to be consistent, this proves that the line of reasoning you describe can’t happen. Indeed this is essentially the way we prove that the diagonal step guarantees that the agent doesn’t infer its decision: if it did, that would make its inference system unsound, and we assume it’s not. So what happens is that if the system proves that it will one-box, it doesn’t prove that two-boxing leads to $10^10, instead it proves something that would make it one-box, such as that two-boxing leads to minus $300.
The system is sound. Therefore, it doesn’t prove (before the agent decides) that the agent will one-box. Presence of the diagonal step guarantees that a proof of the agent one-boxing is not encountered (before the agent decides).
Well, exactly, that’s what I said: the agent is not allowed to prove that it will do a certain action before its decision is made. This is a limitation. My hypothesis: it is not a necessary limitation for a well-behaved consequentialist agent. Here is an attempt at writing an agent without this limitation.
Having knowledge of the decision lying around is not a problem, the problem is if it’s used in construction of the decision itself in such a way that the resulting decision is not consequentialist. The diagonal rule allows breaking the dependence of your knowledge on your decision, if it hypothetically existed, so that it becomes easier to prove that the decision procedure produces a consequentialist decision.
Also, the decision can’t be “inconsistent”, as it’s part of the territory. Agent’s knowledge may be inconsistent, that is useless, but even then there is fact of the matter of what its gibbering self decides.
I meant agent (its proof system) becoming inconsistent, of course, not its decision. Bad wording on my part.
The problem, as I see it, is that the standard UDT agent (its proof system) is not allowed to prove that it will do a certain action (or that it will not do some action). Because then it will prove stupid counterfactuals, which will make it change its decision, which will make its proof wrong, which will make its proof system inconsistent.
I think this is a serious limitation. Maybe it is impossible to define well-behaved consequentialist agents without this limitation, but I didn’t see an actual proof...
This is not how it works. The inference system is consistent, and nothing can be changed, only determined in a stupid way. It’s not false that one-boxing implies that you get $31, if you in fact two-box; your inference system doesn’t need to be inconsistent to produce that argument.
But what if the system proves it will one-box, then forms a counterfactual that two-boxing will get it 10^10$ and so two-boxes. This makes the thing that it proved false, which makes the system inconsistent.
If we know the inference system to be consistent, this proves that the line of reasoning you describe can’t happen. Indeed this is essentially the way we prove that the diagonal step guarantees that the agent doesn’t infer its decision: if it did, that would make its inference system unsound, and we assume it’s not. So what happens is that if the system proves that it will one-box, it doesn’t prove that two-boxing leads to $10^10, instead it proves something that would make it one-box, such as that two-boxing leads to minus $300.
Hmmm. Wait, doesn’t diagonal step immediately make the system inconsistent as soon as the system proves the agent will one-box?
The system is sound. Therefore, it doesn’t prove (before the agent decides) that the agent will one-box. Presence of the diagonal step guarantees that a proof of the agent one-boxing is not encountered (before the agent decides).
Well, exactly, that’s what I said: the agent is not allowed to prove that it will do a certain action before its decision is made. This is a limitation. My hypothesis: it is not a necessary limitation for a well-behaved consequentialist agent. Here is an attempt at writing an agent without this limitation.