OK, I get it. (Or at least I think I do.) And, duh, indeed it turns out (as you were too polite to say in so many words) that I was distinctly confused.
So: Using ordinary conditionals in planning your actions commits you to reasoning like “If (here in the actual world it turns out that) I choose to smoke this cigarette, then that makes it more likely that I have the weird genetic anomaly that causes both desire-to-smoke and lung cancer, so I’m more likely to die prematurely and horribly of lung cancer, so I shouldn’t smoke it”, which makes wrong decisions. So you want to use some sort of conditional that doesn’t work that way and rather says something more like “suppose everything about the world up to now is exactly as it is in the actual world, but magically-but-without-the-existence-of-magic-having-consequences I decide to do X; what then?”. And this is what you’re calling decision-theoretic counterfactuals, and the question is exactly what they should be; EDT says no, just use ordinary conditionals, CDT says pretty much what I just said, etc. The “smoking lesion” shows that EDT can give implausible results; “Death in Damascus” shows that CDT can give implausible results; etc.
All of which I really should have remembered, since it’s all stuff I have known in the past, but I am a doofus. My apologies.
(But my error wasn’t being too mired in EDT, or at least I don’t think it was; I think EDT is wrong. My error was having the term “counterfactual” too strongly tied in my head to what you call linguistic counterfactuals. Plus not thinking clearly about any of the actual decision theory.)
It still feels to me as if your proof-based agents are unrealistically narrow. Sure, they can incorporate whatever beliefs they have about the real world as axioms for their proofs—but only if those axioms end up being consistent, which means having perfectly consistent beliefs. The beliefs may of course be probabilistic, but then that means that all those beliefs have to have perfectly consistent probabilities assigned to them. Do you really think it’s plausible that an agent capable of doing real things in the real world can have perfectly consistent beliefs in this fashion? (I am pretty sure, for instance, that no human being has perfectly consistent beliefs; if any of us tried to do what your proof-based agents are doing, we would arrive at a contradiction—or fail to do so only because we weren’t trying hard enough.) I think “agents that use logic at all on the basis of beliefs about the world that are perfectly internally consistent” is a much narrower class than “agents that use logic at all”.
(That probably sounds like a criticism, but once again I am extremely aware that it may be that this feels implausible to me only because I am lacking important context, or confused about important things. After all, that was the case last time around. So my question is more “help me resolve my confusion” than “let me point out to you how the stuff you’ve been studying for ages is wrongheaded”, and I appreciate that you may have other more valuable things to do with your time than help to resolve my confusion :-).)
All of which I really should have remembered, since it’s all stuff I have known in the past, but I am a doofus. My apologies.
(But my error wasn’t being too mired in EDT, or at least I don’t think it was; I think EDT is wrong. My error was having the term “counterfactual” too strongly tied in my head to what you call linguistic counterfactuals. Plus not thinking clearly about any of the actual decision theory.)
I’m glad I pointed out the difference between linguistic and DT counterfactuals, then!
It still feels to me as if your proof-based agents are unrealistically narrow. Sure, they can incorporate whatever beliefs they have about the real world as axioms for their proofs—but only if those axioms end up being consistent, which means having perfectly consistent beliefs. The beliefs may of course be probabilistic, but then that means that all those beliefs have to have perfectly consistent probabilities assigned to them. Do you really think it’s plausible that an agent capable of doing real things in the real world can have perfectly consistent beliefs in this fashion?
I’m not at all suggesting that we use proof-based DT in this way. It’s just a model. I claim that it’s a pretty good model—that we can often carry over results to other, more complex, decision theories.
However, if we wanted to, then yes, I think we could… I agree that if we add beliefs as axioms, the axioms have to be perfectly consistent. But if we use probabilistic beliefs, those probabilities don’t have to be perfectly consistent; just the axioms saying which probabilities we have. So, for example, I could use a proof-based agent to approximate a logical-induction-based agent, by looking for proofs about what the market expectations are. This would be kind of convoluted, though.
I appreciate that it’s a model, but it seems—perhaps wrongly, since as already mentioned I am an ignorant doofus—as if at least some of what you’re doing with the model depends essentially on the strictly-logic-based nature of the agent. (E.g., the Troll Bridge problem as stated here seems that way, down to applying Löb’s theorem to the agent as formal system.)
Formal logic is very brittle; ex falso quodlibet, and all that; it (ignorantly and doofusily) looks to me as if you might be looking at a certain class of models and then finding problems (e.g., Troll Bridge) that are only problems because of specific features of the models that couldn’t realistically apply to the real world.
(In terms of the “rocket alignment problem” metaphor: suppose you start thinking about orbital mechanics, come up with exact-conic-section orbits as an interesting class of things to study, and prove some theorem that says that some class of things isn’t achievable for exact-conic-section orbits for a reason that comes down to something like dividing by the sum of squares of all the higher-order terms that are exactly zero for a perfect conic section orbit. That would be an interesting theorem, and it’s not hard to imagine how some less-rigid generalization of it might apply to real trajectories (“if the sum of squares of those coefficients is small then the trajectory is unstable and hard to get right” or something) -- but as it stands it doesn’t really tell you anything about real problems faced by real rockets whose trajectories are not perfect conic sections. And logic is generally much more brittle than orbital mechanics, chaos theory notwithstanding; there isn’t generally anything that corresponds to the sum of squares of coefficients being small; a proof that contains only a few small errors is not a proof at all.)
But, hmm, you reckon one could make a viable proof-based agent that has a consistent set of axioms describing a potentially-inconsistent set of probabilities. That’s an intriguing idea but I’m having trouble seeing how it would work. E.g., suppose I’m searching for proofs that my expected utility if I do X is at least Y units; that expectation obviously involves a whole lot of probabilities, and my actual probability assignments are inconsistent in various ways. How do I prove anything about my expected utilities if the probabilities involved might be inconsistent?
This is all a bit off topic on this particular post; it’s not especially about your account of decision-theoretic counterfactuals as such, but about the whole project of understanding decision theory in terms of agents whose decision processes involve trying to prove things about their own behaviour.
OK, I get it. (Or at least I think I do.) And, duh, indeed it turns out (as you were too polite to say in so many words) that I was distinctly confused.
So: Using ordinary conditionals in planning your actions commits you to reasoning like “If (here in the actual world it turns out that) I choose to smoke this cigarette, then that makes it more likely that I have the weird genetic anomaly that causes both desire-to-smoke and lung cancer, so I’m more likely to die prematurely and horribly of lung cancer, so I shouldn’t smoke it”, which makes wrong decisions. So you want to use some sort of conditional that doesn’t work that way and rather says something more like “suppose everything about the world up to now is exactly as it is in the actual world, but magically-but-without-the-existence-of-magic-having-consequences I decide to do X; what then?”. And this is what you’re calling decision-theoretic counterfactuals, and the question is exactly what they should be; EDT says no, just use ordinary conditionals, CDT says pretty much what I just said, etc. The “smoking lesion” shows that EDT can give implausible results; “Death in Damascus” shows that CDT can give implausible results; etc.
All of which I really should have remembered, since it’s all stuff I have known in the past, but I am a doofus. My apologies.
(But my error wasn’t being too mired in EDT, or at least I don’t think it was; I think EDT is wrong. My error was having the term “counterfactual” too strongly tied in my head to what you call linguistic counterfactuals. Plus not thinking clearly about any of the actual decision theory.)
It still feels to me as if your proof-based agents are unrealistically narrow. Sure, they can incorporate whatever beliefs they have about the real world as axioms for their proofs—but only if those axioms end up being consistent, which means having perfectly consistent beliefs. The beliefs may of course be probabilistic, but then that means that all those beliefs have to have perfectly consistent probabilities assigned to them. Do you really think it’s plausible that an agent capable of doing real things in the real world can have perfectly consistent beliefs in this fashion? (I am pretty sure, for instance, that no human being has perfectly consistent beliefs; if any of us tried to do what your proof-based agents are doing, we would arrive at a contradiction—or fail to do so only because we weren’t trying hard enough.) I think “agents that use logic at all on the basis of beliefs about the world that are perfectly internally consistent” is a much narrower class than “agents that use logic at all”.
(That probably sounds like a criticism, but once again I am extremely aware that it may be that this feels implausible to me only because I am lacking important context, or confused about important things. After all, that was the case last time around. So my question is more “help me resolve my confusion” than “let me point out to you how the stuff you’ve been studying for ages is wrongheaded”, and I appreciate that you may have other more valuable things to do with your time than help to resolve my confusion :-).)
I’m glad I pointed out the difference between linguistic and DT counterfactuals, then!
I’m not at all suggesting that we use proof-based DT in this way. It’s just a model. I claim that it’s a pretty good model—that we can often carry over results to other, more complex, decision theories.
However, if we wanted to, then yes, I think we could… I agree that if we add beliefs as axioms, the axioms have to be perfectly consistent. But if we use probabilistic beliefs, those probabilities don’t have to be perfectly consistent; just the axioms saying which probabilities we have. So, for example, I could use a proof-based agent to approximate a logical-induction-based agent, by looking for proofs about what the market expectations are. This would be kind of convoluted, though.
I appreciate that it’s a model, but it seems—perhaps wrongly, since as already mentioned I am an ignorant doofus—as if at least some of what you’re doing with the model depends essentially on the strictly-logic-based nature of the agent. (E.g., the Troll Bridge problem as stated here seems that way, down to applying Löb’s theorem to the agent as formal system.)
Formal logic is very brittle; ex falso quodlibet, and all that; it (ignorantly and doofusily) looks to me as if you might be looking at a certain class of models and then finding problems (e.g., Troll Bridge) that are only problems because of specific features of the models that couldn’t realistically apply to the real world.
(In terms of the “rocket alignment problem” metaphor: suppose you start thinking about orbital mechanics, come up with exact-conic-section orbits as an interesting class of things to study, and prove some theorem that says that some class of things isn’t achievable for exact-conic-section orbits for a reason that comes down to something like dividing by the sum of squares of all the higher-order terms that are exactly zero for a perfect conic section orbit. That would be an interesting theorem, and it’s not hard to imagine how some less-rigid generalization of it might apply to real trajectories (“if the sum of squares of those coefficients is small then the trajectory is unstable and hard to get right” or something) -- but as it stands it doesn’t really tell you anything about real problems faced by real rockets whose trajectories are not perfect conic sections. And logic is generally much more brittle than orbital mechanics, chaos theory notwithstanding; there isn’t generally anything that corresponds to the sum of squares of coefficients being small; a proof that contains only a few small errors is not a proof at all.)
But, hmm, you reckon one could make a viable proof-based agent that has a consistent set of axioms describing a potentially-inconsistent set of probabilities. That’s an intriguing idea but I’m having trouble seeing how it would work. E.g., suppose I’m searching for proofs that my expected utility if I do X is at least Y units; that expectation obviously involves a whole lot of probabilities, and my actual probability assignments are inconsistent in various ways. How do I prove anything about my expected utilities if the probabilities involved might be inconsistent?
This is all a bit off topic on this particular post; it’s not especially about your account of decision-theoretic counterfactuals as such, but about the whole project of understanding decision theory in terms of agents whose decision processes involve trying to prove things about their own behaviour.