All of which I really should have remembered, since it’s all stuff I have known in the past, but I am a doofus. My apologies.
(But my error wasn’t being too mired in EDT, or at least I don’t think it was; I think EDT is wrong. My error was having the term “counterfactual” too strongly tied in my head to what you call linguistic counterfactuals. Plus not thinking clearly about any of the actual decision theory.)
I’m glad I pointed out the difference between linguistic and DT counterfactuals, then!
It still feels to me as if your proof-based agents are unrealistically narrow. Sure, they can incorporate whatever beliefs they have about the real world as axioms for their proofs—but only if those axioms end up being consistent, which means having perfectly consistent beliefs. The beliefs may of course be probabilistic, but then that means that all those beliefs have to have perfectly consistent probabilities assigned to them. Do you really think it’s plausible that an agent capable of doing real things in the real world can have perfectly consistent beliefs in this fashion?
I’m not at all suggesting that we use proof-based DT in this way. It’s just a model. I claim that it’s a pretty good model—that we can often carry over results to other, more complex, decision theories.
However, if we wanted to, then yes, I think we could… I agree that if we add beliefs as axioms, the axioms have to be perfectly consistent. But if we use probabilistic beliefs, those probabilities don’t have to be perfectly consistent; just the axioms saying which probabilities we have. So, for example, I could use a proof-based agent to approximate a logical-induction-based agent, by looking for proofs about what the market expectations are. This would be kind of convoluted, though.
I appreciate that it’s a model, but it seems—perhaps wrongly, since as already mentioned I am an ignorant doofus—as if at least some of what you’re doing with the model depends essentially on the strictly-logic-based nature of the agent. (E.g., the Troll Bridge problem as stated here seems that way, down to applying Löb’s theorem to the agent as formal system.)
Formal logic is very brittle; ex falso quodlibet, and all that; it (ignorantly and doofusily) looks to me as if you might be looking at a certain class of models and then finding problems (e.g., Troll Bridge) that are only problems because of specific features of the models that couldn’t realistically apply to the real world.
(In terms of the “rocket alignment problem” metaphor: suppose you start thinking about orbital mechanics, come up with exact-conic-section orbits as an interesting class of things to study, and prove some theorem that says that some class of things isn’t achievable for exact-conic-section orbits for a reason that comes down to something like dividing by the sum of squares of all the higher-order terms that are exactly zero for a perfect conic section orbit. That would be an interesting theorem, and it’s not hard to imagine how some less-rigid generalization of it might apply to real trajectories (“if the sum of squares of those coefficients is small then the trajectory is unstable and hard to get right” or something) -- but as it stands it doesn’t really tell you anything about real problems faced by real rockets whose trajectories are not perfect conic sections. And logic is generally much more brittle than orbital mechanics, chaos theory notwithstanding; there isn’t generally anything that corresponds to the sum of squares of coefficients being small; a proof that contains only a few small errors is not a proof at all.)
But, hmm, you reckon one could make a viable proof-based agent that has a consistent set of axioms describing a potentially-inconsistent set of probabilities. That’s an intriguing idea but I’m having trouble seeing how it would work. E.g., suppose I’m searching for proofs that my expected utility if I do X is at least Y units; that expectation obviously involves a whole lot of probabilities, and my actual probability assignments are inconsistent in various ways. How do I prove anything about my expected utilities if the probabilities involved might be inconsistent?
This is all a bit off topic on this particular post; it’s not especially about your account of decision-theoretic counterfactuals as such, but about the whole project of understanding decision theory in terms of agents whose decision processes involve trying to prove things about their own behaviour.
I’m glad I pointed out the difference between linguistic and DT counterfactuals, then!
I’m not at all suggesting that we use proof-based DT in this way. It’s just a model. I claim that it’s a pretty good model—that we can often carry over results to other, more complex, decision theories.
However, if we wanted to, then yes, I think we could… I agree that if we add beliefs as axioms, the axioms have to be perfectly consistent. But if we use probabilistic beliefs, those probabilities don’t have to be perfectly consistent; just the axioms saying which probabilities we have. So, for example, I could use a proof-based agent to approximate a logical-induction-based agent, by looking for proofs about what the market expectations are. This would be kind of convoluted, though.
I appreciate that it’s a model, but it seems—perhaps wrongly, since as already mentioned I am an ignorant doofus—as if at least some of what you’re doing with the model depends essentially on the strictly-logic-based nature of the agent. (E.g., the Troll Bridge problem as stated here seems that way, down to applying Löb’s theorem to the agent as formal system.)
Formal logic is very brittle; ex falso quodlibet, and all that; it (ignorantly and doofusily) looks to me as if you might be looking at a certain class of models and then finding problems (e.g., Troll Bridge) that are only problems because of specific features of the models that couldn’t realistically apply to the real world.
(In terms of the “rocket alignment problem” metaphor: suppose you start thinking about orbital mechanics, come up with exact-conic-section orbits as an interesting class of things to study, and prove some theorem that says that some class of things isn’t achievable for exact-conic-section orbits for a reason that comes down to something like dividing by the sum of squares of all the higher-order terms that are exactly zero for a perfect conic section orbit. That would be an interesting theorem, and it’s not hard to imagine how some less-rigid generalization of it might apply to real trajectories (“if the sum of squares of those coefficients is small then the trajectory is unstable and hard to get right” or something) -- but as it stands it doesn’t really tell you anything about real problems faced by real rockets whose trajectories are not perfect conic sections. And logic is generally much more brittle than orbital mechanics, chaos theory notwithstanding; there isn’t generally anything that corresponds to the sum of squares of coefficients being small; a proof that contains only a few small errors is not a proof at all.)
But, hmm, you reckon one could make a viable proof-based agent that has a consistent set of axioms describing a potentially-inconsistent set of probabilities. That’s an intriguing idea but I’m having trouble seeing how it would work. E.g., suppose I’m searching for proofs that my expected utility if I do X is at least Y units; that expectation obviously involves a whole lot of probabilities, and my actual probability assignments are inconsistent in various ways. How do I prove anything about my expected utilities if the probabilities involved might be inconsistent?
This is all a bit off topic on this particular post; it’s not especially about your account of decision-theoretic counterfactuals as such, but about the whole project of understanding decision theory in terms of agents whose decision processes involve trying to prove things about their own behaviour.