What makes you think that theres a “right” prior? You want a “good” learning mechanism for counterfactuals. To be good, such a mechanism would have to learn to make the inferences we consider good, at least with the “right” prior. But we can’t pinpoint any wrong inference in Troll Bridge. It doesn’t seem like whats stopping us from pinpointing the mistake in Troll Bridge is a lack of empirical data. So, a good mechanism would have to learn to be susceptible to Troll Bridge, especially with the “right” prior. I just don’t see what would be a good reason for thinking theres a “right” prior that avoids Troll Bridge (other than “there just has to be some way of avoiding it”), that wouldn’t also let us tell directly how to think about Troll Bridge, no learning needed.
Now I feel like you’re trying to have it both ways; earlier you raised the concern that a proposal which doesn’t overtly respect logic could nonetheless learn a sort of logic internally, which could then be susceptible to Troll Bridge. I took this as a call for an explicit method of avoiding Troll Bridge, rather than merely making it possible with the right prior.
But now, you seem to be complaining that a method that explicitly avoids Troll Bridge would be too restrictive?
To be good, such a mechanism would have to learn to make the inferences we consider good, at least with the “right” prior. But we can’t pinpoint any wrong inference in Troll Bridge.
I think there is a mistake somewhere in the chain of inference from cross→−10 to low expected value for crossing. Material implication is being conflated with counterfactual implication.
A strong candidate from my perspective is the inference from ¬(A∧B) to C(A|B)=0 where C represents probabilistic/counterfactual conditional (whatever we are using to generate expectations for actions).
So, a good mechanism would have to learn to be susceptible to Troll Bridge, especially with the “right” prior.
You seem to be arguing that being susceptible to Troll Bridge should be judged as a necessary/positive trait of a decision theory. But there are decision theories which don’t have this property, such as regular CDT, or TDT (depending on the logical-causality graph). Are you saying that those are all necessarily wrong, due to this?
I just don’t see what would be a good reason for thinking theres a “right” prior that avoids Troll Bridge (other than “there just has to be some way of avoiding it”), that wouldn’t also let us tell directly how to think about Troll Bridge, no learning needed.
I’m not sure quite what you meant by this. For example, I could have a lot of prior mass on “crossing gives me +10, not crossing gives me 0”. Then my +10 hypothesis would only be confirmed by experience. I could reason using counterfactuals, so that the troll bridge argument doesn’t come in and ruin things. So, there is definitely a way. And being born with this prior doesn’t seem like some kind of misunderstanding/delusion about the world.
So it also seems natural to try and design agents which reliably learn this, if they have repeated experience with Troll Bridge.
But now, you seem to be complaining that a method that explicitly avoids Troll Bridge would be too restrictive?
No, I think finding such a no-learning-needed method would be great. It just means your learning-based approach wouldn’t be needed.
You seem to be arguing that being susceptible to Troll Bridge should be judged as a necessary/positive trait of a decision theory.
No. I’m saying if our “good” reasoning can’t tell us where in Troll Bridge the mistake is, then something that learns to make “good” inferences would have to fall for it.
But there are decision theories which don’t have this property, such as regular CDT, or TDT (depending on the logical-causality graph). Are you saying that those are all necessarily wrong, due to this?
A CDT is only worth as much as its method of generating counterfactuals. We generally consider regular CDT (which I interpret as “getting its counterfactuals from something-like-epsilon-exploration”) to miss important logical connections. “TDT” doesn’t have such a method. There is a (logical) causality graph that makes you do the intuitively right thing on Troll Bridge, but how to find it formally?
A strong candidate from my perspective is the inference from ¬(A∧B) to C(A|B)=0
Isn’t this just a rephrasing of your idea that the agent should act based on C(A|B) instead of B->A? I don’t see any occurance of ~(A&B) in the troll bridge argument. Now, it is equivalent to B->~A, so perhaps you think one of the propositions that occur as implications in troll bridge should be parsed this way? My modified troll bridge parses them all as counterfactual implication.
For example, I could have a lot of prior mass on “crossing gives me +10, not crossing gives me 0”. Then my +10 hypothesis would only be confirmed by experience. I could reason using counterfactuals
I’ve said why I don’t think “using counterfactuals”, absent further specification, is a solution. For the simple “crossing is +10″ belief… you’re right its succeeds, and insofar as you just wanted to show that its rationally possible to cross, I suppose it does.
This… really didn’t fit into my intuitions about learning. Consider that there is also the alternative agent who believes that crossing is −10, and sticks to that. And the reason he sticks to that isn’t that hes to afraid and VOI isn’t worth it: while its true that he never empirically confirms it, he is right, and the bridge would blow up if he were to cross it. That method works because it ignores the information in the problem description, and has us insert the relevant takeaway without any of the confusing stuff directly into its prior. Are you really willing to say: Yup, thats basically the solution to counterfactuals, just a bit of formalism left to work out?
Now I feel like you’re trying to have it both ways; earlier you raised the concern that a proposal which doesn’t overtly respect logic could nonetheless learn a sort of logic internally, which could then be susceptible to Troll Bridge. I took this as a call for an explicit method of avoiding Troll Bridge, rather than merely making it possible with the right prior.
But now, you seem to be complaining that a method that explicitly avoids Troll Bridge would be too restrictive?
I think there is a mistake somewhere in the chain of inference from cross→−10 to low expected value for crossing. Material implication is being conflated with counterfactual implication.
A strong candidate from my perspective is the inference from ¬(A∧B) to C(A|B)=0 where C represents probabilistic/counterfactual conditional (whatever we are using to generate expectations for actions).
You seem to be arguing that being susceptible to Troll Bridge should be judged as a necessary/positive trait of a decision theory. But there are decision theories which don’t have this property, such as regular CDT, or TDT (depending on the logical-causality graph). Are you saying that those are all necessarily wrong, due to this?
I’m not sure quite what you meant by this. For example, I could have a lot of prior mass on “crossing gives me +10, not crossing gives me 0”. Then my +10 hypothesis would only be confirmed by experience. I could reason using counterfactuals, so that the troll bridge argument doesn’t come in and ruin things. So, there is definitely a way. And being born with this prior doesn’t seem like some kind of misunderstanding/delusion about the world.
So it also seems natural to try and design agents which reliably learn this, if they have repeated experience with Troll Bridge.
No, I think finding such a no-learning-needed method would be great. It just means your learning-based approach wouldn’t be needed.
No. I’m saying if our “good” reasoning can’t tell us where in Troll Bridge the mistake is, then something that learns to make “good” inferences would have to fall for it.
A CDT is only worth as much as its method of generating counterfactuals. We generally consider regular CDT (which I interpret as “getting its counterfactuals from something-like-epsilon-exploration”) to miss important logical connections. “TDT” doesn’t have such a method. There is a (logical) causality graph that makes you do the intuitively right thing on Troll Bridge, but how to find it formally?
Isn’t this just a rephrasing of your idea that the agent should act based on C(A|B) instead of B->A? I don’t see any occurance of ~(A&B) in the troll bridge argument. Now, it is equivalent to B->~A, so perhaps you think one of the propositions that occur as implications in troll bridge should be parsed this way? My modified troll bridge parses them all as counterfactual implication.
I’ve said why I don’t think “using counterfactuals”, absent further specification, is a solution. For the simple “crossing is +10″ belief… you’re right its succeeds, and insofar as you just wanted to show that its rationally possible to cross, I suppose it does.
This… really didn’t fit into my intuitions about learning. Consider that there is also the alternative agent who believes that crossing is −10, and sticks to that. And the reason he sticks to that isn’t that hes to afraid and VOI isn’t worth it: while its true that he never empirically confirms it, he is right, and the bridge would blow up if he were to cross it. That method works because it ignores the information in the problem description, and has us insert the relevant takeaway without any of the confusing stuff directly into its prior. Are you really willing to say: Yup, thats basically the solution to counterfactuals, just a bit of formalism left to work out?