Darmani comments on $1000 USD prize—Circular Dependency of Counterfactuals

Darmani 3 Jan 2022 0:00 UTC
1 point
I’m not surprised by this reaction, seeing as I jumped on banging it out rather than checking to make sure that I understand your confusion first. And I still don’t understand your confusion, so my best hope was giving a very clear, computational explanation of counterfactuals with no circularity in hopes it helps.
Anyway, let’s have some back and forth right here. I’m having trouble teasing apart the different threads of thought that I’m reading.
After intervening on our decision node do we just project forward as per Causal Decision Theory or do we want to do something like Functional Decision Theory that allows back-projecting as well?
I think I’ll need to see some formulae to be sure I know what you’re talking about. I understand the core of decision theory to be about how to score potential actions, which seems like a pretty separate question from understanding counterfactuals.
More specifically, I understand that each decision theory provides two components: (1) a type of probabilistic model for modeling relevant scenarios, and (2) a probabilistic query that it says should be used to evaluate potential actions. Evidentiary decision theory uses an arbitrary probability distribution as its model, and evaluates actions by P(outcome |action). Causal decision theory uses a causal Bayes net (set of intervential distributions) and the query P(outcome | do(action)). I understand FDT less well, but basically view it as similar to CDT, except that it intervenes on the input to a decision procedure rather than on the output.
But all this is separate from the question of how to compute counterfactuals, and I don’t understand why you bring this up.
When trying to answer these questions, this naturally leads us to ask, “What exactly are these counterfactual things anyway?” and that path (in my opinion) leads to circularity.
I still understand this to be the core of your question. Can you explain what questions remain about “what is a counterfactual” after reading my post?
- Chris_Leong 3 Jan 2022 3:31 UTC
  2 points
  Parent
  So my best hope was giving a very clear, computational explanation of counterfactuals with no circularity in hopes it helps.
  While I can see this working in theory, in practise it’s more complicated as it isn’t obvious from immediate inspection to what extent an argument is or isn’t dependent on counterfactuals. I mean counterfactuals are everywhere! Part of the problem is that the clearest explanation of such a scheme would likely make use of counterfactuals, even if it were later shown that these aren’t necessary.
  I understand FDT less well, but basically view it as similar to CDT, except that it intervenes on the input to a decision procedure rather than on the output.
  The best source for learning about FDT is this MIRI paper, but given its length, you might find the summary in this blog post answers your questions more quickly.
  
  The key unanswered question (well, some people claim to have solutions) in Functional Decision theory is how to construct the logical counterfactuals that it depends on. What do I mean by logical counterfactuals? MIRI models agents as programs ie. logic so that imagining an agent taking an action other than it takes become imagining logic being such that a particular function provides a particular output on a given input than it does. Now I don’t quite agree with the logical counterfactuals framing, but I have been working on the question of constructing appropriate counterfactuals for this situation.
  - Darmani 3 Jan 2022 4:32 UTC
    1 point
    Parent
    While I can see this working in theory, in practise it’s more complicated as it isn’t obvious from immediate inspection to what extent an argument is or isn’t dependent on counterfactuals. I mean counterfactuals are everywhere! Part of the problem is that the clearest explanation of such a scheme would likely make use of counterfactuals, even if it were later shown that these aren’t necessary.
    Is the explanation in the “What is a Counterfactual” post linked above circular?
    Is the explanation in the post somehow not an explanation of counterfactuals?
    
    The key unanswered question (well, some people claim to have solutions) in Functional Decision theory is how to construct the logical counterfactuals that it depends on.
    I read a large chunk of the FDT paper while drafting my last comment.
    The quoted sentence may hint at the root of the trouble that I and some others here seem to have in understanding what you want. You seem to be asking about the way “counterfactual” is used in a particular paper, not in general.
    It is glossed over and not explained in full detail in the FDT paper, but it seems to mainly rely on extra constraints on allowable interventions, similar to the “super-specs” in one of my other papers: https://www.jameskoppel.com/files/papers/demystifying_dependence.pdf .
    I’m going to go try to model Newcomb’s problem and some of the other FDT examples in Omega. If I’m successive, it’s evidence that there’s nothing more interesting going on than what’s in my causal hierarchy post.
    - Chris_Leong 3 Jan 2022 4:53 UTC
      3 points
      Parent
      Is the explanation in the post somehow not an explanation of counterfactuals?
      
      Oh, it’s definitely an explanation of counterfactuals, but I wouldn’t say it’s a complete explanation of counterfactuals as it doesn’t handle exotic cases (ie Newcomb’s). I added some more background info after I posted the bounty and maybe I should have done that originally, but I posted the bounty on LW/alignment forum and that led me towards taking a certain background context as given, although I can now see that I should have clarified this originally.
      
      Is the explanation in the “What is a Counterfactual” post linked above circular?
      
      It seems that way, although maybe this circular dependence isn’t essential.
      
      Take for example the concept of prediction. This seems to involve imagining different outcomes. How can we do this without counterfactuals?
      
      I guess I have the same question with interventions. This seems to depend on the notion that we could intervene or we could not intervene. Only one of these can happen—the other is a counterfactual.
      - Darmani 3 Jan 2022 8:56 UTC
        1 point
        Parent
        I don’t understand what counterfactuals have to do with Newcomb’s problem. You decide either “I am a one-boxer” or “I am a two-boxer,” the boxes get filled according to a rule, and then you pick deterministically according to a rule. It’s all forward reasoning; it’s just a bit weird because the action in question happens way before you are faced with the boxes. I don’t see any updating on a factual world to infer outcomes in a counterfactual world.
        
        ”Prediction” in this context is a synonym for conditioning. $P (x | y)$ is defined as $\frac{P (x, y)}{P (y)}$ .
        If intervention sounds circular...I don’t know what to say other than read Chapter 1 of Pearl ( https://www.amazon.com/Causality-Reasoning-Inference-Judea-Pearl/dp/052189560X ).
        To give a two-sentence technical explanation:
        A structural causal model is a straight-line program with some random inputs. They look like this
        u1 = randBool()
        rain = u1
        sprinkler = !rain
        wet_grass = rain || sprinkler
        It’s usually written with nodes and graphs, but they are equivalent to straight-line programs, and one can translate easily between these two presentations.
        In the basic Pearl setup, an intervention consists of replacing one of the assignments above with an assignment to a constant. Here is an intervention setting the sprinkler off.
        u1 = randBool()
        rain = u1
        sprinkler = false
        wet_grass = rain || sprinkler
        From this, one can easily compute that $P (w e t_{g} r a s s | d o (s p r i n k l e r = f a l s e)) = \frac{1}{2}$ .
        If you want the technical development of counterfactuals that my post is based on, read Pearl Chapter 7, or Google around for the “twin network construction.”
        Or I’ll just show you in code below how you compute the counterfactual “I see the sprinkler is on, so, if it hadn’t come on, the grass would not be wet,” which is written $P (w e t_g r a s s | s p r i n k l e r = t r u e, d o (s p r i n k l e r = f a l s e)) = 0$
        We construct a new program,
        u1 = randBool()
        rain = u1
        sprinkler_factual = !rain
        wet_grass_factual = rain || sprinkler_factual
        sprinkler_counterfactual = false
        wet_grass_counterfactual = rain || sprinkler_counterfactual
        This is now reduced to a pure statistical problem. Run this program a bunch of times, filter down to only the runs where sprinkler_factual is true, and you’ll find that wet_grass_counterfactual is false in all of them.
        If you write this program as a dataflow graph, you see everything that happens after the intervention point being duplicated, but the background variables (the rain) are shared between them. This graph is the twin network, and this technique is called the “twin network construction.” It can also be thought of as what the do(y | x → e) operator is doing in our Omega language.
        Chris_Leong 4 Jan 2022 0:22 UTC
        2 points
        Parent
        Everyone agrees what you should do if you can precommit. The question becomes philosophically interesting when an agent faces this problem without having had the opportunity to precommit.
        Darmani 4 Jan 2022 2:22 UTC
        1 point
        Parent
        Okay, I see how that technique of breaking circularity in the model looks like precommitment.
        I still don’t see what this has to do with counterfactuals though.
        Chris_Leong 4 Jan 2022 4:18 UTC
        2 points
        Parent
        “You decide either “I am a one-boxer” or “I am a two-boxer,” the boxes get filled according to a rule, and then you pick deterministically according to a rule. It’s all forward reasoning; it’s just a bit weird because the action in question happens way before you are faced with the boxes.”
        So you wouldn’t class this as precommitment?
        Darmani 4 Jan 2022 5:12 UTC
        1 point
        Parent
        I realize now that this expressed as a DAG looks identical to precommitment.
        Except, I also think it’s a faithful representation of the typical Newcomb scenario.
        Paradox only arises if you can say “I am a two-boxer” (by picking up two boxes) while you were predicted to be a one-boxer. This can only happen if there are multiple nodes for two-boxing set to different values.
        But really, this is a problem of the kind solved by superspecs in my Onward! paper. There is a constraint that the prediction of two-boxing must be the same as the actual two-boxing. Traditional causal DAGs can only express this by making them literally the same node; super-specs allow more flexibility. I am unclear how exactly it’s handled in FDT, but it has a similar analysis of the problem (“CDT breaks correlations”).