Vanessa Kosoy comments on My Current Take on Counterfactuals

Vanessa Kosoy 3 May 2021 18:09 UTC
LW: 2 AF: 1
0
AF
I lean towards some kind of finitism or constructivism, and am skeptical of utility functions which involve unbounded quantifiers. But also, how does LI help with the procrastination paradox? I don’t think I’ve seen this result.
- abramdemski 2 Jun 2021 16:54 UTC
  LW: 3 AF: 3
  0
  AF Parent
  What I’m referring to is that LI given a notion of rational uncertain expectation for the procrastination paradox—so, less a positive result, more a framework for thinking about what behavior is reasonable.
  However, I also think LIDT solves the problem in practical terms:
  - In the pure procrastination-paradox problem, LIDT will eventually push the button if its logic is sound. If it did not, it would mean the conditional probability of ever pressing the button given not pressing it today remains forever higher than the conditional probability of ever pressing it today. However, the expectation can be split into the probability it gets pushed today, and the probability that it gets pushed on any day later than today. The LI should eventually know that the conditional probability of ever pressing the button given pressing it today is arbitrarily close to 1. So in order to never press the button, the conditional probability of ever pressing it in the future (given not pressing today) would have to go to 1 (faster than the probability of it ever being pressed given pressing it today). I don’t think this can happen, since there will be some nonzero limit probability that the button will never be pressed (that is, there will be supposing the button is in fact never pressed).
  - In a situation where there is some actual reason to procrastinate (there are other sources of utility), but we place very high value on eventually pressing the button, it may be that the button will never be pressed? However, this will only happen if we’re subjectively confident that it will eventually be pressed, and always have something better to do in the mean time. The second part seems pretty difficult. So maybe we can also prove that we eventually press the button in this case, as well.
  My basic argument is we can model this sort of preference, so why rule it out as a possible human preference? You may be philosophically confident in finitist/constructivist values, but are you so confident that you’d want to lock unbounded quantifiers out of the space of possible values for value learning?
  - Vanessa Kosoy 3 Jun 2021 4:00 UTC
    LW: 2 AF: 1
    0
    AF Parent
    
    However, I also think LIDT solves the problem in practical terms:
    
    What is LIDT exactly? I can try to guess but I rather make sure we’re both talking about the same thing.
    
    My basic argument is we can model this sort of preference, so why rule it out as a possible human preference? You may be philosophically confident in finitist/constructivist values, but are you so confident that you’d want to lock unbounded quantifiers out of the space of possible values for value learning?
    
    I agree inasmuch as we actually can model this sort of preferences, for a sufficiently strong meaning of “model”. I feel that it’s much harder to be confident about any detailed claim about human values than about the validity of a generic theory of rationality. Therefore, if the ultimate generic theory of rationality imposes some conditions on utility functions (while still leaving a very rich space of different utility functions), that will lead me to try formalizing human values within those constraints. Of course, given a candidate theory, we should poke around and see whether it can be extended to weaken the constraints.
    - abramdemski 9 Jun 2021 19:02 UTC
      LW: 2 AF: 2
      0
      AF Parent
      I agree inasmuch as we actually can model this sort of preferences, for a sufficiently strong meaning of “model”. I feel that it’s much harder to be confident about any detailed claim about human values than about the validity of a generic theory of rationality. Therefore, if the ultimate generic theory of rationality imposes some conditions on utility functions (while still leaving a very rich space of different utility functions), that will lead me to try formalizing human values within those constraints. Of course, given a candidate theory, we should poke around and see whether it can be extended to weaken the constraints.
      Right, I agree with this. The situation as I see it is that there’s a concrete theory of rationality (logical induction) which I’m using in this way, and it is suggesting to me that your theory (InfraBayes) can still be extended somewhat.
      My argument that we want this particular extension is basically as follows: human values can be thought of as the endpoint of human philosophical deliberation about values. (I am thinking of logical induction as a formalization of philosophical deliberation over time.) This endpoint seems limit-computable, but not necessarily computable. Now, it’s also possible that at this endpoint, humans would have a more compact (ie, computable) representation of values. However, why assume this?
      (My hope is that by appealing to deliberation like this, my argument has more force than if I was only relying on the strength of logical induction as a theory of rationality. The idea of deliberation gives us a general reason to expect that limit-computable is the right place to look.)
      What is LIDT exactly?
      I’m not sure details matter very much here, but I’m provisionally happy to spell out LIDT as:
      Specify some (bounded-value) LUV to use as “utility”
      Make decisions by looking at conditional expectations of that LUV given actions.
      Concrete enough?
      - Vanessa Kosoy 11 Jun 2021 21:32 UTC
        LW: 2 AF: 1
        0
        AF Parent
        I would be convinced if you had a theory of rationality that is a Pareto improvement on IB (i.e. has all the good properties of IB + a more general class of utility functions). However, LI doesn’t provide this AFAICT. That said, I would be interested to see some rigorous theorem about LIDT solving procrastination-like problems.
        
        As to philosophical deliberation, I feel some appeal in this point of view, but I can also easily entertain a different point of view: namely, that human values are more or less fixed and well-defined whereas philosophical deliberation is just a “show” for game theory reasons. Overall, I place much less weight on arguments that revolve around the presumed nature of human values compared to arguments grounded in abstract reasoning about rational agents.
        abramdemski 22 Jun 2021 23:24 UTC
        LW: 2 AF: 2
        0
        AF Parent
        I don’t believe that LI provides such a Pareto improvement, but I suspect that there’s a broader theory which contains the two.
        Overall, I place much less weight on arguments that revolve around the presumed nature of human values compared to arguments grounded in abstract reasoning about rational agents.
        Ah. I was going for the human-values argument because I thought you might not appreciate the rational-agent argument. After all, who cares what general rational agents can value, if human values happen to be well-represented by infrabayes?
        But for general rational agents, rather than make the abstract deliberation argument, I would again mention the case of LIDT in the procrastination paradox, which we’ve already discussed.
        Or, I would make the radical probabilist argument against rigid updating, and the ‘orthodox’ argument against fixed utility functions. Combined, we get a picture of “values” which is basically a market for expected values, where prices can change over time (in a “radical” way that doesn’t necessarily spring from an update on a proposition), but which follow some coherence rules like an expectation of an expectation equals an expectation. One formalization of this is Skyrms’. Another is your generalization of LI (iirc).
        So to sum it up, my argument for general rational agents is:
        In general, we need not update in a rigid way; we can develop a meaningful theory of ‘fluid’ updates, so long as we respect some coherence constraints. In light of this generalization, restriction to ‘rigid’ updates seems somewhat arbitrary (ie there does not seem to be a strong motivation to make the restriction from rationality alone).
        Separately, there is no need to actually have a utility function if we have a coherent expectation.
        Putting the two together, we can study coherent expectations where the notion of ‘coherence’ doesn’t assume rigid updates.
        However, this argument of course does not account for InfraBayes. I suspect your real crux is the plausibility of coming up with a unifying theory which gets both radical-probabilism stuff and InfraBayes stuff. This does seem challenging, but I strongly suspect it to be possible. Indeed, it seems like it might have to do with the idea of a market which maintains a buy/sell spread rather than giving one price for a good.