ben_levinstein comments on Open-minded updatelessness

ben_levinstein Jul 25, 2023, 10:45 AM
8 points
1
I think the basic approach to commitment for the open-minded agent is right. Roughly, you don’t actually get to commit your future-self to things. Instead, you just do what you (in expectation) would have committed yourself to given some reconstructed prior.

Just as a literature pointer: If I recall correctly, Chris Meacham’s approach in “Binding and Its Consequences” is ultimately to estimate your initial credence function and perform the action from the plan with the highest EU according to that function. He doesn’t talk about awareness growth, but open-mindedness seems to fit in nicely within his framework (or at least the framework I recall him having).
- Daniel Kokotajlo Sep 29, 2023, 3:50 PM
  6 points
  4
  Parent
  This whole reconstructed-prior business seems fishy to me. Let’s presuppose, as people seem to be doing, that there is a clean distinction between empirical evidence and ‘logical’ or ‘a priori’ evidence. Such that we can scrub away our empirical evidence and reconstruct a prior, i.e. construct a probability distribution that we would have had if we somehow had zero empirical evidence but all the logical evidence we currently have.
  Doesn’t the problem just recur? Literally this is what I was thinking when I wrote the original commitment races problem post; I was thinking that just ‘going updateless’ in the sense of acting according to the commitments that make sense from a reconstructed prior, didn’t solve the whole problem, just the empirical-evidence flavor of the problem. Maybe that’s still progress, of course...
  
  And then also there is the question of whether these two kinds of evidence really are that distinct anyway.
  - JesseClifton Sep 29, 2023, 5:15 PM
    4 points
    0
    Parent
    Can you clarify what “the problem” is and why it “recurs”?
    
    My guess is that you are saying: Although OM updatelessness may work for propositions about empirical facts, it’s not clear that it works for logical propositions. For example, suppose I find myself in a logical Counterfactual Mugging regarding the truth value of a proposition P. Suppose I simultaneously become aware of P and learn a proof of P. OM updatelessness would want to say: “Instead of accounting for the fact that you learned that P is true in your decision, figure out what credence you would have assigned to P had you been aware of it at the outset, and do what you would have committed to do under that prior”. But, we don’t know how to assign logical priors.
    
    Is that the idea? If so, I agree that this is a problem. But it seems like a problem for decision theories that rely on logical priors in general, not OM updatelessness in particular. Maybe you are skeptical that any such theory could work, though.
    - Daniel Kokotajlo Sep 29, 2023, 5:42 PM
      2 points
      0
      Parent
      OK, let’s suppose all relevant agents follow some sort of updatelessness, i.e. they constantly act according to the policy that would have been optimal to commit to, from the perspective of their reconstructed prior. But their reconstructed prior is changing constantly as they learn more, e.g. as they become aware of “crazy” possible strategies their opponents might use.
      
      Can the agents sometimes influence each other’s priors? Yes. For example by acting ‘crazy’ in some way they didn’t expect, you might cause them to revise their prior to include that possibility—indeed to include it with significant probability mass!
      
      OK, so then some of the agents will think “Aha, I can influence the behavior of the others in ways that I like, by committing to various ‘crazy’ strategies that place incentives on them. Once they become aware of my action, they’ll revise their prior, and then the optimal commitment for them to have made in light of that prior is to conform to the incentives I placed on them, so they will.”
      
      … I’ll stop there for now. Do you see what I mean? It’s literally the commitment races problem. Agents racing to make various commitments in order to influence each other, because they expect that the others might be so influenced.
      
      Now you might think that it’s generally harder to influence someone’s reconstructed-prior than to influence their posterior; if you do something that was ‘in distribution’ for their current reconstructed prior, for example, then they won’t update their reconstructed prior at all, they’ll just update their posterior. I think this is plausible but I’d want to see it spelled out in more detail how much of the problem it solves; certainly not all of it, at least so says my current model which might be wrong.
      - JesseClifton Sep 29, 2023, 6:37 PM
        2 points
        1
        Parent
        If I understand correctly, you’re making the point that we discuss in the section on exploitability. It’s not clear to me yet why this kind of exploitability is objectionable. After all, had the agent in your example been aware of the possibility of crazy agents from the start, they would have wanted to swerve, and non-crazy agents would want to take advantage of this. So I don’t see how the situation is any worse than if the agents were making decisions under complete awareness.
        Daniel Kokotajlo Oct 4, 2023, 2:14 AM
        2 points
        0
        Parent
        How is it less objectionable than regular ol’ exploitability? E.g. someone finds out that you give in to threats, so they threaten you, so you give in, and wish you had never been born—you are exploitable in the classic sense. But it’s true that if you had been aware from the beginning that you were going to be threatened, you would have wanted to give in.
        
        Part of what I’m doing here is trying to see if my understanding of your work is incorrect. To me, it seems like you are saying “Let’s call some kinds of changes-to-credences ‘updates’ and other kinds ‘awareness-growth.’ Here’s how to distinguish them. Now, we recommend the strategy of EA-OMU, which means you calculate what your credences would have been if you never made any updates but DID make the awareness-growth changes, and then calculate what policy is optimal according to those credences, and then do that.’
        
        If that’s what you are saying, then the natural next question is: What if anything does this buy us? It doesn’t solve the commitment races problem, because the problem still remains so long as agents can strategically influence each other’s awareness growth process. E.g. “Ah, I see that you are an EA-OMU agent. I’m going to threaten you, and then when you find out, even though you won’t update, your awareness will grow, and so then you’ll cave. Bwahaha.”
        
        Also, how is this different from the “commitment races in logical time” situation? Like, when I wrote the original commitment races post it was after talking with Abram and realizing that going updateless didn’t solve the problem because agents aren’t logically omniscient, they need to gradually build up more hypotheses and more coherent priors over time. And even if they are updateless with respect to all empirical evidence, i.e. they never update their prior based on empirical evidence, their a priori reasoning probably still results in race dynamics. Or at least so it seemed to me.
        
        I don’t think I fully understand the proposal so it’s likely I’m missing something here.
        
        I do find it plausible that being updateless about empirical (but not logical) stuff at least ameliorates the problem somewhat, and as far as I can tell that’s basically equivalent to saying being EA-OMU is better than being a naive consequentialist at least. But I wish I understood the situation well enough to crisply articulate why.
        JesseClifton Oct 4, 2023, 11:18 AM
        3 points
        3
        Parent
        
        But it’s true that if you had been aware from the beginning that you were going to be threatened, you would have wanted to give in.
        
        To clarify, I didn’t mean that if you were sure your counterpart would Dare from the beginning, you would’ve wanted to Swerve. I meant that if you were aware of the possibility of Crazy types from the beginning, you would’ve wanted to Swerve. (In this example.)
        
        I can’t tell if you think that (1) being willing to Swerve in the case that you’re fully aware from the outset (because you might have a sufficiently high prior on Crazy agents) is a problem. Or if you think (2) this somehow only becomes a problem in the open-minded setting (even though the EA-OMU agent is acting according to the exact same prior as they would’ve if they started out fully aware, once their awareness grows).
        
        (The comment about regular ol exploitability suggests (1)? But does that mean you think agents shouldn’t ever Swerve, even given arbitrarily high prior mass on Crazy types?)
        
        What if anything does this buy us?
        
        In the example in this post, the ex ante utility-maximizing action for a fully aware agent is to Swerve. The agent starts out not fully aware, and so doesn’t Swerve unless they are open-minded. So it buys us being able to take actions that are ex ante optimal for our fully aware selves when we otherwise wouldn’t have due to unawareness. And being ex ante optimal from the fully aware perspective seems preferable to me than being, e.g., ex ante optimal from the less-aware perspective.
        
        More generally, we are worried that agents will make commitments based on “dumb” priors (because they think it’s dangerous to think more and make their prior less dumb). And EA-OMU says: No, you can think more (in the sense of becoming aware of more possibilities), because the right notion of ex ante optimality is ex ante optimality with respect to your fully-aware prior. That’s what it buys us.
        
        And revising priors based on awareness growth differs from updating on empirical evidence because it only gives other agents incentives to make you aware of things you would’ve wanted to be aware of ex ante.
        
        they need to gradually build up more hypotheses and more coherent priors over time
        
        I’m not sure I understand—isn’t this exactly what open-mindedness is trying to (partially) address? I.e., how to be updateless when you need to build up hypotheses (and, as mentioned briefly, better principles for specifying priors).
- SMK Jul 25, 2023, 10:27 PM
  6 points
  0
  Parent
  Thanks.
  
  Roughly, you don’t actually get to commit your future-self to things. Instead, you just do what you (in expectation) would have committed yourself to given some reconstructed prior.
  
  Agreed.
  
  Just as a literature pointer: If I recall correctly, Chris Meacham’s approach in “Binding and Its Consequences” is ultimately to estimate your initial credence function and perform the action from the plan with the highest EU according to that function.
  
  Yes, that’s a great paper! (I think we might have had a footnote on cohesive decision theory in a draft of this post.) Specifically, I think the third version of cohesive decision theory which Meacham formulates (in footnote 34), and variants thereof, are especially relevant to dynamic choice with changing awareness. The general idea (as I see it) would be that you optimize relative to your ur-priors, and we may understand the ur-prior function as the prior you would or should have had if you had been more aware. So when you experience awareness growth, the ur-priors change (and thus the evaluation of a given plan will often change as well).
  
  He doesn’t talk about awareness growth, but open-mindedness seems to fit in nicely within his framework (or at least the framework I recall him having).
  
  (Meacham actually applies the ur-prior concept and ur-prior conditionalization to awareness growth in this paper.)