Scott Garrabrant comments on Functional Side Effects

Scott Garrabrant 17 Jan 2014 6:18 UTC
0 points
Once the 1 has been processed, it might be too late. A single bit of irrelevant information seems easy to ignore, but what if the preferences of the agent after viewing the 1 are different from the preferences of the agent before viewing the 1. This might not be the case in this problem, but it is conceivable to me. Then the agent cannot and should not just forget the 1, unless he is forced to by some pre commitment mechanism from the agent before viewing the 1.

I think that in this problem it makes sense for the agent to pretend it did not see the 1, but this might not be true in all cases.

For example. If the situation was that 1 would be terminated if the letters matched and 2 would be terminated if the letters did not match, I would choose randomly so that I could not be taken advantage of by my other copy.
- Tyrrell_McAllister 17 Jan 2014 19:14 UTC
  0 points
  Parent
  
  A single bit of irrelevant information seems easy to ignore, but what if the preferences of the agent after viewing the 1 are different from the preferences of the agent before viewing the 1. [...] Then the agent cannot and should not just forget the 1 …
  
  To be clear, the input was given to the agent, but the agent didn’t “look at it” prior to choosing an input-output function. Imagine that the agent was handed the input, but, before looking at it, the agent stored the input in external memory. Before looking at what the input was, the agent chooses an optimal input-output function f. Once such a function is chosen, and only then, does the agent look at what the input was. The agent then outputs whatever f maps the input to. (A few years back, I wrote a brief write-up describing this in more detail.)
  
  Now, if, as you suggest, looking at the input will change the agent’s preferences, then this is all the more reason why the agent will want to choose its input-output map before looking at the input. For, suppose that the agent hasn’t looked at the input yet. (The input was stored, sight-unseen, in external memory.) If the agent’s future preferences will be different, then those future preferences will be worse than the present preferences, so far as the present agent is concerned. Any agent with different preferences might work at cross purposes to your own preferences, even if this agent is your own future self. Therefore, if you anticipate that your preferences are about to be changed, then that should encourage you to make all important decisions now, while you still have the “right” preferences.
  
  (I assume that we’re talking about preferences over states of the world, and not “preferences” resulting from mere ignorance. My preference is to get a box with a diamond in it, not a box that I wrongly think has a diamond in it.)
  
  … unless he is forced to by some pre commitment mechanism from the agent before viewing the 1.
  
  I don’t think that “pre-commitment” is the right way to think about this. The agent begins the scenario running a certain program. If that program has the agent setting aside the input sight unseen and choosing an input-output function prior to looking at the input, and then following that input-output function, then that is just what the agent will do — not because of force or pre-commitment, but just because that is its program.
  
  (I wrote a post a while back speculating on how this might “feel” to the agent “on the inside”. It shouldn’t feel like being forced to follow a previous commitment.)
  - Scott Garrabrant 17 Jan 2014 19:55 UTC
    0 points
    Parent
    I don’t think we actually disagree on anything substantial.
    
    I was partially going based on the fact that in Wei Dai’s example, the agent was told the number 1, before he was even told what the experiment was. I think the nature of our disagreement is only on our interpretation of Wei Dai’s thought experiment.
    
    Do you agree with the following statement?
    
    “UDT1.1 is good, but we have to be clear about what the input is. The input is all information that you have not yet received (or not yet processed). All the other information that you have should be viewed as part of the source code of you decision procedure, and may change your probabilities and/or your utilities.”
    - Tyrrell_McAllister 19 Jan 2014 22:36 UTC
      1 point
      Parent
      
      “UDT1.1 is good, but we have to be clear about what the input is. The input is all information that you have not yet received (or not yet processed). All the other information that you have should be viewed as part of the source code of you decision procedure, and may change your probabilities and/or your utilities.”
      
      Yes, I agree.
      
      I could quibble with the wording of the part after the last comma. It seems more in line with the spirit of UDT to say that, if an agent’s probabilities or utilities “change”, then really what happened is that the agent was replaced by a different agent. After all, the “U” in “UDT” stands for “updateless”. Agents aren’t supposed to update their probabilities or utilities. But this is not a significant point.