Ofer comments on Environmental Structure Can Cause Instrumental Convergence

Ofer 16 Aug 2021 12:26 UTC
LW: 1 AF: 1
AF

The claim should be: most agents will not immediately break the vase.

I don’t see why that claim is correct either, for a similar reason. If you’re assuming here that most reward functions incentivize avoiding immediately breaking the vase then I would argue that that assumption is incorrect, and to support this I would point to the same involution from my previous comment.
- TurnTrout 16 Aug 2021 16:47 UTC
  LW: 2 AF: 2
  AF Parent
  I‘m not assuming that they incentivize anything. They just do! Here’s the proof sketch (for the full proof, you’d subtract a constant vector from each set, but not relevant for the intuition).
  
  &You’re playing a tad fast and loose with your involution argument. Unlike the average-optimal case, you can’t just map one set of states to another for all-discount-rates reasoning.
  - Ofer 18 Aug 2021 16:30 UTC
    LW: 1 AF: 1
    AF Parent
    Thanks for the figure. I’m afraid I didn’t understand it. (I assume this is a gridworld environment; what does “standing near intact vase” mean? Can the robot stand in the same cell as the intact vase?)
    
    &You’re playing a tad fast and loose with your involution argument. Unlike the average-optimal case, you can’t just map one set of states to another for all-discount-rates reasoning.
    
    I don’t follow (To be clear, I was not trying to apply any theorem from the paper via that involution). But does this mean you are NOT making that claim (“most agents will not immediately break the vase”) in the limit of the discount rate going to 1? My understanding is that the main claim in the abstract of the paper is meant to assume that setting, based on the following sentence from the paper:
    
    Proposition 6.5 and proposition 6.9 are powerful because they apply to all $γ \in [0, 1]$ , but they can only be applied given hard-to-satisfy environmental symmetries.