Thomas Kwa comments on Thomas Kwa’s Shortform

Thomas Kwa 28 Aug 2022 3:12 UTC
4 points
0
Somewhat related to this post and this post:
Coherence implies mutual information between actions. That is, to be coherent, your actions can’t be independent. This is true under several different definitions of coherence, and can be seen in the following circumstances:
- When trading between resources (uncertainty over utility function). If you trade 3 apples for 2 bananas, this is information that you won’t trade 3 bananas for 2 apples, if there’s some prior distribution over your utility function.
- When taking multiple actions from the same utility function (uncertainty over utility function). Your actions will all have to act like a phased array pushing the variables you care about in some direction.
- When taking multiple actions based on the same observation (uncertainty over observation / world-state). Suppose that you’re trying to juggle, and your vision is either reversed or not reversed. The actions of your left arm and right arm will have mutual information, because they both depend on whether your vision has been reversed in related ways.
This would be a full post, but I don’t think it’s important enough to write up.
What links here?
- Vladimir_Nesov's comment on Simulators by janus (2 Sep 2022 23:13 UTC; 9 points)
- Vladimir_Nesov's comment on Simulacra are Things by janus (9 Jan 2023 0:11 UTC; 4 points)
- Vladimir_Nesov 28 Aug 2022 11:17 UTC
  2 points
  0
  Parent
  
  Your actions will all have to act like a phased array pushing the variables you care about in some direction.
  
  The nice thing is that this should work even if you are a policy selected by a decision making algorithm, but you are not yourself a decision making algorithm anymore. There is no preference in any of the possible runs of the policy at that point, you don’t care about anything now, you only know what you must do here, and not elsewhere. But if all possible runs of the policy are considered altogether (in the updateless sense of maps from epistemic situations to action and future policy), the preference is there, in the shape of the whole thing across all epistemic counterfactuals. (Basically you reassemble a function from pairs (from, to) of things it maps, found in individual situations.)
  
  I guess the at-a-distance part could make use of composition of an agent with some of its outer shells into a behavior that forgets internal interactions (within the agent, and between the agent and its proximate environment). The resulting “large agent” will still have basically the same preference, with respect to distant targets in environment, without a need to look inside the small agent’s head, if the large agent’s external actions in a sufficient range of epistemic situations can be modeled. (These large agents exist in each inidividual possible situation, they are larger than the small agent within the situation, and they can be compared with other variants of the large agent from different possible situations.)
  
  Not clear what to do with dependence on the epistemic situation of the small agent. It wants to reduce to dependence on a situation in terms of the large agent, but that doesn’t seem to work. Possibly this needs something like the telephone theorem, with any relevant-in-some-sense dependence of behavior (of the large agent) on something becoming dependence of behavior on natural external observations (of the large agent) and not on internal noise (or epistemic state of the small agent).