Your comment did clarify for me what Will was talking about. This is an important confusion (to untangle).
Agent’s counterfactual actions feel like a wrong joint to me. I expect agent’s assertion of its own values has more to do with the interval between what’s known about the reasons for its decisions (including to itself, where introspection and mutual introspection is deep) and the decisions themselves, the same principle that doesn’t let it know its decisions in advance of whenever the decisions “actually” happen (as opposed to being enacted on precommitments). In particular, counterfactual behavior can also be taken as decided upon at some point visible to those taking that property (expression of values) into account.
Your comment did clarify for me what Will was talking about. This is an important confusion (to untangle).
Agent’s counterfactual actions feel like a wrong joint to me. I expect agent’s assertion of its own values has more to do with the interval between what’s known about the reasons for its decisions (including to itself, where introspection and mutual introspection is deep) and the decisions themselves, the same principle that doesn’t let it know its decisions in advance of whenever the decisions “actually” happen (as opposed to being enacted on precommitments). In particular, counterfactual behavior can also be taken as decided upon at some point visible to those taking that property (expression of values) into account.