Hence, they would not have seen that their source code is “A”.
Unless something interfered with what they saw—there need not be pure/true observations.
Instead, if the agent were to take action Y upon seeing that their source code is “A”, their source code must be something else, perhaps “B”.
And something might have incentive to do so if the agent were to do X if it “saw its source code was A” and were to do Y if it “saw its source code was B”. While A and B may be mutually exclusive, the actual policy “might” be dependent on observations of either.
Long:
[1] If a program takes long enough to run, it may never be found that it does halt. In a sense, the fact that its output is determined does not mean it can (or will) be deduced.
there is no way for two different policies to be compatible with the same source code.
And set of inputs.
Formally, a policy is a function mapping an observation history to an action. It is distinct from source code, in that the source code specifies the implementation of the policy in some programming language, rather than itself being a policy function.
Logically, it is impossible for the same source code to generate two different policies. There is a fact of the matter about what action the source code outputs given an observation history (assuming the program halts). Hence there is no way for two different policies to be compatible with the same source code.
Overall take:
Dynamic versus static:
Consider the numbers 3, 1, 2, 4.
There exists more than one set of actions that ‘transforms’ the above into: 1, 2, 3, 4.
(It can also be transformed into a sorted list by deleting the 3...)
A sorting method however, does not always take a list and move the first element to the third position, or even necessarily do so in every case where the first element is three.
While deterministic, its behavior depends upon an input. Given the input, the actions it will take are known (or follow from the source code in principle[1]).
This can be generalized further, in the case of a sorting program that takes both a set of objects, and a way of ordering. Perhaps a program can even be written that reasons about some policy, and based on the results, makes an output conditional on what it finds. Thus the “logical counterfactual” does not exist per se, but is a way of thinking used in order to handle the different cases, as it is not clear which one is the case, though only one may be possible.
More specific:
Formally, a policy is a function mapping an observation history to an action. It is distinct from source code, in that the source code specifies the implementation of the policy in some programming language, rather than itself being a policy function.
Though a policy may include/specify (simpler) policies, and thus by extension, a source code may as well, though the different threads will probably be weaved together.
Short:
Unless something interfered with what they saw—there need not be pure/true observations.
And something might have incentive to do so if the agent were to do X if it “saw its source code was A” and were to do Y if it “saw its source code was B”. While A and B may be mutually exclusive, the actual policy “might” be dependent on observations of either.
Long:
[1] If a program takes long enough to run, it may never be found that it does halt. In a sense, the fact that its output is determined does not mean it can (or will) be deduced.
And set of inputs.
Overall take:
Dynamic versus static:
Consider the numbers 3, 1, 2, 4.
There exists more than one set of actions that ‘transforms’ the above into: 1, 2, 3, 4.
(It can also be transformed into a sorted list by deleting the 3...)
A sorting method however, does not always take a list and move the first element to the third position, or even necessarily do so in every case where the first element is three.
While deterministic, its behavior depends upon an input. Given the input, the actions it will take are known (or follow from the source code in principle[1]).
This can be generalized further, in the case of a sorting program that takes both a set of objects, and a way of ordering. Perhaps a program can even be written that reasons about some policy, and based on the results, makes an output conditional on what it finds. Thus the “logical counterfactual” does not exist per se, but is a way of thinking used in order to handle the different cases, as it is not clear which one is the case, though only one may be possible.
More specific:
Though a policy may include/specify (simpler) policies, and thus by extension, a source code may as well, though the different threads will probably be weaved together.