Here’s one way of thinking about this. Imagine a long sequence of instances of ASP. Both the agent and predictor in a later instance know what happened in all the earlier instances (say, because the amount of compute available in later instances is much higher, such that all previous instances can be simulated). The predictor in ASP is a logical inductor predicting what the agent will do this time.
Looking at the problem this way, it looks pretty fair. Since logical inductors can do induction, if an agent takes actions according to a certain policy, then the predictor will eventually learn this, regardless of the agent’s source code. So only the policy matters, not the source code.
There are some formal notions of fairness that include ASP. See Asymptotic Decision Theory.
Here’s one way of thinking about this. Imagine a long sequence of instances of ASP. Both the agent and predictor in a later instance know what happened in all the earlier instances (say, because the amount of compute available in later instances is much higher, such that all previous instances can be simulated). The predictor in ASP is a logical inductor predicting what the agent will do this time.
Looking at the problem this way, it looks pretty fair. Since logical inductors can do induction, if an agent takes actions according to a certain policy, then the predictor will eventually learn this, regardless of the agent’s source code. So only the policy matters, not the source code.
See also In Logical Time, All Games are Iterated Games.