Suppose we have an agent A trying to optimise for a reward R in an environment S. How can we tell that the presence of the agent does not affect the environment and the measurement(observation) is not only subject to the agent but the environment?
This is related with the measurement problem in quantum computing, we have an agent (a particle ) entangled in a quantum superposition, consider an electron with two possible configurations: up and down, a|↑⟩+b|↓⟩, when we measure the state, it collapses the wavefunction to a particular classical state, up or down.
Moreover, the observer effect, notes that measurements of certain systems cannot be made without affecting the system. While the uncertainty principle argues that we cannot predict the value of a quantity with arbitrary certainty.
Another way to state the problem is, does measuring the state of the action affect the state of the environment? How are the observations in physics different from the observations we make in RL? Is the environment state at S1 causal to the state in S2 ?
[Question] What are the causality effects of an agents presence in a reinforcement learning environment
Suppose we have an agent A trying to optimise for a reward R in an environment S.
How can we tell that the presence of the agent does not affect the environment and the measurement(observation) is not only subject to the agent but the environment?
This is related with the measurement problem in quantum computing, we have an agent (a particle ) entangled in a quantum superposition, consider an electron with two possible configurations: up and down, a|↑⟩+b|↓⟩, when we measure the state, it collapses the wavefunction to a particular classical state, up or down.
Moreover, the observer effect, notes that measurements of certain systems cannot be made without affecting the system.
While the uncertainty principle argues that we cannot predict the value of a quantity with arbitrary certainty.
Another way to state the problem is, does measuring the state of the action affect the state of the environment?
How are the observations in physics different from the observations we make in RL?
Is the environment state at S1 causal to the state in S2 ?