It’s just no free lunch theorem? For every computable decision procedure you can construct environment which predicts exact output for this decision procedure and reacts in way of maximum damage, making decision procedure to perform worse than random action selection.
It’s just no free lunch theorem? For every computable decision procedure you can construct environment which predicts exact output for this decision procedure and reacts in way of maximum damage, making decision procedure to perform worse than random action selection.