To add a simple observation to more detailed analysis : human brains have real world noise affecting their computations. So the preference they are going to exhibit when their internal pretences are almost the same is going to be random. This is also the optimal strategy for a game like rock paper scissors: to randomly choose from the 3 classes, because any preference for a class can be exploited like you found out.
We can certainly make AI systems that exhibit randomness whenever 2 actions being considered are close together in value heuristic.
To add a simple observation to more detailed analysis : human brains have real world noise affecting their computations. So the preference they are going to exhibit when their internal pretences are almost the same is going to be random. This is also the optimal strategy for a game like rock paper scissors: to randomly choose from the 3 classes, because any preference for a class can be exploited like you found out.
We can certainly make AI systems that exhibit randomness whenever 2 actions being considered are close together in value heuristic.