Based on other comments, I realize I’m making an assumption for something you haven’t specified. How is uy chosen? If it’s random and independent, then my assertion holds, if it’s selected by an adversary who knows the players’ full strategies somehow, then R is just a way of keeping a secret from the adversary—sequence doesn’t matter, but knowledge does.
Based on other comments, I realize I’m making an assumption for something you haven’t specified. How is uy chosen? If it’s random and independent, then my assertion holds, if it’s selected by an adversary who knows the players’ full strategies somehow, then R is just a way of keeping a secret from the adversary—sequence doesn’t matter, but knowledge does.
Claim 1 says there exists some uy value for which the algorithm gets high regret, so we might as well assume it’s chosen to maximize regret.
Claim 2 says the algorithm has low regret regrardless of uy , so we might as well assume it’s chosen to maximize regret.