Oh, another worry: there may not be a stable equilibrium to converge to—every time M approximates the final result well, Adv may be incentivized to switch to making different arguments to make M’s predictions wrong. (Or rather, maybe the stable equilibrium has to be a mixture over policies for this reason, and so you only get the true answer with some probability.)
Oh, another worry: there may not be a stable equilibrium to converge to—every time M approximates the final result well, Adv may be incentivized to switch to making different arguments to make M’s predictions wrong. (Or rather, maybe the stable equilibrium has to be a mixture over policies for this reason, and so you only get the true answer with some probability.)