To be precise, in every case where the environment only cares about your actions and not what algorithm you use to produce them, any algorithm that can be improved by randomization can always be improved further by derandomization.
Isn’t this trivially true? Isn’t the most (time) efficient algorithm always a giant lookup table?
To be precise, in every case where the environment only cares about your actions and not what algorithm you use to produce them, any algorithm that can be improved by randomization can always be improved further by derandomization.
Isn’t this trivially true? Isn’t the most (time) efficient algorithm always a giant lookup table?