V_V comments on Versions of AIXI can be arbitrarily stupid

V_V 17 Aug 2015 10:26 UTC
3 points

In theory, changing the exploration rate and changing the prior are equivalent.

Not really. Standard AIXI is completely deterministic, while the usual exploration strategies for reinforcement learning, such as ɛ-greedy and soft-max, are stochastic.