Do I understand correctly that in general the elements of A, B, C, are achievable probability distributions over the set of n possible outcomes? (But that in the examples given with the deterministic environments, these are all standard basis vectors / one-hot vectors / deterministic distributions ?)
Yes.
Then no permutation of the set of observations-histories would convert any element of A into an element of B, nor visa versa.
Nice catch. In the stochastic case, you do need a permutation-enforced similarity, as you say (see definition 6.1: similarity of visit distribution sets in the paper). They won’t apply for all A, B, because that would prove way too much.
Yes.
Nice catch. In the stochastic case, you do need a permutation-enforced similarity, as you say (see definition 6.1: similarity of visit distribution sets in the paper). They won’t apply for all A, B, because that would prove way too much.