I see a gap in the argument at the point where you “give” the agent a probability distribution, and it “gets” from that a probability distribution to use to make its choice. A probability distribution is an infinite object, in this case an assignment of a real number to each element of A. You can only give such a thing to a finite automaton by streaming it as a string of bits that progressively give more and more information about the distribution, approaching it in the limit. The automation must calculate from this a stream of bits that similarly represent its probability distribution, each bit of output depending on only a finite part of the input. However, it cannot use the distribution to calculate its action until it has the exact values, which it never will. (That is generally true even if the automaton is not calculating based on an input distribution. In practice, of course, one just uses the computer’s IEEE 754 “reals” and hopes the result will be close enough. But that will not do for Mathematics.)
If it is going to repeatedly choose actions from the same distribution, then I imagine there may be ways for it to sample from increasingly good finite approximations to the distribution in such a way that as it approximates it more and more precisely, it can correct earlier errors by e.g. sampling an action with a little higher probability than the distribution specifies, to correct for too low an approximation used earlier. But for a single choice, a finite automaton cannot sample from a probability distribution specified as a set of real numbers, unless you allow it a tolerance of some sort.
Furthermore, to do these calculations, some representation of the real numbers must be used that makes the operations computable. The operations must be continuous functions, because all computable functions on real numbers are continuous. This generally means using an ambiguous representation of the reals, that gives more than one representation to most real numbers. Each real is represented by an infinite string drawn from some finite alphabet of “digits”. However, the topology on the space of digit-strings in this context has to be the one whose open sets are all the strings sharing any given finite prefix. These are also the closed sets, making the space totally disconnected. Brouwer’s fixed point theorem therefore does not apply. There are trivial examples of continuous functions without fixed points, e.g. permute the set of digits. Such a function may not correspond to any function on the reals, because multiple representations of the same real number might be mapped to representations of different real numbers.
I think this is actually not as serious of a problem as you make it out to be, for various reasons.
Finite automatons don’t actually exist in the real world. This was my whole point about (for example) transistors not actually working as binary on/off switches. In the real world there are ways to give agents access to real numbers, just with some measurement error both on the part of the forecaster and on the part of the agent reading the input. The setup of giving a wave-function input to a QTM is where this works best: you can’t get the squared norms of the coefficients exactly right, but you can get them right “in probability”, in the sense that an error of >ε will have a probability of <δ, etc.
Given that you don’t have a continuous function but (say) a convex combination of a continuous function with some random noise, Brouwer’s fixed point theorem still tells you that the continuous part has a fixed point, and then the random noise added on top of it just throws you off by some small distance. In other words, there isn’t a catastrophic failure where a little bit of noise on top of a continuous process totally throws you off; continuous function + some noise on top still has an “almost fixed point” with probability >1−δ.
I agree with you that if you had a finite automaton then this whole trick doesn’t work. The relevant notion of continuity for Turing machines is continuity on the Cantor set, which must hold since a TM that halts can only read finitely many bits of the input and so can only figure out it’s in some nonempty open subset of the Cantor set. However, as you point out the Cantor set is totally disconnected and there’s no way to map that notion of continuity to the one that uses real numbers.
In contrast, quantum Turing machines that take inputs from a Hilbert space are actually continuous in the more traditional real or complex analytic sense of that term, so here we have no such problems.
I see a gap in the argument at the point where you “give” the agent a probability distribution, and it “gets” from that a probability distribution to use to make its choice. A probability distribution is an infinite object, in this case an assignment of a real number to each element of A. You can only give such a thing to a finite automaton by streaming it as a string of bits that progressively give more and more information about the distribution, approaching it in the limit. The automation must calculate from this a stream of bits that similarly represent its probability distribution, each bit of output depending on only a finite part of the input. However, it cannot use the distribution to calculate its action until it has the exact values, which it never will. (That is generally true even if the automaton is not calculating based on an input distribution. In practice, of course, one just uses the computer’s IEEE 754 “reals” and hopes the result will be close enough. But that will not do for Mathematics.)
If it is going to repeatedly choose actions from the same distribution, then I imagine there may be ways for it to sample from increasingly good finite approximations to the distribution in such a way that as it approximates it more and more precisely, it can correct earlier errors by e.g. sampling an action with a little higher probability than the distribution specifies, to correct for too low an approximation used earlier. But for a single choice, a finite automaton cannot sample from a probability distribution specified as a set of real numbers, unless you allow it a tolerance of some sort.
Furthermore, to do these calculations, some representation of the real numbers must be used that makes the operations computable. The operations must be continuous functions, because all computable functions on real numbers are continuous. This generally means using an ambiguous representation of the reals, that gives more than one representation to most real numbers. Each real is represented by an infinite string drawn from some finite alphabet of “digits”. However, the topology on the space of digit-strings in this context has to be the one whose open sets are all the strings sharing any given finite prefix. These are also the closed sets, making the space totally disconnected. Brouwer’s fixed point theorem therefore does not apply. There are trivial examples of continuous functions without fixed points, e.g. permute the set of digits. Such a function may not correspond to any function on the reals, because multiple representations of the same real number might be mapped to representations of different real numbers.
I think this is actually not as serious of a problem as you make it out to be, for various reasons.
Finite automatons don’t actually exist in the real world. This was my whole point about (for example) transistors not actually working as binary on/off switches. In the real world there are ways to give agents access to real numbers, just with some measurement error both on the part of the forecaster and on the part of the agent reading the input. The setup of giving a wave-function input to a QTM is where this works best: you can’t get the squared norms of the coefficients exactly right, but you can get them right “in probability”, in the sense that an error of >ε will have a probability of <δ, etc.
Given that you don’t have a continuous function but (say) a convex combination of a continuous function with some random noise, Brouwer’s fixed point theorem still tells you that the continuous part has a fixed point, and then the random noise added on top of it just throws you off by some small distance. In other words, there isn’t a catastrophic failure where a little bit of noise on top of a continuous process totally throws you off; continuous function + some noise on top still has an “almost fixed point” with probability >1−δ.
I agree with you that if you had a finite automaton then this whole trick doesn’t work. The relevant notion of continuity for Turing machines is continuity on the Cantor set, which must hold since a TM that halts can only read finitely many bits of the input and so can only figure out it’s in some nonempty open subset of the Cantor set. However, as you point out the Cantor set is totally disconnected and there’s no way to map that notion of continuity to the one that uses real numbers.
In contrast, quantum Turing machines that take inputs from a Hilbert space are actually continuous in the more traditional real or complex analytic sense of that term, so here we have no such problems.