This post is part of the Asymptotic Logical Uncertainty series. In this post, I will give an algorithm, BenfordLearner, that passes the Benford test. This algorithm further satisfies limN→∞,N∈SBenfordLearner(N)=p whenever S is an irreducible pattern with probability p.
In the following algorithm, we define logq to be 1 whenever q≤2.
BenfordLearner(N)
1: P=0
2: M=N
3: for j=0 to N
4: M_Y=0
5: for Y a TM expressible in K_Y <log N bits
6: M_X=N
7: for Y a TM expressible in K_X <log N bits
8: if UTM(X,N) and UTM(Y,N) both accept in time T(N)
9: A=0
10: R=0
11: i=1
12: while i<=N
13: if UTM(X,i) and UTM(Y,i) both accept in time T(i)
14: if UTM(L,i) accepts in time T(N)
15: A=A+1
16: else if UTM(L,i) rejects in time T(N)
17: R=R+1
18: else
19: i=N
20: i=i+1
21: F=A/(A+R)
22: Q=A+R
23: if max(K_X,|F-j/N|*sqrt(Q)/K_Y/sqrt(log log Q))<M_X
24: M_X=max(K_X,|F-j/N|*sqrt(Q)/K_Y/sqrt(log log Q))
25: if M_X>M_Y
26: M_Y=M_X
27: if M_Y<M
28: M=M_Y
29: P=j/N
30: return P
Let TM(N) be the set of all Turing machines X expressible in at most logN bits such that UTM(X,N) accepts in time at most T(N). The encoding of Turing machines must be prefix-free, which in particular means that no Turing machine is encoded in 0 bits. Let JN denote the set of rational numbers of the form jN with j=0,…,N.
For X and Y Turing machines, let K(X) be the number of bits necessary to encode X. Let S′(X,Y) be the subset of natural numbers i which are accepted by both UTM(X,i) and UTM(Y,i) in time at most T(i). Let QN(X,Y) be the greatest number less than or equal to N such that for every s in the first QN(X,Y) elements of S′, UTM(L,s) halts in time T(N). Let FN(X,Y) be the proportion of the first QN(X,Y) elements of S′ which L accepts. Let BN(X,Y,P)=max(K(X),|FN(X,Y)−P|√QN(X,Y)K(Y)√loglogQN(X,Y)).
Lemma: The output of BenfordLearner(N) is in
argminP∈JNmaxY∈TM(N)minX∈TM(N)BN(X,Y,P).
Proof: The algorithm has three for loops, the outer ranging over j=0,…N and the inner two ranging over Y and X respectively, both restricted to Turing machines expressible in logN bits. The condition on line 8 means that X and Y effectively range over all Turing machines in TM(N), and P=jN ranges over JN.
The inner while loop will increment the variables A or R a total of exactly QN(X,Y) times. Thus, Q is set to QN(X,Y) in line 22. Similarly, F is sent to FN(X,Y) in line 21. Clearly KX and KY are K(X) and K(Y) respectively. Therefore, the expression on lines 23 and 24 is BN(X,Y,P).
Considering the for loops from inner to outer, we minimize this quantity in X, maximize it in Y, and find P of the form j/N minimizing the whole quantity. The P returned is therefore a minimizer of maxY∈TM(N)minX∈TM(N)BN(X,Y,P).□
Our goal is to quickly assign logical probabilities. The code is not optimized to run efficiently. The following proposition is just to ensure that the run time is not far off from T(N).
Proposition: The run time of BenfordLearner(N) is in O(T(N)N4logT(N))).
Simulating UTM on any input for T time steps can be done in time T in time cTlogT for some fixed constant c by simulating U for T steps \cite{Hennie1966}. The bulk of the run time comes from simulating Turing machines on lines 8, 13, 14, and 16. Each of these lines takes at most cT(N)logT(N) time, and we enter each of these lines at most N4 times. Therefore, the program runs in time O(T(N)N4logT(N)).
□
In the next post, we will use the above Lemma and the notion of irreducible patterns to show that this algorithm passes the Benford Test.
Asymptotic Logical Uncertainty: A Benford Learner
This post is part of the Asymptotic Logical Uncertainty series. In this post, I will give an algorithm, BenfordLearner, that passes the Benford test. This algorithm further satisfies limN→∞,N∈SBenfordLearner(N)=p whenever S is an irreducible pattern with probability p.
In the following algorithm, we define logq to be 1 whenever q≤2.
Let TM(N) be the set of all Turing machines X expressible in at most logN bits such that UTM(X,N) accepts in time at most T(N). The encoding of Turing machines must be prefix-free, which in particular means that no Turing machine is encoded in 0 bits. Let JN denote the set of rational numbers of the form jN with j=0,…,N.
For X and Y Turing machines, let K(X) be the number of bits necessary to encode X. Let S′(X,Y) be the subset of natural numbers i which are accepted by both UTM(X,i) and UTM(Y,i) in time at most T(i). Let QN(X,Y) be the greatest number less than or equal to N such that for every s in the first QN(X,Y) elements of S′, UTM(L,s) halts in time T(N). Let FN(X,Y) be the proportion of the first QN(X,Y) elements of S′ which L accepts. Let BN(X,Y,P)=max(K(X),|FN(X,Y)−P|√QN(X,Y)K(Y)√loglogQN(X,Y)).
Lemma: The output of BenfordLearner(N) is in arg minP∈JNmaxY∈TM(N)minX∈TM(N)BN(X,Y,P).
Proof: The algorithm has three for loops, the outer ranging over j=0,…N and the inner two ranging over Y and X respectively, both restricted to Turing machines expressible in logN bits. The condition on line 8 means that X and Y effectively range over all Turing machines in TM(N), and P=jN ranges over JN.
The inner while loop will increment the variables A or R a total of exactly QN(X,Y) times. Thus, Q is set to QN(X,Y) in line 22. Similarly, F is sent to FN(X,Y) in line 21. Clearly KX and KY are K(X) and K(Y) respectively. Therefore, the expression on lines 23 and 24 is BN(X,Y,P).
Considering the for loops from inner to outer, we minimize this quantity in X, maximize it in Y, and find P of the form j/N minimizing the whole quantity. The P returned is therefore a minimizer of maxY∈TM(N)minX∈TM(N)BN(X,Y,P). □
Our goal is to quickly assign logical probabilities. The code is not optimized to run efficiently. The following proposition is just to ensure that the run time is not far off from T(N).
Proposition: The run time of BenfordLearner(N) is in O(T(N)N4logT(N))).
Simulating UTM on any input for T time steps can be done in time T in time cTlogT for some fixed constant c by simulating U for T steps \cite{Hennie1966}. The bulk of the run time comes from simulating Turing machines on lines 8, 13, 14, and 16. Each of these lines takes at most cT(N)logT(N) time, and we enter each of these lines at most N4 times. Therefore, the program runs in time O(T(N)N4logT(N)).
□
In the next post, we will use the above Lemma and the notion of irreducible patterns to show that this algorithm passes the Benford Test.