I think that re-reading again your answer made something click. So thanks for that
The observed data is not **random**, because random is not a property of the data itself. The hypotheses that we want to evaluate are not random either, because we are analysing Turing machines that generate those data deterministically.
If the data is HTHTHT, we do not test a python script that is doing:
random.choices([“H”,”T”], k=6)
What we test instead is something more like
[“H”] +[“T”]+[“H”]+[“T”]+[“H”]+[“T”]
And
[“HT”]*3
In this case, this last script will be simpler and for that reason, will receive a higher prior.
If we apply this is a Bayesian setting, the likelihood of all these hyptohesis is necessarily 1, so the posterior probabilty just becomes the prior (divided by some factor), which is proportional to the length of the program. This makes sense because it is in agreement with Occam’s razor.
The thing I still struggle to see is how I connect this framework with probabilistic hypothesis that I want to test, such as the data was generated by a fair coin. One possibility that I see (but I am not sure this is the correct thing) is testing all the possible strings generated by an algorithm like this:
i=0 while True: random.seed(i) random.choices([“H”,”T”], k=6)
The likelihood of the strings like HHTHTH is 0 so we remove them and then we are left only with the algorithms that are consistent with the data.
I think that re-reading again your answer made something click. So thanks for that
The observed data is not **random**, because random is not a property of the data itself.
The hypotheses that we want to evaluate are not random either, because we are analysing Turing machines that generate those data deterministically.
If the data is HTHTHT, we do not test a python script that is doing:
random.choices([“H”,”T”], k=6)
What we test instead is something more like
[“H”] +[“T”]+[“H”]+[“T”]+[“H”]+[“T”]
And
[“HT”]*3
In this case, this last script will be simpler and for that reason, will receive a higher prior.
If we apply this is a Bayesian setting, the likelihood of all these hyptohesis is necessarily 1, so the posterior probabilty just becomes the prior (divided by some factor), which is proportional to the length of the program. This makes sense because it is in agreement with Occam’s razor.
The thing I still struggle to see is how I connect this framework with probabilistic hypothesis that I want to test, such as the data was generated by a fair coin. One possibility that I see (but I am not sure this is the correct thing) is testing all the possible strings generated by an algorithm like this:
i=0
while True:
random.seed(i)
random.choices([“H”,”T”], k=6)
The likelihood of the strings like HHTHTH is 0 so we remove them and then we are left only with the algorithms that are consistent with the data.
Not totally sure of the last part