You treat a lottery output as a bitstring and ask about SI on it. We can imagine a completely naive agent with no previous observations; what will this ignorant predict? Well, it seems reasonable that one of the top predictions will be for the initial bitstring to be repeated; this seems OK by Occam’s razor (events often repeating are necessary for induction) and I understand that empirically investigating simple Turing machines that many (most? all?) terminating programs will repeat output. It will definitely rank the ‘sequence repeats’ hypotheses above that of possible PRNGs, or very complex physical theories encompassing atmospheric noise and balls dropping into baskets etc.
So far, so good.
I think I lose you when you go on to talk about inferring that you will always win and stuff like that. The repeating hypotheses aren’t contingent on who they happen to. If the particular bitstring emitted by the lottery had also included ‘...and this number was picked by Jain Farstrider’, then SI would seem to then also predict that this Jain will win the next one as well, by the same repeating logic. It certainly will not predict that the agent will win, and the hypothesis ‘the agent (usually) wins’ will drop.
Remember that my trichotomy was that you need to either 1) invoke anthropics; 2) break Aumann via something like dishonesty/incompetence; or 3) you actually do have communicable knowledge.
These SI musings doesn’t seem to invoke anthropics or break Aumannian requirements, and looking at them, they seem communicable. ‘AIXI-MC-MML*, why do you think Jain will win the lottery a second time?’ ‘[translated from minimum-message-length model+message] Well, he won it last time and since I am ignorant of everything in the world, it seems reasonable that he will win it again’. ‘Hmm, that’s a good point.’ And ditto if AIXI-MC-MML happened to be the beneficiary.
* I bring up minimum-message length because Patrick Robotham is supposed to be working on a version of AIXI-MC using MML so one would be able to examine the model of the world(s) a program has devised so far and so one could potentially ask ‘why’ it is making the predictions it is. Having a comprehensible approximation of SI would be pretty convenient for discussing what SI would or would not do.
It will definitely rank the ‘sequence repeats’ hypotheses above that of possible PRNGs,
It doesn’t need PRNGs. The least confusing description of S.I. is as following: the probability of a sequence S is the probability that an universal prefix Turing machine with 3 tapes: input tape which can only be read from, head only advanced in one direction, work tape which can be read from and written to, and is initialized with zeroes, and output tape that can only be written to, will output the sequence S when fed a never-ending string of random bits on the input tape.
The head has such rule set that the program can be loaded via the input tape, and then the program can use the input tape as source of data. This is important because a program can then set up an interpreter emulating other Turing machine (which ensures a constant bound on difference between length of code for different machines).
(We predict using conditional probability—if the machine outputs sequence matching the previous observations, what is the probability that it will produce specific future observations) .
So if we are predicting, for example, perfect coin flips, an input string which begins with code that sets up the working tape so that it will subsequently relay random bits from input to the output, does the trick. This code requires the bits on the input tape to match the observation, meaning that for each observed bit, the length of the input string which has to be correct grows by 1 bit.
Meanwhile a code that sets up the machine to output repeating zeroes does not require any more bits on the input tape to be correct. So when you are getting repeated zeroes, the code relaying random bits is being lowered in weight by factor of 2 with each observed bit, whereas the theory outputting zeroes stays the same (until, of course, you encounter a non zero and it is eliminated).
I think I lose you when you go on to talk about inferring that you will always win and stuff like that. The repeating hypotheses aren’t contingent on who they happen to. If the particular bitstring emitted by the lottery had also included ‘...and this number was picked by Jain Farstrider’, then SI would seem to then also predict that this Jain will win the next one as well, by the same repeating logic.
You scratched your ticket and you seen a number. Correct codes have to match the number on the ticket and the number winning the lottery. Some use same string of input bits to match both, some use different pieces of input string.
(I am assuming that S.I. can not precisely predict the lottery. Even assuming a completely deterministic universe, light from the distant stars, incoming cosmic rays, all of that incoming information ends up mixed in the grand hash of thermal noise and thermal fluctuations)
edit: to make it clearer. Suppose that the lottery has 1000 decimal digits; you scratch one ticket; then later, the winning number is announced, and it matches your ticket. You will conclude that the lottery was rigged, with very good confidence, won’t you? In absence of some rather curious anthropic reasoning, existence or non existence of 10^1000 −1 other tickets, or other conscious players, is entirely irrelevant (and in presence of anthropics you have to figure out which ancestors of h. sapiens will change your answer and which won’t). With regards to Aumann’s agreement theorem, other people would agree that if they were in your shoes (shared the data and the priors) they’d arrive at same conclusions, so it is not at all violated.
So let’s see if I’m understanding you here.
You treat a lottery output as a bitstring and ask about SI on it. We can imagine a completely naive agent with no previous observations; what will this ignorant predict? Well, it seems reasonable that one of the top predictions will be for the initial bitstring to be repeated; this seems OK by Occam’s razor (events often repeating are necessary for induction) and I understand that empirically investigating simple Turing machines that many (most? all?) terminating programs will repeat output. It will definitely rank the ‘sequence repeats’ hypotheses above that of possible PRNGs, or very complex physical theories encompassing atmospheric noise and balls dropping into baskets etc.
So far, so good.
I think I lose you when you go on to talk about inferring that you will always win and stuff like that. The repeating hypotheses aren’t contingent on who they happen to. If the particular bitstring emitted by the lottery had also included ‘...and this number was picked by Jain Farstrider’, then SI would seem to then also predict that this Jain will win the next one as well, by the same repeating logic. It certainly will not predict that the agent will win, and the hypothesis ‘the agent (usually) wins’ will drop.
Remember that my trichotomy was that you need to either 1) invoke anthropics; 2) break Aumann via something like dishonesty/incompetence; or 3) you actually do have communicable knowledge.
These SI musings doesn’t seem to invoke anthropics or break Aumannian requirements, and looking at them, they seem communicable. ‘AIXI-MC-MML*, why do you think Jain will win the lottery a second time?’ ‘[translated from minimum-message-length model+message] Well, he won it last time and since I am ignorant of everything in the world, it seems reasonable that he will win it again’. ‘Hmm, that’s a good point.’ And ditto if AIXI-MC-MML happened to be the beneficiary.
* I bring up minimum-message length because Patrick Robotham is supposed to be working on a version of AIXI-MC using MML so one would be able to examine the model of the world(s) a program has devised so far and so one could potentially ask ‘why’ it is making the predictions it is. Having a comprehensible approximation of SI would be pretty convenient for discussing what SI would or would not do.
It doesn’t need PRNGs. The least confusing description of S.I. is as following: the probability of a sequence S is the probability that an universal prefix Turing machine with 3 tapes: input tape which can only be read from, head only advanced in one direction, work tape which can be read from and written to, and is initialized with zeroes, and output tape that can only be written to, will output the sequence S when fed a never-ending string of random bits on the input tape.
The head has such rule set that the program can be loaded via the input tape, and then the program can use the input tape as source of data. This is important because a program can then set up an interpreter emulating other Turing machine (which ensures a constant bound on difference between length of code for different machines).
(We predict using conditional probability—if the machine outputs sequence matching the previous observations, what is the probability that it will produce specific future observations) .
So if we are predicting, for example, perfect coin flips, an input string which begins with code that sets up the working tape so that it will subsequently relay random bits from input to the output, does the trick. This code requires the bits on the input tape to match the observation, meaning that for each observed bit, the length of the input string which has to be correct grows by 1 bit.
Meanwhile a code that sets up the machine to output repeating zeroes does not require any more bits on the input tape to be correct. So when you are getting repeated zeroes, the code relaying random bits is being lowered in weight by factor of 2 with each observed bit, whereas the theory outputting zeroes stays the same (until, of course, you encounter a non zero and it is eliminated).
For more information, see referenced papers in
http://www.scholarpedia.org/article/Algorithmic_probability
You scratched your ticket and you seen a number. Correct codes have to match the number on the ticket and the number winning the lottery. Some use same string of input bits to match both, some use different pieces of input string.
(I am assuming that S.I. can not precisely predict the lottery. Even assuming a completely deterministic universe, light from the distant stars, incoming cosmic rays, all of that incoming information ends up mixed in the grand hash of thermal noise and thermal fluctuations)
edit: to make it clearer. Suppose that the lottery has 1000 decimal digits; you scratch one ticket; then later, the winning number is announced, and it matches your ticket. You will conclude that the lottery was rigged, with very good confidence, won’t you? In absence of some rather curious anthropic reasoning, existence or non existence of 10^1000 −1 other tickets, or other conscious players, is entirely irrelevant (and in presence of anthropics you have to figure out which ancestors of h. sapiens will change your answer and which won’t). With regards to Aumann’s agreement theorem, other people would agree that if they were in your shoes (shared the data and the priors) they’d arrive at same conclusions, so it is not at all violated.