Would it be legitimate to ask the SI to estimate the probability that its guess is correct? I suppose that if it sums up its programs’ estimates as to the next bit and finds itself predicting a 50% chance either way, it at least understands that it is dealing with random data but is merely being very persistent in looking for a pattern just in case it merely seemed random? That’s not as bad as I thought at first.
But, given 1 terabyte of data, will it not generate a ~1 terabyte program as it’s hypothesis? Even if it is as accurate as the best answer, this seems like a flaw.
Remember that SI works by accounting for all the infinite multitude of hypothesis that can generate the given string. Given an algorithmically random TB of data, SI will take into consideration surely a TB hypothesis with high probability but also all the bigger hypothesis with exponentially lower probabilities.
OK, so it will predict one of multiple different ~ 1 terabyte programs as having different likelihoods. I’d still rather it predict random{0,1} for less than 10 bytes, as the most probable. Inability to recognize noise as noise seems like a fundamental problem.
There is an SI that works over programs (which I was referring to originally), and there is an SI that works over computable distributions (that will produce random{0,1} with high probability).
No, it will always make a prediction according to the the infinitely many programs which are consistent with the observed string. In the observed string is 1 terabyte of uniform random noise, the shortest of these programs will be most likely ~ 1 terabyte long, but Solomonoff induction also considers the longer ones.
I think it’s instructive to compare with the red card—blue card experiment. Like the human subjects, Solomonoff induction will entertain very complicated hypotheses, but this is fine because it doesn’t select its next card based on just one of them. A decision rule like AIXI will settle down on ~70% probability of blue, and then pick blue every time.
Well when it tries to guess the next bit it gets 50% of its guesses right, which is as good as anything else.
Would it be legitimate to ask the SI to estimate the probability that its guess is correct? I suppose that if it sums up its programs’ estimates as to the next bit and finds itself predicting a 50% chance either way, it at least understands that it is dealing with random data but is merely being very persistent in looking for a pattern just in case it merely seemed random? That’s not as bad as I thought at first.
But, given 1 terabyte of data, will it not generate a ~1 terabyte program as it’s hypothesis? Even if it is as accurate as the best answer, this seems like a flaw.
Remember that SI works by accounting for all the infinite multitude of hypothesis that can generate the given string. Given an algorithmically random TB of data, SI will take into consideration surely a TB hypothesis with high probability but also all the bigger hypothesis with exponentially lower probabilities.
OK, so it will predict one of multiple different ~ 1 terabyte programs as having different likelihoods. I’d still rather it predict random{0,1} for less than 10 bytes, as the most probable. Inability to recognize noise as noise seems like a fundamental problem.
random{0,1} is not an algorithm, so...
Explained here.
There is an SI that works over programs (which I was referring to originally), and there is an SI that works over computable distributions (that will produce random{0,1} with high probability).
No, it will always make a prediction according to the the infinitely many programs which are consistent with the observed string. In the observed string is 1 terabyte of uniform random noise, the shortest of these programs will be most likely ~ 1 terabyte long, but Solomonoff induction also considers the longer ones.
I think it’s instructive to compare with the red card—blue card experiment. Like the human subjects, Solomonoff induction will entertain very complicated hypotheses, but this is fine because it doesn’t select its next card based on just one of them. A decision rule like AIXI will settle down on ~70% probability of blue, and then pick blue every time.