I just reread it; thank you for allowing to see one of Eliezer’s posts in a new light. Always a pleasure.
However, I have other data at hand that seems to lend credence to the “God exists” theory; I don’t have to reply on the results of one test. If I did, then by that same logic, we would always have to assume that a coin once flipped would be 100% biased toward the side upon which is landed.
Your program, in order to describe the universe, has to be the best model of every single point in the universe. I’m sure there were people who argued that Newton’s equations were simpler than General Relativity. But the data cannot be denied.
I think there are two distinct concepts here: One of them is Bayesian reasoning, and the other is Solomonoff induction (which is basically Occam’s Razor taken to its logical extreme).
Bayesian reasoning is applicable when you have some prior beliefs, usually formalized as probabilities for various theories being true, (e.g 50% chance God did it, 50% amino acids did it), and then you encounter some evidence (e.g. observe angels descend from the sky), and you now want to update your beliefs to be consistent with the evidence you encountered (e.g. 90% chance God did it, 10% amino acids did it). To emphasize, Bayesian reasoning is simply not applicable unless some prior belief to update.
However, I have other data at hand that seems to lend credence to the “God exists” theory;
Sounds like you’re referring to Bayesian reasoning here. You’re saying without that “other data”, you have some probabilities for your various theories, but then when you add in that data, you’re inclined to update your probabilities such that “God did it” becomes more probable.
In contrast, Occam’s Razor and Solomonoff induction do not work with “prior beliefs” (in fact, Solomonoff is often used, in theory, to bootstrap the Bayesian process, providing the “initial belief”, from which you can start using Bayesian to update from). When using Solomonoff, you enumerate all conceivable theories, and then for each theory, you check whether it is compatible with the the data you currently have. You don’t think in terms of “this theory is more probable given data set 1, but that theory is more probable given data set 2”. You simply mark each theory as “compatible” or “not compatible”. Once you’ve done that, you eliminate all theories which are “not compatible” (or equivalently, assign them a probability of 0). Now all that remains is to assign probabilities to the theories the remain (i.e. the ones which are compatible with the data you have). One naive way to do that is to just assign uniform probability to all remaining theories. Solomonoff induction actually states that you should assign probabilities based on the complexity of the theory.
If I did, then by that same logic, we would always have to assume that a coin once flipped would be 100% biased toward the side upon which is landed.
Mentally relabel the button “Witnessed Failure” with “Saw a coin come up tails” and “Witnessed Success” with “Saw a coin come up heads”, then click the “Witnessed Success”/”Saw a coin come up heads” button.
Note that the results is not “You should assume that a coin is 100% biased towards head.”
Instead, the results are “There’s a 0% chance that the coin is 100% biased towards tail, a tiny chance that the coin is 99% biased towards tail, a slightly larger chance that the coin is 98% biased towards tail” and so on until you reach “about a 2% chance the coin is 100% biased towards head”, which is currently your most probable theory. But note that while “100% biased towards head” is your most probable theory, you’re extremely non-confident in that theory (only a 2% chance that the theory is true). You need to witness a lot more coin flips to increase you confidence levels (go ahead and click on the buttons a few more times).
Disclaimer: This web app actually uses the naive solution of initially assigning uniform probability to all possible theories, rather than the Solomonoff solution of assigning probability according to complexity.
Have you read Occam’s Razor?
I just reread it; thank you for allowing to see one of Eliezer’s posts in a new light. Always a pleasure.
However, I have other data at hand that seems to lend credence to the “God exists” theory; I don’t have to reply on the results of one test. If I did, then by that same logic, we would always have to assume that a coin once flipped would be 100% biased toward the side upon which is landed.
Your program, in order to describe the universe, has to be the best model of every single point in the universe. I’m sure there were people who argued that Newton’s equations were simpler than General Relativity. But the data cannot be denied.
I think there are two distinct concepts here: One of them is Bayesian reasoning, and the other is Solomonoff induction (which is basically Occam’s Razor taken to its logical extreme).
Bayesian reasoning is applicable when you have some prior beliefs, usually formalized as probabilities for various theories being true, (e.g 50% chance God did it, 50% amino acids did it), and then you encounter some evidence (e.g. observe angels descend from the sky), and you now want to update your beliefs to be consistent with the evidence you encountered (e.g. 90% chance God did it, 10% amino acids did it). To emphasize, Bayesian reasoning is simply not applicable unless some prior belief to update.
Sounds like you’re referring to Bayesian reasoning here. You’re saying without that “other data”, you have some probabilities for your various theories, but then when you add in that data, you’re inclined to update your probabilities such that “God did it” becomes more probable.
In contrast, Occam’s Razor and Solomonoff induction do not work with “prior beliefs” (in fact, Solomonoff is often used, in theory, to bootstrap the Bayesian process, providing the “initial belief”, from which you can start using Bayesian to update from). When using Solomonoff, you enumerate all conceivable theories, and then for each theory, you check whether it is compatible with the the data you currently have. You don’t think in terms of “this theory is more probable given data set 1, but that theory is more probable given data set 2”. You simply mark each theory as “compatible” or “not compatible”. Once you’ve done that, you eliminate all theories which are “not compatible” (or equivalently, assign them a probability of 0). Now all that remains is to assign probabilities to the theories the remain (i.e. the ones which are compatible with the data you have). One naive way to do that is to just assign uniform probability to all remaining theories. Solomonoff induction actually states that you should assign probabilities based on the complexity of the theory.
That’s actually not true. Coincidentally, I wrote a web app which illustrates a similar point: http://nebupookins.github.io/binary-bayesian-update/
Mentally relabel the button “Witnessed Failure” with “Saw a coin come up tails” and “Witnessed Success” with “Saw a coin come up heads”, then click the “Witnessed Success”/”Saw a coin come up heads” button.
Note that the results is not “You should assume that a coin is 100% biased towards head.”
Instead, the results are “There’s a 0% chance that the coin is 100% biased towards tail, a tiny chance that the coin is 99% biased towards tail, a slightly larger chance that the coin is 98% biased towards tail” and so on until you reach “about a 2% chance the coin is 100% biased towards head”, which is currently your most probable theory. But note that while “100% biased towards head” is your most probable theory, you’re extremely non-confident in that theory (only a 2% chance that the theory is true). You need to witness a lot more coin flips to increase you confidence levels (go ahead and click on the buttons a few more times).
Disclaimer: This web app actually uses the naive solution of initially assigning uniform probability to all possible theories, rather than the Solomonoff solution of assigning probability according to complexity.