Finally, consider the question of whether you can assign 100% certainty to a mathematical theorem for which a proof exists
To ground this issue in more concrete terms, imagine you are writing an algorithm to compress images made up of 8-bit pixels. The algorithm plows through several rows until it comes to a pixel, and predicts that the distribution of that pixel is Gaussian with mean of 128 and variance of .1. Then the model probability that the real value of the pixel is 255 is some astronomically small number—but the system must reserve some probability (and thus codespace) for that outcome. If it does not, then it violates the general contract that a lossless compression algorithm should assign a code to any input, though some inputs will end up being inflated. In other words it risks breaking.
On the other hand, it is completely reasonable that it should assign zero probability to the outcome that the pixel value is 300. That all pixels values fall between 0 and 255 is a deductive consequence of the problem definition.
To ground this issue in more concrete terms, imagine you are writing an algorithm to compress images made up of 8-bit pixels. The algorithm plows through several rows until it comes to a pixel, and predicts that the distribution of that pixel is Gaussian with mean of 128 and variance of .1. Then the model probability that the real value of the pixel is 255 is some astronomically small number—but the system must reserve some probability (and thus codespace) for that outcome. If it does not, then it violates the general contract that a lossless compression algorithm should assign a code to any input, though some inputs will end up being inflated. In other words it risks breaking.
On the other hand, it is completely reasonable that it should assign zero probability to the outcome that the pixel value is 300. That all pixels values fall between 0 and 255 is a deductive consequence of the problem definition.