“This process prefers to exactly follow the laws of physics, therefore future events and observations will turn out exactly as a natural physical system would evolve” seems to be a minimal message length description for predicting the behavior of any process, unless that process necessarily restricts its final output to a very tiny domain that’s shorter to describe than the initial state. For any description of the current measurable state of a process and a predicted future description of state likelihoods it seems that it will always be simpler to describe the current state and then predict it will exactly follow the natural laws.
Maybe it works if the likelihood of a given prediction is compared with its length? It’s easy to be trivially correct, but not so easy to make complex predictions that are also right. Take the probability of any message of that length being correct and compare that with P(E|M), the probability that the event E described by message M occurs. If P(E|M) is higher than the probability of a random M-length message being correct it means M is a good description, but I am not convinced it’s a good description solely because of the properties of E. It still seems like M=”laws of physics” will be better than other descriptions of optimizing processes. I am probably missing something.
It is frequently difficult to describe the current state, including the entire optimizer, in complete detail. Often times the behavior of systems including optimizers will be chaotic, in that arbitrarily small changes (especially to the optimizer itself) will result in large changes in the output. In such cases, a less-precise description of the process is more useful, and an optimization based description may be able to constrain the output state much more accurately than “laws of physics” applied over a broad range of input states and optimizer states.
It takes a very, very long message to describe the state of every quark in the human brain.
There are still short, predictions like “matter and energy and momentum will be conserved, and entropy will not decrease.” that are (vacuously) true and accurate. It may take a much longer message to describe the salient properties of the final state of an optimizing process, and it will be less precise than the former prediction. For simple optimizers, like a pump, the message could just describe the overall changes in gas volume, pressure, and temperature. For an optimizer that tiles the universe with highly complex information it may be incredibly difficult to compose a message describing that, especially if the information is encoded very efficiently. That’s why I was thinking about comparing the correctness of the message to its length; a minimal length terabit message that is only 90% accurate (or even 10%) is probably way more informative than a kilobit message that’s 100% correct.
“This process prefers to exactly follow the laws of physics, therefore future events and observations will turn out exactly as a natural physical system would evolve” seems to be a minimal message length description for predicting the behavior of any process, unless that process necessarily restricts its final output to a very tiny domain that’s shorter to describe than the initial state. For any description of the current measurable state of a process and a predicted future description of state likelihoods it seems that it will always be simpler to describe the current state and then predict it will exactly follow the natural laws.
Maybe it works if the likelihood of a given prediction is compared with its length? It’s easy to be trivially correct, but not so easy to make complex predictions that are also right. Take the probability of any message of that length being correct and compare that with P(E|M), the probability that the event E described by message M occurs. If P(E|M) is higher than the probability of a random M-length message being correct it means M is a good description, but I am not convinced it’s a good description solely because of the properties of E. It still seems like M=”laws of physics” will be better than other descriptions of optimizing processes. I am probably missing something.
It is frequently difficult to describe the current state, including the entire optimizer, in complete detail. Often times the behavior of systems including optimizers will be chaotic, in that arbitrarily small changes (especially to the optimizer itself) will result in large changes in the output. In such cases, a less-precise description of the process is more useful, and an optimization based description may be able to constrain the output state much more accurately than “laws of physics” applied over a broad range of input states and optimizer states.
It takes a very, very long message to describe the state of every quark in the human brain.
There are still short, predictions like “matter and energy and momentum will be conserved, and entropy will not decrease.” that are (vacuously) true and accurate. It may take a much longer message to describe the salient properties of the final state of an optimizing process, and it will be less precise than the former prediction. For simple optimizers, like a pump, the message could just describe the overall changes in gas volume, pressure, and temperature. For an optimizer that tiles the universe with highly complex information it may be incredibly difficult to compose a message describing that, especially if the information is encoded very efficiently. That’s why I was thinking about comparing the correctness of the message to its length; a minimal length terabit message that is only 90% accurate (or even 10%) is probably way more informative than a kilobit message that’s 100% correct.