It is frequently difficult to describe the current state, including the entire optimizer, in complete detail. Often times the behavior of systems including optimizers will be chaotic, in that arbitrarily small changes (especially to the optimizer itself) will result in large changes in the output. In such cases, a less-precise description of the process is more useful, and an optimization based description may be able to constrain the output state much more accurately than “laws of physics” applied over a broad range of input states and optimizer states.
It takes a very, very long message to describe the state of every quark in the human brain.
There are still short, predictions like “matter and energy and momentum will be conserved, and entropy will not decrease.” that are (vacuously) true and accurate. It may take a much longer message to describe the salient properties of the final state of an optimizing process, and it will be less precise than the former prediction. For simple optimizers, like a pump, the message could just describe the overall changes in gas volume, pressure, and temperature. For an optimizer that tiles the universe with highly complex information it may be incredibly difficult to compose a message describing that, especially if the information is encoded very efficiently. That’s why I was thinking about comparing the correctness of the message to its length; a minimal length terabit message that is only 90% accurate (or even 10%) is probably way more informative than a kilobit message that’s 100% correct.
It is frequently difficult to describe the current state, including the entire optimizer, in complete detail. Often times the behavior of systems including optimizers will be chaotic, in that arbitrarily small changes (especially to the optimizer itself) will result in large changes in the output. In such cases, a less-precise description of the process is more useful, and an optimization based description may be able to constrain the output state much more accurately than “laws of physics” applied over a broad range of input states and optimizer states.
It takes a very, very long message to describe the state of every quark in the human brain.
There are still short, predictions like “matter and energy and momentum will be conserved, and entropy will not decrease.” that are (vacuously) true and accurate. It may take a much longer message to describe the salient properties of the final state of an optimizing process, and it will be less precise than the former prediction. For simple optimizers, like a pump, the message could just describe the overall changes in gas volume, pressure, and temperature. For an optimizer that tiles the universe with highly complex information it may be incredibly difficult to compose a message describing that, especially if the information is encoded very efficiently. That’s why I was thinking about comparing the correctness of the message to its length; a minimal length terabit message that is only 90% accurate (or even 10%) is probably way more informative than a kilobit message that’s 100% correct.