Perhaps we could say that Gears-like models have low entropy? (Relative to the amount of territory covered.)
You can communicate the model in a small number of bits. That’s why you can re-derive a missing part (your test #3)--you only need a few key pieces to logically imply the rest.
This also implies you don’t have many degrees of freedom; [you can’t just change one detail without affecting others](https://www.lesswrong.com/posts/XTWkjCJScy2GFAgDt/dark-side-epistemology). This makes it (more likely to be) incoherent to imagine one variable being different while everything else is the same (your test #2).
Because the model itself is compact, you can also specify the current state of the system in a relatively small number of bits, inferring the remaining variables from the structure (your test #1). (Although the power here is really coming from the ”...relative to the amount of territory covered” bit. That bit seems critical to reward a single model that explains many things versus a swarm of tiny models that collectively explain the same set of things, while being individually lower-entropy but collectively higher-entropy.)
This line of thinking also reminds me of Occam’s Razor/Solomonoff Induction.
Perhaps we could say that Gears-like models have low entropy? (Relative to the amount of territory covered.)
You can communicate the model in a small number of bits. That’s why you can re-derive a missing part (your test #3)--you only need a few key pieces to logically imply the rest.
This also implies you don’t have many degrees of freedom; [you can’t just change one detail without affecting others](https://www.lesswrong.com/posts/XTWkjCJScy2GFAgDt/dark-side-epistemology). This makes it (more likely to be) incoherent to imagine one variable being different while everything else is the same (your test #2).
Because the model itself is compact, you can also specify the current state of the system in a relatively small number of bits, inferring the remaining variables from the structure (your test #1). (Although the power here is really coming from the ”...relative to the amount of territory covered” bit. That bit seems critical to reward a single model that explains many things versus a swarm of tiny models that collectively explain the same set of things, while being individually lower-entropy but collectively higher-entropy.)
This line of thinking also reminds me of Occam’s Razor/Solomonoff Induction.