Let’s set aside daemons for a moment, and think about a process which does “try to” make accurate predictions, but also “tries to” perform the relevant calculations as efficiently as possible. If it’s successful in this regard, it will generate small (but probably not minimal) prediction circuits. Let’s call this an efficient-predictor process. The same intuitive argument used for daemons also applies to this new process: it seems like we can get a smaller circuit which makes the same predictions, by removing the optimizy parts.
This feels like a more natural setting for the problem than daemons, but also feels like any useful result could carry back over to the daemon case.
The next step along this path: the efficient-predictor is presumably quite general; it should be able to predict efficiently in many different environments. The “optimizy parts” are basically the parts needed for generality. Over time, the object-level prediction circuit will hopefully stabilize (as the process adapts to its environment), so the optimizy parts mostly stop changing around the object-level parts. That’s something we could check for: after some warm-up time, we expect some chunk of the circuit (the optimizy part) to be mostly independent of the outputs, so we get rid of that chunk of the circuit.
From this standpoint, the key property of daemons (or any other goal-driven process) is that it’s adaptive: it will pursue the goal with some success across multiple possible environments. Intuitively, we expect that adaptivity to come with a complexity cost, e.g. in terms of circuit size.
Let’s set aside daemons for a moment, and think about a process which does “try to” make accurate predictions, but also “tries to” perform the relevant calculations as efficiently as possible. If it’s successful in this regard, it will generate small (but probably not minimal) prediction circuits. Let’s call this an efficient-predictor process. The same intuitive argument used for daemons also applies to this new process: it seems like we can get a smaller circuit which makes the same predictions, by removing the optimizy parts.
This feels like a more natural setting for the problem than daemons, but also feels like any useful result could carry back over to the daemon case.
The next step along this path: the efficient-predictor is presumably quite general; it should be able to predict efficiently in many different environments. The “optimizy parts” are basically the parts needed for generality. Over time, the object-level prediction circuit will hopefully stabilize (as the process adapts to its environment), so the optimizy parts mostly stop changing around the object-level parts. That’s something we could check for: after some warm-up time, we expect some chunk of the circuit (the optimizy part) to be mostly independent of the outputs, so we get rid of that chunk of the circuit.
That seems very close to formalizable.
From this standpoint, the key property of daemons (or any other goal-driven process) is that it’s adaptive: it will pursue the goal with some success across multiple possible environments. Intuitively, we expect that adaptivity to come with a complexity cost, e.g. in terms of circuit size.