If you want to prove things about fixed points of HCH in an iterated function setting, consider it a function from policies to policies. Let M be the set of messages (say ascii strings < 10kb.) Given a giant look up table T that maps M to M, we can create another giant look up table. For each m in M , give a human in a box the string m, and unlimited query access to T. Record their output.
The fixed points of this are the same as the fixed points of HCH. “Human with query access to” is a function on the space of policies.
Sure, but the interesting thing to me isn’t fixed points in the input/output map, it’s properties (i.e. attractors that are allowed to be large sets) that propagate from the answers seen by a human in response to their queries, into their output.
Even if there’s a fixed point, you have to further prove that this fixed point is consistent—that it’s actually the answer to some askable question. I feel like this is sort of analogous to Hofstadter’s q-sequence.
In the giant lookup table space, HCH must converge to a cycle, although that convergence can be really slow. I think you have convergence to a stationary distribution if each layer is trained on a random mix of several previous layers. Of course, you can still have occilations in what is said within a policy fixed point.
If you want to prove things about fixed points of HCH in an iterated function setting, consider it a function from policies to policies. Let M be the set of messages (say ascii strings < 10kb.) Given a giant look up table T that maps M to M, we can create another giant look up table. For each m in M , give a human in a box the string m, and unlimited query access to T. Record their output.
The fixed points of this are the same as the fixed points of HCH. “Human with query access to” is a function on the space of policies.
Sure, but the interesting thing to me isn’t fixed points in the input/output map, it’s properties (i.e. attractors that are allowed to be large sets) that propagate from the answers seen by a human in response to their queries, into their output.
Even if there’s a fixed point, you have to further prove that this fixed point is consistent—that it’s actually the answer to some askable question. I feel like this is sort of analogous to Hofstadter’s q-sequence.
In the giant lookup table space, HCH must converge to a cycle, although that convergence can be really slow. I think you have convergence to a stationary distribution if each layer is trained on a random mix of several previous layers. Of course, you can still have occilations in what is said within a policy fixed point.