I think your description here is basically right, except for the very last sentence. If we just had one meta-level, then yeah, it would end up basically the same as explore/exploit but at a more abstract level. But by recursively applying the trick, it ends up working well on a much wider class of problems than explore/exploit by itself.
The really neat thing is that recursively applying this particular trick does not require moving up an infinite ladder of meta-levels. There are only two levels, and moving “up” alternates between them.
To see why, consider what it looks like on a math problem where we’re trying to prove some theorem:
A “constraint” would be something like a conditional counterexample—i.e. “for any proof which uses strategy X, here’s a counterexample”
When searching for such counterexamples, our “constraints” are partial proofs—i.e. if we have a proof assuming X, then any counterexample must leverage/assume the falsehood of X.
So when we use constraints to guide our search for proofs, we look for counterexamples. When we use constraints to guide our search for counterexamples, we look for proofs.
More visually, consider solving a maze. We can find constraints like this:
When we’re searching for that constraint, what are the meta-constraints to that search? Well, they’re paths. A constraint is a path of connected walls; a meta-constraint is a path of connected spaces.
In general, we’re switching between solving an optimization problem and solving the dual problem.
Of course, we may also use some random search at any given step before switching.
I think your description here is basically right, except for the very last sentence. If we just had one meta-level, then yeah, it would end up basically the same as explore/exploit but at a more abstract level. But by recursively applying the trick, it ends up working well on a much wider class of problems than explore/exploit by itself.
The really neat thing is that recursively applying this particular trick does not require moving up an infinite ladder of meta-levels. There are only two levels, and moving “up” alternates between them.
To see why, consider what it looks like on a math problem where we’re trying to prove some theorem:
A “constraint” would be something like a conditional counterexample—i.e. “for any proof which uses strategy X, here’s a counterexample”
When searching for such counterexamples, our “constraints” are partial proofs—i.e. if we have a proof assuming X, then any counterexample must leverage/assume the falsehood of X.
So when we use constraints to guide our search for proofs, we look for counterexamples. When we use constraints to guide our search for counterexamples, we look for proofs.
More visually, consider solving a maze. We can find constraints like this:
When we’re searching for that constraint, what are the meta-constraints to that search? Well, they’re paths. A constraint is a path of connected walls; a meta-constraint is a path of connected spaces.
In general, we’re switching between solving an optimization problem and solving the dual problem.
Of course, we may also use some random search at any given step before switching.