For this example, I think you can do this if you implement the additional query “How likely is the search on [partial solution] to return a complete solution?”. This is asked of all potential branches before recursing into them.A2 learns to answer the solution probability query efficiently.
Then in amplification of A2 in the top level of B2 looking for a solution to problem of length d1, the root agent first asks “How likely is the search on [string starting with 0] to return a complete solution?” and “How likely is the search on [string starting with 1] to return a complete solution?”. Then, the root agent first queries whichever subtree is most likely to contain a solution. (This doesn’t improve worst case running time, but does improve average case running time.).
This is analogous to running a value estimation network in tree search, and then picking the most promising node to query first.
This seems to require that the heuristic be of a certain form and you know what that form is. What if it’s more general, like run algorithm G on t to produce a list of guesses for s, then check the guesses in that order?
1. I don’t think that every heuristic A2 could use to solve problems of depth d0 needs to be applicable to performing the search of depth d1 - we only need enough heuristics to be useable to be able to keep increasing the search depth at each amplification round in an efficient manner. It’s possible that some of the value of heuristics like “solution is likely to be an output of algorithm G” could be (imperfectly) captured through some small universal set of heuristics that we can specify how to learn and exploit. (I think that variations on “How likely is the search on [partial solution] to produce an answer?” might get us pretty far).
The AlphaGo analogy is that the original supervised move prediction algorithm didn’t necessarily learn every heuristic that the experts used, but just learned enough to be able to efficiently guide the MCTS to better performance.
(Though I do think that imperfectly learning heuristics might cause alignment problems without a solution to the aligned search problem).
2. This isn’t a problem if once the agent A2 can run algorithm G on t for problems of depth d0, it can directly generalize to applying G to problems of depth d1. Simple Deep RL methods aren’t good at this kind of tasks, but things like the Neural Turing Machine are trying to do better on this sort of tasks. So the ability to learn efficient exponential search could be limited by the underlying agent capability; for some capability range, a problem could be directly solved by an unaligned agent, but couldn’t be solved for an aligned agent. This isn’t a problem if we can surpass that level of capability.
I’m not sure that these considerations fix the problem entirely, or whether Paul would take a different approach.
It also might be worth coming up with a concrete example where some heuristics are not straightforward to generalize from smaller to larger problems, and it seems like this will prevent efficiently learning to solve large problems. The problem, however, would need to be something that humans can solve (ie. finding a string that hashed to a particular value using a cryptographic hash function would be hard to generalize any heuristics from, but I don’t think humans could do it either so it’s outside of scope).
For this example, I think you can do this if you implement the additional query “How likely is the search on [partial solution] to return a complete solution?”. This is asked of all potential branches before recursing into them.A2 learns to answer the solution probability query efficiently.
Then in amplification of A2 in the top level of B2 looking for a solution to problem of length d1, the root agent first asks “How likely is the search on [string starting with 0] to return a complete solution?” and “How likely is the search on [string starting with 1] to return a complete solution?”. Then, the root agent first queries whichever subtree is most likely to contain a solution. (This doesn’t improve worst case running time, but does improve average case running time.).
This is analogous to running a value estimation network in tree search, and then picking the most promising node to query first.
This seems to require that the heuristic be of a certain form and you know what that form is. What if it’s more general, like run algorithm G on t to produce a list of guesses for s, then check the guesses in that order?
1. I don’t think that every heuristic A2 could use to solve problems of depth d0 needs to be applicable to performing the search of depth d1 - we only need enough heuristics to be useable to be able to keep increasing the search depth at each amplification round in an efficient manner. It’s possible that some of the value of heuristics like “solution is likely to be an output of algorithm G” could be (imperfectly) captured through some small universal set of heuristics that we can specify how to learn and exploit. (I think that variations on “How likely is the search on [partial solution] to produce an answer?” might get us pretty far).
The AlphaGo analogy is that the original supervised move prediction algorithm didn’t necessarily learn every heuristic that the experts used, but just learned enough to be able to efficiently guide the MCTS to better performance.
(Though I do think that imperfectly learning heuristics might cause alignment problems without a solution to the aligned search problem).
2. This isn’t a problem if once the agent A2 can run algorithm G on t for problems of depth d0, it can directly generalize to applying G to problems of depth d1. Simple Deep RL methods aren’t good at this kind of tasks, but things like the Neural Turing Machine are trying to do better on this sort of tasks. So the ability to learn efficient exponential search could be limited by the underlying agent capability; for some capability range, a problem could be directly solved by an unaligned agent, but couldn’t be solved for an aligned agent. This isn’t a problem if we can surpass that level of capability.
I’m not sure that these considerations fix the problem entirely, or whether Paul would take a different approach.
It also might be worth coming up with a concrete example where some heuristics are not straightforward to generalize from smaller to larger problems, and it seems like this will prevent efficiently learning to solve large problems. The problem, however, would need to be something that humans can solve (ie. finding a string that hashed to a particular value using a cryptographic hash function would be hard to generalize any heuristics from, but I don’t think humans could do it either so it’s outside of scope).