paulfchristiano comments on Challenges to Christiano’s capability amplification proposal

paulfchristiano 27 May 2018 18:03 UTC
4 points
Even if this is the case, it seems like you’re giving up on being competitive on speed by saying “well, we could just use brute force search.”
The efficiency of the hypothetical amplification process doesn’t directly much affect the efficiency of the training process. It affects the number of “rounds” of amplification you need to do, but the rate is probably limited mostly by the ability of the underlying ML to learn new stuff.
There needs to be an ordering on solutions-to-evaluate such that you can ensure the evaluators are pointed at different solutions and cover the whole solution space
You can pick randomly.
(It also seems to me like you’re giving up on safety, as you point out later; one of the reasons why heuristic search methods for optimization seem promising to me is because you can be also doing safety-evaluation effort there, such that more dangerous solutions are less likely to be considered in the first place.)
I agree that this merely reduces the problem of “find a good solution” to “securely evaluate whether a solution is good” (that’s what I was saying in the grandparent).
or we discover that we have too much state to successfully pass around, and thus there’s not really a meaningful sense in which we can have separate short-lived agents
The idea is to pass around state by distributing it across a large number of agents. Of course it’s an open question whether that works, that’s what we want to figure out.
(or not really a meaningful sense in which we can be competitive with agents that do maintain all that state)
Again, the hypothetical amplification process is not intended to be competitive, that’s the whole point of iterated amplification.
But if we think that the different branches are mutually informative, then we want to have a linkage between those branches, which means a horizontal links in this tree
Only if we want to be competitive. Otherwise you can just simulate horizontal links by just running the entire other subtree in a subcomputation. In the case of iterated amplification, that couldn’t possibly change the speed of the training process, since only O(1) nodes are actually instantiated at a time anyway and the rest are distilled into the neural network. What would a horizontal link mean?
the intuition network as part of the state of each short-lived agent
The intuition network is a distillation of the vertical tree, it’s not part of the amplification process at all.
and that couldn’t be implemented without this horizontal linkage
I don’t think that’s right, also I don’t see how a ‘horizontal’ linkage would compare with a normal vertical linkage, just unroll the computation.
are ones where this sort of wide state is relevant and not easily compressible
The main thing I’m looking for are examples of particular kinds of state that you think are incompressible. For example, do you think modern science has developed kinds of understanding that couldn’t be distributed across many short-lived individuals (in a way that would let you e.g. use that knowledge to answer questions that a long-lived human could answer using that knowledge)?
Last time this came up Eliezer used the example of calculus. But I claim that anything you can formalize can’t possibly have this character, since you can distribute those formal representations quite easily, with the role of intuition being to quickly reach conclusions that would take a long time using the formal machinery. That’s exactly the case where amplification works well. (This then lead to the same problem with “if you just manipulate things formally, how can you tell that the hypothesis is just making predictions rather than doing something evil, e.g. can you tell that the theory isn’t itself an optimizer?”, which is what I mentioned in the grandparent.)