HCH is the result of a potentially infinite exponential process (see figure 1) and thereby, computationally intractable. In reality, we can not break down any task into its smallest parts and solve these subtasks one after another because that would take too much computation. This is why we need to iterate distillation and amplification and cannot just amplify.
In general your post talks about amplification (and HCH) as increasing the capability of the system and distillation as saving on computation/making things more efficient. But my understanding, based on this conversation with Rohin Shah, is that amplification is also intended to save on computation (otherwise we could just try to imitate humans). In other words, the distillation procedure is able to learn more quickly by training on data provided by the amplified system compared to just training on the unamplified system. So I don’t like the phrasing that distillation is the part that’s there to save on computation, because both parts seem to be aimed at that.
(I am making this comment because I want to check my understand with you or make sure you understand this point because it doesn’t seem to be stated in your post. It was one of the most confusing things about IDA to me and I’m still not sure I fully understand it.)
In general your post talks about amplification (and HCH) as increasing the capability of the system and distillation as saving on computation/making things more efficient. But my understanding, based on this conversation with Rohin Shah, is that amplification is also intended to save on computation (otherwise we could just try to imitate humans). In other words, the distillation procedure is able to learn more quickly by training on data provided by the amplified system compared to just training on the unamplified system. So I don’t like the phrasing that distillation is the part that’s there to save on computation, because both parts seem to be aimed at that.
(I am making this comment because I want to check my understand with you or make sure you understand this point because it doesn’t seem to be stated in your post. It was one of the most confusing things about IDA to me and I’m still not sure I fully understand it.)