Question: Does ARC consider ELK-unlimited to be solved, where ELK-unlimited is ELK without the competitiveness restriction (computational resource requirements comparable to the unaligned benchmark)?
One might suppose that the “have AI help humans improve our understanding” strategy is a solution to ELK-unlimited because its counterexample in the report relies on the competitiveness requirement. However, there may still be other counterexamples that were less straightforward to formulate or explain.
I’m asking for clarification of this point because I notice most of my intuitions about counterexamples aren’t drawing heavily on the competitiveness requirement, and I suspect ELK-unlimited is still open. If ARC doesn’t think so maybe this discrepancy will become a source of new counterexamples.
My guess is that “help humans improve their understanding” doesn’t work anyway, at least not without a lot of work, but it’s less obvious and the counterexamples get weirder.
It’s less clear whether ELK is a less natural subproblem for the unlimited version of the problem. That is, if you try to rely on something like “human deliberation scaled up” to solve ELK, you probably just have to solve the whole (unlimited) problem along the way.
It seems to me like the core troubles with this point are:
You still have finite training data, and we don’t have a scheme for collecting it. This can result in inner alignment problems (and it’s not clear those can be distinguished from other problems, e.g. you can’t avoid them with a low-stakes assumption).
It’s not clear that HCH ever figures out all the science, no matter how much time the humans spend (and having a guarantee that you eventually figure everything out seems seems kind of close to ELK, where the “have AI help humans improve our understanding” is to some extent just punting to the humans+AI to figure out something).
Even if HCH were to work well it will probably be overtaken by internal consequentialists, and I’m not sure how to address that without competitiveness. (Though you may need a weaker form of competitiveness.)
I’m generally interested in crisper counterexamples since those are a bit of a mess.
Question: Does ARC consider ELK-unlimited to be solved, where ELK-unlimited is ELK without the competitiveness restriction (computational resource requirements comparable to the unaligned benchmark)?
One might suppose that the “have AI help humans improve our understanding” strategy is a solution to ELK-unlimited because its counterexample in the report relies on the competitiveness requirement. However, there may still be other counterexamples that were less straightforward to formulate or explain.
I’m asking for clarification of this point because I notice most of my intuitions about counterexamples aren’t drawing heavily on the competitiveness requirement, and I suspect ELK-unlimited is still open. If ARC doesn’t think so maybe this discrepancy will become a source of new counterexamples.
My guess is that “help humans improve their understanding” doesn’t work anyway, at least not without a lot of work, but it’s less obvious and the counterexamples get weirder.
It’s less clear whether ELK is a less natural subproblem for the unlimited version of the problem. That is, if you try to rely on something like “human deliberation scaled up” to solve ELK, you probably just have to solve the whole (unlimited) problem along the way.
It seems to me like the core troubles with this point are:
You still have finite training data, and we don’t have a scheme for collecting it. This can result in inner alignment problems (and it’s not clear those can be distinguished from other problems, e.g. you can’t avoid them with a low-stakes assumption).
It’s not clear that HCH ever figures out all the science, no matter how much time the humans spend (and having a guarantee that you eventually figure everything out seems seems kind of close to ELK, where the “have AI help humans improve our understanding” is to some extent just punting to the humans+AI to figure out something).
Even if HCH were to work well it will probably be overtaken by internal consequentialists, and I’m not sure how to address that without competitiveness. (Though you may need a weaker form of competitiveness.)
I’m generally interested in crisper counterexamples since those are a bit of a mess.