It seems to me that the meaning of the set C of cases drifts significantly from when it is first introduced and the “Implications” section. It further seems to me that clarifying what exactly C is supposed to be resolves the claimed tension between the existence of iterably improvable ontology identifiers and difficulty of learning human concept boundaries.
Initially, C is taken to be a set of cases such that the question Q has an objective, unambiguous answer. Cases where the meaning of Q are ambiguous are meant to be discarded. For example, if Q is the question “Is the diamond in the vault?” then, on my understanding, C ought to exclude cases where something happens which renders the concepts “the diamond” and “the vault” ambiguous, e.g. cases where the diamond is ground into dust.
In contrast, in the section “Implications,” the existence of iterably improvably ontology identifiers is taken to imply that the resulting ontology identifier would be able to answer the question Q posed in a much larger set of cases C′ in which the very meaning of Q relies on unspecified facts about the state of the world and how they interact with human values.
(For example, it seems to me that the authors think it implausible that an ontology identifier be able to answer a question like “Is the diamond in the vault?” in a case where the notion of “the vault” is ambiguous; the ontology identifier would need to first understand that what the human really wants to know is “Will I be able to spend my diamond?”, reinterpret the former question in light of the latter, and then answer. I agree that an ontology identifier shouldn’t be able to answer ambiguous and context-dependent questions like these, but it would seem to me that such cases should have been excluded from the set C.)
To dig into where specifically I think the formal argument breaks down, let me write out (my interpretation) of the central claim on iterability in more detail. The claim is:
Claim: Suppose there exists an initial easy set E0⊆C such that for any E0⊂E⊊C, we can find a predictor that does useful computation with respect to E. Then we can find a reporter that answers all cases in C correctly.
This seems right to me (modulo more assumptions on “we can find,” not-too-largeness of the sets, etc.). But crucially, since the hypothesis quantifies over all sets E such that E0⊆E⊊C, this hypothesis becomes stronger the larger C is. In particular, if C were taken to include cases where the meaning of Q were fraught or context-dependent, then we should already have strong reason to doubt that this hypothesis is true (and therefore not be surprised when assuming the hypothesis produces counterintuitive results).
(Note that the ELK document is sensitive to concerns about questions being philosophically fraught, and only considers narrow ELK for cases where questions have unambiguous answers. It also seems important that part of the set-up of ELK is that the reporter must “know” the right answers and “understand” the meanings of the questions posed in natural language (for some values of “know” and “understand”) in order for us to talk about eliciting its knowledge at all.)
It seems to me that the meaning of the set C of cases drifts significantly from when it is first introduced and the “Implications” section. It further seems to me that clarifying what exactly C is supposed to be resolves the claimed tension between the existence of iterably improvable ontology identifiers and difficulty of learning human concept boundaries.
Initially, C is taken to be a set of cases such that the question Q has an objective, unambiguous answer. Cases where the meaning of Q are ambiguous are meant to be discarded. For example, if Q is the question “Is the diamond in the vault?” then, on my understanding, C ought to exclude cases where something happens which renders the concepts “the diamond” and “the vault” ambiguous, e.g. cases where the diamond is ground into dust.
In contrast, in the section “Implications,” the existence of iterably improvably ontology identifiers is taken to imply that the resulting ontology identifier would be able to answer the question Q posed in a much larger set of cases C′ in which the very meaning of Q relies on unspecified facts about the state of the world and how they interact with human values.
(For example, it seems to me that the authors think it implausible that an ontology identifier be able to answer a question like “Is the diamond in the vault?” in a case where the notion of “the vault” is ambiguous; the ontology identifier would need to first understand that what the human really wants to know is “Will I be able to spend my diamond?”, reinterpret the former question in light of the latter, and then answer. I agree that an ontology identifier shouldn’t be able to answer ambiguous and context-dependent questions like these, but it would seem to me that such cases should have been excluded from the set C.)
To dig into where specifically I think the formal argument breaks down, let me write out (my interpretation) of the central claim on iterability in more detail. The claim is:
Claim: Suppose there exists an initial easy set E0⊆C such that for any E0⊂E⊊C, we can find a predictor that does useful computation with respect to E. Then we can find a reporter that answers all cases in C correctly.
This seems right to me (modulo more assumptions on “we can find,” not-too-largeness of the sets, etc.). But crucially, since the hypothesis quantifies over all sets E such that E0⊆E⊊C, this hypothesis becomes stronger the larger C is. In particular, if C were taken to include cases where the meaning of Q were fraught or context-dependent, then we should already have strong reason to doubt that this hypothesis is true (and therefore not be surprised when assuming the hypothesis produces counterintuitive results).
(Note that the ELK document is sensitive to concerns about questions being philosophically fraught, and only considers narrow ELK for cases where questions have unambiguous answers. It also seems important that part of the set-up of ELK is that the reporter must “know” the right answers and “understand” the meanings of the questions posed in natural language (for some values of “know” and “understand”) in order for us to talk about eliciting its knowledge at all.)