I will say some things that occurred to me while thinking more about this, and hope that someone will correct me if I get something wrong.
“Human imitation” is sometimes used to refer to the outward behavior of the system (e.g. “imitation learning”, and in posts like “Just Imitate Humans?”), and sometimes to refer to the model of the human inside the system (e.g. here when you say “the human imitation is telling the system what to do”).
A system that is more capable than a human can still be a “human imitation”, because “human imitation” is being used in the sense of “modeling humans inside the system” instead of “has the outward behavior of a human”.
There is a distinction between the counterfactual training procedure vs the resulting system. “Counterfactual oracle” (singular) seems to be used to refer to the resulting system, and Paul calls this “the system” in his “Human-in-the-counterfactual-loop” post. “Counterfactual oracles” (plural) is used both as a plural version of the resulting system and also as a label for the general training procedure. “Human-in-the-counterfactual-loop”, “counterfactual human oversight”, and “counterfactual oversight” all refer to the training procedure (but only when the procedure uses a model of the human).
Thanks! I think I understand this now.
I will say some things that occurred to me while thinking more about this, and hope that someone will correct me if I get something wrong.
“Human imitation” is sometimes used to refer to the outward behavior of the system (e.g. “imitation learning”, and in posts like “Just Imitate Humans?”), and sometimes to refer to the model of the human inside the system (e.g. here when you say “the human imitation is telling the system what to do”).
A system that is more capable than a human can still be a “human imitation”, because “human imitation” is being used in the sense of “modeling humans inside the system” instead of “has the outward behavior of a human”.
There is a distinction between the counterfactual training procedure vs the resulting system. “Counterfactual oracle” (singular) seems to be used to refer to the resulting system, and Paul calls this “the system” in his “Human-in-the-counterfactual-loop” post. “Counterfactual oracles” (plural) is used both as a plural version of the resulting system and also as a label for the general training procedure. “Human-in-the-counterfactual-loop”, “counterfactual human oversight”, and “counterfactual oversight” all refer to the training procedure (but only when the procedure uses a model of the human).