Submission: Low-bandwidth oracle. Here is a description of one million experiments. Generate an ordinal ranking of all these experiments based on how much practical knowledge we would have gained had we conducted each one in isolation. After you generate this list, it will go to a computer program. This program will randomly pick three of these experiments and tell us humans which three it picked. This program will then store which of these three experiments you gave the highest and which the lowest rank to, and the names in random order of the ten experiments that you overall gave the highest rankings to. All other aspects of your list will be deleted. We will then perform the three experiments (before we learn of your rankings) and ask a panel of independent experts to rank them in terms of how much practical knowledge we gained from each. After this ranking the program will give us the information it has not deleted. You will get a positive score if and only if your ranking is consistent with how our experts ranked the three experiments.
A difference of the above approach compared to having the oracle give us its top ten, and we then randomly pick one of these ten and one from the rest and compared to see which is better is that the humans running the experiments might be influenced by which experiment they think the oracle picked, and the oracle would take this into account when picking. Under the above approach, humans being told to run the three experiments are not given any information about how valuable the oracle thinks the experiments are. In contrast, under the alternative approach the humans know when doing the experiments that the oracle has judged one of the experiments as being highly valuable.
Another difference is that if you could keep the relevant information from the oracle, with the above approach the humans could perform the experiments before you turn on the oracle, rather than having to wait until after the oracle gives you its prediction. This difference would be critical if the oracle wouldn’t believe you would actually go to the trouble of performing the experiments after it gives you its prediction, but would be able to tell if you have already performed the experiments.
Submission: Low-bandwidth oracle. Here is a description of one million experiments. Generate an ordinal ranking of all these experiments based on how much practical knowledge we would have gained had we conducted each one in isolation. After you generate this list, it will go to a computer program. This program will randomly pick three of these experiments and tell us humans which three it picked. This program will then store which of these three experiments you gave the highest and which the lowest rank to, and the names in random order of the ten experiments that you overall gave the highest rankings to. All other aspects of your list will be deleted. We will then perform the three experiments (before we learn of your rankings) and ask a panel of independent experts to rank them in terms of how much practical knowledge we gained from each. After this ranking the program will give us the information it has not deleted. You will get a positive score if and only if your ranking is consistent with how our experts ranked the three experiments.
A difference of the above approach compared to having the oracle give us its top ten, and we then randomly pick one of these ten and one from the rest and compared to see which is better is that the humans running the experiments might be influenced by which experiment they think the oracle picked, and the oracle would take this into account when picking. Under the above approach, humans being told to run the three experiments are not given any information about how valuable the oracle thinks the experiments are. In contrast, under the alternative approach the humans know when doing the experiments that the oracle has judged one of the experiments as being highly valuable.
Another difference is that if you could keep the relevant information from the oracle, with the above approach the humans could perform the experiments before you turn on the oracle, rather than having to wait until after the oracle gives you its prediction. This difference would be critical if the oracle wouldn’t believe you would actually go to the trouble of performing the experiments after it gives you its prediction, but would be able to tell if you have already performed the experiments.