I must admit that I did not understand everything in the paper, but I think this excerpt summarizes a crucial point:
“The key issue here is proper conditioning. The unbiasedness of the value estimates V_i discussed in §1 is unbiasedness conditional on mu. In contrast, we might think of the revised estimates ^v_i as being unbiased conditional on V. At the time we optimize and make the decision, we know V but we do not know mu, so proper conditioning dictates that we work with distributions and estimates conditional on V.”
The proposed “solution” converts n independent evaluations into n evaluations (estimates) that respect the selection process, but, as far as I can tell, they still rest on prior value estimates and prior knowledge about the uncertainty of those estimates… Which means the “solution” at best limits introduction of optimizer bias, and at worst… masks old mistakes?
I must admit that I did not understand everything in the paper, but I think this excerpt summarizes a crucial point:
“The key issue here is proper conditioning. The unbiasedness of the value estimates V_i discussed in §1 is unbiasedness conditional on mu. In contrast, we might think of the revised estimates ^v_i as being unbiased conditional on V. At the time we optimize and make the decision, we know V but we do not know mu, so proper conditioning dictates that we work with distributions and estimates conditional on V.”
The proposed “solution” converts n independent evaluations into n evaluations (estimates) that respect the selection process, but, as far as I can tell, they still rest on prior value estimates and prior knowledge about the uncertainty of those estimates… Which means the “solution” at best limits introduction of optimizer bias, and at worst… masks old mistakes?