My reading of dadadarren is that you can use that method to make a decision, but you cannot use that to determine whether it is correct. How would you (the I in that situation) determine that? One can’t. Either it gets 1000 or 999, and it learns whether it is an L or an R but not with which probability. The formula gives an expected value over a lot of such interactions. Which ones count? If only those of the I count, then it will never be any wiser even if it loses or wins all the time—it could just be the lucky ones. Only by comparing to a group of other first persons can you evaluate that—but, as dadadarren says, then it is no longer about the I.
My reading of dadadarren is that you can use that method to make a decision, but you cannot use that to determine whether it is correct. How would you (the I in that situation) determine that? One can’t. Either it gets 1000 or 999, and it learns whether it is an L or an R but not with which probability. The formula gives an expected value over a lot of such interactions. Which ones count? If only those of the I count, then it will never be any wiser even if it loses or wins all the time—it could just be the lucky ones. Only by comparing to a group of other first persons can you evaluate that—but, as dadadarren says, then it is no longer about the I.