It is not obvious to me that people should be penalized so strongly for being wrong. In fact, I think that is a big part of the question I am asking. Would you rather be right 999 times and 100% sure but wrong the last time, or would you rather have no information on anything?
Would you rather be right 999 times and 100% sure but wrong the last time, or would you rather have no information on anything?
Is that a rhetorical question? Obviously it depends on the application domain: if we were talking about buying and selling stocks, I would certainly want to have no information about anything than experience a scenario where I was 100% sure and then wrong. In that scenario I would presumably have bet all my money and maybe lots of my investors’ money, and then lost it all.
It does depend on the domain. I think that the reason that you want to be very risk-averse in stocks is because you have adversaries trying to take your money, so you get all the negatives of being wrong without all the positives of the 999 times you knew the stock would rise and were correct.
In other cases, such as deciding which route to take while traveling to save time, I’d rather be wrong every once in a while so that I could be right more often.
Both of these ideas are about instrumental rationality, so the question is if you are trying to come up with a model of epistemic rationality which does not depend on utility functions, what type of scoring should you use?
This discussion suggests, that the puzzles presented to the guesser should be associated with a “stake”—a numeric value which says how much you (the asker) care about this particular question to be answered correctly (i.e. how risk averse you are at this particular occassion). Can this be somehow be incorporated into the reward function itself or needs to be a separate input (Is “I want to know if this stock will go up or down, and I care 10 times as much about this question than about will it rain today”, the same thing as “Please estimate p for the following two questions where the reward function for the first one is f(x)=10(x-x^2) and the second is f(x)=x-x^2”? Does it somehow require some additional output channel from the guesser (“I am 90% confident that the p is 80%?” or maybe even “Here’s my distribution over the values of p \in (0,1)”) or does it somehow collapse into one dimension anyway (does “I am 90% confident that the p is 80% and 10% that it’s 70%” collaps to “I think p is 79%”? Does a distribution over p collapse to it’s expected value?).
It is not obvious to me that people should be penalized so strongly for being wrong. In fact, I think that is a big part of the question I am asking. Would you rather be right 999 times and 100% sure but wrong the last time, or would you rather have no information on anything?
Is that a rhetorical question? Obviously it depends on the application domain: if we were talking about buying and selling stocks, I would certainly want to have no information about anything than experience a scenario where I was 100% sure and then wrong. In that scenario I would presumably have bet all my money and maybe lots of my investors’ money, and then lost it all.
It does depend on the domain. I think that the reason that you want to be very risk-averse in stocks is because you have adversaries trying to take your money, so you get all the negatives of being wrong without all the positives of the 999 times you knew the stock would rise and were correct.
In other cases, such as deciding which route to take while traveling to save time, I’d rather be wrong every once in a while so that I could be right more often.
Both of these ideas are about instrumental rationality, so the question is if you are trying to come up with a model of epistemic rationality which does not depend on utility functions, what type of scoring should you use?
This discussion suggests, that the puzzles presented to the guesser should be associated with a “stake”—a numeric value which says how much you (the asker) care about this particular question to be answered correctly (i.e. how risk averse you are at this particular occassion). Can this be somehow be incorporated into the reward function itself or needs to be a separate input (Is “I want to know if this stock will go up or down, and I care 10 times as much about this question than about will it rain today”, the same thing as “Please estimate p for the following two questions where the reward function for the first one is f(x)=10(x-x^2) and the second is f(x)=x-x^2”? Does it somehow require some additional output channel from the guesser (“I am 90% confident that the p is 80%?” or maybe even “Here’s my distribution over the values of p \in (0,1)”) or does it somehow collapse into one dimension anyway (does “I am 90% confident that the p is 80% and 10% that it’s 70%” collaps to “I think p is 79%”? Does a distribution over p collapse to it’s expected value?).