If you know the scores of two different golfers on day 1, then you know more than if you know the score of only one golfer on day 1. You can’t predict the direction in which regression to the mean will occur if your data set is a single point.
The following all have different answers:
I play a certain video game a lot. The last time I played it, my score was 39700. What’s your best guess for my score the next time I play it?
(The answer is 39700; I’m probably not going to improve with practice, and you have no way to know if 39700 is unusually good or unusually bad.)
My friend and I both play a certain video game a lot. The last time I played it, my score was 39700. The last time my friend played it, his score was 32100. What’s your best guess for my score the next time I play it?
(The answer is some number less than 39700; knowing that my friend got a lower score gives you a reason to believe that 39700 might be higher than normal.)
I played a video game for the first time yesterday. My score was 39700, and higher scores are better than lower ones. What’s your best guess for my score the next time I play it?
(The answer is some number higher than 39700, because I’m no longer an absolute beginner.)
True, a single data point can’t give you knowledge of regression effects. In the context of the original problem, Kahneman assumed that you had access to the average score of all the golfers on the first day.
I played a video game for the first time yesterday. My score was 39700, and higher scores are better than lower ones. What’s your best guess for my score the next time I play it?
(The answer is some number higher than 39700, because I’m no longer an absolute beginner.)
I’m not sure it’s true that the answer is higher than 39700, in this case. It depends on if you have knowledge of how people generally improve, and if your score is higher than average for an absolute beginner. Since unknown factors could adjust the score either up or down, I would probably just guess that it will be the same the next day.
The existence of factors which could adjust the score either up or down does not indicate which factors dominate. In this case, you have no information which suggests that 39700 is either above or below the median, and therefore these two cases must be assigned equal probability—canceling out any “regression to the mean” effects you could have predicted. Similar arguments apply to other effects which change the score.
So you estimate “regression to the mean” effects as zero, and base your estimate on any other effects you know about and how strong you think they are. That makes sense. Thanks for the correction!
In this case, you have no information which suggests that 39700 is either above or below the median, and therefore these two cases must be assigned equal probability
Not quite, you have some background information about the range of scores video games usually employ.
And, I suppose, information about the probability of people mentioning average scores. I concede that either factor could justify arguing that the score should decrease.
If you know the scores of two different golfers on day 1, then you know more than if you know the score of only one golfer on day 1. You can’t predict the direction in which regression to the mean will occur if your data set is a single point.
The following all have different answers:
(The answer is 39700; I’m probably not going to improve with practice, and you have no way to know if 39700 is unusually good or unusually bad.)
(The answer is some number less than 39700; knowing that my friend got a lower score gives you a reason to believe that 39700 might be higher than normal.)
(The answer is some number higher than 39700, because I’m no longer an absolute beginner.)
True, a single data point can’t give you knowledge of regression effects. In the context of the original problem, Kahneman assumed that you had access to the average score of all the golfers on the first day.
I’m not sure it’s true that the answer is higher than 39700, in this case. It depends on if you have knowledge of how people generally improve, and if your score is higher than average for an absolute beginner. Since unknown factors could adjust the score either up or down, I would probably just guess that it will be the same the next day.
The existence of factors which could adjust the score either up or down does not indicate which factors dominate. In this case, you have no information which suggests that 39700 is either above or below the median, and therefore these two cases must be assigned equal probability—canceling out any “regression to the mean” effects you could have predicted. Similar arguments apply to other effects which change the score.
So you estimate “regression to the mean” effects as zero, and base your estimate on any other effects you know about and how strong you think they are. That makes sense. Thanks for the correction!
Not quite, you have some background information about the range of scores video games usually employ.
And, I suppose, information about the probability of people mentioning average scores. I concede that either factor could justify arguing that the score should decrease.