So let’s say there is a value which we can’t observe directly. Our indirect observations come with noise. To keep things simple let’s assume that our observations are iid and that the noise is zero-mean (unbiased) additive Gaussian. Let’s also assume that the noise is the same for you and for the other observer (if you don’t know how noisy the other guy’s observations are, you won’t be able to answer the question).
Let’s say you have n observations. Let’s call the standard deviation of the noise ‘sigma’ so that noise ~N(0, sigma). You care about “higher” so the significance is going to be one-tailed. You want 90% confidence and our noise is Gaussian, so in standard errors (SE) we want to be about 1.3 standard errors higher than the threshold.
The SE is just sigma / sqrt(n). This means that your estimate has to be greater than (0.8 + 1.3 * sigma / sqrt(n)) for you to have 90% confidence the true value is larger than 0.8.
But your question is different. You want 90% confidence not that the true value is >0.8, but that 90% of the samples will show 90% confidence that the value is >0.8. That’s not hard.
An estimate will provide 90% confidence if it is greater than 0.8 + 1.3 sigma / sqrt(n). The estimates’ standard deviation is sigma/sqrt(n), so we just repeat: you can “be more than 90% confident that the other guy is more than 90% confident” if your estimate is above (0.8 + 1.3 sigma / sqrt(n)) + (1.3 sigma / sqrt(n)) = 0.8 + 2.6 sigma / sqrt(n).
So if you see an estimate that’s more than 2.6 SEs greater than 0.8 (which would lead to your own confidence about the true value being in excess of 99%), you can be 90% sure that the other guy is 90% sure.
Does it make sense to do the “this is what the world would look like if our sampling methodologies were the same” calculation when you have strong reasons to suspect that the sampling methodologies are not the same?
The calculation is more to convince members of the rationalist community that you need extremely strong evidence to believe that rationalist Trump voters thought Trump was racist. Assuming differing sampling methodologies would strengthen the result beyond the 2.6 SEs that Lumifer calculated.
That’s a fairly straightforward question.
So let’s say there is a value which we can’t observe directly. Our indirect observations come with noise. To keep things simple let’s assume that our observations are iid and that the noise is zero-mean (unbiased) additive Gaussian. Let’s also assume that the noise is the same for you and for the other observer (if you don’t know how noisy the other guy’s observations are, you won’t be able to answer the question).
Let’s say you have n observations. Let’s call the standard deviation of the noise ‘sigma’ so that noise ~N(0, sigma). You care about “higher” so the significance is going to be one-tailed. You want 90% confidence and our noise is Gaussian, so in standard errors (SE) we want to be about 1.3 standard errors higher than the threshold.
The SE is just sigma / sqrt(n). This means that your estimate has to be greater than (0.8 + 1.3 * sigma / sqrt(n)) for you to have 90% confidence the true value is larger than 0.8.
But your question is different. You want 90% confidence not that the true value is >0.8, but that 90% of the samples will show 90% confidence that the value is >0.8. That’s not hard.
An estimate will provide 90% confidence if it is greater than 0.8 + 1.3 sigma / sqrt(n). The estimates’ standard deviation is sigma/sqrt(n), so we just repeat: you can “be more than 90% confident that the other guy is more than 90% confident” if your estimate is above (0.8 + 1.3 sigma / sqrt(n)) + (1.3 sigma / sqrt(n)) = 0.8 + 2.6 sigma / sqrt(n).
So if you see an estimate that’s more than 2.6 SEs greater than 0.8 (which would lead to your own confidence about the true value being in excess of 99%), you can be 90% sure that the other guy is 90% sure.
Thanks!
Does it make sense to do the “this is what the world would look like if our sampling methodologies were the same” calculation when you have strong reasons to suspect that the sampling methodologies are not the same?
The calculation is more to convince members of the rationalist community that you need extremely strong evidence to believe that rationalist Trump voters thought Trump was racist. Assuming differing sampling methodologies would strengthen the result beyond the 2.6 SEs that Lumifer calculated.