jacobjacob once again seems too pessimistic, posterior is very heavily that when habryka makes a 60% yes prediction for a decision he has (partial) control over about a functionality which the community has glommed onto thus far, the community is also justified in expressing ~60% belief that the feature ships. :)
Also, we aren’t selecting from “most possible features”!
As a fellow member of the top 100 Metaculus leaderboard I unfortunately have to tell you that it doesn’t measure how well someone is calibrated. You get into it if you do a lot of predictions on Metaculus.
I think this is only partly right. I’ve personally interfaced with most of the people on the top 20 -- some of them for 20+ hours; and I’ve generally been deeply impressed with them; and expect the tails of the Metaculus rankings to track important stuff about people’s cognition.
But yeah, I also found that I could get to like 70 by just predicting the same as the community a bunch, and finding questions that would resolve in just a few days and extremizing really hard.
(That said, I still think I’m a reasonable forecaster, and have practiced it a lot independently. But don’t think this Metaculus ranking is much evidence of that.)
You get metaculus points by being active on Metaculus. People who spent a lot of time on metaculus have thought a lot about about making predictions and that’s valuable but it’s not the same as it being a measure of calibration.
The the expection of jimrandomh all the people in the top 20 have more then 1000 predictions (jimrandomh has 800). I’m Rank 87 at the moment with 195 predictions I made.
If you go on GJOpen you can see the calibration of any user and use it to judge how well they are calibrated. Metaculus doesn’t make that information publically available (you need to have level 6 and pay Tachyons to access the track record of another user, 50 Tachyons also feels too expensive to just do it for the sake of a discussion like this).
Argument screens off authority, and I’m interested in hearing arguments. While that information should be incorporated into your prior, I don’t see why it’s worth mentioning as a counterargument. (To be sure, I’m not claiming that jacobjacob isn’t a good predictor in general.)
It’s reasonable to be unsure whether the “people don’t ship things” consideration is stronger than the “people are excited” consideration. If you knew that the person who deployed the “people don’t ship things” consideration was generally a better predictor (which you don’t quite here, but let’s simplify a bit), then that would suggest that the “people don’t ship things” consideration is in fact stronger.
(Actually downvoted Daniel for reasons similar to what TurnTrout mentions. Aumannian updating is so boring, even though it’s profitable when you’re betting all-things-considered… I also did give arguments above, but people mostly made jokes about my punctuation! #grumpy )
Aumann updating involves trying to inhabit the inside perspective of somebody else and guess what they saw that made them believe what they do—hardly seems boring to me! Also the thing I was doing was ranking my friends at skills, which I think is one of the classic interesting things.
I’m associating it with doing exactly not that. Just using outside variables like “what do they believe” and “how generally competent do I expect them to be”. (I often see people going “but this great forecaster said 70%” and updating marginally closer, without even trying up build a model of that forecaster’s inside view.)
I guess I’m really making a bid for ‘Aumanning’ to refer to the thing that Aumann’s agreement theorem describes, rather than just partially deferring to somebody else.
To clarify: it’s not a lot of evidence that people say “yeah this thing is going to be great and we’ll work on it a lot”, in response to the very post where it was announced, only a few days afterward.
Like, I’d happily go to a bitcoin/Tesla party and take lots of bets against bitcoin/Tesla. I expect those places would give me some of the most generous odds for those bets.
Also it’s some evidence habryka predicted it, but man, people in general are notoriously bad at predicting how much they’ll get done with their time, so it’s certainly not super strong.
(That being said, I think this integration is awesome and kudos to everyone. Just keeping my priors sensible :)
jacobjacob once again seems too pessimistic, posterior is very heavily that when habryka makes a 60% yes prediction for a decision he has (partial) control over about a functionality which the community has glommed onto thus far, the community is also justified in expressing ~60% belief that the feature ships. :)
Also, we aren’t selecting from “most possible features”!
This has now resolved false.
It seems important to me that he is in the top 100 on the Metaculus leaderboard and you are not.
As a fellow member of the top 100 Metaculus leaderboard I unfortunately have to tell you that it doesn’t measure how well someone is calibrated. You get into it if you do a lot of predictions on Metaculus.
I think this is only partly right. I’ve personally interfaced with most of the people on the top 20 -- some of them for 20+ hours; and I’ve generally been deeply impressed with them; and expect the tails of the Metaculus rankings to track important stuff about people’s cognition.
But yeah, I also found that I could get to like 70 by just predicting the same as the community a bunch, and finding questions that would resolve in just a few days and extremizing really hard.
(That said, I still think I’m a reasonable forecaster, and have practiced it a lot independently. But don’t think this Metaculus ranking is much evidence of that.)
You get metaculus points by being active on Metaculus. People who spent a lot of time on metaculus have thought a lot about about making predictions and that’s valuable but it’s not the same as it being a measure of calibration.
The the expection of jimrandomh all the people in the top 20 have more then 1000 predictions (jimrandomh has 800). I’m Rank 87 at the moment with 195 predictions I made.
If you go on GJOpen you can see the calibration of any user and use it to judge how well they are calibrated. Metaculus doesn’t make that information publically available (you need to have level 6 and pay Tachyons to access the track record of another user, 50 Tachyons also feels too expensive to just do it for the sake of a discussion like this).
Argument screens off authority, and I’m interested in hearing arguments. While that information should be incorporated into your prior, I don’t see why it’s worth mentioning as a counterargument. (To be sure, I’m not claiming that jacobjacob isn’t a good predictor in general.)
It’s reasonable to be unsure whether the “people don’t ship things” consideration is stronger than the “people are excited” consideration. If you knew that the person who deployed the “people don’t ship things” consideration was generally a better predictor (which you don’t quite here, but let’s simplify a bit), then that would suggest that the “people don’t ship things” consideration is in fact stronger.
(Actually downvoted Daniel for reasons similar to what TurnTrout mentions. Aumannian updating is so boring, even though it’s profitable when you’re betting all-things-considered… I also did give arguments above, but people mostly made jokes about my punctuation! #grumpy )
This is a timeless part of the LessWrong experience, my friend.
Aumann updating involves trying to inhabit the inside perspective of somebody else and guess what they saw that made them believe what they do—hardly seems boring to me! Also the thing I was doing was ranking my friends at skills, which I think is one of the classic interesting things.
I’m associating it with doing exactly not that. Just using outside variables like “what do they believe” and “how generally competent do I expect them to be”. (I often see people going “but this great forecaster said 70%” and updating marginally closer, without even trying up build a model of that forecaster’s inside view.)
Your version sounds fun.
I guess I’m really making a bid for ‘Aumanning’ to refer to the thing that Aumann’s agreement theorem describes, rather than just partially deferring to somebody else.
Not a crux :)
Also I’m betting on the prior so I’ll have to accept taking a loss every now then expecting to do better, on average, in the long run
To clarify: it’s not a lot of evidence that people say “yeah this thing is going to be great and we’ll work on it a lot”, in response to the very post where it was announced, only a few days afterward.
Like, I’d happily go to a bitcoin/Tesla party and take lots of bets against bitcoin/Tesla. I expect those places would give me some of the most generous odds for those bets.
Also it’s some evidence habryka predicted it, but man, people in general are notoriously bad at predicting how much they’ll get done with their time, so it’s certainly not super strong.
(That being said, I think this integration is awesome and kudos to everyone. Just keeping my priors sensible :)
I do not endorse this as a way to end parentheticals! Grrr!
You must understand—we have to ration our usage of parentheses, lest our strategic reserve again fails us in time of need…