Eliezer Yudkowsky comments on How not to sort by a complicated frequentist formula

Eliezer Yudkowsky 2 Jan 2013 0:15 UTC
12 points
The main thing you want to calculate here is expected-value-of-information. Otherwise new posts drop into the void. Trying to maximize upvotes in the long run means showing new posts that might have a high-upvoting parameter.
What links here?
- A1987dM's comment on How not to sort by a complicated frequentist formula by Meni_Rosenfeld (2 Jan 2013 23:23 UTC; 2 points)
- Manfred 2 Jan 2013 0:58 UTC
  16 points
  Parent
  Well, you have to make somewhat of a secret sauce, because what you show on top should depend on what you want the top things and the people who see them to do. If you’re a site like reddit, putting highly-rated stuff on the front page is good, but what you really want is highly rated new stuff. But if you’re urbandictionary, you don’t really care when something was submitted, you want to put the best answer at the top—except that if a word has multiple definitions, you want to have variety in your top definitions. Or maybe you’re amazon, and you want people to see stuff they’ve already bought so it’s easy for them to rate it. Etc.
- Meni_Rosenfeld 2 Jan 2013 10:26 UTC
  4 points
  Parent
  This is interesting, especially considering that it favors low-data items, as opposed to both the confidence-interval-lower-bound and the notability adjustment factor, which penalize low-data items.
  
  You can try to optimize it in an explore-vs-exploit framework, but there would be a lot of modeling parameters, and additional kinds of data will need to be considered. Specifically, a measure of how many of those who viewed the item bothered to vote at all. Some comments will not get any votes simply because they are not that interesting; so if you keep placing them on top hoping to learn more about them, you’ll end up with very few total votes because you show people things they don’t care about.
  - Eliezer Yudkowsky 2 Jan 2013 10:29 UTC
    5 points
    Parent
    Yep. You’d want to check or guess the size of the user’s monitor and where they were scrolling to, and calculate upvotes-per-actual-user-read. As things are read and not upvoted, your confidence that they’re not super-high-value items increases and the value of information from showing them again diminishes.