Simple theory of IMDB bias

tawJan 3, 2012, 10:20 AM

−7 points

IMDB top 250 list is dominated by old movies, which conflicts with my perception (shared by majority of people as far as I can tell) that new movies are far better than old movies (comparing either top with top or average with average).

I have a simple theory why IMDB is wrong:

For new movies, very wide population have seen it, many not fans of the genre. They vote on IMDB soon after watching.
For old movies, only narrow population of fans have seen it recently. The only people who vote on IMDB are those who’ve seen it recently (atypical fans), or have particularly good memories of it (atypical fans again). People who watched an old movie ages ago but don’t remember much about it are very unlikely to vote on IMDB.
Therefore it’s much more difficult for a new movie to get a good IMDB score than it is for an old movie.
Therefore a new movie with identical IMDB store is likely much better than an old movie with identical score.

The “correct” procedure would of course be gathering random sample of people, showing them random movies, and asking for ratings just after the movie. For practical reasons this cannot really be done, so the next best thing we can do is ignoring old movies with unreasonably high IMDB scores.

What links here?

taw's comment on New Year’s Prediction Thread (2012) by gwern (Jan 3, 2012, 10:09 AM; 2 points)

tawJan 3, 2012, 10:20 AM

−7 points

23 comments1 min readLW link Archive

timtyler Jan 3, 2012, 12:34 PM
13 points


IMDB top 250 list is dominated by old movies,

With 72 out of 250 movies form the last 12 years, it appears that there’s more of a bias towards the present. Perhaps more movies are being made these days—or perhaps people prefer current moveis.
- magfrump Jan 3, 2012, 7:37 PM
  17 points
  Parent
  
  There are substantially more movies recently, eyeballing it it looks like about 150 thousand movies were made in the most recent 12 years of the chart, compared to about 350 thousand made in previous years total. That would imply that recent movies are underrepresented very slightly. Removing short films might smooth that slightly, but actually these numbers are about right.
  - Vaniver Jan 3, 2012, 9:18 PM
    3 points
    Parent
    
    Upvoted for doing the research. Woo null hypothesis!
Cthulhoo Jan 3, 2012, 1:36 PM
12 points

new movies are far better than old movies Premise: I don’t necessarily agree with this point.

Rate Your Music shows a similar phenomenon, though. The enjoyment of music is in some respect different from the enjoyment of movies, and—if the reason behind the behavior of the two charts is the same (the hypothesis sounds reasonably plausible) - we can maybe identify what the two have in common. Looking at your list of possible causes, they don’t look perfectly like they could give a possible explanation of origin of RYM’s chart, too, but there are indeed some interesting considerations.

For new movies, very wide population have seen it, many not fans of the genre. They vote on IMDB soon after watching

Music fans are different beasts: they usually mix novelties with classics in their discovery of music. It’s still true, though, that you have to more actively search for older music, and therefore are inclined to listen to music that generally meets your tastes. This can be similarly true for movies: you usually go to actively search for an old movie if you like the director, some of the actors or the genre, and it’s probable that you will like the movie in the end. On the contrary, massive advertisement of new movies may convince you to see one that in the end was misrepresented by the trailers and doesn’t meet your tastes.

For old movies, only narrow population of fans have seen it recently. The only people who vote on IMDB are those who’ve seen it recently (atypical fans), or have particularly good memories of it (atypical fans again). People who watched an old movie ages ago but don’t remember much about it are very unlikely to vote on IMDB.

This isn’t usually true for albums. Music fans periodically listen to their favorite albums, even if they are really old. They should have therefore always a “fresh” (more on this later) impression of the recording. Something resembling these effects happens if you look for niche genres (e.g. extreme metal): in this case the only people who listen to those albums are the fans of the genre, and the score for the albums are usually artificially high.

Other effects that I can see playing a role:
- cached beliefs: people tend to form an emotional bond to movies and records they loved in their teens. They tend to keep this kind of judgement even after years have passed, and tend to value newer movies/records less. “All this new music sucks!”
- social cached beliefs: if something is deemed to be a masterpiece, it’s more difficult to dismiss it with a bad rating. “Ok, I don’t really get Citizen Kane/The Velvet Underground and Nico, but I respect them /recognize their historical importance”.
- various level of signaling/countersignaling: very common indeed. If you are seen as a fine critic and expert, how can you admit that you prefer Scary Movie 3 to Apocalypse Now? Or the Spice Girls to Coltrane? “How can you listen to that crappy pop music? Come with me and enjoy some good latin-avant-garde-techno-art-brutal-classic Jazz”.
- Manfred Jan 3, 2012, 9:47 PM
  0 points
  Parent
  
  The music analogy is a bit tricky because we’ve had a lot longer to explore music than we have to explore movies, and movies are harder production-wise. Modern movies are better than past movies for reasons more like the reasons that music after the pianoforte was better than music before the pianoforte, rather than the reasons some people prefer Michael Jackson to Louis Armstrong.
  - Prismattic Jan 4, 2012, 12:33 AM
    3 points
    Parent
    
    Musical preferences appear to be different from other artistic preferences.
    - Cthulhoo Jan 5, 2012, 6:56 PM
      0 points
      Parent
      
      Interesting link, it’s something I’ve always suspected to be true.
  - Cthulhoo Jan 5, 2012, 7:03 PM
    0 points
    Parent
    
    I’m really not discussing the absolute value of old movies vs. new ones, I’m definitely not expert enough on the subject. I was rather trying to point out other kind of effects I thought played a role in explaining the pattern pointed out in the OP. I used the parallel of music because I thought some of these effects were relevant in both cases.
Jack Jan 3, 2012, 11:30 AM
11 points

Disagree with the premise. New movies tend to have more plot holes, less characterization and worse writing. Improved effects only rarely make up that margin. I also find the following stories just as plausible as yours: “New movies are over-represented on the IMDB top 250 because they get bolstered by excited fans who just saw the film and haven’t yet taken the time to digest the movie or see how it dates and who, often, haven’t seen the old movies on the list.” The Return of the King is not better than Blade Runner.

/done with my silly arguing for the day.
- gwern Jan 3, 2012, 9:30 PM
  4 points
  Parent
  
  For what it’s worth, when you look at rankings, they almost always exhibit a recency bias (Murray gives some examples in Human Accomplishment as he tries to correct for it), so I am pretty skeptical that the IMDB would exhibit an anti-recency bias.
- [deleted]Jan 3, 2012, 4:49 PM
  1 point
  Parent
  
  Parts of that argument are unfair. People voting on new movies often haven’t seen the old ones, but obviously most people who voted on the old ones haven’t seen the new ones, either. I’m not sure whether “see how it dates” is a good criterion either—it’s basically saying that what we think of as good movies changes over time, which isn’t ever an argument against any particular movie. If we want to keep the ratings more modern, we could weight new votes more than old votes.
  
  If IMDB exists 100 years from now, it will probably at least be effective at comparing non-recent movies from different time ranges. The two movies that got compared would get a chance both at the new-fan vote and the old-fan vote. Assuming value drift, it’s not clear that the comparison would be meaningful, but it would at least be fair.
  - magfrump Jan 3, 2012, 7:21 PM
    0 points
    Parent
    
    I don’t think it’s obvious that people voting on old movies haven’t seen new movies.
    
    It might be likely that people who watch many old movies are less likely to have seen new movies, but it could just be that these people watch more movies in general.
    - [deleted]Jan 4, 2012, 12:21 AM
      0 points
      Parent
      
      It’s obvious not because of any character trait that those people have, but because the majority of the voting on those movies was done before the new movies came out.
      - magfrump Jan 4, 2012, 3:41 AM
        0 points
        Parent
        
        I was going to say that this is likely more true for movies coming out in 2011 than 2000, which I still believe somewhat, but cursory research indicates that imdb was started in 1990 and acquired by amazon in 1998, so even in the year 2000 imdb was fairly large, and therefore probably had reasonable traffic.
        
        When I thought about older movies I thought about movies from the ’50s and ‘60s especially, where all of the reviews necessarily came out long before the movies were released, rather than movies from the ’90s, where the effect you mention should be pretty strong.
        
        So my new hypothesis in your vein would be that ’90s movies should be over-represented compared with ’00s movies.
        magfrump Jan 4, 2012, 3:45 AM
        0 points
        Parent
        
        Null hypothesis from the data I’ve referenced in my other comments: approximately 37 movies from the ’90s.
        
        Actual data: 40 movies from the 1990s in the top 250. So signs point to movie quality being essentially constant across time (at least on the decade level of granularity. I’ll take another look at specifically 1998-2003; the five years after being acquired by amazon in which presumably the site had the most traffic, but I feel like I’m privileging the hypothesis here.)
        magfrump Jan 4, 2012, 3:49 AM
        0 points
        Parent
        
        6 from 1998, 6 from 1999, 5 from 2000, basically exactly what I’d expect, nothing super high.
- spuckblase Jan 3, 2012, 1:17 PM
  0 points
  Parent
  
  I agree in large parts, but it seems likely that value drift plays a role, too.
magfrump Jan 3, 2012, 7:40 PM
6 points

Did a quick google search on number of movies made each year; actually it looks like modern movies versus old movies have about the representation you’d expect if quality of movies was essentially constant (the number of extra modern movies on the list corresponds approximately to the number of extra movies made.)
Morendil Jan 3, 2012, 12:16 PM
6 points


majority of people as far as I can tell

Low likelihood of sampling bias here, obviously.
Complaining Qoheleth May 19, 2020, 9:06 AM
1 point

Why is correct in scare quotes? If it isn’t the correct solution then what is it? On a website called Less Wrong correct doesn’t exist? Such is the mental schizophrenia that relativism brings.
- Raemon May 19, 2020, 7:51 PM
  2 points
  Parent
  
  This is explained in the next sentence: it is impractical.
Complaining Qoheleth May 19, 2020, 9:05 AM
1 point

Correct is in scare quotes but if what’s being recommended isn’t correct then what is it? On a website called Less Wrong there’s no such thing as correct? Such is the mental schizophrenia that relativism brings.
dlthomas Jan 3, 2012, 5:09 PM
0 points


For practical reasons this cannot really be done[.]

Really? It seems practical to me, at least for enough movies to demonstrate bias or lack thereof in IMDB ratings.