philh comments on May Monthly Bragging Thread

philh 4 May 2014 12:44 UTC
29 points
In the film industry, when we want to guess how much money a film is going to make, we think of similar films to compare it to and see how much money those made. Six months ago, I started a project to think of similar films for you. We rolled it out to our users a couple of weeks ago, and they love it. Quite apart from the films it recommends, my boss says that the way it displays the comparisons is by far the best that he’s ever seen. The database it draws from only goes back a few years, so we managed to get budgeting approval to hire a team of interns for a few weeks to fill in the history. My company is probably paying about £10,000 to collect data for a system that I wrote, by myself, in six months while working on other projects. (That’s not actually a lot of money, but my system one still thinks it is.)

Relatedly, I solved this bug while working on it, by delving deeper into the network stack than I ever have before.
- Joshua_Blaine 4 May 2014 22:25 UTC
  17 points
  Parent
  You made a thing that’s being used by other people. People who are paying you to use it. That’s pretty great!
- William_Quixote 5 May 2014 17:46 UTC
  0 points
  Parent
  To get data to feed your model consider buying a data set from an industry provider like box office mojo. Depending on what fields you need, they have a very solid data set with long history that could probably be purchased for less than 10000 euro depending on the confi and on sell terms your company could agree to.
  - philh 5 May 2014 22:20 UTC
    0 points
    Parent
    “Import” might have been more accurate than “collect”. We have access to the history from various sources, but we can’t yet import it automatically. I’m pretty sure the data needs to be hunted down, collated and cleaned up in difficult-to-automate ways, but I haven’t actually turned my attention to the problem yet.