John_Maxwell comments on Hacking on LessWrong Just Got Easier

John_Maxwell 4 Sep 2011 6:20 UTC
2 points
Awesome!

I don’t have time right now but someone should totally implement a recommendation engine for Less Wrong articles. Seems like it could be really high utility (is recency really the best criterion for determining the optimal Less Wrong article for a user to read?) as well as a great excuse to bone up on the latest in machine learning and narrow AI. (I’m a little leery of learning about a field without applying it to some project—if I’m able to apply knowledge successfully that makes me feel a lot more confident that I’ve mastered it. And having a project in mind helps resolve questions of how deeply I want to master a given piece of material—just master it well enough to do the project.) Plus if you fail at this there won’t be any stigma—the reddit team worked on their recommendation engine for years without it going much of anywhere.

Can anyone recommend any books they’ve read that would be useful to an endeavor such as this? Programming Collective Intelligence looks pretty interesting but I haven’t read it.
- jefftk 5 Sep 2011 20:09 UTC
  0 points
  Parent
  Better than recency, perhaps the top scoring posts of all time?
  
  A recommendation engine needs information about what posts you are glad to have read and ideally what posts you read but did not fund useful. So if the engine knows for each user (1) the set of posts they’ve read [1] and (2) the set of posts that they’ve voted up, then we have an evaluation criterion: did we choose to show people posts they voted up?
  
  You’d then need to figure out features and write code for them so the learning algorithm could find user correlations. Set up a svm or something to get probability of upvoting given viewing. Then you use some sort of multi armed bandit algorithm so you continue to gather information.
  
  [1] This isn’t perfect, because you can open a post without reading it. We could detect scrolling and log how much of it they actually read, but people might not like the privacy implications.