LW Frontpage Experiments! (aka “Take the wheel, Shoggoth!”)

Update: June 20th

After a few rounds of adjustments and careful examination of the data, we’ve decided to make the Enriched tab be the default for all logged-in users[1]. Anyone is who has not already switched tabs ever will be set to Enriched tab. If you dislike it, you can switch to the Latest tab, the prior default.

We’re not completely certain this is the correct choice long-term, but the results seem good enough to progress to a broader roll out for now, though we’ll keep monitoring.

We’ve also enabled a Recommended tab, so the available tabs are now:

  • Latest: 100% post from the Latest algorithm (using karma and post age to sort)

  • Enriched (default): 50% posts from the Latest algorithm, 50% posts from the recommendations engine

  • Recommended: 100% posts from the recommendations engine, choosing posts specifically for you based on your history

  • Subscribed: a feed of posts and comments from users you have explicitly followed

  • Bookmarks: this tab appears if you have bookmarked any posts

Update: May 13th

If you’re reading this, it’s possible you just found yourself switched to the Enriched tab. Congratulations! You were randomly assigned to be fed to the Shoggoth to a group of users automatically switched to the new posts list.

The Enriched posts list:

  • Is 50% the same algorithm as Latest, 50% ML-algorithm selected posts for you based on your post interaction history.

  • The sparkle icon next to the post title marks which posts were the result of personalized recommendations.

  • You can switch back at any time to the regular Latest tab if you don’t like the recommendations

  • We changed the name “Recommended” to “Enriched” to better imply that it contains 50% of the regular Latest posts. (We will probably soon add a Recommended tab that is 100% recommendations.)

You can read further discussion of the experiments in this comment.


Original Post, April 22nd

For the last month, @RobertM and I have been exploring the possible use of recommender systems on LessWrong. Today we launched our first site-wide experiment in that direction.

Behold, a tab with recommendations!

(In the course of our efforts, we also hit upon a frontpage refactor that we reckon is pretty good: tabs instead of a clutter of different sections. For now, only for logged-in users. Logged-out users see the “Latest” tab, which is the same-as-usual list of posts.)

Why algorithmic recommendations?

A core value of LessWrong is to be timeless and not news-driven. However, the central algorithm by which attention allocation happens on the site is the Hacker News algorithm[2], which basically only shows you things that were posted recently, and creates a strong incentive for discussion to always be centered around the latest content.

This seems very sad to me. When a new user shows up on LessWrong, it seems extremely unlikely that the most important posts for them to read were all written within the last week or two.

I do really like the simplicity and predictability of the Hacker News algorithm. More karma means more visibility, older means less visibility. Very simple. When I vote, I basically know the full effect this has on what is shown to other users or to myself.

But I think the cost of that simplicity has become too high, especially as older content makes up a larger and larger fraction of the best content on the site, and people have been becoming ever more specialized in the research and articles they publish on the site.

So we are experimenting with changing things up. I don’t know whether these experiments will ultimately replace the Hacker News algorithm, but as the central attention allocation mechanism on the site, it definitely seems worth trying out and iterating on. We’ll be trying out a bunch of things from reinforcement-learning based personalized algorithms, to classical collaborative filtering algorithms to a bunch of handcrafted heuristics that we’ll iterate on ourselves.

The Concrete Experiment

Our first experiment is Recombee, a recommendations SaaS, since spinning up our RL agent pipeline would be a lot of work.We feed it user view and vote history. So far, it seems that it can be really good when it’s good, often recommending posts that people are definitely into (and more so than posts in the existing feed). Unfortunately it’s not reliable across users for some reason and we’ve struggled to get it to reliably recommend the most important recent content, which is an important use-case we still want to serve.

Our current goal is to produce a recommendations feed that both makes people feel like they’re keeping up to date with what’s new (something many people care about) and also suggest great reads from across LessWrong’s entire archive.

The Recommendations tab we just launched has a feed using Recombee recommendations. We’re also getting started using Google’s Vertex AI offering. A very early test makes it seem possibly better than Recombee. We’ll see.

(Some people on the team want to try throwing relevant user history and available posts into an LLM and seeing what it recommends, though cost might be prohibitive for now.)

Unless you switch to the “Recommendations” tab, nothing changes for you. “Latest” is the default tab and is using the same old HN algorithm that you are used to. I’ll feel like we’ve succeeded when people switch to “Recommended” and tell us that they prefer it. At that point, we might make “Recommended” the default tab.

Preventing Bad Outcomes

I do think there are ways for recommendations to end up being pretty awful. I think many readers have encountered at least one content recommendation algorithm that isn’t giving them what they most endorse seeing, if not outright terrible uninteresting content.

I think it’s particularly dangerous to ship something where (1) your target metric really is only a loose proxy for value, (2) you’re detached from actual user experience, i.e. you don’t see their recommendations and can’t easily hear from them, (3) your incentives are fine with this.

I hope that we can avoid getting swallowed by Shoggoth for now by putting a lot of thought into our optimization targets, and perhaps more importantly by staying in contact with the recommendation quality via multiple avenues (our own recommendations as users of the site, user interviews and easy affordances for feedback, a broad range of analytics).

Further cost of a recommendations (common knowledge, Schelling discussion, author incentive)

As above, personalized algorithms mean we lose simplicity and interpretability of the site’s attention-allocation mechanism. It also means that we no longer have common-ish knowledge of which posts everyone else has seen. There’s sense of [research] community in feeling like we all read the same “newspaper” today and if some people are discussing a new post, I probably at least read its title.

That’s value I think we lose a good deal of, though I think we should be able to find mechanisms to offset it at least a bit. (Curated posts, which will continue, is one way creating common-ish knowledge around posts.)

Related, with a shared frontpage focused on recent posts, discussion (commenting) gets focused around the same few posts. If people’s reading gets spread out over more posts, they could make it harder for conversations to happen. Maybe that will be fine and is worth attention going to the best posts overall for people. Also I think we might be able to find another mechanisms of coordinating discussion. I like the idea of trying a combined post/​comment feed like Facebook/​Twitter[3] that will show user’s comments recently made by others when it’s on a post or by another user that someone is likely interested in[4]. Such a feed if used by many could allow discussion to spring up again on older posts too, which would be pretty cool.

I’ve had one team member comment that using personalized recommendations, they feel some different of feeling as an author because they don’t know when/​where/​for who their post will show up, unlike with the current system. I think this is true, but also doesn’t seem to stop people posting on Facebook or Twitter, so likely not a dealbreaker. I do like the idea of providing analytics to authors showing how many people were displayed a post, clicked on it, etc., possible serving as an escape valve to catch if the algorithm is doing something dumb.

Thoughts?

Please share anything you do/​don’t like from recommendations or any of the new frontpage tabs we’ve shipped. Especially great would be screenshots of your posts list with your reaction to them – lists of posts that are particularly great or terrible.

Also happy to get into thoughts about the general use of recommendations on LW in the comments here. Cheers.

  1. ^

    This is mostly about enabling recommendations for logged-out users requiring some more technical work.

  2. ^

    Since the dawn of LessWrong 2.0, posts on the frontpage have been sorted according to the HackerNews algorithm:

    Each posts is assigned a score that’s a function of how much karma it was and how it old is, with posts discounted over time. In the last few years, we’ve enabled customization by allowing users to manually boost or penalize the karma of posts in this algorithm based on tag. The site has default tag modifiers to boost Rationality and World Modeling content (introduced when it seemed like AI content was going to eat everything).

  3. ^

    We have Recent Discussion which is a pure chronological feed of posting and commenting activity that I find its a bit too much of a firehose with lots of low-interest stuff, so I don’t look at it much.

  4. ^

    Since trying out the “subscribe to user’s comments” feature that we shipped recently, I’ve found this to be an interesting way to discover posts to read. I’m motivated to read things people I like are discussing.

  5. ^

    For now, the tabs are only visible to logged-in users, though the frontpage redesign has been rolled out to everyone. Logged-out users see the contents of the “Latest” tab (which is what the previous frontpage showed under the “Latest Posts” section).