LessWrong search traffic doubles
LessWrong search traffic doubles… despite Google thinking our site is a pro-family pro-democracy astrology blog! More on that in a minute.
First, The Good News: Since I started doing SEO on LessWrong (10 months ago) search traffic from Google has doubled! It took researching >200 different techniques—actually implementing 14 of them (w/ help from Tricycle) -- 2 of which I think are responsible for most of the improvement:
Reversing titles (e.g., “Less Wrong—OMG Scholarship!” → “OMG Scholarship! - Less Wrong”)
No-Following / No-Indexing a complex set of duplicate content
Anyway, I’m really happy about this! This was the explicit goal I set for myself 10 months ago. It’s nice to achieve goals… especially unreasonably ambitious ones.
So… YAY!! :D
OK, Now, The Bad News: So I was trying to figure out why we never get any traction for search terms like “rationality” when I looked through Google Webmaster tools. This is what Google thinks our site is about, keyword wise:
<col>
Keyword | Occurrences |
vote | 196504 |
points | 152881 |
permalink | 95106 |
children | 84578 |
parent | 56374 |
people | 37047 |
it’s | 27082 |
march | 21846 |
february | 21520 |
january | 20425 |
human | 19587 |
december | 18005 |
september | 15695 |
august | 15667 |
password | 15377 |
april | 14714 |
october | 14011 |
seem | 12822 |
november | 11546 |
july | 11265 |
june | 9283 |
world | 8542 |
post | 8496 |
actual | 8251 |
probability | 8114 |
child | 7828 |
moral | 7787 |
work | 7143 |
might | 6250 |
new | 6156 |
theory | 5827 |
argument | 5639 |
read | 5278 |
utility | 5206 |
account | 5002 |
evident | 4777 |
belief | 4749 |
remember | 4691 |
recent | 4584 |
intelligent | 4582 |
science | 4424 |
eliezer | 4384 |
doesn’t | 4339 |
rationality | 4188 |
brain | 3969 |
decision | 3904 |
life | 3795 |
username | 3732 |
mind | 3721 |
All the keywords that I bolded are purely structural elements of the Less Wrong site layout. And it appears Google actually is punishing our site for this keyword density imbalance. Google really does think our site is about voting, parenting, and astrology. And while I find it somewhat hilarious that our top source of Google impressions (27,000/mo) is for the keyword “babies”, I also lament that the keyword “rationality” is our #3955 source of traffic. We should invert this.
So does anyone have any ideas? How do other sites solve this problem?
- May 6, 2011, 5:08 AM; 15 points) 's comment on SIAI—An Examination by (
- Apr 19, 2011, 9:30 PM; 2 points) 's comment on Introduction to the Sequence Reruns by (
[joke] Change the names of the structural elements to keywords we consider important! For instance,
“Vote up / down” → “rationality up / down”
“points” → “paperclips”
“permalink” → “timeless commenting decision”
“password” → “the teacher’s password”
“username” → “code name in the Bayesian conspiracy”
EDIT: You know, I actually like the “points” → “paperclips” change for real.
+1 to points → paperclips :-D
I have previously suggested “Vote up/down” to “More like this/Less like this”, to generally positive reception.
parent/children → above/below? There should be something suitable.
When I put the word “rationality” into Google, the first hit is Wikipedia, the second is “Twelve Virtues of Rationality” and the third is LessWrong. How much of LW’s low traffic on the word can be attributed to people just not searching on the word much? Edit: This was an artifact of searching logged-in—not logged in, it’s not even on the front page.
Bending one’s site out of shape for an idiot Googlebot sorta sucks, really. But on my own sites, Google supplies 97% of the search engine traffic. So I suppose one must do what one has to if traffic is a goal.
RationalWiki doesn’t give a hoot about SEO, so has an accordingly poor showing and terrible pagerank. RW’s hit articles tend to be stuff that it covers well that doesn’t rate a Wikipedia article, e.g. Poe’s law, Project Blue Beam, European Union Times. The whole answer to succeeding as a wiki is “provide something Wikipedia can’t or won’t.”
Are you signed into google or not? When you’re signed in, it tailors the results to your search history.
D’oh! Well spotted—not logged in, LessWrong is not on the front page.
On the plus side, Harry Potter and the Methods of Rationality is the fourth response to Rationality, even signed out.
And Yudkowski.net is result #6
I am completely clueless about SEO, but the tag line “a community blog devoted to refining the art of human rationality” is part of an image file and as such invisible to Google, right? Making it equally prominently visible to Google as it is to humans seems like the sort of thing that would help. I don’t know what the best way to do that would be though, alt text?
Yes looking at the source html, the image has the alt text “Less Wrong”/”Less Wrong Discussion”, but does not include the tag line, which it should.
Google is smart enough to know about this kind of “trick” and trying it will actually decrease your pagerank. Do not meddle in the ways of google… ;)
This is all inherited from Reddit, right? Does Reddit get a lot of search traffic for babies?
My best SEO advice would be to turn the structural links (vote, edit, etc) into buttons (ie post instead of get). AFAIK, google doesn’t consider buttons to be as “contenty” as ordinary links.
Actually, Less Wrong does have a fair amount of discussion about babies (mainly about killing them). And I would guess searches about babies are several orders of magnitude more frequent than searches about rationality.
Edit: Continuing this line of thought, maybe an effective strategy would be to figure out what potentially receptive people are searching for and write some posts about how to apply rationality to those things.
If someone wrote something like “Babies: A Rational Analysis”, our site’s current structuring would help it be unreasonably popular in Google. This would be analogous to Less Wrong “doing what it’s best at”.
CarlShulman’s articles about voting are overly-popular for the same reason… probably by accident.
Does “Babies and Bunnies: A Caution About Evo-Psych” show this effect?
yes
I suggest you make a post of suggested topics that spring to mind. You don’t have to write all the posts, but then someone inspired by the title can.
Can people please not write articles simply to improve Google ranking? That’s dark sidish and also easily leads to a decline in content quality.
It looks to me like this is just a raw count of word occurrences rather than what google thinks are the most relevant keywords, because I wouldn’t expect the latter to contain words like “it’s”. If I’m right then the list isn’t very informative.
Regarding words like “vote” and “parent”, I think one way to hide them would be to put them in buttons rather than links.
Google does do some word-ranking. From memory:
1) if it’s in the url—it’s more important
2) if it’s in headings (h1/h2 etc tags) then it’s more important—the bigger the tag the better… but in descending in order down the page (ie an h3 right at the top may be considered more important than an h1 at the bottom of the page)
3) google starts at the top of the page and works down. Stuff at the top is more important than stuff below that.
4) If it occurs more frequently, then it’s probably more relevant (thus vote and parent)
5) If other links, that point at this site contain the same keywords.. then they are more important
There’s plenty of other stuff that goes into this—most of which google keeps secret and it changes on a day by day basis. There are people who make whole careers (lucrative ones!) out of figuring it all out.
Are ‘Top’ and ‘Bottom’ defined as on the unstyled page? If so, sidebars may be getting undue weight...
Yes, defined as on the unstyled page, however, if you’re talking about the right-hand sidebar… it appears below the content on the page (I checked). The only things that appear “above” the content are the header-image, the top tabbed-navigation and that discussion blurb.
This probably would be bad for performance, but purely structural sections of the site could be loaded in no-indexed iframes.
If we were dealing with certain Russian search engines, structural sections could be no-indexed inline:
Unfortunately, I don’t see any indication that Google honors such a thing.
If HTML is supposed to be about semantics of the page, the NOINDEX tag should have been a part of every HTML specification, at least since server-side scripting became popular.
There is a lot of repeated text on each page of many websites, that really isn’t part of the content, such as: “write your comment here”, “next page”, “previous page”, “username / password”, “permalink”, etc.
I wonder if your website contains a word “permalink” in each page and comment, and there is one page that is really about permalinks, whether Google can tell the difference.
Your SEO problem with “votes” and “points” keywords is not entirely due to the comment-voting sections. It’s also because of the short blurb above the main article-title.
Google ranks things literally from top-down (in the html)… and that blurb starting “This part of the site is for the discussion of topics” (class = infobar) - appears on most pages, and it appears above the H1 tag containing the article’s title. Thus google thinks it’s MORE important the main content of the article.
If you want that kind of thing to appear above the title… you can actually do funky things with CSS-positioning that will keep it below the article in the html, but appear to the humans as being at the top of the page.
I just noticed that in the recent comments feed, article links on comment replies to “Philosophy: A Diseased Discipline” go to http://lesswrong.com/r/lukeprog-drafts/lw/4zs/philosophy_a_diseased_discipline/ , which is a broken link because it’s no longer a draft. That’s probably bad for their rank, and it might be a more general problem.
It’s a content vs. formatting issue. Words like vote, march, reply, points, etc are really formatting, but Google reads them as content.
To fix this, you could do a lot of JavaScript hacking so that the timestamps, etc are displayed using DHTML. The search engine robots won’t run JavaScript, so they’ll only see the content.
JS hacking will also make the page less stable, less accessible and more annoying to maintain. So it’s possible, but there is a significant cost involved.
Well done, sir.
Unfortunately, I know very little about SEO.
Would it do anything to make the title be:
Article TItle—Less Wrong: a community blog devoted to refining the art of human rationality
googlehacking is a fine art… and too much can be just as detrimental as too little.
Utility, belief, intelligent, brain, decision and mind are also topical, aren’t they? Arguably moral, argument, theory and science as well. Except for the structural elements and rationality being too low it doesn’t look too bad.
From https://sites.google.com/site/webmasterhelpforum/en/faq—webmaster-tools :
I couldn’t find a more detailed estimation of the impact of such keywords, but we should consider the option of just ignoring the issue. Especially since according to this the only effective options are JavaScript or frames tricks, both of which would make LW significantly more annoying or slow to use.
taryneast’s idea of using CSS to pretend-shove the opening blurb to the bottom of the page could be rather painless, though.
Great job!
it occurs to me that those most frequent structural words are embedded in anchors that have url’s back to lesswrong itself.. seems like a decent heuristic for peeling apart structure and ignoring it?
Edit: I suppose my theory is that Google would make efforts to ignore structural terms in analyzing topic, that this wouldn’t be all that hard, and that the ‘babies’ effect is a coincidence.
For the months: fix the date display so that the month isn’t written out.
I assume both the right and left will think that we support their cause because we’re “rational”.