LessWrong search traffic doubles
post by Louie · 2011-03-25T22:01:46.130Z · LW · GW · Legacy · 35 commentsContents
35 comments
LessWrong search traffic doubles... despite Google thinking our site is a pro-family pro-democracy astrology blog! More on that in a minute.
First, The Good News: Since I started doing SEO on LessWrong (10 months ago) search traffic from Google has doubled! It took researching >200 different techniques -- actually implementing 14 of them (w/ help from Tricycle) -- 2 of which I think are responsible for most of the improvement:
- Reversing titles (e.g., "Less Wrong - OMG Scholarship!" -> "OMG Scholarship! - Less Wrong")
- No-Following / No-Indexing a complex set of duplicate content
Anyway, I'm really happy about this! This was the explicit goal I set for myself 10 months ago. It's nice to achieve goals... especially unreasonably ambitious ones.
So... YAY!! :D
OK, Now, The Bad News: So I was trying to figure out why we never get any traction for search terms like "rationality" when I looked through Google Webmaster tools. This is what Google thinks our site is about, keyword wise:
Keyword | Occurrences |
vote | 196504 |
points | 152881 |
permalink | 95106 |
children | 84578 |
parent | 56374 |
people | 37047 |
it's | 27082 |
march | 21846 |
february | 21520 |
january | 20425 |
human | 19587 |
december | 18005 |
september | 15695 |
august | 15667 |
password | 15377 |
april | 14714 |
october | 14011 |
seem | 12822 |
november | 11546 |
july | 11265 |
june | 9283 |
world | 8542 |
post | 8496 |
actual | 8251 |
probability | 8114 |
child | 7828 |
moral | 7787 |
work | 7143 |
might | 6250 |
new | 6156 |
theory | 5827 |
argument | 5639 |
read | 5278 |
utility | 5206 |
account | 5002 |
evident | 4777 |
belief | 4749 |
remember | 4691 |
recent | 4584 |
intelligent | 4582 |
science | 4424 |
eliezer | 4384 |
doesn't | 4339 |
rationality | 4188 |
brain | 3969 |
decision | 3904 |
life | 3795 |
username | 3732 |
mind | 3721 |
All the keywords that I bolded are purely structural elements of the Less Wrong site layout. And it appears Google actually is punishing our site for this keyword density imbalance. Google really does think our site is about voting, parenting, and astrology. And while I find it somewhat hilarious that our top source of Google impressions (27,000/mo) is for the keyword "babies", I also lament that the keyword "rationality" is our #3955 source of traffic. We should invert this.
So does anyone have any ideas? How do other sites solve this problem?
35 comments
Comments sorted by top scores.
comment by Kaj_Sotala · 2011-03-26T09:00:09.795Z · LW(p) · GW(p)
[joke] Change the names of the structural elements to keywords we consider important! For instance,
- "Vote up / down" -> "rationality up / down"
- "points" -> "paperclips"
- "permalink" -> "timeless commenting decision"
- "password" -> "the teacher's password"
- "username" -> "code name in the Bayesian conspiracy"
EDIT: You know, I actually like the "points" -> "paperclips" change for real.
Replies from: David_Gerard↑ comment by David_Gerard · 2011-03-26T09:36:31.103Z · LW(p) · GW(p)
+1 to points -> paperclips :-D
I have previously suggested "Vote up/down" to "More like this/Less like this", to generally positive reception.
parent/children -> above/below? There should be something suitable.
When I put the word "rationality" into Google, the first hit is Wikipedia, the second is "Twelve Virtues of Rationality" and the third is LessWrong. How much of LW's low traffic on the word can be attributed to people just not searching on the word much? Edit: This was an artifact of searching logged-in - not logged in, it's not even on the front page.
Bending one's site out of shape for an idiot Googlebot sorta sucks, really. But on my own sites, Google supplies 97% of the search engine traffic. So I suppose one must do what one has to if traffic is a goal.
RationalWiki doesn't give a hoot about SEO, so has an accordingly poor showing and terrible pagerank. RW's hit articles tend to be stuff that it covers well that doesn't rate a Wikipedia article, e.g. Poe's law, Project Blue Beam, European Union Times. The whole answer to succeeding as a wiki is "provide something Wikipedia can't or won't."
Replies from: Raemon↑ comment by Raemon · 2011-03-26T13:27:35.460Z · LW(p) · GW(p)
When I put the word "rationality" into Google, the first hit is Wikipedia, the second is "Twelve Virtues of Rationality" and the third is LessWrong. How much of LW's low traffic on the word can be attributed to people just not searching on the word much?
Are you signed into google or not? When you're signed in, it tailors the results to your search history.
Replies from: David_Gerard↑ comment by David_Gerard · 2011-03-26T14:44:58.825Z · LW(p) · GW(p)
D'oh! Well spotted - not logged in, LessWrong is not on the front page.
Replies from: Raemon↑ comment by Raemon · 2011-03-27T16:20:57.741Z · LW(p) · GW(p)
On the plus side, Harry Potter and the Methods of Rationality is the fourth response to Rationality, even signed out.
Replies from: Zachary_Kurtz↑ comment by Zachary_Kurtz · 2011-03-29T16:36:24.030Z · LW(p) · GW(p)
And Yudkowski.net is result #6
comment by FAWS · 2011-03-25T22:54:45.763Z · LW(p) · GW(p)
I am completely clueless about SEO, but the tag line "a community blog devoted to refining the art of human rationality" is part of an image file and as such invisible to Google, right? Making it equally prominently visible to Google as it is to humans seems like the sort of thing that would help. I don't know what the best way to do that would be though, alt text?
Replies from: JGWeissman, taryneast↑ comment by JGWeissman · 2011-03-25T23:01:31.716Z · LW(p) · GW(p)
Yes looking at the source html, the image has the alt text "Less Wrong"/"Less Wrong Discussion", but does not include the tag line, which it should.
comment by Douglas_Knight · 2011-03-25T23:48:54.291Z · LW(p) · GW(p)
This is all inherited from Reddit, right? Does Reddit get a lot of search traffic for babies?
comment by saturn · 2011-03-26T02:24:59.947Z · LW(p) · GW(p)
Actually, Less Wrong does have a fair amount of discussion about babies (mainly about killing them). And I would guess searches about babies are several orders of magnitude more frequent than searches about rationality.
Edit: Continuing this line of thought, maybe an effective strategy would be to figure out what potentially receptive people are searching for and write some posts about how to apply rationality to those things.
Replies from: Louie↑ comment by Louie · 2011-03-26T04:24:23.990Z · LW(p) · GW(p)
If someone wrote something like "Babies: A Rational Analysis", our site's current structuring would help it be unreasonably popular in Google. This would be analogous to Less Wrong "doing what it's best at".
CarlShulman's articles about voting are overly-popular for the same reason... probably by accident.
Replies from: Alicorn, David_Gerard, JoshuaZ↑ comment by David_Gerard · 2011-03-26T09:38:10.495Z · LW(p) · GW(p)
This would be analogous to Less Wrong "doing what it's best at".
I suggest you make a post of suggested topics that spring to mind. You don't have to write all the posts, but then someone inspired by the title can.
comment by saturn · 2011-03-25T23:08:54.667Z · LW(p) · GW(p)
It looks to me like this is just a raw count of word occurrences rather than what google thinks are the most relevant keywords, because I wouldn't expect the latter to contain words like "it's". If I'm right then the list isn't very informative.
Regarding words like "vote" and "parent", I think one way to hide them would be to put them in buttons rather than links.
Replies from: taryneast↑ comment by taryneast · 2011-03-26T18:23:47.415Z · LW(p) · GW(p)
Google does do some word-ranking. From memory:
1) if it's in the url - it's more important
2) if it's in headings (h1/h2 etc tags) then it's more important - the bigger the tag the better... but in descending in order down the page (ie an h3 right at the top may be considered more important than an h1 at the bottom of the page)
3) google starts at the top of the page and works down. Stuff at the top is more important than stuff below that.
4) If it occurs more frequently, then it's probably more relevant (thus vote and parent)
5) If other links, that point at this site contain the same keywords.. then they are more important
There's plenty of other stuff that goes into this - most of which google keeps secret and it changes on a day by day basis. There are people who make whole careers (lucrative ones!) out of figuring it all out.
Replies from: Alexandros↑ comment by Alexandros · 2011-03-28T16:15:52.704Z · LW(p) · GW(p)
Are 'Top' and 'Bottom' defined as on the unstyled page? If so, sidebars may be getting undue weight...
Replies from: taryneast↑ comment by taryneast · 2011-03-28T17:13:18.318Z · LW(p) · GW(p)
Yes, defined as on the unstyled page, however, if you're talking about the right-hand sidebar... it appears below the content on the page (I checked). The only things that appear "above" the content are the header-image, the top tabbed-navigation and that discussion blurb.
comment by JGWeissman · 2011-03-25T22:23:08.418Z · LW(p) · GW(p)
This probably would be bad for performance, but purely structural sections of the site could be loaded in no-indexed iframes.
If we were dealing with certain Russian search engines, structural sections could be no-indexed inline:
Russian search engines Yandex and Rambler introduce a new tag which only prevents indexing of the content between the tags, not a whole Web page.
Do index this text block. Don't index this text block
Unfortunately, I don't see any indication that Google honors such a thing.
Replies from: Viliam_Bur↑ comment by Viliam_Bur · 2011-10-15T13:29:26.795Z · LW(p) · GW(p)
If HTML is supposed to be about semantics of the page, the NOINDEX tag should have been a part of every HTML specification, at least since server-side scripting became popular.
There is a lot of repeated text on each page of many websites, that really isn't part of the content, such as: "write your comment here", "next page", "previous page", "username / password", "permalink", etc.
I wonder if your website contains a word "permalink" in each page and comment, and there is one page that is really about permalinks, whether Google can tell the difference.
comment by taryneast · 2011-03-26T18:14:28.212Z · LW(p) · GW(p)
Your SEO problem with "votes" and "points" keywords is not entirely due to the comment-voting sections. It's also because of the short blurb above the main article-title.
Google ranks things literally from top-down (in the html)... and that blurb starting "This part of the site is for the discussion of topics" (class = infobar) - appears on most pages, and it appears above the H1 tag containing the article's title. Thus google thinks it's MORE important the main content of the article.
If you want that kind of thing to appear above the title... you can actually do funky things with CSS-positioning that will keep it below the article in the html, but appear to the humans as being at the top of the page.
comment by jimrandomh · 2011-03-29T21:03:54.509Z · LW(p) · GW(p)
I just noticed that in the recent comments feed, article links on comment replies to "Philosophy: A Diseased Discipline" go to http://lesswrong.com/r/lukeprog-drafts/lw/4zs/philosophy_a_diseased_discipline/ , which is a broken link because it's no longer a draft. That's probably bad for their rank, and it might be a more general problem.
comment by Daniel_Burfoot · 2011-03-26T01:08:02.958Z · LW(p) · GW(p)
It's a content vs. formatting issue. Words like vote, march, reply, points, etc are really formatting, but Google reads them as content.
To fix this, you could do a lot of JavaScript hacking so that the timestamps, etc are displayed using DHTML. The search engine robots won't run JavaScript, so they'll only see the content.
Replies from: taryneastcomment by NihilCredo · 2011-03-27T22:17:59.021Z · LW(p) · GW(p)
From https://sites.google.com/site/webmasterhelpforum/en/faq--webmaster-tools :
Q: Why do my Webmaster Tools stats show common phrases such as "buy now" that are not directly related to my site?
A: While some common words and phrases are filtered by Webmaster Tools, there may be some that you use which are not. Having these words or phrases listed in your Webmaster Tools account does not mean that our algorithms will view your site as being only relevant for those keywords. While Webmaster Tools mostly counts the occurences of words on your site, our web-search algorithms use well over 200 other factors for crawling, indexing and ranking. In other words: don't worry if you see keywords like this listed in your Webmaster Tools account.
I couldn't find a more detailed estimation of the impact of such keywords, but we should consider the option of just ignoring the issue. Especially since according to this the only effective options are JavaScript or frames tricks, both of which would make LW significantly more annoying or slow to use.
taryneast's idea of using CSS to pretend-shove the opening blurb to the bottom of the page could be rather painless, though.
comment by Miller · 2011-03-25T23:00:13.473Z · LW(p) · GW(p)
it occurs to me that those most frequent structural words are embedded in anchors that have url's back to lesswrong itself.. seems like a decent heuristic for peeling apart structure and ignoring it?
Edit: I suppose my theory is that Google would make efforts to ignore structural terms in analyzing topic, that this wouldn't be all that hard, and that the 'babies' effect is a coincidence.
comment by [deleted] · 2011-03-25T22:58:38.009Z · LW(p) · GW(p)
For the months: fix the date display so that the month isn't written out.