Choosing Your SEO Testing Grounds

March 24, 2008 – 1:19 am

OK, so after a long time and a lot of testing and data crunching, it was about time I sat and wrote about my latest SEO adventures.

It seems like the SEO testing is all over the place lately. Firstly there is the excellent post by our old favorite XMCP about all the things to keep in mind when setting up an SEO test (and my follow up post). Then there is an interesting test about Google preference for different TLDs at GoogleCache blog. Top it up with the “Google indexes only the first link” test from SEOmoz (or the same test done earlier by Michael VanDeMar) and you have a rage. That is without mentioning Michael Martinez’s SEO Theory which consistently provides thought provoking material. Hell, there is so much writing about SEO testing that maybe there should be a session with similar topic at the next search conference ? SMX Advanced maybe ? (hint, hint)

So, you read all the excellent articles that describe different SEO tests and you want to test an idea that has been running circles in your head for months now. The question is how do you start ?

Image courtesy of cyranthu

You want your test to have a clear-cut, reliable results that can be translated into actions applicable on your money/client’s websites. So obviously you will want to put a website of your own out there, tweak a thing or two and see how that affects your rankings. Now this is where the problems start. What keyphrase to optimize for ? How much to optimize ? …. The main question here is what am I looking for in a testing ground. Well there are two things that come to mind:

  1. Low noise level – we want to perform our experiments in a surroundings that will not drown out our signal. This means that if I see my site drop 5 positions in SERPS, this is due to the action I performed and not due to the fact that the 5 sites below me have increased their rankings.
  2. Low level of competition – this is needed for two reasons:
    1. Competitive phrases have a high level of noise due to the constant promotion work being done on competing websites. Links added, text changed, metatags improved, etc.
    2. It will be harder for you to actually bring your site to a position where changes effecting it can be analyzed.

Looking at the above mentioned articles, SEO testing ground choices that people have made can be summarized into three models:

  1. A virgin testing ground – this is a nonsense keyword that has no results in Google SERPs prior to the test. This is basically what was done in SEOCache Google TLD preference test. They created new websites and promoted them as much as they wanted for the non-existing keyword, thus controlling all the 10 results. This approach provides a great level of control over all the parameters of the competing sites, thus enabling the tester to accurately attach every change in rankings to the action done on the websites. That said, this is as far from a real life situation as you can get. Forget about semantic relevance, forget about the rate of incoming links, forget about the search history. The problems with this model are further stressed by the actual SEOCache experiment – they show that Google prefers .org TLDs since they ranked above all the other TLDs consistently. However, when I checked the results from Israel, .net TLD’s came up above the .org’s.
  2. Semi-virgin testing ground – optimizing for nonsense keyphrases that have websites in the SERPs but are not used. Good examples I can think of are old SEO contests like [nigritude ultramarine] or [seraphim proudleduck]. These are actually quite good testing grounds, but again, no semantic relevance and other algo parameters that characterize a real SERP.
  3. Semi-promiscuous testing ground – these are SERPs for keyphrases that are made up of words that actually mean something separately but together are not in common use, like [small red cantaloupe chair] or [arrogant tennis epiphany]. This will provide you with a number of real websites to compete against which will provide the search history and link addition rate parameters. The real keywords used in the query will cover the semantic relevancy issue. There are, of course, problems with this ground, which I will elaborate on in a moment.

So, my somewhat biased description of different testing grounds should tell you which one I personally prefer. Yep, the majority of my tests are done on the Semi-Promiscuous Testing Grounds (SPTG), due to the closest possible resemblance to the real life SERPs. There are some problems with SPTGs as well as is shown on the actual examples below.

Throwing theories up in the air is all well and good, however some examples need to be shown. So yours truly took 10 queries that define SPTGs and monitored the rankings for 20 days. Well, actually I did not monitor the rankings, SERP Archive did and I just gathered and analyzed the results.

So, I will provide just the chosen few phrases that, IMHO, represent the typical SPTGs and discuss the potential problems with these niches. I already apologize for not giving the complete details about the queries, I am running additional tests on some of them and do not want them spoiled just yet :).

So let’s look at the graph over time for the phrase #1:

As can be seen, the top 10 for this phrase is pretty stable. Notice how things go a bit haywire between the 7th and the 11th of February ? Keep that in mind. Also see how the Site 8 dropped out of top 20 for a few days and then returned ? Remember this too, it will come useful when we get to the numerical analysis. Let’s take a look at the phrase #5:

This one looks even more stable. The turmoil begins only at the position 7 and lower and even then it is not significant for the sites #7, #8 and #9. Let’s take a shot at another niche defined by Phrase 3:

See the upheaval between the 7th and the 11th of February ? Just like in Phrase 1. Since these are two completely unrelated phrases, it makes sense that this is some kind of Google link recalculation/PR update/algo change. This is further emphasized by a similar pattern seen with other phrases not shown here.

Now, as a comparison, let’s take a look at a competitive phrase [personal loans]:

All hell breaks loose. No one is safe here, since there are constantly links being added so the position is never constant.

Looking at colorful charts is nice and important, however it is not enough. If you want to automate the process of choosing the testing grounds, you need to have some numbers for your scripts to crunch, so some calculations need to be made. Here is the point where I am warning any casual reader that what I did when doing the calculations is based only on my common sense. I am aware of the fact that much more robust and logical statistical tests exist that should be applied to the data, but I am just not swimming well in that field. I am actually trying to set a meeting with a statistics whiz that will guide me in these kinds of analysis, but that has not happened yet and I did not want to delay this post any longer. So take anything from this point onwards with more grains of salt than you are usually recommended when reading this blog (which is a lot).

So what I did is calculate an absolute value of change between locations on every two adjacent days and then averaged these changes over the testing time period. This gave me an average change of locations for each site. Then I averaged these values in order to get a value I called a Niche Stability Value (NSV). I put those in a bar chart and here is what I got:

So even though I am not sure (to say the least) about the reliability of my calculations, the above chart matches what I saw in the phrases charts. Phrases 6-10 were considered as non-competitive, however they did include competitive words like “investment” or “outfit” albeit in non-conventional context. Since the queries were not done in quotes, it makes sense that some of the sites in the top 10 were being promoted which would add to the level of noise.

One of the weaknesses of my calculations is the fact that they should minimize the effects of the temporary reversible drops in SERPs (like we saw with site 8 for phrase #1). These drops do not represent the real devaluation in the site’s score. Actually, if i took those few days out of calculations, the NSC for phrase #1 would drop to 0.2, which would make more sense. So any statisticians out there, I would love to get some input and further improve my calculations.

So, what is the take home message ? How do we choose a niche to perform experiments in ?

  • From what I saw in the results of the experiment, the phrases that defined the most stabile niches were scientific phrases. Any query that brings up a lot of PDF files from scientific magazines should fit this category. Furthermore, non-exact sciences are better suited niches for SEO testing than exact science related. So go for comparative religion, literature, sociology etc.
  • It is important to do both a visual inspection of the location charts and the statistical analysis. The advantage of visual check is that you can spot the algo changes/PR updates/reversible drops that should be taken out of equation. The advantage of the statistical analysis is that it can provide you with the quick estimate of low-competitive niches. It may have some false negatives, however the chances of a false positive are rather small. If the statistical analysis singles out a niche as a non-competitive one, the chances are that it really is a good testing ground, while the niches marked as competitive could still be good testing grounds with non-significant quirks that skewed the calculations.
  • Do not rely on your estimation of what is a non-competitive phrase without doing the above analysis. As you can see, I thought that phrase #8 was non-competitive and it came out to be a terrible possible testing ground. As they say, assumption is the mother of all f*ckups. Yes, yes I know there is a nicer way of saying that. It sounds dorky though.

To summarize this g-i-g-a-n-t-i-c post: chosing pristine niches for your testing is a good tactics, but it takes out a lot of real-life parameters out of equation. On the other hand, performing tests in real competitive SERPs will probably tell you nothing and will waste a lot of your time. Therefore, I do my tests in SERPs made out of illogical phrases consisting of real words. This however demands location monitoring for all the sites around my testing pages, which gives it an additional level of reliability ruling out temporary hickups in Google’s algo and other unrelated changes.

I don’t have a disclaimer to this blog (although I should put it up sometimes), but this post maybe brings it out most significantly: the setup of the tests, the results and the interpretations (mostly the interpretations) are all a product of my experience and current knowledge. They can be spot on and they can be complete crap. For me, the most important thing is to put the material and the ideas out there for the public to judge, add and shoot down. Only time and additional tests will tell whether there is some value to my ramblings here. So, if you have a different idea, find a significant logical failure or just strongly disagree with everything written here, please leave a comment, I don’t consider this a popularity contest.

Positive responses are also welcome. :))

PS. After re-reading the post and before publishing it, I noticed a possible mix-up that can happen: the stability of a niche should not be confused with the competitiveness, ie. the difficulty of promoting a page to the top 10 for that phrase. It only shows the levels of change in the locations of the top 10 websites.

Tags: , , , , , ,

  1. 11 Responses to “Choosing Your SEO Testing Grounds”

  2. Awesome post! It’s sparked a few ideas 😀

    I must say that finding the standard deviation for your NSV numbers would be immensely useful for testing out various testing grounds (ha!).

    I’d also be curious to know what your normal distribution looks like (how steep is your bell curve for the NSVs)…that, along with the std deviation, can help you identify scenarios which are statistically insignificant and figure out when you’ve done enough testing to come to a conclusion.

    This could be the start of a great study. Congrats!

    By kamo on Mar 24, 2008

  3. Hey Kamo,

    Thanks for stopping by and for the constructive comment. I would love to brainstorm with you on the significance of SD values before I post it. I’ll email you with the details.

    Regarding the bell curve, I thought of that, but I don’t think that 10 phrases is a sample large enough to draw any significant conclusions about the normal distribution of niches. I would need a good sample of seemingly non-competitive phrases as well as some competitive ones. Now that I think of it, just deciding what the sample is could skew the results of such investigation…

    By Neyne on Mar 24, 2008

  4. What an interesting and useful post you have there. I guess this is the real SEO scientist here to apply his scientific techniques. 😆

    By The_man on May 26, 2008

  5. Creating a sterile SEO testing environment is extremely difficult, no doubt about it. What I do in my projects to achieve good measurability is to setup several subdomains with exact the same amount of incoming links from exactly the same sites. Than I allow Google several weeks to show results for the keywords in question. It works for me quite well.

    By Security Bay on Jul 15, 2008

  6. Another great test and article. Really enjoyed the read. I don’t know if you can actually get it exactly correct with the ever changing landscape.

    By Bill Ross on Aug 20, 2008

  7. liked ;]
    thanks!

    By transportadora on Feb 12, 2009

  8. I’m using this service to monitor my website’s position – http://monitor.mazecore.com . They provide rank and uptime monitoring with alerts, but position monitoring on free account is enough for me. I recommend this service with free tariff for your website.

    By John on Oct 10, 2009

  9. Fantastic SEO research advice here, thanks for sharing.

    By Stuart Chester on Mar 1, 2011

  1. 3 Trackback(s)

  2. Mar 24, 2008: Firetown » Blog Archive » Choosing your SEO Testing grounds
  3. Jan 17, 2009: Ссылкостроительство: Анкор - Торговля ссылками - блог о продвижении сайтов и интернет рекламе - SeoImho.com
  4. Jul 25, 2009: SEO Testing Grounds | Webmaster Blog

Sorry, comments for this entry are closed at this time.