Optimizing For Average = Correlation

Author: Daniel Cuttridge

SEO content tools that make 'increase' and 'decrease' recommendations are largely unscientific. The recommendations of most are in some way based on the average number of instances utilized by your competitors... These tools have slight differences of course, and some of the features - I will admit are better than others.

It is the core idea, however, that is a bit skewed. "Optimized content means matching the averages of your top competitors for as many factors as possible".

confused face

Truthfully, this works at times, after all even a blind person throwing darts is going to hit the bullseye sometimes.

The 'core idea' though, is a load of rubbish. It just doesn't make sense, and it's not correct.

Correlation is not causation... Most high-schoolers have heard that before. So I don't believe that most of the search engine optimization industry that hypes these tools has not heard of this.

Correlation and Dependence: In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or bivariate data. [1]

It's not so much that those people don't understand the problem with correlation, it is more that they can't identify it in the first place... Correlation has always been a trickster.

John changes 5 things on a page at once, it gets a good result. John thinks that thing #2 made all the difference.

I'm not going to insult your intelligence by telling you what is wrong with that, but it happens all the time and it's how most unfounded/plain wrong ideas in our industry started.

its a trap

A good result, wrongly credited because of correlation, can quickly give way to heuristics such as confirmation bias. Less of the can and more of the does in practice. This is also exactly why I have always maintained that one of the best ways to improve at SEO is to level-up your thinking.

Unfortunately, I can't help but think that the people who created these tools did it for one or two reasons, or possibly both.

  1. They had some good results and wrongly attributed the reasons why via correlation to the 'by averages' approach they recommend.
  2. They know it's a flawed idea, but they just don't care because $$$

Either way, they're not companies that I would put my faith in to help direct my campaigns.

My views on this do not change the fact, that amazingly, these tools have insidiously seduced a large portion of the industry, who now believe these tools to be scientific marvels.

computer science meme

The fact is that if you optimize for average, you will get average... The idea that you can get more is even etymologically incorrect - average has never meant optimal.

It is also worth noting that for some, average results constitute an overall improvement.

With our example, John the SEO who made 5 changes at once. He can't reasonably say what did or did not help... These tools have you doing too much at once because correlation has given a false belief that it is best.

The net result may be positive, say you change 15 things at once... 8 are positive improvements, and 7 are negative or neutral. The net result is an improvement in rankings.

In real-world testing situations, it is normal to have results of the following type: True positive, False positive, False negative, and True negative. In SEO it is common to change too much at once and get a False positive.

It's easy to see how this can give the wrong impression.

It's also kinda obvious to see when we explain it like this that using averages might hit the bullseye a few times for certain factors. But it might also lead to under or over-optimization of other factors at the same time. A net result creates an illusion of total improvement, which then leads to the false/correlation-based belief that optimizing for averages is a sure way to improve rankings.

To me, this is not the same though, optimizing for average is something I feel to be a deeply flawed strategy.

Optimizing for average is something I feel to be a deeply flawed strategy.

The one thing that I have not yet mentioned is the growing number of users who get inconsistent results with these tools because it is hit and miss when optimizing for average.

When you figure out what these tools are telling you if you implement their suggestions, they are saying that you have achieved the perfect average, whatever that means.

In Latin, the word for knowledge was 'scientia'. This is how we arrived at the word Science. Real optimization of your content should be scientific. Even if you can ignore the etymological issues, and the damning usage of correlation... How much do we really learn from these tools?

For all their warts, including the strong possibility that the creators failed scientifically to identify correlation when building the tools, they still have scores of useful raw data that can be used by someone who knows how.

So I'm not saying you should stop using these tools, instead, I am saying you should use them in a responsible and informed manner which is not what I have been seeing on the whole.

Correlation is not causation, and average is not optimal. Despite what you may have been told - we can do better, but that first starts with awareness.