Google’s algorithm (or close to it)

First of all, the algorithm is the mathematical formula that Google uses to decide which website goes first. Knowing this formula would be of great value, as it would make our web positioning job easier, but it is a very well kept technical secret.

I have been collecting clues on the algorithm for a while, and running some quiet experiments. The latest non-official disclosure of the algorithm is from Rand Fish, in his seomoz.org site, obviously very well positioned under the SEO keyword. The article’s name is “A little piece of the Google algorithm revealed”.

And the formula is:

GoogScore = (KW Usage Score * 0.3) + (Domain Strength * 0.25) + (Inbound Link Score * 0.25) + (User Data * 0.1) + (Content Quality Score * 0.1) + (Manual Boosts) – (Automated & Manual Penalties)

The different factors are calculated as follows:

KW Usage Score
• KW in Title
• KW in headers H1, H2, H3…
• KW in document text
• KW in internal links pointing to the page
• KW in domain and/or URL

Domain Strength
• Registration history
• Domain age
• Strength of links pointing to the domain
• Topical neighbourhood of domain based on inlinks and outlinks
• Historical use and links pattern to domain

Inbound Link Score
• Age of links
• Quality of domains sending links
• Quality of pages sending links
• Anchor text of links
• Link quantity/weight metric (Pagerank or a variation)
• Subject matter of linking pages/sites

User Data
• Historical CTR to page in SERPs
• Time users spend on page
• Search requests for URL/domain
• Historical visits/use of URL/domain by users GG can monitor (toolbar, wifi, analytics, etc.)
Content Quality store
• Potentially given by hand for popular queries/pages
• Provided by Google raters
• Machine-algos for rating text quality/readability/etc

Automated & Manual Penalties are a mystery, but it seems they lower the ranking by 30 entries or more.

The mentioned factors are generally known in the experts’ forums, but the relative value that Rand gives them is useful. Rand’s conclusion is that little we can do to apply this algorithm, but to improve the content quality.

Some factors are too basic for Rand to mention, and relate to selecting a good domain, writing with a reasonable density of keywords, intelligently programming links, good code, sensible writing, etc.

Surprisingly, there are very few companies publishing results on the Google algorithm. However, competing search engines do very well their research, because they were able to copy almost the same ranking features as Google. Most of the times when I get a good ranking result in Google, Yahoo follows. A clear difference between both algos lies in the penalties, being Yahoo more lenient.

Most algo crackers show only a small sample of their knowledge, to prevent their competition to take advantage of their findings, and to avoid identification and possible penalizations. However, some of us are a bit more open, trying to use distributed thinking in order to achieve our algo cracking goals.