17 Ways Your Search Engine Judges the Value of a Link
How does Google decide how much a particular link helps your rankings? That one question has plagued link builders since the dawn of time 2002.
Before we get started on the list, let’s talk turkey. You may have noticed search engines have become more and more dependent on metrics about an entire domain, rather than just an individual page. It’s why you’ll see new pages or those with very few links ranking highly, simply because they’re on an important, trusted, well-linked-to domain. The Internet is changing, and in a way becoming more homogenized. But don’t worry, you can still make money with your site by understanding the nature of search engines and how they judge the value of a link. Here are 17 examples:
#1 – Internal vs. External
Search engines value external opinions more than internal. This is a simple fact, and it makes sense. If you are in a band and you go around telling everyone how great you are, not many ears perk up, except in annoyance. However, if Spin Magazine begins telling people your band is great, that changes things quite a bit. Search engines work in the same way. Internal links (links that point from one page on your site to another) do carry some weight; links from external sites matter far more.
This doesn’t mean it’s not important to have a good internal link structure, or to do all that you can with your internal links (good anchor text, no unnecessary links, etc.), it just means that a site/page’s performance is highly dependant on how other sites on the web have cited it.
#2 – Anchor Text
An obvious one for those in the SEO business, anchor text is one of the biggest factors in the rankings equation overall.
This, of course, comes with a question. Is “exact match” anchor text more beneficial than simply including the target keywords all hully gully? In a word, yes. We’ve conducted many experiments, much to the dismay of our interns, and we’ve conclusively decided that anchor text that features an exact match is the winner, no contest. However, the engines won’t always bias in this fashion. It seems to me that, particularly for generic (non-branded) keyword phrases, this is the cause of a lot of manipulation and abuse in the SERPs.
#3 – PageRank
Whether they call it StaticRank (Microsoft’s metric), WebRank (Yahoo!’s), PageRank (Google’s) or mozRank (Linkscape’s), some form of an iterative, Markov-chain based link analysis algorithm is a part of all the engines’ ranking systems. So, it’s important. All of the services use the analogy that links are votes and that those pages which have more votes have more influence. Pretty simple, right? Well, sort of. Here is a quick PageRank primer:
1.Every single URL on the Internet is assigned a tiny, innate quantity of PageRank.
2.If there are “n” links on a page, each link passes that page’s PageRank divided by “n” (and thus, the more links, the lower the amount of PageRank each one flows.)
3.An iterative calculation that flows through the web’s entire link graph dozens of times is used to reach the calculations for each URL’s ranking score. The calculation is too complicated to replicate here, either that or it’s magic. It might be magic.
4.Representations like those shown in Google’s toolbar PageRank or SEOmoz’s mozRank on a 0-10 scale are logarithmic (thus, a PageRank/mozRank 4 has 8-10X the link importance than a PR/mR 3. Get it?)
#4 – TrustRank
The basics of TrustRank are described in this paper from Stanford – Combatting Webspam with TrustRank. There will be a quiz later.
If you are long since college age and got tired from just reading the word “Stamford,” then here is a quick primer. TrustRank, basically, says that “good” and “trustworthy” pages tend to be closely linked together. Think PayPal and eBay or something. It follows, then, that the spammy and dangerous stuff is located outside of this safe “center.” By calculating an iterative, PageRank-like metric that only flows juice from trusted seed sources, a metric like TrustRank can be used to predictively state whether a site/page is likely to be high quality vs. spam. So, the lesson? Don’t take candy from strangers. The candy might be laced with spam and phish.
#5 – Domain Authority
The phrase “domain authority” is thrown around all over the SEO world, but a concrete definition remains elusive. Most people use it to describe that wonderous combination of popularity, importance and trustworthiness that is calculated by search engines and based mostly on link data.
Search engines likely use scores about the “authority” of a domain in counting links, and thus, despite the fuzzy language, it’s worth mentioning as a data point. The domains you earn links from are, potentially, just as important (or possibly more important) than the individual metrics of the page passing the link. Our advice? Practice the term “domain authority” for your next SEO gathering but be prepared to hide behind the punch bowl if people get too curious.
#6 – Diversity of Sources
No single metric has more positive a correlation with high rankings than the number of linking root domains. This is also, incidentally, a very hard metric to manipulate for spam. So, that being said, it tends to indicate true, broad popularity and importance. How to rack up those linking root domains? Diversity. Empirical data suggests that a diversity of domains linking to your site/page has a strong positive effect on rankings. Getting a link from an entirely unique domain is more important than getting a new one from a previously linked domain. So get your name out there and start making those contacts. A few of them might turn into new links. Thank us later.
#7 – Uniqueness of Source + Target
Those crafty search engines have a number of ways to judge and predict ownership and relationships between websites. These can include (but are not limited to):
* A whole lot of shared, reciprocated links
* Domain registration data
* Shared hosting IP address or IP address C-blocks
* Public acquisition/relationship information
* Publicized marketing agreements that can be machine-read and interpreted
Anecdotal evidence that links shared between “networks” of websites obtain very little value from search engines. This is particularly referring to the classic SE strategy of “sitewide” links. So, again, diversify those links.
#8 – Location on the Page
Microsoft was the first engine to reveal public data about their plans to do “block-level” analysis (in an MS Research piece on VIPS – VIsion-based Page Segmentation). If you lack the patience to read that long form piece, or are simply out of ADD medication, then read on.
Simply put, internal links in the footer of web pages may not provide the same beneficial results that those same links will when placed into the top or header positions. This is based on of our own experimentation(Gotta keep that intern working) and much empirical data brought to us via Google and Yahoo! This seems to be based on an algrorithim that seeks to dismiss pervasive link advertisement by diminishing the valu that external links carry from the sidebar or footer of webpages. Links from the actual content of the piece, as always, remain the most sought after links.
#9 – Topical Relevance
The search engines have a myriad of tools at their disposal to determine if two pages or sites cover similar subject matter. Years ago, Google Labs unveiled an automatic classification tool that could predict, based on a URL, the category and sub-category for nearly any type of content. This worked for conents in a wide array of subject matters, from medical news to real estate and back again. Engines may use these automated topical-classification systems to identify “neighborhoods” around particular topics.
However, there are arguments to be had on both sides of the field here. We are of the opinion that if you get a link from a topic-neutral site such as NYTimes.com or a specific blog on an unrelated subject, then they’ll still pass positive value. Perhaps the engines use these classification tools to predict spam, more than passing judgement. After all, it does look fishy(phishy?) if a site that’s never previously linked to anything in the pharmaceutical field, suddenly does so.
#10 – Content & Context Assessment
Sure, topical relevance can provide some useful information for engines about linking relationships. But, isn’t it possible that the content and context of a link may be even more useful to said engine? Of course it is! Content is king, after all. In content/context analysis, the engines attempt to discern, in a machine parse-able way, why a link exists on a page.
For instance, links positioned for editorial content create certain patterns. They tend to be embedded in the content, link to relevant sources, use accepted norms for HTML structure, word usage, phrasing, language, etc. Through a series of pattern-matching algorithims, it’s possible for search engines to analyze the editorial links and determine their value and liklihood they were added authentically.
#11 – Geographic Location
The geography of a link is, obviously, highly dependent on the purported location of the host. However, the engines(Specifically Google) have been amping up the sophistication in their quest to pinpoint the location-relevance of a root domain, subdomain or subfolder. Here are some of the things they look for:
* The host IP address location
* The country-code TLD extension (.de, .co.uk, etc)
* The language of the content
* Registration with local search systems and/or regional directories
* Association with a physical address
* The geographic location of links to that site/section
If you earn links to a page or site that is targeted to a particular region, that does mean it should help you perform better in that region’s searches. However, if your profile is tied too heavily to one particular region, it may make it harder to perform in other regions. Keep that in mind as you set out to build links.
#12 – Use of Rel=”Nofollow”
Although in the SEO world it feels like a lifetime ago since nofollow appeared, it’s actually only been around since January of 2005, when Google announced it was adopting support for the new HTML tag.
To put it simply, rel=”nofollow” tells the engines not to ascribe any editorial endorsements or “votes” that would boost a page or site’s ranking metrics. It is an attempt to filter out some noise. Linkscape’s index notes that approximately 3% of all links on the web have the “nofollow” tag attached to them.
#13 – Link Type
#14 – Other Link Targets on the Source Page
When a page links out externally, both the quantity and targets of the other links that exist on that page may be taken into account by the engines when determining how much link value will be passed on.
As mentioned way up on topic number 3, PageRank, algorithims from all of the engines divide the amount of value passed by any given page by the number of links on that page. Additionally, the engines may als consider the quanity of external domains a page points to. They do this as a way to judge the quality and value of said endorsements. For example, if a page links to merely a few external resources on a particular topic, spread out all over the content, it will be perceived differently than a long list of links pointing to external sites. One take is not necessarily better than the other, but the engines may pass greater value through one or the other. However, this is subject to the rest of your page/site and the links contained therein.
Also, the engines are going to be looking at who else your linking pages endorse. If they go for anything shady or spam-filled, the value of your link is going to go down. It’s kind of like being scene with your one friend who always clears the room at parties. Nice guy, but…
#15 – Domain, Page & Link-Specific Penalties
Nearly everyone in the SEO world can agree on one thing, search engines apply penalties to sites and pages. These range from the loss of the ability to pass value and endorsement all the way up to a full on ban from their main index. If a page or site has received the former punishment, then links from it provide no value for search rankings. Beware, though, engines occasionally show penalities publicly but this is not always the case.
#16 – Content/Embed Patterns
As content licensing & distribution, widgets, badges and distributed, embeddable links-in-content become more prevalent across the web, the engines have begun looking for ways to downplay these tactics. It’s not that Google et al. don’t want to give proper value to the pages or sites that employ these tactics, it’s just that they are a bit wary about over-counting or over-representing sites that simply do a good job of distributing their licensing deals.
It’s likely that content pattern detection and link pattern detection plays a role in how the engines evaluate link diversity and quality. If the engines see the same link with the same content on thousands and thousands of sites, this is sure to signal a decrease in endorsement. To say it yet again, diversity is key here. The engines place more stock in a variety of links from a variety of sources featuring a variety of content. It makes sense, after all.
#17 – Temporal / Historical Data
Timing and data about the appearance of links is the final point on this rather long list. As the trusty engines crawl along the web, they see patterns about how sites earn links. They use this data to fight spam, identify authoritative links and to pass endorsement on rising Internet stars.
Of course, what the engines do with these patterns of link attraction is the subject of much debate. One thing isn’t, however. The data IS being consumed. It is being analyzed and it is being used to help algorithims do a better job of showing the best results and reducing the abilities of spam.
This list had a lot of information, but it certainly was not a be-all end-all list. Please feel free to suggest your own additions in the comment box.
- The Evolution of Link Building
- Introducing Daily SEO Tip – Your daily source for SEO tips.
- Google’s New Tools for Site Link Strengthening
- 4 Tools for Advanced Google SERPs Analysis
- Why Linkbait is a Tactic the Search Engines Will Always Value