How to Avoid Duplicate Content Penalties While Link Building
What is duplicate content? Never heard of the term? Don’t think it may affect you? Think again. Even if you don’t dabble in the dark arts, you can accidentally victimize yourself by not understanding the ins and outs of duplicate content.
Duplicate content is what it sounds like it is: content that is an exact or near-exact duplicate of other content. You may think that only spammers and thieves are using duplicate content, but the issue of duplicate content can arise through a number of innocent actions as well. Whether you are using non-standard links or syndicating your content, you need to be well aware of the implications of duplicate content and how to address each. This is especially true for those of you who are earnestly working at link building and creating valuable content in order to attract greater levels of targeted traffic and increase your webpage’s search placement.
Why Should You Care? Because You Can Be “Penalized” for Duplicate Content
Yes, that’s right. There are penalties for using duplicate content. In fact, Google is so concerned with duplicate content that it takes quite a bit of time and resources to root-out duplicate content with something that is referred to as the Google Duplicate Content Filter. Other search engines do the same thing.
Why are search engines so concerned? Because their existence requires them to return a collection of valuable, accurate and diverse search results when someone performs a search. Consider for a moment, how valuable would you consider a search engine if you did a search for “used cars” and the first 1,000 results were the same article, only published on 1,000 different sites. No diversity.
To ensure their search results are diverse–and to penalize those who are attempting to manipulate the search engines–Google, Yahoo and the others look for duplicate content and “penalize” it. Another important reason why search engines look for duplicate content is that identifying it helps them conserve their valuable resources for indexing/re-indexing more content in a faster time.
What Are The Penalties?
There are a few different ways in which you can be penalized for duplicate content. And the majority of the penalizing is your own doing. First off, Google does it’s best to provide the most relevant search result first. So if you have placed lots of duplicate content on your website and on other websites, and Google figures out its all duplicate, it is going to do its best to select the one that is most relevant. All the others will be severely downgraded, moved to the “omitted results” section of the search, or if it determines that the pages are an attempt to spam the search engine, outright banned.
Unfortunately for you, this may result in the wrong webpage gaining top ranking over the right webpage–creating a very frustrating “how did that happen” moment.
However, careful preparation and forethought can help prevent many of these accidents from happening while allowing you to continue forward with your content creation and link building (Google’s Duplicate Content Policy and Guidelines for SEO and link building).
Types of Duplicate Content
There are a variety of ways to create duplicate content:
Content Syndication: Publishing your content for others to re-publish, article marketing and other methods of syndication can create hundreds and even thousands of duplications on the internet.
Mirror Site: Creating site mirrors means that there are exact matches of your websites content on the Internet.
Ease of Use: Creating HTML and printer friendly duplicates of content.
Site Architecture: Your site’s navigation may allow an individual to access the same content through different navigational paths; however, a search engine will see the two URLs as two different documents with the same content. This can occur with blogs that categorize the same content by category, month, year, favorites, etc.
Non-Standard Linking Practices: Search engines perceive the URLs “mysite.com/”, “www.mysite.com/” and “www.mysite.com/index.html” as three distinct documents–all of which have the exact same content.
Dynamic Navigation: Some navigation systems add several variables to the end of a URL that cause search engines to perceive the URL as unique each time it is generated. For example, a session variable or internal tracking code of some kind can make a URL look different to a search engine each time it is accessed.
Affiliate Template: Creating quick affiliate websites and using an affiliate template, the same template that appears on hundreds or thousands of other affiliate websites.
Standardized Product Descriptions: Similar to affiliate templates, using a standardized product description from a manufacturer guarantees that there are hundreds or thousands of duplicate descriptions out on the Internet.
Stolen Content: A less-scrupulous website owner steals your content and publishes it as his own.
Spamming Content: Purposely re-publishing the same content hundreds of times on your own site or syndicating it to hundreds or thousands of websites for the sole purpose of spamming search engines
And I am sure there are even more. As you can see, some of these infractions happen regularly on almost every website on the Internet. I know I have made several of these mistakes in the past.
What Can You Do?
The following is a list of steps that you can take to avoid any duplicate content penalties:
1. Don’t duplicate your content. If content is king on the Internet, it may be a better idea to not duplicate it on your site or syndicate through other distribution methods.
2. Standardize your links. Make sure that all of your internal and–to the greatest extent possible–your external inbound links use the same URL structure. You can also use the canonical domain and dynamic parameters tools to standardize your links and avoid duplicate content issues.
3. Emphasize links to original documents. If you have duplicate content as a result of HTML and printer-friendly versions, emphasize links to the original HTML document and de-emphasize links to any additional versions of the content.
4. If you have duplicate content on your website for formatting purposes, use the noindex meta tag or the new canonical meta tag to let the search engines know which document you want considered to be the original. The noindex tag simply tells the search engine not to index the page, which is effective but not as useful as the canonical meta tag.
The canonical meta tag allows you to point to a matching document and tell the search engine that it is the original document and to ignore any other duplicates. The format of the tag is as follows and can be placed within your website on the original document as well as any duplicate content pages:
I would guess that this tag might hold some value if your document contains it and other, offsite duplicates don’t.
5. When you publish new content, and especially if you are going to syndicate your content, make sure to have the new webpage crawled as quickly as possibly. Do not syndicate the content before you know the page has been indexed. Having a prior date of indexation can prove to the search engines that the content is the original.
6. If you syndicate your content, make sure that all documents have a link pointing back to the original, boosting the original documents PageRank.
7. Don’t over syndicate. Sharing content is a cornerstone of the Internet and to some extent it is expected. Where the line is between good syndication and spamming is unclear. But it may have to do with time as well as numbers. A good article that is organically syndicated throughout the Internet will take time (days, weeks and even months), where submission software can blast the content to thousands of websites in an afternoon.
8. Manually submit your content. Manually submitting your content to fewer distribution points allows you the opportunity to change a headline, paragraph, link or anchor text.
9. Re-write templated or pre-generated content. To the extent possibly, don’t use templated or pre-generated web content. Taking some time to make the listing unique and original can give you a huge advantage over every other site that duplicated the content.
10. Use 301 redirects. If you have old documents with high PR value but that are duplicates of newer documents, use 301 redirects to redirect users and search engines to the new documents, PR in tact.
11. Use the webmaster tools to set the proper domain / language / locale settings. Two websites with mirrored content can avoid penalization and confusing by using the Google webmaster tools to decide which website serves which language and/or locale.
Duplicate content is just one of the ways that good-intentioned link building efforts can go bad. A over-syndicated article can create huge exposure and direct web traffic, but absolutely kill your website within a search engine.
Take the time to look over your website and ask yourself if you are making any of the mistakes listed in this article. Using some of the suggested solutions may just push your webpages higher in the search results.
Similar Posts:
- Looking Natural: 9 Ways to Add Variety to Your Anchor Text
- 11 Ways to Camouflage Your SEO and Link Building Efforts
- Natural vs. Unnatural Link Building Video
- Offsite SEO Comes Into Focus
- To YouTube or Not?
Popularity: 15% [?]

October 28th, 2009 at 3:01 pm
[...] How to Avoid Duplicate Content Penalties While Link Building … [...]
October 28th, 2009 at 3:07 pm
[...] See the rest here: How to Avoid Duplicate Content Penalties While Link Building … [...]
October 28th, 2009 at 4:12 pm
[...] This post was mentioned on Twitter by Malcolm Cooper, Matthew Yarro. Matthew Yarro said: ::: How to Avoid Duplicate Content Penalties While Link Building http://bit.ly/b0Ul1 [...]
October 28th, 2009 at 5:00 pm
[...] How to Avoid Duplicate Content Penalties While Link Building … This entry was written by Ricky and posted on October 26, 2009 at 9:01 pm and filed under Free Ebooks. Bookmark the permalink. Follow any comments here with the RSS feed for this post. [...]
October 29th, 2009 at 8:03 am
[...] How to Avoid Duplicate Content Penalties While Link Building … [...]
October 29th, 2009 at 9:08 am
Wow, great post. I’ve been an SEO intern for the past 2 months and article marketing has been one of my major focuses. I distribute to 5 sites with EZine and Articlebase.com leading the way. Most of the articles I submit to these sites get picked up while I haven’t had much success through other article submission sites. Is 5 sites too much and should I just limit it to EZine and Articlebase.com?
October 29th, 2009 at 9:37 am
Is 5 sites too much? It all depends on why you are trying to achieve with your article marketing. If you are trying to create a more organic, viral opportunity to build authority, links, syndication, then you are just fine.
If your articles aren’t getting picked up as much as you like, then maybe you need to ask yourself if what your writing about is interesting (how many people really care about it), or if your content is interesting, or if your writing style is interesting.
Also, I think it is important to consider what you do once you have submitted those articles. Are you trying to hype them or are you waiting for them to be discovered. That’s were a lot of social networking technologies come into play to help you buzz your articles. But I am sure you are already doing that.
Thanks for the comment.
October 29th, 2009 at 2:01 pm
Google Guidelines specifically state not to create multiple pages, domains or subdomains with duplicate content.
November 9th, 2009 at 10:58 am
Yes and no. Yes you are not supposed to duplicate content on your own site or across multiple websites you own. This simply creates a lot of noise and is perceived by SEs as an attempt to manipulate rankings.
But sometimes it is unavoidable, as in the case of a blog that will navigationally refer to the same document in different ways. That is when webmasters should be using the noindex and canonicalization attributes/tag to prevent SEs from penalizes their sites.
As far as syndicating content across multiple pages and domains that you don’t own, that is a different creature entirely. Google, as with the Internet in general, wants you to share good content.
But there is a difference between syndicating content with a few solid websites, and posting an article on hundreds or thousands of websites.
The SEs are getting better at sniffing out spam versus organically viral content that is syndicated throughout the Internet.