Duplicate Content and SEO

Duplicate content is bad for your website but the good news is that you can avoid having duplicate content. Keep in mind that duplicate content that can have an impact is your own duplicate content. What other websites will do with your content is way out of your control, but you have full control over your own content.

Some negative effects of duplicated content include the dilution of the anchor text, and fragmentation of your PageRank among other negatives. The only sure way you can tell that your content is duplicated is to make use of the ‘value’ aspect. Ask yourself if there is any extra value to your web page content. Do not reproduce content without any reason.

Is this page fresh content or a slight rewrite of the previous page? Make sure you put in unique valuable content. Your main concern should be not to send the search engines any bad signal as they are able to identify all duplicated content from very many signals. Just like ranking, notorious duplicated contented will be pinpointed, and marked.

Note that every website could have some degree of duplicate content which is perfectly OK. Your main concern here would be how to manage the content. You can have some genuine reason for content duplication such as: 1) Varying document formats when you have content being hosted as Word, HTML, PDF etc 2) Genuine/legitimate content syndication  such as using RSS feeds 3) Use of common code such as JavaScript, CSS, or any boilerplate constituents.

In point one; you can have alternative ways in which your content is delivered. You have to select a default format and prohibit the search engines from recognizing the others, but still allowing the users to have access to the other formats. This you can do by adding the right code robot .txt file, and ensure you exclude all urls to the other versions on your sitemap also. Still on the urls point, use nofollow attribute on the website to eliminate duplicate pages as other people are able to link to them.

In point two; if you have one page that contains a rendering of an rss feed from another website and 10 other websites contain pages on the same feed, it could appear less of duplicate content to search engines. You will not risk duplication except if large parts of your website are based on the feeds.

In point three; disallow all common codes to get indexed. If you have CSS, ensure you put it in a different folder and make sure that folder isn’t crawled in the robots.txt. You should do the same for JavaScript and all other common external codes. A point to note on duplicate content is that any URL has the likelihood of getting counted by search engines. Unless you are able to tactfully manage two URLs that refer to similar content, they will appear duplicated.