Duplicate content is content that appears multiple times in various places on the internet. A URL defines the location of a website and when you have the same piece of content at more than one web address, you have duplicate content. Most, if not all, cases of duplicate content arise from a technical error since developers deliver the best product to the consumer and do not build a site strictly for spiders to crawl. 

Example of Duplicate Content:

If blogger A links their post to the URL “http://www.examplesite.com/keyword-1/” and Blogger B links their post to your same site, but with a URL of  “http://www.examplesite.com/category-1/keyword-1/”, you have two links taking you to the same landing page. Both URLs will take you to the same content, but now you have two distinct pathways for spiders to crawl and traffic to follow.

How does duplicate content happen?

Most instances of duplicate content being created are by accident. When using Content Management Systems, a lot of pages will be linked to in various ways by different users. Because search engines do a great job of sifting through text to identify the best version there is no harsh penalty for having duplicate content. However, while search engines are smart enough to tell what is the original version here are some important things to remember to combat duplicate content:

  • Utilize the Canonical Link Element to Mark Duplicate Content
  • Enable 301 Errors to redirect users and spiders to the canonical URL
  • Do NOT Block Googlebot in robots.txt from crawling URLs

Why does combating duplicate content matter?

Incorporating the rel=”canonical” tag will help spiders determine the specified URL is good content. The canonical link is essentially a tag that claims that URL as the original or official version.

An example of code would appear like this:

<link rel=canonical href=”http://www.examplesite.com/keyword-1.html” />

Add your canonical tag to let search engines know where the original piece of content is and drive up traffic!

Considering that duplicate content is everywhere across the internet, it is important to note the difference between duplicate content and copied content. Duplicate content is when you have multiple URLs linking to the same piece of content. Copied content is when a second party copies the text and republished it as their own aka plagiarism. 

How can this improve your SEO?

Aside from helping search engines identify canonical links and sifting through duplicate content across the internet cleaning up duplicate session ID’s, meta tags, meta titles will significantly help increase traffic. By not only helping spiders crawl your site easier, you can additionally rank better for having a healthier variety of original content which leads to an increase in traffic. The more original content you have along with the less duplicate content on your site, the stronger you will rank!