Checking for & Dealing with Duplicate Content on your Website [VIDEO]
Duplicate content on your site can cause problems with search engines indexing your content and in extreme cases can result in a penalty. In this video we explain what duplicate content means and how you find it on your site.
Done with this one? View the next video: Making Sure Your Site Loads Quickly
Looking for the first video in the series? Find it here: Making Sure Your Sites Content Can Be Indexed
Hi, this is Chris from Dejan SEO. In this tip, I’m going to explain duplicate content, how it’s spotted, and what to do about it. In this video, we’ll be using Google Webmaster Tools and I’ll quickly explain a cool function in the web crawling tool, Screaming Frog.
Duplicate content is where two more URLs contain the same content. This can cause problems for your SEO, because it confuses search engines. First, they don’t know which version of page to show in results. They also don’t know where the link value should be assigned or even if it should be divided between multiple versions. Finally, it can be unclear which version to rank for specific queries. The overall results of this is that websites can lose rankings and traffic. Duplicate content is most often caused by content management systems that create multiple URLs for the same content or printing PDF versions of the page.
There are a couple of ways to spot duplicate content. First, you can look for duplicate titles and descriptions in webmaster tools. Just open your account, go to Search Appearance, then HTML Improvements. Here, you can see where there are duplicate descriptions in titles. This gives you a strong indication of where there might be problems. If we look at an example in the Duplicate Titles, we can see a page which is showing a large number of Facebook tracking tags in the URL. This indicates that these pages aren’t using a proper canonical tag. I’ll explain this more in a moment.
If you’d like to try a more advanced method, you can use the web crawling tool called, Screaming Frog. If you have a site under 500 pages, the free version should be enough to get you started. Once you’ve earned a crawl of your site, you can look at the internal pages and scroll to the far right. Here you’ll see a hash for each of the pages. This is a way of summarizing all the content on the page into a single value. If you sort by the hash, you’ll be able to spot the values that match. You could also export to Excel and use conditional formatting to highlight any duplicate values. If you’re not comfortable using this more advanced method, don’t worry, using webmaster tools will be all that’s needed in most cases.
So what do you do once you’ve found duplicate content? There’s several ways of solving the problem. The first and generally most effective is to make sure that your site is using canonical tags to point to the preferred version of the URL. This removes a lot of duplicate content from the user link tracking or campaign tags. This is the best approach where you might still want a particular version of a page to be accessible, like a landing page for your paid advertising.
Another method of removing duplicate content is a 301 redirect. This automatically sends anyone or anything trying to access the page to another page. This is best used when you have no need of a particular URL and you want to make sure another version is indexed. It’s important that where you’re using the 301, you try and make sure all your links on your site are pointing to the newer preferred version.
Finally, if you have some duplicate content that doesn’t have another version on your site and you still need people to access it, you can use the Robots No Index tag. This tells search engines not to index the page. You might use this approach if you have content on your site which has been duplicated on other domains, but which you still want users to be able to access such as privacy pages or terms and conditions.
Generally speaking, duplicate content is one of those things that pops up when your site is first being audited. Once you have best practices in place, like canonical links, it tends to disappear and requires little active management.
That was our quick video tip on finding and removing duplicate content. If you want to use the tips in this video, you can get started with the action shown here. You can also find supporting articles on HubSpot and the Dejan SEO website.