Search engine optimization has different perspectives. One of the important factor in website optimization is to clean up the content regularly to keep it up-to-date. You can compare the website’s content to food items that we will consume. When it is fresh it is highly attractive, which draws lots of visitors. However, when it becomes stale it becomes unwanted and undesirable, which results in a drop in ratings.
Why Should You Cleanup Old Content?
The importance of content has become more pronounced nowadays just as finding the right keyword to use. However, for a website to maintain consistent rankings, it is important to carry out a regular cleanup of old content. But how does that help SEO in terms of conversions, backlinks, ratings, clicks, impressions, and traffic?
Before you embark on cleaning up your content, you should know why you are doing so in the first place.
- Make optimum use of the Google crawl bot when crawling your site.
- Eliminate the issue of having two contents on the same topic, which could result in Google ranking the page you do not want instead of the one you want.
- Reduce index bloat of having too many thin pages on Google search.
- Users landing on thin or less useful pages.
- Your backlinks get impacted as thin pages affect the visibility of pages with lengthy content.
Example of Index Bloat
Index bloat is the SEO terms used to indicate the availability of too many pages indexed by Google while there are less number of pages on the site. This can happen due to the use of taxonomies like categories, tags and archives. Though the archive pages on your site like blog feed helps to offer feed to readers, it creates too many pages on your site. Let us take an example of using WordPress content management system:
- Page > page1
- Page1 Category > category1 and category 2
- Page1 Tags > tag1, tag2, tag3 and tag4
- Author > author archive
- Date > date archive
- Blog > blog archive
- RSS > blog feed, category feed and 4 tags feed
As you see, a single page can create entries in as many as 16 different places. The problem comes when you are not using the categories tags or feeds properly. You need to group similar content so that users can easily read based on their interest. However, using too many tags will lead to individual tag page with one or two pages listed. When these taxonomy pages are static then Google will start showing in search result instead of your original page.
This is an unintentional example due to misuse of taxonomies and archives. However, there are also cases with thin content that is created intentionally to increase number of pages on a site. These thin content pages will hurt the ranking by overtaking your valuable content.
How to Find Low Value Content?
It depends on the size of your website and a detailed analysis of your contents is necessary to determine which content deserve to be done away with. The focus is on content that is damaging or has no impact on SEO. Generally, your content should hold valuable information that is credible and beneficial, which could help search engines interpret your content as quality content.
However, finding low value content on your site not an easy task. You can use the below tools in combination to evaluate the pages on your site:
- Use the data from Google Analytics to find the pages that have low visitors. However, you may not be able to find pages that have no visitors at all.
- When you are using WordPress or other content management systems, find plugins that help you to show the page views of each page / post. For example, the popular Newspaper WordPress theme records the page views and show in the WordPress admin dashboard against each posts. You can filter pages with 0 and low views.
- Find the number of words on each page to filter the thin content that has less than 300 words.
- Analyze the taxonomies and archive pages to find the duplicate pages.
When you find the low value content, check the relevant keywords in Google to check the search results. If you see the low value content is shown in the search results instead of the intended pages then you can assume it started affecting your site’s ranking.
Planning for Content Deletion
After evaluating the content, below are some of the actions you can pursue.
1. Do Nothing
As mentioned, it is a tricky task to find the value of a page when you have smaller traffic to that and the page has relevant content. With gazillion blogs available on the internet, you should not expect each page on your site should rank top in Google search. Unless, you are confident the page is outdated, thin or duplicate, you do not need any action if your old content holds small value but does no damage.
2. Updating Old Content
Not all old pages deserves to be deleted, there are old contents that require you just update them. These types of contents should be those that hold relevant information and value. It could have backlinks from high authority sites and still deliver some good statistics.
For instance, if you have an article on your blog about “google algorithm update” then you should know that kind of article as to be updated frequently as it becomes old over time. That will make it stay relevant to the audience reading it.
It is not easy to transform old articles into a new post as it takes serious work but it is rewarding as it makes them more valuable and up to date. However, remember to use a 301 redirect for the old URLs to direct traffic to the newly updated post.
3. Remove Low Value Content
The other option you can use when cleaning up your old content is removing totally. This should be done once an article no longer holds any value or worse causing damage to your site SEO. For example, if you did an article before the Google Panda Update where keyword stuffing was the top SEO strategy irrespective of content quality. You might want to remove it to avoid the Panda update effect as the article might have lost its value especially if it was low quality or thin.
4. Combine Pages
- Combine similar content on the same topic into one new content page.
- Delete or merge taxonomy and archive pages. For example, single author site should not have separate author archive or date archive as it will be the same as the blog index page.
Informing Search Engine About Your Action
Deleting the content alone will not cut it; you also have to make sure search engines like Google also know these pages no longer exist by sending an instruction for the search bot to read. This will prevent the deleted pages from been indexed further. How do you go about this?
After you update old content or combined smaller old posts into a new one, it is necessary to inform search engines about the change in URL. A 301 redirect guides the search engines to the new content from the old posts. This is essential so that you do not lose the page ranking and backlinks value of the removed pages.
Showing 404 or 410
An old content with no impact that has been removed should be followed with the 404 Not Found Code that will tell search engines that the content is no longer available and it was removed on purpose. Once this code has been inserted, the page will return the 404 Not Found message to the search engines.
Initially, the page may still be showing up on search pages but eventually, it will disappear and stop showing up. Another code you can use is 410 to tell search engines that the content is gone forever.
3. No Index Meta Tag
Another means of keeping your old content from the search engines is to add the no-index tag, which tells the search engines to ignore the page. You can add the HTML meta robots tag within the header section of a page with noindex to inform Google not to index in search results.
You can view all noindexed pages on your site from “Coverage” section of Google Search Console and analyze those pages are getting traffic. This will help you to take necessary action if necessary to bring those pages back in search results.
4. Use Canonical URL
The other option is to setup canonical URL when you really want to keep two pages with similar content on your site. Though you can have two pages with two different URLs, you can inform search engines that both pages have single source. You can do this by adding canonical meta tag to the duplicate pages.
Your site maintenance routine should include the cleanup of old content on your website. The failure to comb through your archive for old content periodically could bring unnecessary problems for you. These outdated contents could contain inadequate or misleading information to your audience. That could end up affecting your page rankings by housing contents that the search engines deem not valuable or useful. This is unhealthy for your site SEO, which is why you need to access and evaluate your old content and determine what you need to do with them. The result of your evaluation will influence your decision on whether to remove or update your content. In most cases, you need to merge or delete the pages and setup appropriate redirection to inform search engines.