Dropping URL Parameters
Google Tests Canonical Page Versions by Dropping URL Parameters
SEO firms and webmasters are sure to be interested by the news revealed by Google’s Matt Cutts that was recently posted online. Responding to a question regarding whether or not the Googlebot uses inference when spidering the Internet, Cutts agreed that it definitely does – to some extent – and that it looks at redundant parameters in order to do so. What does this mean for SEO, webmasters and other interested people? It might hold a great deal of significance when it comes to using the rel=canonical tag that started coming into vogue in early 2009.
How Does Googlebot Handle Unnecessary Parameters?
Cutts explains that the Googlebot uses inference while spidering by looking for duplicate parameters, or for parameters that appear again and again; in other words, it looks for overly ubiquitous parameters. When it recognizes one as such, it attempts to drop it to see if it still achieves the same results. For example: www.example.com/index.php?page=pagename¶meter1=green¶meter2=widget might be tried out using the URL www.example.com/index.php?page=pagename¶meter1=green in order to see if the content is affected significantly. If it isn’t, Googlebot will drop the unnecessary parameter – in this case, “parameter2=widget” – to create cleaner indexed URLs and to reduce the amount of redundant search results in the Google rankings.
How Does Googlebot’s Handling Of Parameters Affect The rel=canonical Tag?
SEO professionals and other web developers were understandably excited by the news about the rel=canonical tag, which allows users to specify which version of a page should be prioritized by Google. In the past, the process was determined through complex Google algorithms and was decidedly imperfect. Naturally, those algorithms are still at play, but developers can take more control by using the rel=canonical tag to let Googlebot know which version of a page should hold precedence. However, the news that Googlebot uses inference when spidering, and that it drops unnecessary parameters when considering which content is relevant means that the rel=canonical tag might be largely worthless. After all, if Google weeds out a parameter that would drop a huge number of pages from consideration – but your rel=canonical tag points to one of them – it’s not going to do you any good in the long run. This news highlights the fact that SEO is, indeed, a true science. When crafting an SEO-friendly site, experts will have to take Googlebot’s inference practices and the rel=canonical tag’s relation to it into consideration. Whether or not the rel=canonical tag will ultimately be worth it remains to be seen, and will have to be determined on a case by case basis.