Google’s Dan Crow on Supplemental Results and XML Site Maps

This week Dan Crow, Google’s Product Manager for Crawl Systems, spoke at an SEMNE event in Providence Rhode Island. He covered many topics including Google’s Supplemental Results index (SI) and XML Site Maps.

Google’s Supplemental Results: Confirmation that files In SI rarely rank well

Dan gave us an update on Google’s Supplemental Results Index. 

First a little background. For general searches Google has a main index and a secondary Supplemental Results Index (SI). Since it appeared we’ve seen pages moved into the supplemental results for a number of reasons including pages that Google is having trouble reaching to reindex, pages that have been removed from a site (they eventually get removed from SI), old pages left up on the site without links to them from the main site, penalized pages, pages that are largely duplicates of other pages indexed, etc. Recently Google has been using the Supplemental Results index more as an auxiliary index because of the limited capacity in the main index. 

Dan said that currently most files end up in the supplemental results index because they have a low page rank and/or have not been updated in a long time.

Although the situation has improved recently, it’s been our experience that once pages are moved into the Supplemental Index they very rarely reach high positions in the search results for competitive keyword phrases and they are not re-indexed very often.

Dan confirmed that files in the Supplemental Results index do indeed get indexed less often then files in the main index. He said that while files in the main index on average are reindexed weekly, those in the Supplemental Results index get reindexed every month or two on average.

He also confirmed that files in the Supplemental Results index don’t do as well in search results as files from the main index as we have seen. Dan said that for most keyword searches a certain number of results will be displayed from the main index first before any result from supplemental results are displayed.

Dan went on to say that, as we have seen, files in the Supplemental Results index are getting indexed more often than in the past. He said improvements in the crawl rate should continue to the point where there may eventually be little difference between the results from either the main or supplemental indexes and the day may come when there would be no more need for a Supplemental Results index.

So for now, if you find you have a significant percentage of files in Google’ Supplemental Results index you need to try to determine why they are there and address it in order to improve your search results for these files.

Google SiteMaps: Files in “Walled Gardens” should do better in search results in the future
Dan also talked about Google’s Webmaster Tools and Sitemaps in particular.

Definition from; Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL so that search engines can more intelligently crawl the site.

The sitemap protocol is being adopted by many search engines including Google, Yahoo, and MSN.

Dan talked about using a sitemap to let Google know about pages that it wouldn’t know about otherwise such as files in a “Walled Garden” – files that have no links Google can follow from the public area of the site and that have no incoming links pointing to them from other sites.

However since there are no links to these pages from other sites or links that Google can follow from the main site then it follows that the page rank of these files will be low.  Since page rank is such and important variable in Google rankings how well could these files do in the search results?

Dan said that it was indeed an issue that Google is aware of and that we can expect it to be addressed, probably later this year.