Ho, ho, ho!

Take an extra 10% OFF

Discover the indexing of the future: Google SiteMap

Contents of this article

Google proposes what will be the new way of indexing web pages.
Search engines such as Google and Yahoo use spiders to collect information from web pages published on the Internet. Once they have the information, they process it to quickly order search results based on a specific algorithm when a user visits their respective web pages and searches for a term or phrase.

Search engine spiders periodically visit web pages that are published on the Internet and automatically update information about their content.

Until now, spiders would go into the root directory of a domain, look for the robots.txt file to make sure the site wanted to be indexed, and then proceed to visit all the links they found on the web page, thus recording all the content on the page.

Google SiteMaps is going to revolutionize the way we index web pages.

It's not just that Google now reads more carefully the sitemaps that people include on their web pages... it's not at all... it's a radical new way of indexing the content of the pages. Google proposes the creation of an XML sitemap following certain specifications that will give all the information to its spiders and that will allow them to access URLs that until now could have been hidden for various reasons beyond the control of the webmasters.

Google wants to be able to access all the content of web pages in the easiest and most efficient way. The way indexing of pages is now set up, even though it is much more efficient than the human indexes we had in the past (who doesn't remember going to a search engine, manually entering the definition of our site, the keywords by which we wanted to be found and the URL of the site... but this is already Internet prehistory), what Google offers us now is much better.

It's all about making a special sitemap available to spiders.

To create this sitemap, you just need an application that is installed on your server (there are versions for all operating systems) and that creates a site map in a specific format. The application that Google offers can generate the map from the URLs of the website, from the directories of the website, or from the server logs (ideal for dynamic pages).
Once we have the sitemap made according to Google's specifications, we can register it in Google SiteMaps. Google will automatically index it in less than 4 hours.

Google allows webmasters to create a cron job that generates a new sitemap up to every hour (for sites with a lot of content renewal) and that automatically submits the sitemap to Google Sitemaps. This way, spiders will immediately know about new pages created and can incorporate them into the index.

Advantages of this application:

No matter how bad your website is in terms of spider paths… with a site map created by the Sitemap Generator, Google spiders will always find the URLs of all your pages.

Another great advantage is the rapid indexing of the entire site's content. In less than 4 hours, the spiders visited up to 50,000 links on our website. For websites with more URLs, Google recommends creating several sitemaps and having a sitemap index.

Disadvantages of this application:

It requires some programming knowledge, so either ISPs offer this service as an added value for their customers or many websites will not have this service and will have to continue to be indexed by regular spiders.

The sitemaps already available on most websites are not compatible with Google's format. Google requires an XML document with certain specifications.

With this project, Google is undoubtedly looking for a way to improve the indexing of web pages and to be able to include in its indexes pages that until now were lost in a sea of links within our sites.

Google has created the Sitemap Generator and the Express indexing service and offers it completely free of charge… it will be interesting to see Yahoo's reaction to this, since Yahoo offers the express indexing service for a fee of 49$, 20$ or 10$ depending on the number of URLs we want to index in an accelerated manner.

At the moment we do not have any first-hand results regarding the effectiveness of indexing through the Google SiteMap. As soon as we have installed the new sitemap on several web pages and are able to make comparisons of the number of increases in indexed pages and the frequency of visits by spiders, we will write a new article reporting on the results. See you then.

Post note: It's been a few months since we wrote this article. The results have been very good. A whole website is indexed again in less than 24 hours. This is ideal for when a new website comes online. You can have it indexed in no time, without having to wait months and months for Google spiders to read all of its content.

Additional information:

URL with information about the Google sitemap:
https://www.google.com/webmasters/sitemaps/docs/en/about.html

URL with specifications about the Google sitemap:
https://www.google.com/webmasters/sitemaps/docs/en/protocol.html

One Response

  1. Register your site with Google Webmaster Tools
    Create a sitemap and submit it to Google Webmaster Tools
    Publish several posts daily (at least 2 even if they are short)
    Put your URL in PingOMatic after updating your blog
    Submit your blog to Digg, Propeller, Delicious and StumbleUpon
    Submit your RSS to multiple RSS Feed Readers and directories
    Write about other blogs (that follow the Do-Follow movement) and create trackbacks linking to your site

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.