Member Sign In
Not a member?

A Wired.com user account lets you create, edit and comment on Webmonkey articles. You will also be able to contribute to the Wired How-To Wiki and comment on news stories at Wired.com.


It's fast and free.

Sign in with OpenID
Sign In
Webmonkey is a property of Wired Digital.
processing...
Join Webmonkey

Please send me occasional e-mail updates about new features and special offers from Wired/Webmonkey.
Yes No

Please send occasional e-mail offers from Wired/Webmonkey affiliated web sites and publications, and carefully selected companies.
Yes No

I understand and agree that registration on or use of this site constitutes agreement to Webmonkey's User Agreement and Privacy Policy.
Webmonkey is a property of Wired Digital.
processing...

Retrieve Sign In

Please enter your e-mail address or username below. Your username and password will be sent to the e-mail address you provided us.

or
Webmonkey is a property of Wired Digital.
processing...

Welcome to Webmonkey

A private profile page has been created for you.
As a member of Webmonkey, you can now:
  • edit articles
  • add to the code library
  • design and write a tutorial
  • comment on any Webmonkey article
Close
Webmonkey is a property of Wired Digital.

Sign In Information Sent

An e-mail has been sent to the e-mail address registered in this account.
If you cannot find it in your in-box, please check your bulk or junk folders.
Sign In
Webmonkey is a property of Wired Digital.

Sitemap Files Ensure Google Finds All Your Webpages

Sample_xmlDeveloper Jeff Atwood has an inside look at the importance of using a sitemap.xml file to ensure that search engines are finding all of your website’s content. Atwood shows how, prior to setting up a sitemap.xml file, Google’s search bots were ignoring large portions of Atwood’s recently launched StackOverflow site.

The problem was that StackOverflow used a URL scheme which paginates content using the dynamic ?page=n format. Now you might assume, as Atwood did, that the Google bots could figure this out (after all, probably at least half the web is using the same format), but for whatever reason the Google bots weren’t following these URLs.

As a result less than half of StackOverflow was being indexed.

The solution, should your site be in a similar situation, is to create a sitemap.xml file and use it to explicitly tell Google and other search engine spiders what and where your pages are.

As the Google Webmaster Q and A says, Sitemaps are particularly helpful if:

  • Your site has dynamic content.
  • Your site has pages that aren’t easily discovered by Googlebot during the crawl process — for example, pages featuring rich AJAX or Flash.
  • Your site is new and has few links to it. (Googlebot crawls the web by following links from one page to another, so if your site isn’t well linked, it may be hard for us to discover it.)
  • Your site has a large archive of content pages that are not well linked to each other, or are not linked at all.

So, if you’re having trouble getting your website indexed by Google and other search engines, now is a good time to create and use a sitemap file. For more tips on optimizing your websites, be sure to check out our tutorial on Site Optimization.

See Also:

Post Comment Comments Permalink Print
Reddit Digg

 
Subscribe now

Special Offer For Webmonkey Users

WIRED magazine:
The first word on how technology is changing our world.

Subscribe for just $10 a year