File Under: HTML

Validate Your HTML

If you’re anything like us, you’re always jumping on your cross-platform soapbox at the office. On mailing lists, over lunch, at parties, and on the bus home from the party (just in case the bus driver has been duped into supporting proprietary tags in his home page) you insist that everyone should follow W3C standards for HTML, CSS, and HTTP.

“Standards, standards, standards!”, you insist, as you pound on your podium. But do you actually test your own Web site for rigid W3C compliance? Be honest: Of course you don’t. You look at it with a couple of browsers, maybe a second computer, and just fix anything that looks wrong to your naked eye.

But if you really want to create cross-platform HTML and remove proprietary browser tricks from your site, you should be validating your HTML with the W3C’s HTML validator at validator.w3.org. The W3C validator is free to use and always up to date. And a more official source for standards specifications doesn’t exist.

Even better, you can augment the W3C specifications to include your own rules, such as forbidding nested tables. A “house rules” validator is an easy way to keep a team of developers from messing up each others’ work.

How to Use the Validator

Using the validator requires one modification to your pages:It will complain if you don’t embed a doctype tag at the very top of every page you want to validate, stating what “document type definition” (DTD) your document conforms to. Without that tag, the validator will try to guess which HTML standard you are trying to comply with. But to be fully W3C compliant you must put this line at the top of your page:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"  "http://www.w3.org/TR/html4/strict.dtd">

or for documents with frames:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"  "http://www.w3.org/TR/html4/frameset.dtd">

You can also use the XHTML transitional DTD:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Or, if you want to be truly fancy and future-proof, use the HTML5 declaration:

<!DOCTYPE HTML>

just be aware that it won’t properly validate most of the time.

For a full list of valid DTDs, take a trip to the W3C’s page listing recommended doctypes.

Then open validator.w3.org. Enter the URL of your page into the box labelled “URI” (don’t ask why it says “URI” instead of “URL” or this column will get a lot longer), and hit Return. Almost instantly, you’ll get a list of all the HTML errors on your page, with a link to a standard explanation for each, like this:

Error at line 25:</head>

end tag for element "HEAD" which is not open

Figuring out how to fix the errors is up to you and can be time-consuming. The explanations aren’t always helpful for struggling developers, and we haven’t figured out a way to get the validator to stop complaining for pages and pages about questionable HTML in some of the ad banners we don’t create ourselves. As a workaround, we created fake, validated ad HTML fragments, and replace the ad server calls with these in pages we want to validate while developing them.

Congratulations – You Get a Button!

Once you’ve fixed all your errors, running your page through the validator will reward you with this button:

Congratulations, your document validates as HTML 4.01 Transitional!


To show your readers that you’ve taken the care to create an interoperable web page, you may display this icon on any page that validates.

A Couple of Catches

One small problem with the W3C validator is that it won’t work on pages hiding behind a firewall. If you aren’t comfortable copying your pages under development to somewhere outside your firewall, there are local validation tools you can use, such as the ones that come with whatever developer environment you’re using (Dreamweaver, etc.). Once your pages are live to the world, you can run them through the W3C site.

Another problem you’ll encounter is pages that call in ad banners from an ad server. Your page may be standards-compliant, but there’s no way to guarantee that the ads scheduled for it will be. Fixing them all yourself, or getting all your advertisers to comply, just isn’t going to happen. To make it worse, many ad servers use “&” to separate CGI parameters. Ampersands in URLs are verboten in every HTML standard since HTML 4.0 standard. Good luck getting your US$50,000 ad server software changed.

But the most aggravating problem you’ll discover with validating your HTML is this: Meeting the W3C specifications doesn’t mean that all browsers will agree on how to render your page. You can end up with pages that are standards-compliant, but which lay out differently on Firefox for Linux, or Safari for Macintosh. This can be unacceptable to a designer who wants exact control over each pixel. You may have to do some evangelizing (or just old-fashioned complaining) in order to avoid using browser-specific HTML in the long run.

Writing valid HTML can be a lot of work up front, especially if you’re retrofitting old content. To be honest, this column itself probably fails the 4.01 Transitional test. But if you can put the work in, and win a few fights with your co-workers over standards, a trip to validator.w3.org now will save you a lot of catch-up work down the road.