File Under: Programming, Web Basics

XHTML 2 Dies a Lonely Death, Makes Room For HTML 5

The web’s governing body has taken XHTML 2.0 off life support. The World Wide Web Consortium, the group charged with overseeing the languages that power the web, has decided not to renew the charter of the XHTML2 working group, which is set to expire at the end of 2009.

Quick, web developers, the sky is falling — panic!

Actually, the sky is just fine. In fact, it’s considerably clearer than it used to be, and there’s certainly no reason to panic over the death of XHTML 2.0. The supposed successor to XHTML 1.x and the markup language that was once hailed as the next evolutionary step for the web has for all intents and purposes been dead for years. All the W3C has done is give it a proper headstone. And with the burial complete, the W3C can put all its efforts into the real future of the web — HTML 5.

But what about the much-touted advantages of XHTML that the web-standards purists (including us) were fussing about just a few years ago?

To understand how and why XHTML 1.x gained so much favor and why its successor fell out of favor, we need to go back in time a bit and take a closer look at HTML 4.

HTML 4 is a very loose language full of options. So many options, it fostered a bunch of experimental new ideas for how to build web pages — some good, some bad. However, to say that HTML 4 encourages bad code (which was the XHTML rallying cry) is like blaming the English language for producing bad novels. HTML 4 code can in fact be well-formed and semantically valid, so long as the authors know what they’re doing and adhere to the spec.

However XHTML 1.0 was much stricter, and the validation tools were much better at pointing out bad code, which is at least partially responsible for its popularity — if you were lazy and wanted to make sure that your code was well-formed, XHTML 1.x made it much easier to check.

But that was never the real purpose of XHTML. The X isn’t there because it’s cool, it’s there because XHTML is really XML.

As Henri Sivonen, who is working on the HTML 5 spec, points out, there are really “two meanings to XHTML: technical and marketing.”

The technical aspect covers authors who genuinely want to serve XML documents and use the application/xhtml+xml MIME type. When it comes to serving web pages, these folks are in the minority. Which isn’t to say the technical aspects of serving XML aren’t important; they are and they will be covered by the new XHMTL 5, an XML serialization of HTML 5.

The far more prevalent use of XHTML is the marketing variety — in other words, pages are written in XHTML 1.x, but served as using the text/html MIME type. So while these documents might be valid XML, they aren’t served as XML documents anyway. In such a scenario, the browser ignores the fact that the document is XML and simply renders the page as it would any other HTML page.

So why did everyone love XHMTL so much if it was really pretty much the same thing as HTML 4?

The answer is that XHMTL encouraged much better coding practices. Tags needed to be fully closed (think, <br /> rather than <br>) and well-formed. XHTML made for cleaner, much more manageable code than HTML 4.

However, HTML 5 already addresses most the those issues, and it allows you to use either the closed syntax of XHTML 1.x or the open syntax of HTML 4. That means that your well-formed XHTML 1.x code can (in most cases) be converted to HTML 5 by simply changing the doctype.

So what was wrong with XHTML 2.0? Although largely well-intentioned, XHTML 2.0 was an entirely different beast than its predecessor and did two things that essentially doomed it from the get-go.

First, it was backwards-incompatible with XHTML 1.x. All that XHTML 1.x code would need to updated to work with the new spec. HTML 5 on the other hand is backward-compatible with both XHTML 1.x and HTML 4.

The second problems stems from the fact that XHTML 2.0 wasn’t simply an XML formulation of an HTML spec; it was a completely new spec that ignored the realities of web development in favor of semantic precision. Because of this, it failed to offer any compelling, practical new features.

Where HTML 5 has loads of new stuff for developers to use — native audio and video embeds, multi-column layout tools, offline data storage, native vector graphics — XHTML 2.0 didn’t offer anything of the sort.

Another big reason we’ve been tagging XHTML 2.0 as dead for some time is that not one browser maker has implemented anything in the XHTML 2.0 specification. While all the latest releases of Firefox, Safari, Chrome, Opera and yes, even Internet Explorer, have implemented at least some aspects of HTML 5, none of them has touched XHTML 2.0.

Big names like Google and Apple are already rolling out web services that use HTML 5′s new features to embed videos and store data for offline access, but again, no one is building XHTML 2.0 apps.

But wait, what about that minority of web pages actually served using the application/xhtml+xml MIME type? Aren’t those pages screwed without XHTML 2.0?

If they were conforming the the XHTML 1.x spec, they were screwed anyway since XHTML 2.0 isn’t backwards-compatible. But no, they aren’t screwed by the dropping of the XHTML 2.0 spec. In fact they’re in much better shape since there will be XHTML 5, designed exclusively for documents that actually need to be served as XML.

So what spec should you be using for your web pages? In the future, HTML 5 will be the best choice. Until the future is even distributed (that’s a nice way of saying “until IE catches up”) we’ll be using HTML 4.01, though XHTML 1.x will work as well.

See Also: