File Under: HTML, Web Standards

Building with the Document Object Model

How would you describe your web page without mentioning its content?

One way would be to describe the page’s structure. What tags are on the page? How many are there? What order are they in? What are the properties of these tags? And finally, what is the presentational nature of each element? This is what the Document Object Model does. It expresses the structure of an HTML document in a universal, content-neutral way.

One of my first Webmonkey articles was about displaying random images. I twiddled with image tags on a page so they pointed to different image files over time. It was a simple concept: You have an arbitrary number of images on a page, a few of which the computer would randomly change about five times a second. The effect was a flashing, mutating space that I liked a lot.

I didn’t know it back then, but what I was doing was manipulating the Document Object Model of that page. I had a number of objects on the page. My script would then query out the number of images, and then modify an attribute of that object (i.e., switch out the sources of the images).

This was about the limit of what you could do with the Document Object Model in Netscape 3. You could read and write the attributes of image and anchor tags, and you could query some information about the browser itself – what MIME-types it accepted, what plug-ins were installed, its location, and a few other things. Simple, basic, down-to-earth, level-zero items.

Modern browsers have introduced several new items that communicate what’s in a document. Now there are layer tags that create a tree of hierarchies, plus new and more powerful ways to query the width and height of containers (that is, layers and div tags) as well as the window itself. All these add up to some attributes that allow you to manipulate your page and make it do what you want – either as the page loads or over a period of time.

At first, this object model was contained in JavaScript, and if you wanted other elements (Java applets, plug-ins) to manipulate any of this model, they’d have to negotiate with the browser’s JavaScript scripting engine.

Thus, early on, it was more useful to think about the object model as a function of JavaScript. The syntax was JavaScript and the collections of objects looked and acted like array objects. For most people, there wasn’t really a distinction between a page’s object model and JavaScript.

With Internet Explorer 4.0, browsers began to take the object model out of the JavaScript and put it into the page layout code. Instead of having a language that has a conception of various objects on a page, you have a browser that stores the structure and presentation of a page, and opens up that information to a scripting language or compiled component for reading and manipulation. You don’t have to figure out how the div tag’s position is stored in JavaScript only to discover it’s stored differently in VBScript, because that information is all in one consistent format.

You can manipulate HTML using JavaScript as easily as you could using VBScript, as easily as you could using a Java applet, as easily as you could with a compiled ActiveX control, as easily as you could with Cobol.

And this object model doesn’t just deal with anchors, images, and embed tags. The entire structure is dealt with. So if you want to get a count of how many <div>s are on a page, or change the fifth paragraph to blue, or modify the CSS values of your list elements so they appear to do one of those oh-so-cool “wipes,” the objects you manipulate are the same, and the languages you use become a means to that end.

But what can you do with a DOM?

I’ve discovered that one of the hardest things to grasp about the DOM is what its use is. If you view it as merely a broker between scripts and a document, it’s easy to assume the DOM is simply a common syntax. But it’s so much more than that.

If stylesheets allow you to perform layout independent of content and structure, then the DOM allows you to make interaction independent of content and structure. It works on the same principle, but from a different axis, one of code instead of design. This is good because it allows you to modify your existing pages and add all of the wizzy flying titles that your boss wants to see.

The Guts of DOM

DOM works by creating objects. These objects have child objects and properties, and the child objects have child objects and properties, and so on. Objects are referenced either by moving down the object hierarchy, or by explicitly giving an HTML element an ID attribute ( <img src="" ID="blueMarble">).

Here is a brief listing of top-level objects:

  • window
    • location
    • frames
    • history
    • navigator
    • event
    • screen
    • document
      • links
      • anchors
      • images
      • filters
      • forms
      • applets
      • embeds
      • plug-ins
      • frames
      • scripts
      • all
      • selection
      • stylesheets
      • body

I’m serious: That was brief. It’s vast. Though, for the most part, when doing operations like flying in a news headline title, you would reference the item in one of two ways: either starting at the document object and traversing down ordinaly, or by using the id or name of the element and referencing it that way. If you wanted to change the 23rd element on the page to a lovely blue, you could do it that way

Or, if you knew that the name of the element in question was “Freddie,” you could go down the tree and call it:

document.all('Freddie').style.color = 'blue'; 

The astute reader will notice that whatever HTML object I was referring to had a child object “style.” This object occurs in any element that can be styled with CSS, so a <div> would have one, but a <title> would not. Below the style object are the representations of all the CSS attributes, including CSS positioning elements. So, much of the time I’ve spent making a headline fly in has been spent manipulating an object’s style attribute.

This article was meant as an overview (a brief, brief one). The Document Object Model is vast, and there are lots of issues about how things work, and also lots of interesting things you can do with it. I’ll be covering those in a later article. To read the preliminary specs the W3C is drafting, check out the working group’s page on the subject. And then of course there is Microsoft’s Software Development Kit (SDK) for Internet Explorer, which contains a full listing of the current SDK.