Detect Which Browsers Are Visiting Your Site

In a web designer’s dream world, all browsers would behave in the same way. They would all render HTML predictably and offer identical support for web standards.

But we don’t live in a web designer’s dream world, and we can’t expect to, of course. It would be silly, for example, to expect a text-based browser like Lynx to be able to handle all the same features as a wizzy graphical browser like Opera.

As a result, it has become common practice for web sites to tailor the HTML they send to different browsers. This can be done in a variety of ways — basically any way that you can dynamically generate HTML — but whether you do it with an ASP page, a Java servlet, or even by configuring your server itself, there must be some way for a browser to identify itself to your server.

That’s where the User-Agent string comes in.

Contents

  1. The User-Agent String
  2. Picking Out the Pieces
  3. The Detection Checklist
  4. Or You Could Just Steal Some Code
  5. Other handset and device detection services

The User-Agent String

Fortunately, somebody was thinking ahead when they wrote the HTTP specification.

Section 14.42 of the HTTP 1.1 spec designates a special header, called User-Agent, to be used for identifying the Web client to the server. Don’t ask me why they didn’t call it the Web-Client header — perhaps they just got back from a 007 Film Festival.

The point is that your Web server can use this header to try to figure out what kind of browser is making a request, and act appropriately. For example, Internet Explorer 6 running on a Windows XP box sends a User-Agent header string that looks like this:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

Firefox 2 running on a Mac might send one like this:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1) Gecko/20061024 Firefox/2.0

And Safari 3 on a PowerPC Mac:

Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/521.25 (KHTML, like Gecko) Safari/521.24

Wikipedia has a fairly extensive list of User-Agent strings, but it’s easy enough to make your own if you know a little Perl and your server is set up to record them in its logs. If you actually perform this little surveillance experiment, you may be surprised to see how many different variations there are.

And therein lies the difficulty of getting the info that you want out of the User-Agent.


Picking Out the Pieces

As you can see from the examples above, User-Agent strings typically start off with the name of the product (the browser, that is) and a version number, sometimes followed by a list of “comments” in parentheses, one of which usually indicates the operating system.

Here are a few more examples from back in the day.

User-Agent:NCSA Mosaic/2.6b1 (X11;UNIX_SV 4.2MP R4000) libwww/2.12 modified User-Agent:Mozilla/3.0 (compatible; Opera/3.0; Windows 95/NT4) User-Agent:Lynx/2.7 libwww-FM/2.14

You can also see from the irregularity of these examples that it’s not a trivial task to come up with a parsing algorithm that will identify browsers (and platforms) correctly. In fact, there’s nothing stopping someone from building a browser that completely defies the vague conventions that we noticed above. To make things worse, there’s always a certain element that enjoys “spoofing” User-Agent strings of other browsers, or making up their own. One of our favorites from Wired’s old Hotbot logs was:

User-Agent:Nintendo64/1.0 (SuperMarioOS with Cray-II Y-MP Emulation)

Are you still bold enough to try writing your own browser-detection routine? Keep reading for a few rules of thumb.


The Detection Checklist

  1. If you just need to distinguish between text-only and graphical browsers, you’re in good shape. Let’s face it, there’s just one text-only browser with enough users to bother going out of your way for:Lynx. And its User-Agent string always starts with Lynx, so it’s easy enough to identify.
  2. If you just need to distinguish between the big two — Firefox and IE — then notice that Firefox’s “product name” is Mozilla. Unfortunately, IE also starts its User-Agent string with Mozilla. Lots of browsers pretend to be Mozilla-compatible, presumably because sneaky web programmers like us used to just look for the string Mozilla when checking to see if certain features were available. To find IE, you should look for the string MSIE in one of the parenthesized comments. So, you could, for example, assume that any browser whose User-Agent string starts with Mozilla really is Firefox or Firefox-based, unless you find MSIE in one of the comments, in which case you’ve got an IE user on your hands. You could check for Opera this way as well, since Opera always identifies itself somewhere inside the parentheses.
  3. If you want the version information, it’s almost always separated from the name of the browser by a slash or by whitespace (usually a single space).
  4. To find platform info, as with the browser info, you pretty much have to know in advance what you’re looking for. There’s no guarantee about which “comment” within the parentheses contains the platform info, so it’s a good idea to check each of them against a predictable list of strings, like X11, SunOS, Linux, Mac, Intel, PPC, Win, and anything else you want to check.


So to recap, here’s a high level pseudo-algorithm for parsing the User-Agent string:

  • Parse the stuff before the left parenthesis (if there is one) and look for browser and version info (if it’s Mozilla, you might change your mind later).
  • Loop through each of the semicolon-separated tokens within the parentheses, and for each one,
    • Check against a fixed list of strings to see if the token contains the “real” browser and version info
    • Check against a fixed list of strings to find possible platform info.


Another reason this example is slightly contrived is that the client, of course, already knows what it is — it doesn’t have to go looking into the User-Agent string that it sends to a server. So in JavaScript you can also use the navigator.appName and navigator.appVersion properties, which theoretically might not coincide with the information that’s in the User-Agent string. But our example is more portable, and is intended to illustrate the general principles we’ve already discussed.


Or You Could Just Steal Some Code

If you’ve read this far and you’re still not up to parsing the User-Agent string, there may be another option. Depending on your needs and on your web server setup, someone may already have done the work for you. Here’s a list of some related resources:

If your web server supports ASP, you can use the built-in BrowsCap function. You might also want to check out cyScape’s BrowserHawk application, which provides more accurate, full-featured browser detection.

If you’re using Apache, check out the section of Arachna that shows how to do your UA parsing using SSIs.

Mozilla has several code snippets you can use for the different methods of detecting browsers.

Finally, Scott Isaacs offers a couple of articles on browser detection using both client side and server side JavaScript.

Other handset and device detection services

You could also consider a service such as handsetdetection.com which interfaces with a website to return live device information based on make, model, mobile screen size and also location based information. You can then use this data to customise the view and optimize the display, design and UI for each handset.