There is an eloquent blog post at glyph.twistedmatrix.com this week, inspired by the Blogger platform’s post-mangling.
* Properly-quoted “<” and “>” (i.e. “< and “>”) are quoted again.
* Additional line-breaks are added.
* is converted to white-space, and then
* white space is collapsed.
This is a symptom, Glyph Lefkowitz argues, of treating HTML as just a string of text, rather than as the smartly manipulable structured data that it in fact is.
String manipulation is easier than DOM wrangling — indeed, string manipulation is one of the first topics covered in How To Program books — but that doesn’t mean it’s the best tool for every job. I daresay that treating structured data as a string is a reason that, for instance, good calendar applications, which should be something computers can do well, are so hard to come by.