goer.org

« Quick Hits: The Muggles of Physics | Main | Phrases for Programmer Pirates »

Make Money Fast with Clean Markup

Rakesh Pai: "The Economics of XHTML" (via Anne Van Kesteren). Anne advises that when you read Rakesh's article, you sub in "semantic HTML" for "XHTML". That's a good substitution, although I actually prefer "clean markup." Making your markup more semantic is a good thing, up to a point. Once you cross a certain line, your mind begins an inevitable slide into Semantic Extremism, until eventually you've convinced yourself that everything should be a list item or some such nonsense.[1] But I digress.

There have been countless articles like Rakesh's about how XHTML clean markup will save you big bucks. Honestly, I don't fundamentally doubt the overall theory, but it disturbs me that none of these fine articles puts out hard numbers on how much money you'll actually save in practice. The most concrete examples in the genre so far are the "redesign articles", wherein the author picks a large site with crufty markup, redesigns the home page with clean markup, and performs a highly naive calculation of the bandwidth saved. The best article that I know of is Mike Davidson's interview with DevEdge, and even that piece only provides a theoretical estimate.

So let's all put on our Business Analyst hats and ask a few questions that might be pertinent for designing an actual case study. To be sure, thinking in BizDev does not come naturally to most folks, certainly not me.[2] So first, a short cleansing ritual, to prepare the mind for these alien thoughts:

Ph'nglui mglw'nafh Forbes R'lyeh wgah'nagl fhtagn! ROI! ROI! ROI! Aiiiiieeee!

Ah, there we go. Now, consider a largish commercial site:

  • What are the actual bandwidth savings over one-month period, factoring in caching, real-world resource request patterns, etc.?

  • How much does a TB of bandwidth go for these days? How much will that same TB cost three years from now?

  • How much developer time does it take to refactor a complicated CMS to produce clean markup?

  • How much developer time does it take to clean up legacy content? Is this archived material accessed often enough to be worth cleaning?

  • Are developers who have modern skills more expensive than old-skool <font>-happy developers? (I would think so.)

  • What percentage of visitors use NN 4 or IE 4? Does the revenue lost from these visitors outweigh the overall bandwidth savings?

  • How much does it cost to employ other techniques to speed up your site, such as enabling conditional gzip compression? Comparing these techniques with a total redesign, which ones are the cheapest?

I don't have the answers to these questions. But I do suspect that any web design shops that can answer these questions (with non-foofy numbers) basically have a license to print money.

1. If we all lived in Web Designer City, a metropolis bustling with architects and bricklayers, professors and artists, hustlers and street vendors, you would be the guy staggering down the street, muttering to himself.

2. Business persons tend to ask questions that either A) make no sense or B) are so hard that any response you get back is almost certainly a lie. Or if we're feeling charitable, a "Wild Ass Guess."

Posted by Evan Goer on Sep. 16, 2004 at 6:53 PM | Comments (12)

Comments

  1. How much does it save in bandwidth?

    That question got put to Tantek from Technorati at Hypertext 04 last month.

    He said he couldn't quote a number, but it was significant.

    Also, Technorati's already consuming well-formed content (or content that can be repaired quickly) so they didn't have a huge base of legacy static HTML to fix.

    Posted by Bill Humphries on Sep. 16, 2004 at 11:19 PM

  2. I think I hurt myself trying to pronounce the words for the cleansing ritual.

    Posted by Mike P. on Sep. 17, 2004 at 3:03 AM

  3. Bill - That's interesting, thanks. I am sure the savings is significant over time. But again, we really need some concrete numbers, because the cost of refactoring a large site is also signficant, and there may be other, cheaper ways to achieve that savings. (Of course, I understand why these sites are reluctant to provide these numbers, I'm not blaming Tantek.)

    Also, a site with a large base of legacy content is exactly what I'm thinking of. We don't have to take this to mean Amazon or eBay... I'm thinking NewEgg.com or someone like that.

    Mike - That's okay if you can't perform the ritual. It's pretty dangerous, you could end up <em>permanently</em> in BizDev mode...

    Posted by Evan on Sep. 17, 2004 at 7:01 AM

  4. The purported bandwidth savings of clean markup is a lot of malarky. If you are using <code>modgzip</code>/<code>moddeflate</code>, you'll find the difference to be negligible. Text compresses really well (by a factor of 4 or 5) and "bloated" old-skool markup compresses (slightly) better than "clean" markup.

    If you're <em>not</em> using <code>modgzip</code>/<code>moddeflate</code>, then you are not actually serious about saving bandwidth. (Hint: if you want to know why Tantek doesn't quote any numbers, examine the HTTP headers from <a href="http://technorati.com/">technorati</a>.)

    Posted by Jacques Distler on Sep. 17, 2004 at 8:21 AM

  5. "Text compresses really well (by a factor of 4 or 5)..."

    When I read Rakesh's article, that was exactly the first thing that popped into my mind. You have managed to uncover the real motivation for this piece. :)

    If you're creating a new site from scratch, clean markup is a no-brainer, because the cost is pretty close to zero. But if you have a large, well-established site with crufty markup, it's not a no-brainer. Maybe the most efficient way to improve the user experience is to enable gzip compression. (For certain sites, gzip compression can lead to a serious tradeoff in CPU usage -- something my company is testing right now in our product.) Or... maybe the "best" approach for improving user performance is to just cut Akamai a big check.

    In short, Your Mileage May Vary. That's why I'd like see a few real case studies with real numbers. Clean markup is a good solid hammer in your toolbox, but not all problems are nails.

    Posted by Evan on Sep. 17, 2004 at 9:32 AM

  6. ingoBay!

    The reason why I haven't gone html-minimul like Anne van Kesteren has on some sites is that gzipping really makes dropping the head, closing tags and angles negligible.

    We've monitored client sites and get close to 98% coming down gzipped...

    Posted by Mike P. on Sep. 17, 2004 at 10:42 AM

  7. Mike, you actually thought I was doing that for saving bandwidth? That was just a little joke actually and it also showed that all "XHTML saves bandwidth" people were wrong.

    I'm using it on sites because it makes the source look clean, nothing more.

    Posted by Anne on Sep. 19, 2004 at 12:33 PM

  8. Anne's web pages are not so much designed around bandwidth savings, they are more like minimalist works of art.

    The first time I saw the source of one of his minimalist pages, I had to go running back to the spec. "Ummm, can he <em>do</em> that?" Yup, I think he can...

    Posted by Evan on Sep. 19, 2004 at 1:49 PM

  9. "Anne's web pages are not so much designed around bandwidth savings, they are more like minimalist works of art."

    The effect was kind of ruined for me because Markdown (which he uses for the content of his latest <a href="http://gameslog.net/">minimalist creation</a>) generates closing <code>&lt;/p&gt;</code> and <code>&lt;/li&gt;</code> tags, even it its HTML4 generation mode. The spare neo-Bauhaus lines are marred by this bit of code bloat.

    The site would be much more beautiful were he to tinker with the Markdown code to fix that.

    Posted by Jacques Distler on Sep. 19, 2004 at 11:21 PM

  10. Oh, what a quandary. Do we prefer <strong>Old Anne</strong>, whose perfectly strict minimalism led an almost Zen-like quality to his markup, or do we prefer <strong>New Anne</strong>, who uses his closing tags to enlighten us to the importance of symmetry in every aspect of the cosmos?

    So many different aesthetic points of view to consider!

    Posted by Evan on Sep. 20, 2004 at 5:44 PM

  11. Heh, I'm going to fix that part of Markdown some day. I entirely agree that it is suboptimal :-)

    Posted by Anne on Sep. 21, 2004 at 4:20 AM

  12. I managed to do some (dirty) clean-up on that site. The source code now looks lovely, imho.

    Posted by Anne on Sep. 21, 2004 at 4:58 AM

Post a comment

(Optional, but hides your email address)

Are you a spammer? (REQUIRED — you must select "No" to post.)

NOTE: For mysterious reasons, comment posting is extremely slow right now. It can take from 30-60 seconds after you hit "Submit" for your comment to post. However, your comment will go through; you shouldn't need to click the button again.

Comment Syntax

The basics:

  • For a new paragraph, enter two carriage returns.
  • Plain URLs such as http://www.yahoo.com automatically become links.
  • The system encodes all angle brackets and ampersands. For example, if you try to enter a HTML paragraph, the system displays the open tag literally as "<p>".

Show advanced syntax

About

This entry was posted on September 16, 2004 by Evan Goer.

For more entries, you can visit the main journal page or browse through the complete archives, which date back to 2001.

Subscribe to this Site

(What does subscribing mean?)

Copyright

Creative Commons License Text released under Creative Commons.

To use this license, you must attribute this work properly. This license does not extend to comments unless the original poster of that comment states otherwise.

Powered by Movable Type 3.33.

Home | About | Journal | HTML Tutorial

© Copyright 2001-2007, Evan Goer. Some Rights Reserved. Last Updated July 2, 2008.