« Fools Rush In | Main | It's the Little Things »
One of the problems with the XHTML 100 Laziness Test was that it was... well, lazy. Rather than simply validating three random secondary pages, a real Laziness Test would spider through all pages. I didn't bother with this approach for two reasons. First, my version of the Laziness Test was reasonably effective at weeding people out (although not perfect). Second, writing a well-behaved spider is well beyond my rather limited programming skills.
Fortunately, all is not lost. We can build such a spider. We have the technology. Or rather, the Norwegians do. (And isn't that scary? What else are those Norwegians hiding from us in those fjords? Does the Defense Department know about this?) Anywaay. Via Zeldman, via Ben Meadowcroft, via... no, wait, that's all the vias. Err... via all those people, researcher Dagfinn Parnas reports that 99.29% of the 2.4 million web pages in his sample fail validation.
In truth, Parnas did a much better job measuring general standards compliance than the XHTML 100, which was just a quick-and-dirty survey. Moreover, you can't compare Parnas's data directly with the XHTML 100 results. First of all, I was looking at the Alpha Geeks And Their Friends, while Parnas is looking at a much larger and much less geeky population. Second, Parnas is looking at HTML, while I was looking at XHTML only. Finally, Parnas's analysis is fundamentally different in that he aggregates his data into one pool. For example, consider a site in the XHTML 100 where the first page validates, the first two secondary pages validate, but the last secondary page fails. Parnas would count that as three successes and one failure. I would count that as total success for Test #1 and total failure for Test #2.
But hey -- let's ignore all those distinctions. If you can't compare apples and oranges on the Internet, where can you compare them? Parnas reports a 99% failure rate. In contrast, I report an Markup Geek failure rate (as measured by Test #2) of a staggeringly low 90%. I think we can all be proud.
If you do bother to read Parnas's nearly 6MB pdf paper, and I certainly recommend that you do, be sure to look at the breakdown of the various types of errors. The bulk of the errors (after "no DTD declared") consist of "non-valid attribute specified" or "required attribute not specified". Not surprising at all. From my own experience, very few people seem to know that the alt tag is required or that <img border="0"> is illegal in XHTML 1.0 Strict. The only real puzzler in Parnas's data was the relatively low fraction of pages with invalid entities. In the XHTML 100, invalid entities were a major killer. I don't have a good explanation for this discrepancy, but hey.
Parnas concludes:
As we have seen there is little correlation between the official HTML standard and the de-facto standard of the WWW.
The validation done here raises the question if the HTML standard is of any use on the WWW. It seems very odd to have a standard that only 0.7% of the HTML documents adhere to...
A good question. For me, the reason to validate is not ideological. Simply put, validation saves me time. For any page design, there are a huge number of possible glitches across the various browsers. Validation doesn't reduce the set to zero, but it does make the set a lot smaller. Hey, I don't know about you, but I need all the help I can get.
Posted by Evan Goer on Jun. 12, 2003 at 12:40 PM | Comments (9)
The basics:
http://www.yahoo.com automatically become links.This entry was posted on June 12, 2003 by Evan Goer.
For more entries, you can visit the main journal page or browse through the complete archives, which date back to 2001.
Text released under Creative Commons.
To use this license, you must attribute this work properly. This license does not extend to comments unless the original poster of that comment states otherwise.
Powered by Movable Type 3.33.
Home | About | Journal | HTML Tutorial
© Copyright 2001-2007, Evan Goer. Some Rights Reserved. Last Updated July 2, 2008.
Posted by Adiv on Jun. 13, 2003 at 9:07 AM
Posted by Evan on Jun. 13, 2003 at 9:34 AM
Posted by Jacques Distler on Jun. 18, 2003 at 9:51 PM
Posted by Adiv on Jun. 19, 2003 at 11:21 AM
Posted by Evan on Jun. 19, 2003 at 7:22 PM
Posted by Dagfinn Parnas on Jun. 20, 2003 at 5:18 AM
Posted by Evan on Jun. 21, 2003 at 12:00 PM
Posted by Dagfinn Parnas on Oct. 26, 2004 at 12:13 PM
Posted by Evan on Jan. 23, 2005 at 2:51 PM