A Slashdot story about the sites of browser makers that do not conform to the XHTML1.1 or any other HTML standard triggered my interest and after checking a couple of the Braveworld pages with the W3C markup validation service I faced the sad fact, that there where quite a few errors in my HTML.
The next step was a story from Ben Hammersley’s blog which reflects my experience of using the validation service and also helped to remind me that you don’t always have to use your browser (this is something I keep forgetting).Apart from giving me another example of non-standard RSS usage, it also gave me the idea of adding automatic validation for the Braveworld site.
So I put together the W3CHTMLValidator module that essentially wraps the HTML markup validation ‘web service’. Just a couple of helper classes and two methods:
W3CHTMLValidator:request_validation makes a HTTP GET request to the W3C validation service.
W3CHTMLValidator::validate_page sets up a request_validation call with XML results and returns a ValidationResult class that contains any possible messages.
Using this module in a Rake task for this site goes something like this:
- Collect all the pages for the site from the generation directory: FileList[’/generated/*/.html’] does this beautifully
- For all the HTML files, construct the URI (i.e http://www.braveworld.net/riva/index.html) and call W3CHTMLValidator::validate_page
- Log all ValidationResult messages to a file
Proxy support is also included (Ruby’s net/http library is extremely useful in this regard)
You can have the gem, read the API docs and download the source for W3CHTMLValidator at your leisure.
I have even created a mini site for this mini project.