The Battlefield of HTML5

It was a cold and quiet night, snowflakes flitted past the window and burning wood crackled soothingly in the fireplace. A cup of coffee and a chocolate bonbon rested beside me when I read something that put the idyllic visage of this real-world environment into a stark juxtaposition against my online world: HTML5, the specification already plagued with an identity crisis, had now turned into an Orwellian nightmare. What was once a Web technologies specification rife with issues expressed (with great concern) by the community for whom it is intended, had now become an idiotic accumulation of spec-shrapnel.

“What happened?”, you may wonder in great distress. Well, HTML5 got torn to pieces.

A little history: HTML5 is the specification that started out as Web Applications 1.0 and was created in 2004 primarily by employees from Apple, the Mozilla Foundation and Opera under the formation of the WHATWG, or Web Hypertext Application Technology Working Group. It arose out of concerns over the W3C’s direction with XHTML and their lack of interest in HTML.

Since then, a lot has already changed: the XHTML2 specification was terminated in lieu of much faster and more sensible progress taking place in the WHATWG on Web Applications 1.0, which got renamed HTML5 somewhere along the way. It is still being developed by the WHATWG but from then on existed as a specification, somewhat inexplicably, on both the WHATWG and W3C websites. Meanwhile, the Web Forms 2.0 specification, also an original WHATWG effort, got rolled into HTML5 and the Web Controls 1.0 specification got abandoned due to XBL 2.0.

Now, if you’re a web designer or web developer wanting to use the latest and greatest technologies available to you today, the above bit of narrative will probably have made you ask: what does any of that matter to me?

The Good:

Mostly, learning about the origins of HTML5 and its (ongoing) history will do you little to no good. You’re not going to learn a thing about how to build semantic, accessible, rich media-enhanced websites using the technologies defined in HTML5 from knowing this history. But…

The Bad:

Since the specification is still under heavy development here and there, there are very few tutorials so far that accurately explain how you can use the new features. As such, you are almost guaranteed an occasional but necessary trip to the WHATWG website for a look at the specification itself. “How is this feature implemented” is something that browser vendors are still working on, let alone have it documented, and yet website developers are already eagerly trying to use them. In other words, the HTML5 spec (such as it can be called “spec”, but more on that below) is one of the few resources at your disposal that help you figure out some of all this stuff.

The Ugly:

This is where things get really messy: in the past few days, HTML5 got disorganized as a specification even more, with Microdata (er, Microdata?) and 2D Context (er, 2D Context?) being isolated into separate sections. Furthermore, it now exists under several confusing identifiers: there is the WHATWG HTML (Including HTML5) HTML5 (including next generation additions still in development) spec, the perceived-as-offical HTML5 spec, and the subsets or sub-specifications: Web Workers, Web Storage, Web Sockets API, The Web Socket Protocol and Server-sent Events.

Why half of all this lives and is developed on the WHATWG website but some of it on the W3C site, or why the spec was dissected even more, or why some of it got renamed (from “HTML5” to “WHATWG HTML (Including HTML5)” to, two days later, “HTML5 (including next generation additions still in development)”[1]), is all completely beyond me. It feels, as Jeremy Keith aptly pointed out in the WHATWG IRC channel, very “bureaucratic and Kafkaesque”.

It seems the WHATWG and W3C are in a power struggle over who gets to dictate the future of web technologies, but it has gotten to a point where the web designers and developers, i.e. the very people this entire thing is ultimately in service to, are so far out of the picture that their best interests are not only not being served anymore, they have become the innocent bystander getting injured.

Sure, I know, the HTML specification is meant for the browser makers, not the people making actual websites, but that oft-repeated argument is flawed. Any web developer worth his or her salt will rely on the specification, be it directly by wanting to know the truth of the matter (like wondering which browser is doing something right), or indirectly by learning from tutorials written by people like myself who read the specification and interpret it into words more understandable by designers and developers.

Evangelizing HTML5 was already difficult; between keeping up to date with the ongoing changes, browsers implementing parts of it and fixing bugs between versions, and trying to contribute with suggestions on why certain implementations work or don’t work, there is very little time left to actually explain to people just how it’s all supposed to work. Jeremy’s frustration is therefore very understandable, as is my own (I evangelize much less these days, but Modernizr requires me to keep on top of all this just the same).

The reasoning given in that same IRC chat was that “HTML is an ongoing development”, but that doesn’t work for the Web. Yes, these technologies are constantly changing and yes, it’s important to stay flexible to adapt to new technologies that emerge from creative people, but no, this current, highly disorganized process does not work.

It does not work for web designers and web developers who want to learn about the next wave of web technology, thus far popularized as HTML5.

It does not work for browser makers that are less involved in the process than others, forcing their hand (which may end up going in a direction the other browser makers may not like).

It does not work for the Web as a platform, as that “ongoing development” gets stunted by the confusion this process is creating, and adoption of new technologies will suffer.

It does not work for the consumers, the people browsing websites by the billions every single day, as their sites are not being updated to use these more efficient and much more powerful techniques as rapidly as they could have.

It only seems to work for the handful of people working in the WHATWG and W3C, but their goal with this process has started to elude me.

HTML5 has become a battlefield, and the people most excited to use it are the ones stuck in the crossfire. So what can we do?

There are several things we can do; first of all, the WHATWG wiki has a What you can do page, but it’s written for people wanting to contribute to HTML5 in its current direction, not so much for people wanting clarity in the process. What then? Peter-Paul Koch suggests that HTML5 is whatever you want it to be; similar to that is Andy Clarke’s suggestion: Keep calm and carry on (with HTML5). The latter two ideas have one important thing in common: they both recommend that you just ignore the politics of the WHATWG and W3C and simply use whatever parts of HTML5 you want and are available in browsers. This is sound advice because as a web developer, what really matters to you is what features are useful to serve your needs and are also implemented in today’s browsers.

I’d like to propose another option: full modularization of HTML5. Real, official, proper modularization.

Modularized HTML5

The obvious question that arises from this suggestion is: “Why?”

The reason this current fragmentation of HTML5 is such a problem right now is because different parties are trying to appropriate different pieces of the spec, and it keeps being adjusted as a whole for the sake of “ongoing changes”. This is a real problem that won’t go away, but one that modularization could actually fix. Look at CSS3: it is a massive, massive specification, but because it was broken down into individual modules long ago it has undergone very clear and steady development (compared to HTML5, anyway). It also proved flexible enough to go along with new developments in the industry, such as CSS Transforms and CSS Transitions that were proposed by a non-W3C entity (Apple is a W3C Member, but they proposed these specifications independently).

The benefit for CSS3 was that browser vendors could easily choose which modules to implement and which to wait out on, which was not really clear-cut with CSS2 and only slightly so with HTML5. A proper modularization could clear up the mess from a vendors point of view[2].

It would offer a solution for the political situation, too: rather than trying to appropriate the entire specification, each party can simply focus on individual modules they care deeply about. True, this breaks the battle down into smaller chunks, but it’s better for everyone when an individual module gets pulled at by opposing parties, rather than the entire HTML5 specification.

How does modularization make this work for the Web?

The trick here is in perception: HTML5 as one gigantic specification, with bits and pieces being rolled in and taken out, is much more difficult to make any kind of sense of (for your average yet cutting-edge web developer) than a modularized specification that clearly identifies itself as consisting of many individually developed parts. CSS3 has reaped the benefits of that, and HTML5 can do so, too.

For instance, the Geolocation API is not part of the HTML5 specification, but you’ll be forgiven if you thought it was. Conversely, things like Web Storage are part of HTML5 but—especially with these recent changes—seem only somewhat related, if at all.

With a modularized HTML5, new developments can be rolled into the spec as new modules (like Geolocation), and the ongoing development of the Web won’t suffer. We can collectively consider “HTML5” the spec-brother (or sister) to CSS3, which continues to be used and re-used as a catch-all for new proposals and technologies.

And Web designers & developers?

They’re already used to modularized next-generation specs. This improved clarity would make things much more understandable for people who make websites, and thus encourage them much more effectively to use HTML5.

Modularization is but one option for the HTML5 specification, of course, but while it won’t please all parties equally, it would at least put an end to this battle and restore some sense to the specification.

One can hope, anyway.

  1. I apologize if the specification has suffered another name change by the time this article is published.
  2. Though with 3 out of 4 major browsers behind the WHATWG, it is unclear how much they actually care that it’s getting messy.

If you liked this, you should follow me on Twitter!