2009

First day: war plans

Any developer with a background in (client-side) web development knows (in his flesh) that writing portable javascript code is a permanent dance around implementation discrepancies - feature-wise and behavior-wise. Although, given we are speaking here about a 9 nine years old specification that most vendors claim to implement (DOM2Core), the brave explorer soul would expect at least complete implementations, and to be left only with the usual unspecified suspects (getElementsByTagName in namespaced documents) and the dark corners (DocumentType) to deal with.

Such a savvy developer would certainly start things up with the following well ordered war plan in head:

  • read the specifications first (without checking browsers behavior, so not to get tainted)
  • write extremely-bitchy' tests suites according to that preliminary reading
  • check best-of-breed browsers behavior against that, read the specs again, amend the tests
  • check out the official w3c tests suites, rip them out and amend again
Then, as discrepancies appear, order the stuff by priority/severity, and begin "fixing" - that is: override "buggy" native functions so they behave consistently across platforms. Pretty much, the usual closure/trick pattern:

XMLDocument.prototype.whatever = (function() {
  var buggy = XMLDocument.prototype.whatever;
  return function(tagName) {
    // Whatever works better, possibly reusing "buggy" in some cases
  };
})();

It's even feasible to add support for some fancy (and quite useful) stuff not in the DOM level 2 specification (thinking specifically about xml:id).

And of course, add to all that a nifty bootstrap tester mechanism so that patches are not applied on (blind and dumb) User-Agent sniffing, but rather on live features and bug testing.

Second day: first round tests results - the winner is...

Assuming you get a "starter" test suite covering the useful stuff, results don't really come out as a surprise.

Firefox and Opera do a good job - Safari as well (but see below for why it doesn't, actually). And... you knew Internet Explorer would be different.

There is no native XML parser/DOM support in Internet Explorer. Instead, you may use an ActiveX object (a.k.a. "MSXML2.DOMDocument"), that returns a DOMDocument (taking some liberties with the spec, including some additional stuff - see http://msdn.microsoft.com/en-us/library/aa923288.aspx).

I'm not being sarcastic. The IE approach is an implementation choice (whether bad or good is not up to me to judge). Now, you need to be very careful here: the ActiveX component exists in several different versions that bear no relation with IE versions - and yes, if you wondered, you are bound to whatever is installed on the user box, and yes, if you wondered, there may be different version of the component available simultaneously. So, you need to crystal-ball guess whichever version is the most up-to-date. Now, it's not all that simple: version 4 and 5 were "special builds" meant for the Office suite, not for the web (even if accessible...). So, pick anything above version 6, and if this doesn't pan out, try 3 instead (stay away from 4 and 5). Details are here: http://blogs.msdn.com/xmlteam/archive/2006/10/23/using-the-right-version-of-msxml-in-internet-explorer.aspx - and ha, I think this is obvious but... You will have to workaround specific bugs for every MSXML2 version you plan on supporting...

Ahem.

So, now you have that component wrapped-in with shiny new clothes (to mimmick DOMParser/LSParser), and you get back from it a "XMLDocument" equivalent, with its flaws and qualities. The first really bad surprise will hit you now. ActiveX objects are not "really" javascript objects - you can't overload them with new methods/accessors, and you can't override existing methods. Yet again, I think this is an implementation choice - if a painful one. So, you are up for your first dramatic decision on your road to consistent XML-DOM support: if you can't overload, you will delegate - write a DOM implementation that internally lean on the ActiveX component. Well, you could instead make the right choice right away, and decide to write a DOM implementation from scratch (parser included), but at the time you don't know yet you'll have more problems than benefits from delegation, and you think that given the time constraints of the project, it would be better not to reinvent each and every rod of the darn wheel.

Third day: you thought you wouldn't get bad surprises from the Cult

So, you patched the few inconsistencies you spotted between Opera and Firefox (leave the doctype problems out, for now) - that's, as I said earlier, mainly stuff related to the unspecified behavior of DOM 1 methods in a namespaced context and other kinkyness that might leave you with a horked non-XML live document. You also wrote a huge chunk of wrapping glue to get IE back into the fray - and start to feel confident.

Well... all of a sudden, you segfault Webkit on a trivial manipulation (https://bugs.webkit.org/show_bug.cgi?id=26402). Annoying, but you need more to be stopped. Well, you get more: https://bugs.webkit.org/show_bug.cgi?id=26147

That's right, you read well. No events on a document that is not rendered.

Who freakin' cares about that? - would you ask (echoing some of the commentators who crawled out of the dark here: http://ljouanneau.com/blog/post/2009/06/02/problemes-XML-dans-webkit).

Well... maybe if Webkit was to claim DOM implementation, they should care.

Either way, you're on your own. So, you take that clumsy patch of you that brought an event stack for IE, and apply that to webkit as well - as painful as it looks and sounds...

Fourth day: trouble everIEday

The ground work being laid, you can now get where it really hurts.

First, your IE patch suck and leak memory insanely. Oh, nothing noticeable on a reasonably-sized document, but whenever you hit 500+ elements (with, say, 5 attributes per-element - and that would make a few thousands dom nodes), you can easily get in the hundred(s) megabytes red zone. I haven't yet figured out why we do so poorly on the memory front (obviously in a non-linear way). Leaking is almost certainly due to circular references and/or closures not being properly GC-ed (I guess I'll have to chase these and destroy "manually"). As for the initial consumption itself, I guess something silly is lurking in there - possibly linked to the lean/bind use we are making of the ActiveX component.

Then come endless parsing problems.

PIs are not supported at all. Too bad - strip them out before parsing, and thanks for the fish.

DTD declarations as well are a real problem. No matter what I do with validateOnParse or resolveExternals I always get denied. So long for DTDs...

Well, http://blogs.msdn.com/ie/archive/2009/01/13/responding-to-change-updated-getter-setter-syntax-in-ie8-rc-1.aspx makes you think you could at least get real setters/getters on your ghost-DOM implementation - too bad for you if you read too fast. This doesn't concern random JavaScript objects, but only DOMNodes... You'll have to stick with "read-only" properties that can be overwritten...

Fifth day: ad nauseam

All that was barely for the "Core" and basic events support... Although crucial, they are far from getting you to full compliance - you can now spend the rest of the year on StyleSheets, Mutation Events, Views, and the countless problems your funky patch will bring to light...

A rather arid experience...

2010

A year after: things I should have not done (and things I plan on changing)

So, here you are, sitting on your pile of "somewhat consistently working" cross-browser GhostDOM layer - quite likely a terse 1500+ lines of TV-MA-LSV javascript.

Retrospectively, there are a couple of things I shouldn't have done, and that anyone looking into something similar should be aware of:

  • don't rely on the MSXML2 ActiveX component, based on the sole assumption that it can't be worse than having to parse yourself. Because it is, by far - and you will end-up rewriting everything either way. I have been looking recently into (the rather impressive so far) XML for <script> sax parser - too bad it doesn't perform so well when the document size grows. I also looked into something more lightweight and more efficient (in the line of the Rex Shallow Parser), and will probably end-up with something like that.
  • never, ever, loose sight of memory consumption in Internet Explorer - this will bite you very hard in the back, and it will be too late - right now I'll have to gut-rip the whole thing out...
  • always test with huge documents, if you expect your thing to scale up decently - similarly as above, if you neglect that, you will end-up with a damn slow hog, and performance is likely critical here
  • take a look now at what the XML for <script> guys did (and others), before rewriting half of their stuff in a curiously look-alike way...

Pushing the envelop

Still, it works. We now have something that is IMHO better than any of the UA own implementation - not (only) because it does correct implementation specific problems, but more importantly because it addresses what I feel is the #1 flaw of the "web platform" - inconsistency - by providing a consistent layer, whatever the browser.

And yet, I'm extremely frustrated - because I know we barely scratched the surface on a very limited area, and that this is far from enough to qualify as reliable and consistent.

Let alone what seems to me an entirely broken DTD support in most UA (last I checked), here is what happens if you keep digging on simple things. Take rather crude interfaces: NodeList and NamedNodeMap.

Now, torture them by feeding invalid values. This is twilight zone. Some User-Agents will throw (possibly un-adequate) exceptions. Others will return null. Others will return what they would return if the value was "0".

Concretely:

doc.childNodes.item("whatever")

will return doc.childNodes.item(0) in Firefox and Chrome, but null in Opera.

On the other hand:

doc.childNodes.item(-10)

will return null in Firefox and Opera, but throw INDEX_SIZE_ERR in Chrome.

Conversely:

doc.childNodes[-10]

will throw "Index or size is negative or greater than the allowed amount" in Firefox, but get undefined in Chrome and Opera. Note btw that Firefox will happily return undefined on doc.childNodes[Infinity], making the message of the previous exception rather confusing (what could possibly be greater than Infinity, guys? :-)).

Interestingly, Chrome is not consistent in his behavior when it comes to NamedNodeMap (admittedly the two interfaces are unrelated, though they look alike - the item method and [] accessor), where it never throw on invalid indexes.

The situation gets even more interesting when feeding setNamedItem with invalid values: Firefox throws a host of different and fine-grained Exceptions, while Opera does only throw WRONG_ARGUMENTS_ERR - both approaches being respectable. Now, Chrome throws a NOT_FOUND_ERR DOM Exception that is definitely not supposed to be used in such a case (unless I missed something in the spec, in which case Opera and Firefox would be to blame, which I kind of doubt here).

You may very well argue that these are not interesting - that it's not the role of the specification to cover each and every possible case, and that it's perfectly sane to let implementations differ in the way they handle unspecified situations.

This is possibly a sane answer - and seems to be the default browsers makers answer. After all, it's already complicated enough (and kind of an engineering miracle :-)) to get these guys to implement the same thing, and to have their implementations behave in a reasonably similar manner on "expected" values ("expected" == whatever the specification is not mute about).

But pardon me, I do disagree - at least if the DOM API is to be the basis of the "web as a platform", and if the web as a software platform is to be considered seriously. Now, putting things apart from what web-developers have grown accustomed to (the daily wtf :-)), let's ask questions out of the blue:

  • what would you think of a "platform" that randomly, on the same methods, in the same context, either throw or return null?
  • how would you manage to write reliable code on top of that?
  • how would you properly handle errors in your application (yeah, shit do happens), in such conditions?

I know that exception handling is not exactly web-developers' strong suit (no disrespect! you savvy readers are, I'm sure, masters at that!), but bashing them for not being "real" developers is not helpful either, while the platform in itself is in such a shape.

Now, not all client applications in the world are mondane web-pages that can afford to randomly stop working. Reliability is definitely one of the qualities a developer should consider as a criteria when picking a technology for a given project, and that doesn't sound convincing - specially since we are not speaking here about some new fancy-kinky experimental support for a bleeding edge technology (or even about corner-case rendering oddities), but about what (technically) is a decade-old fundamental-brick of the platform. I certainly am not a faint of heart when it comes to get things done (no matter how dirty) - neither, I guess, do I belong to the "web" newbie category - but a platform you need to wrap into layers of extra code before doing anything, just to make sure it will behave, doesn't sound that appealing for anything more serious than gadget development and demo showcases...

[To be continued :-)]