Does it really matter? The data representation could be markdown, yaml, JSON, SVG, or just about anything else and be equally effective. The browser already supports XML and HTML5 as wildly differing grammars that still compile to the same target.
html is not a DATA representation, it's not designed that way and it would actually be way more noisy if attemps to represent all the hidden states of the DOM.