Not so fast. HTML5 is a language. But it's a language in sort of the same way design languages are languages (...what a terribly apt analogy, no?) It's not exactly a metaphor... it's better to think of it as somewhat of an abstraction: it's really a conceptualization of objects and abilities or things you can have and things you can do with them. It's decoupled from the implementation, that is to say how it is written. The W3C draws this distinction by saying HTML5 is a vocabulary that can be implemented, in one of two syntaxes: the HTML Syntax or the XML Syntax. The most important distinction for you to understand as a web developer is that the XML Syntax is valid XML and valid XML is well-formed. Or it's not XML. (It's crap. (And it's going in the bin.))
You may be an outlayer but I am willing to bet Euros to Yen that my experience here is virtually universal: I have, to this day, never encountered HTML5 implemented in XML Syntax in the wild. It's a simple matter of using the right tools for a given job: XML is generally used for exchanging information between machines (think AJAX, .docx documents, XMPP (Jabber, Skype), SVG graphics...). And it makes sense: there shouldn't be any blemishes in these formats - if there's an error something is very wrong and it is correct that an entire document or message or packet should be discarded. It is not meant to be pretty, nor made so - it is meant to accurately and precisely describe information itself. Those of us who tried to switch from HTML to strict XHTML in the late 90s and early 00s out of a idealistic and good-intentioned but naive belief that well-formed, machine-readable and human-friendly regime would dominate the future understand only too well today the very good reasons why a forgiving, error-tolerant human-oriented markup language makes sense for creating and delivering documents intended for consumption by humans.
All of that is to say nothing of the main ideological argument that pervaded until only rather recently. Virtually from the beginning, HTML was criticized for having its feet in two ponds: it looks like a markup language and markup languages are meant for describing data. But here we're talking about data that is intended, especially in the early days and still very certainly nominally, to be rendered in a web browser and presented to humans. If it was meant for machines to process and then do god only knows what, possibly including presenting it to humans also, it would probably be better accomplished in a different format. Like XML - which is exactly what XML is for. But we started off with tags clearly only relevant to displaying documents to humans, like >center>. Which, come to think of it, is pretty limited. Would the next iteration include tags for right-aligning content? What about justified text? With word-wrapping? Without?
Wouldn't it make so much more sense if a so-called Printer Friendly version was already a part of the document, after all we're just changing a couple of visual features - often the same ones thousands of times per website. And then there's the problem of just how visual those few formatting controls available were. People with disabilities use the web too - in fact if we put their needs and use-cases at the centre of our focus for a moment it rapidly becomes intoxicating to contemplate how the web could be a revolutionary tool to reach them - for them to reach you - and connect us all together in ways never even possible to conceive of before. Finally, the stresses gave way and continents shifted, into the ever-transitional, never-quite-done ecosystem Cascading StyleSheets' first wails into the dark night were heard and we could finally cleanly separate form from function, appearance from structure and get to the Good Work™ of making a markup language that describes how data should be organized, inter- and intra-related for presentation to a human being, complimented by a styling language that defines how that data should look, feel, sound, print... (..taste... ..smell... ..?)
...OR the W3C could dick around for a decade stubbornly buggering with XHTML 2.0 instead, which it would later abandon anyway. In so doing they issued a summary slap in the c*ck to every man woman and child whose life or livelihood is impacted by the unhindered evolution of a vibrant and economically crucial world wide web (YUP THAT'S YOU! :D)...). As they slept peacefully atop their wheel (Netscape Navigator?) the real world was passing them by, ever-more desperate for the self-declared arbiters of the most fundamental technology underpinning the web: namely the standardized language its participants must speak to be heard (and speak well to not waste everybody's goddamned time and money). Verily, by virtue of all that makes the Internet good and holy Web Hypertext Application Technology Working Group (WHATWG) self-assembled to meet the needs of the time:
The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving the web through standards and tests.
The WHATWG was founded by individuals of Apple, the Mozilla Foundation, and Opera Software in 2004, after a W3C workshop. Apple, Mozilla and Opera were becoming increasingly concerned about the W3C’s direction with XHTML, lack of interest in HTML, and apparent disregard for the needs of real-world web developers. So, in response, these organisations set out with a mission to address these concerns and the Web Hypertext Application Technology Working Group was born.
In 2017, Apple, Google, Microsoft, and Mozilla helped develop an IPR policy and governance structure for the WHATWG, together forming a Steering Group to oversee relevant policies.
Hell yeah! That's how it's done Internet Style! Imagine: the people behind the development of the major competing browser engines - the engineers that actually have to design the components that communicate, transact and render at the most fundamental level came together in opposition of indifference, stagnation and wanton disregard for the needs of the public writ large! Even today they still coordinate the implementation and future of the language of the Web itself. This was utterly unimaginable in the era of the Microsoft Antitrust Case. The benefits extend far beyond the introduction of HTML5 and every hobbyist and professional web developer active in both eras feels them whenever their work loads consistently in a second browser. That's not to say W3C's work on XHTML 2.0 and its later pivot to "modularization" was entirely for naught: fruits included SVG, MathML and Xforms which would come to influence the inception of HTML5.
By 2007 The W3C saw the writing on the wall: get with the program or resign to irrelevance and eventual disbandment. A new working group was created to onboard the WHATWG specification and a tentative cooperation was established. It didn't take long for the W3C to get ahead of its britches. In 2011, deciding that HTML5 needed a finalization in the same stuffy tradition as the broken, inflexible and inorganic versions prior. In lock-step with this action a new working group was chartered to begin work on a so-called HTML 6 specification, leaving some wondering if the whole affair weren't a cynical, surprise ploy by the W3C to gain the upper hand and secure the future against WHATWG by simply painting a number on it that nobody asked for or needed. The WHATWG disagreed, being headed by pragmatic industry players with skin in the game and pieces on the board a Living Standard that could respond to new challenges and ideas was preferred. An understandable decision given the active cooperation and participation of the major browser vendors fostered a venue that was uniquely capable of ensuring interoperability, consistency and the ability to both deploy and actually realize changes to the specification in time frames that would otherwise be impossible. Adhering to the old convention of monolithic versioning was not only outdated, unnecessary and - looking back it could be argued - demonstrably broken, it would effectively handicap an industry that fancies itself, not without merit, to be a bedfellow of innovation.
Finally, in May 2019, the two camps were sleeping together once more and all was well in the world. WHATWG, obviously, being the boy of the relationship and the W3C being the girl. Today both groups perform important functions in maintaining the HTML5 specification and its siblings and the W3C has come to accept the sensibility of the Living Standard. Indeed, one could say the Internet has been graced by a new era aligned under a Live and let live Standard.