=^.^=

Unattended Windows Installation Error 0x80042565 DiskConfiguration

When performing an "Unattended Windows installation" with an Answer File, as outlined in Go Fast: Windows 10 and 11 Unattended Installation Answer File Template, you may encounter the following error:

Windows Setup

Windows could not create a partition on disk 0. The error occurred while applying the unattended answer file's <DiskConfiguration> setting. Error code: 0x80042565

This usually means you are using an installation medium with an answer file configured to create a GPT partition table, compatible with a UEFI boot environment, but the machine - while it may be UEFI capable - has booted in Legacy Mode (in which only an MBR partitioning scheme may be installed).

To fix this, reboot and reconfigure your boot settings such that the machine will boot into UEFI or "UEFI first then fallback to Legacy" Mode; alternatively if your BIOS is capable of supporting both modes simultaneously your installation medium may have two boot device entries; one for UEFI mode and one for Legacy. Either ensure the UEFI version of the device is attempted first or use your system's one-off Boot Device Menu (where available) to select the appropriate entry.

ModSecurity Rule 932105 Execution error - PCRE limits exceeded

mod_security / ModSecurity / ModSec / whatever the kids are calling it today is a battle-tested Web Application Firewall that plugs into the Apache HTTP daemon's modular framework and has been the main mechanism for implementing Intrusion Prevention and DoS mitigation to the LAMP stack for even slightly longer than I've been doing this - way back in the Apache 1.3.x era 19 years ago. If you've never used mod_security but have implemented any sort of IPS/IDS or DoS mitigation technology before your mind's gears have already sputtered "gee I bet that takes a hella lotta resources," buddy - you better believe it. While I would never leave my house and venture into The Wild without smothering myself in a thick lather of mod_security it is prudent to apply it intelligently in situations where available resources, cluster nodes or funding in general is not unlimited. That's why on high-traffic installations you might be enabling and disabling it or at least running radically different configurations with content-appropriate rulesets on a per-vhost basis, separating things like static content away from interpreted scripts and other attack surfaces more susceptible to, say, getting tricked into running Bob's shellcode zero-du-jour.

These days mod_security comes with a lot of rules and that's great (other than the false positives). But as anyone who admins a Snort or Suricata instance knows the more rules the more resources are demanded. PCREs in particular are powerful tools for, well, pattern matching - and that's most of what we're doing in this type of situation. mod_security essentially acts like a proxy, sitting between the end user making a request and the server-side end point that will process it, keeping track of vital statistics so it can judge the intent behind and risk of each request and running pattern match query after query after query on the data going in and coming out. Unfortunately powerful tools are also power hogs, so to prevent mod_security from itself instigating a resource starvation Denial of Service condition default limits on the total number of PCRE operations that can be run on any given transaction are imposed. The default behavior when that limit is reached is to err on the side of caution, instead of processing the transaction mod_security will call the destination up yet provide no input, meanwhile throwing a 500 Internal Server Error to the client. Depending on your configuration it may be very difficult to detect when this particular condition is at fault for your seemingly aborted transaction; often you will have to resort to prying open your Apache error logs, either those configured globally or that which is configured for the vhost at hand.

ModSecurity: Rule 6fa404524850 [id "932105"][file "/var/apache2/template/etc/mod_sec3_CRS/REQUEST-932-APPLICATION-ATTACK-RCE.conf"][line "158"] - Execution error - PCRE limits exceeded (-8): (null). [hostname "ychan.net"] [uri "/post.php"], referer: https://ychan.net/r/

As you can see I recently ran into such a condition with the upload processing script for our imageboard, Ychan (NSFW). I'm already doing a lot of my own security testing inside that script (including several PCREs, in fact), for obvious reasons, so I should be free to do any number of things:

  1. I can change what mod_security does when it encounters a match by changing the SecRuleEngine Apache configuration directive's value to DetectionOnly:
    SecRuleEngine DetectionOnly
    This will take the teeth out of mod_security's response and only log the match instead of zapping the transaction. Obviously I would want to think long and hard about using this on a production system and then only apply it in a very limited perimeter - for example a specific <Directory>, <Files> or <Location>:
    <VirtualHost *:80>     ServerName ychan.net     ServerAlias www.ychan.net     <Location /post.php>         SecRuleEngine DetectionOnly     </Location>     ... </VirtualHost>
  2. I can raise the limits themselves with the SecPcreMatchLimit and SecPcreMatchLimitRecursion directives:
    SecPcreMatchLimit 150000 SecPcreMatchLimitRecursion 150000
  3. Going by Google, it seems I could add these lines to my php.ini file:
    pcre.backtrack_limit = 10000000 pcre.recursion_limit = 10000000
    or, where supported, to the relevant .htaccess file:
    php_value pcre_backtrack_limit = 10000000 php_value pcre_recursion_limit = 10000000

But there's a problem... as with SecRuleEngine above, I'd like to be able to make these changes to SecPcreMatchLimit and SecPcreMatchLimitRecursion with some specificity, i.e.:

  • In my given vhost's configuration block (i.e. /etc/apache2/vhosts.d/mysub_domain_com.conf)
  • In the .htaccess file of the relevant directory, where AllowOverride All is in effect

However, according to the documentation, these directives can only be configured globally! That is:

  • In (/etc/[apache2|httpd]/conf.d/)httpd.conf
  • In (/etc/[apache2|httpd]/modules.d/)(20_)mod_rewrite.conf

This effectively renders about 90% of the articles and answers I have encountered in writing this essentially useless, or misleading at best. What's more, these directives don't even apply in versions 3 and over, but it's not likely that you found this article if that's your situation. Good thing we read the manual around here, right? :D

Back to the solution, it begs the question: what do these values do exactly?

From https://serverfault.com/questions/408265/what-are-pcre-limits:

These appear to be settings internal to the PCRE engine in order to limit the maximum amount of memory/time spent on trying to match some text to a pattern. The pcreapi manpage does little to explain it in layman's terms:

The match_limit field provides a means of preventing PCRE from using up a vast amount of resources when running patterns that are not going to match, but which have a very large number of possibilities in their search trees. The classic example is the use of nested unlimited repeats.

Internally, PCRE uses a function called match() which it calls repeatedly (sometimes recursively). The limit set by match_limit is imposed on the number of times this function is called during a match, which has the effect of limiting the amount of backtracking that can take place. For patterns that are not anchored, the count restarts from zero for each position in the subject string.

The default value for the limit can be set when PCRE is built; the default default is 10 million, which handles all but the most extreme cases. You can override the default by suppling pcre_exec() with a pcre_extra block in which match_limit is set, and PCRE_EXTRA_MATCH_LIMIT is set in the flags field. If the limit is exceeded, pcre_exec() returns PCRE_ERROR_MATCHLIMIT.

The match_limit_recursion field is similar to match_limit, but instead of limiting the total number of times that match() is called, it limits the depth of recursion. The recursion depth is a smaller number than the total number of calls, because not all calls to match() are recursive. This limit is of use only if it is set smaller than match_limit.

Since the PCRE library built-in default is 10000000, my guess is that the lower setting is suggested for mod_security in order to prevent requests from being held up for a long time.

In fact, according to the documentation the defaut value for mod_security is 1500 - very low indeed, until we consider that it is expected to process up to several thousands of transactions per second.

A Brief (and Delightfully Snarky) History of HTML5

Official HTML5 Logo
The Official HTML5 Logo

You've probably been told - or perhaps you have arrived at the conclusion yourself - that HTML5 is this great new (new? OK I'm an old man everything is new to me these days) revision of the immortal HyperText Markup Language that has come in like a wrecking ball to elevate mobile design, gloriously unite browser rendering engines, modernize the ECMAScript (that's JavaScript to you, kiddo - but don't you ever confuse it with JScript!) API and above all else: clear out the mind-boggling clutter of mix-and-match legacy DTDs including some arcane, experimental footnote that uber-dweebs pushed to feel smarter than everyone else called 'XHTML' that hated puppies and had self-closing tags. Well, they called them that but if you didn't close them yourself they'd snap at your fingers or something. Anyway HTML5 crammed all that rambunctious, frothing mess into the basement and slammed the door shut on it forever. Click! Phew. Now we have one big, forgiving standard with all the goodies and a thoughtfully selected menu of pre-defined tags to pick and choose from that make sense for describing information and designing visual layouts and tactile layouts and auditory layouts... all sorts of layouts! And best of all you can even make your own tags named more-or-less whatever you want that start off as simple replicas of &;t;div> so you can easily extend the language itself in ways that are simpler and sensible-er and smart-er-er than even the W3C could have ever envisioned. The end! Right?

Not so fast. HTML5 is a language. But it's a language in sort of the same way design languages are languages (...what a terribly apt analogy, no?) It's not exactly a metaphor... it's better to think of it as somewhat of an abstraction: it's really a conceptualization of objects and abilities or things you can have and things you can do with them. It's decoupled from the implementation, that is to say how it is written. The W3C draws this distinction by saying HTML5 is a vocabulary that can be implemented, in one of two syntaxes: the HTML Syntax or the XML Syntax. The most important distinction for you to understand as a web developer is that the XML Syntax is valid XML and valid XML is well-formed. Or it's not XML. (It's crap. (And it's going in the bin.))

You may be an outlayer but I am willing to bet Euros to Yen that my experience here is virtually universal: I have, to this day, never encountered HTML5 implemented in XML Syntax in the wild. It's a simple matter of using the right tools for a given job: XML is generally used for exchanging information between machines (think AJAX, .docx documents, XMPP (Jabber, Skype), SVG graphics...). And it makes sense: there shouldn't be any blemishes in these formats - if there's an error something is very wrong and it is correct that an entire document or message or packet should be discarded. It is not meant to be pretty, nor made so - it is meant to accurately and precisely describe information itself. Those of us who tried to switch from HTML to strict XHTML in the late 90s and early 00s out of a idealistic and good-intentioned but naive belief that well-formed, machine-readable and human-friendly regime would dominate the future understand only too well today the very good reasons why a forgiving, error-tolerant human-oriented markup language makes sense for creating and delivering documents intended for consumption by humans.

You can't blame us for trying; the problems with HTML 2.0 through 4.01 (don't ask where 1.0 went, we don't like to talk about it) were more glaring than obvious. Less the fault of iterations of the language itself than browser support, dozens of browsers and a multitude of rendering engines upon which those browsers had been created were developed in this time and as one might imagine not only different concepts, tags and attributes were supported, ignored or different from browser to browser but the way they were displayed to the user and functioned was increasingly divergent. More daunting still was the art of cross-platform JavaScript engineering the nascent paradigm of webapps demanded serious programmers pull off, birthing a slew of bulky and inefficient compatibility libraries that could themselves just as easily break as raw code while each distinct dialect evolved and shifted beneath them. Many developers made the logical but limiting decision to design for a certain browser to the exclusion of others, meanwhile an elite culture of deadly warrior-poets emerged (yours truly among them) that dared to wrangle the quirks of as many market-share claiming beasts as possible in works of ugly to awe-inspiring hackery that can strike terror into the hearts of polite programmers even today. If this is what it was like for people to converse in the language of the web I'm sure you can imagine what that meant for machines; spiders, indexers, gateways, proxies, aggregators, caches, load balancers, analysis, analytics, application-level firewalls, content delivery, content management, intrusion prevention, intrusion detection, systems, SYSTEMS, SYSTEMS, SYSTEMS! All these systems could do so much more if only half the stuff passing through them didn't taste like junk.

All of that is to say nothing of the main ideological argument that pervaded until only rather recently. Virtually from the beginning, HTML was criticized for having its feet in two ponds: it looks like a markup language and markup languages are meant for describing data. But here we're talking about data that is intended, especially in the early days and still very certainly nominally, to be rendered in a web browser and presented to humans. If it was meant for machines to process and then do god only knows what, possibly including presenting it to humans also, it would probably be better accomplished in a different format. Like XML - which is exactly what XML is for. But we started off with tags clearly only relevant to displaying documents to humans, like >center>. Which, come to think of it, is pretty limited. Would the next iteration include tags for right-aligning content? What about justified text? With word-wrapping? Without?

Wouldn't it make so much more sense if a so-called Printer Friendly version was already a part of the document, after all we're just changing a couple of visual features - often the same ones thousands of times per website. And then there's the problem of just how visual those few formatting controls available were. People with disabilities use the web too - in fact if we put their needs and use-cases at the centre of our focus for a moment it rapidly becomes intoxicating to contemplate how the web could be a revolutionary tool to reach them - for them to reach you - and connect us all together in ways never even possible to conceive of before. Finally, the stresses gave way and continents shifted, into the ever-transitional, never-quite-done ecosystem Cascading StyleSheets' first wails into the dark night were heard and we could finally cleanly separate form from function, appearance from structure and get to the Good Work™ of making a markup language that describes how data should be organized, inter- and intra-related for presentation to a human being, complimented by a styling language that defines how that data should look, feel, sound, print... (..taste... ..smell... ..?)

...OR the W3C could dick around for a decade stubbornly buggering with XHTML 2.0 instead, which it would later abandon anyway. In so doing they issued a summary slap in the c*ck to every man woman and child whose life or livelihood is impacted by the unhindered evolution of a vibrant and economically crucial world wide web (YUP THAT'S YOU! :D)...). As they slept peacefully atop their wheel (Netscape Navigator?) the real world was passing them by, ever-more desperate for the self-declared arbiters of the most fundamental technology underpinning the web: namely the standardized language its participants must speak to be heard (and speak well to not waste everybody's goddamned time and money). Verily, by virtue of all that makes the Internet good and holy Web Hypertext Application Technology Working Group (WHATWG) self-assembled to meet the needs of the time:

From What is the WHATWG? (https://whatwg.org/faq):

The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving the web through standards and tests.

The WHATWG was founded by individuals of Apple, the Mozilla Foundation, and Opera Software in 2004, after a W3C workshop. Apple, Mozilla and Opera were becoming increasingly concerned about the W3C’s direction with XHTML, lack of interest in HTML, and apparent disregard for the needs of real-world web developers. So, in response, these organisations set out with a mission to address these concerns and the Web Hypertext Application Technology Working Group was born.

In 2017, Apple, Google, Microsoft, and Mozilla helped develop an IPR policy and governance structure for the WHATWG, together forming a Steering Group to oversee relevant policies.

W3C HTML5 WHATWG
Clicking navigates to a visual History of HTML 5 Timeline by Darryll Alcorn

Hell yeah! That's how it's done Internet Style! Imagine: the people behind the development of the major competing browser engines - the engineers that actually have to design the components that communicate, transact and render at the most fundamental level came together in opposition of indifference, stagnation and wanton disregard for the needs of the public writ large! Even today they still coordinate the implementation and future of the language of the Web itself. This was utterly unimaginable in the era of the Microsoft Antitrust Case. The benefits extend far beyond the introduction of HTML5 and every hobbyist and professional web developer active in both eras feels them whenever their work loads consistently in a second browser. That's not to say W3C's work on XHTML 2.0 and its later pivot to "modularization" was entirely for naught: fruits included SVG, MathML and Xforms which would come to influence the inception of HTML5.

By 2007 The W3C saw the writing on the wall: get with the program or resign to irrelevance and eventual disbandment. A new working group was created to onboard the WHATWG specification and a tentative cooperation was established. It didn't take long for the W3C to get ahead of its britches. In 2011, deciding that HTML5 needed a finalization in the same stuffy tradition as the broken, inflexible and inorganic versions prior. In lock-step with this action a new working group was chartered to begin work on a so-called HTML 6 specification, leaving some wondering if the whole affair weren't a cynical, surprise ploy by the W3C to gain the upper hand and secure the future against WHATWG by simply painting a number on it that nobody asked for or needed. The WHATWG disagreed, being headed by pragmatic industry players with skin in the game and pieces on the board a Living Standard that could respond to new challenges and ideas was preferred. An understandable decision given the active cooperation and participation of the major browser vendors fostered a venue that was uniquely capable of ensuring interoperability, consistency and the ability to both deploy and actually realize changes to the specification in time frames that would otherwise be impossible. Adhering to the old convention of monolithic versioning was not only outdated, unnecessary and - looking back it could be argued - demonstrably broken, it would effectively handicap an industry that fancies itself, not without merit, to be a bedfellow of innovation.

Finally, in May 2019, the two camps were sleeping together once more and all was well in the world. WHATWG, obviously, being the boy of the relationship and the W3C being the girl. Today both groups perform important functions in maintaining the HTML5 specification and its siblings and the W3C has come to accept the sensibility of the Living Standard. Indeed, one could say the Internet has been graced by a new era aligned under a Live and let live Standard.

DOMDocument::loadHTML(): Tags Invalid Parsing HTML5 and More!

You may have noticed your log files (or - god forbid - your browser window) filling up with complaints from PHP's built-in DOM parser DOMDocument that the loadHTML() or LoadHTMLFile() method is being fed all manner of invalid tags. In the case of poorly formed (or both technically and awkwardly: not well-formed) markup or inadequately couched snippets of otherwise-valid HTML this might make some sense. But why is it whining about all-American, good-ol-boys like <article>, <nav>, <picture>... wait, there's a pattern here. It seems to have developed a food allergy for HTML5! Make haste, we must get to the bottom of this! (and somebody get to an epi pen...)

I had been detailing my debugging efforts and defending my use of a jquery-for-php-like library some decade or so old (phpquery @ http://code.google.com/p/phpquery/) in the production of this very site (foxpa.ws) which was generating most of these DOMDocument errors in my logs and mentioned that I might set out to write this, a "jump-the-shark article" - or a jump-the-sharticle, if you will. I had always believed that the concept of jumping the shark carried with it an implicit self-referential element. It turns out this is not so. But as I have already uploaded The Fonz and set up the CSS float it is no longer possible to avoid this shame and conceal my folly... well, I was forced to learn something this morning. So you should have to as well...

From https://en.wikipedia.org/wiki/Jumping_the_shark:

The idiom "jumping the shark" was coined in 1985 by Jon Hein in response to a 1977 episode from the fifth season of the American sitcom Happy Days, in which Fonzie (Henry Winkler) jumps over a shark while on water-skis. The phrase is pejorative and is used to argue that a creative outlet or work appears to be making a misguided attempt at generating new attention or publicity for something that is perceived once to have been widely popular, but is no longer.

The problem, it seems, is that PHP employs libxml2 to do the heavy lifting behind the DOMDocument API. Since it always has, perhaps developed wasn't the right way to put it - it's more likely that we are only now noticing this issue where it once went unseen is because the version of PHP we are using has updated and the error level assigned to this exception has entered the reporting window at hand. But why does it take offense to common HTML5? It's a simple matter of sloppy and outdated naming; libxml2 is an XML parser. HTML5 is not XML. Except it can be. But it's not. Huh?

NOTE: When I started to try explaining that nuance it soon became apparent that the topic was better suited to the confines of its own article. As such, please continue reading A Brief (and Delightfully Snarky) History of HTML5 if the details interest you... and you happen to be blessed with more time than good sense...

[attachment-yhTyOw]
HEY GUYS! GUYS! WHAT DO YOU CALL A BUNDLE OF LOGS? :D
(I'm allowed, it's our word :/)

...Right. So, the HTML5 you're trying to feed the XML parser isn't XML and the XML parser is just doing what it's supposed to by barfing in its mouth a little and we only have this problem because of the prevailing trends (see: XHTML, as above) at the time the API was drafted in PHP and the methods were named and that's all fine and dandy mister but what about my damn code?

  1. We can swap out HTML5's descriptive layout tags for plain-jane <div>s, like this enterprising fellow on Stack Overflow...
    $html = file_get_contents($url); $search = array( "<header", "</header>", "<nav", "</nav>", "<section", "</section>", "<article", "</article>", "<footer", "</footer>", "<aside", "</aside>", "<noindex", "</noindex>", ); $replace = array( "<div", "</div>", "<div", "</div>", "<div", "</div>", "<div", "</div>", "<div", "</div>", "<div", "</div>", "<div", "</div>", ); $html = str_replace($search, $replace, $html); $dom = new DOMDocument(); $dom->loadHTML($html);
    Notice how the potentiality of attributes has been thoughtfully taken into account. Unfortunately what hasn't been taken into account is about five times as many more defined tags and the whole make your own damn tags and name them whatever the hell you want thing that tickles the free spirits among us absolutely pink to this very day. Perhaps that's why this suggestion is, at present time of writing, ranked negative six (-6).

    Well, that and the fact that if we had been relying on being able to identify elements in the resulting object by those tags' proper names we have just accomplished an excellent unravelling of the entire undertaking. Bravo.

  2. We could suppress the errors with the STFU Operator (@)...
    $dom = new DOMDocument(); @$dom->loadHTML($html_data);
    Sure! And in the immortal words of my sensei: "I could make you S the F up - permanently - if you ever suggest that again within earshot!" :@

    Repeat after me: The STFU operator is not how real programmers program real programs. PHP is finally awesome, but all the programmers of all the other programming languages are watching you, and me, and frankly: I'm sick of being beat up and laughed at. q.q

    Also, it should be noted:

    From https://php.watch/versions/8.0/fatal-error-suppression:

    In PHP 8.0, the @ operator does not suppress certain types of errors that were silenced prior to PHP 8.0. This includes the following types of errors:

    All of these errors, if raised, halts [sic] the rest of the application from being run. The difference in PHP 8.0 is that the error message is not silenced, which would have otherwise resulted in a silent error.

  3. ...a-Alright, suppress the errors with a flag? *wince*:
    $dom = new DOMDocument(); $dom->loadHTML($html_data, LIBXML_NOERROR);
    Now that has an air of slight sophistication. Or at least it doesn't smell like outright crap. I'd accept it if we're on a deadline but this precludes our ability to access the errors in the event we have legitimate use for them in debugging. Particularly say, where we are parsing user-submitted or directed information or interfacing with a source that we do not control. Frankly that sounds like virtually every case in which I've employed this capability...
  4. What if we intercepted the errors with a non-libxml2, PHP-based set_error_handler()... error handler?
    From https://stackoverflow.com/questions/1148928/disable-warnings-when-loading-non-well-formed-html-by-domdocument-php:
    class ErrorTrap { protected $callback; protected $errors = array(); function __construct($callback) { $this->callback = $callback; } function call() { $result = null; set_error_handler(array($this, 'onError')); try { $result = call_user_func_array($this->callback, func_get_args()); } catch (Exception $ex) { restore_error_handler(); throw $ex; } restore_error_handler(); return $result; } function onError($errno, $errstr, $errfile, $errline) { $this->errors[] = array($errno, $errstr, $errfile, $errline); } function ok() { return count($this->errors) === 0; } function errors() { return $this->errors; } } // create a DOM document and load the HTML data $xmlDoc = new DOMDocument(); $caller = new ErrorTrap(array($xmlDoc, 'loadHTML')); // this doesn't dump out any warnings $caller->call($fetchResult); if (!$caller->ok()) { var_dump($caller->errors()); }
    While I will give full points for isolation, doesn't your app already have a thoughtful error and exception handling subsystem? Yes, this will keep PHP/the HTTP or fCGI daemon logs and our own custom error handling system free of all these extraneous parsing errors and that is what we set out to do... but now anything that isn't a parsing error that crops up is going to be hidden from our error handling. It just doesn't feel ideologically tidy and speaking of tidy, since the next option is available - and so much more elegant - all this fancy (or convoluted, you be the judge) footwork is rather moot....
  5. OK, let's leverage libxml2's built-in error management, as exposed to PHP:
    // create DOM $dom = new DOMDocument(); // modify state $libxml_previous_state = libxml_use_internal_errors(true); // parse $dom->loadHTML($html); // fetch the errors $errors = libxml_get_errors(); foreach ($errors as $error) { // handle the errors as you wish } // flush the libxml error queue libxml_clear_errors(); // restore state libxml_use_internal_errors($libxml_previous_state);
    Nicely done, in a one-off situation you could get away with ignoring the state of libxml error handling - but this is a classy way of making whatever code widget you're building here maximally portable and respectful of its surroundings so you can feel confident dropping it in wherever down the line.
  6. Sadly, none of these approaches solves the real issue: libxml2 is an XML parser. HTML5, in the common serialization - likely the only one you're even aware of unless you're a documentation addict (or read my bit about the history of HTML5) - just isn't XML.

    And the right answer is always to use the right tool for the job at hand: there are a number of third-party HTML5 parsing libraries out there.

    • HTML5DOMDocument bills itself as a drop-in extension and correction to the DOMDocument API which should make it ideal for those just looking for a quick fix - but only as quick as correct permits. It also sports CSS selector functionality - which is what I was using that rusty old phpquery library for. Two birds with one stone? That's my kinda library!
      $dom = new IvoPetkov\HTML5DOMDocument(); $dom->loadHTML('<!DOCTYPE html><html><body><h1>Hello</h1><div class="content">This is some text</div></body></html>'); echo $dom->querySelector('h1')->innerHTML; // Hello echo $dom->querySelector('.content')->outerHTML; // <div class="content">This is some text</div>
      Hot dog!
    • Masterminds HTML5-PHP is easily the most referenced PHP-based server-side HTML5 parsing library and with good reason: it boasts over 5 million downloads and a pedigree that extends well beyond the last decade. Surely if you are in the market for a complete, tried, tested and true solution it must not be ignored. GitHub activity shows that despite its age it is still being maintained, while the majority of its codebase has survived the test of time and settled into a rock-hard foundation.

Go Fast: Windows 10 and 11 Unattended Installation Answer File Template

Windows 10 and 11 AutoUnattend.xml Unattended Installation Answer File Template for UEFI and Legacy Boot (GPT/MBR): foxpa.ws Edition

A stitch in time saves nine, as the saying goes. Well, I can't tell how much time this stitch has saved or will save me but it might be the single best temporal ROI I've got out of sitting down and gettting all up inside a Windows config file. Simply drop this into the root directory of your Windows 10 or 11 installation medium, plug it in, select which edition of Windows you would like to install (or don't, details below...) and go make a cup of tea while the installation completes in under half the time it usually takes, without any of those annoying questions about junk you never asked for in the first place.

Please read the documentation thoroughly before deploying. It is provided in the form of comments embedded in the file itself so it's always right where you need it and not all the annoying way over here - a working machine, open browser and google away.

You may find these other articles useful:

NOTE: If you choose to read nothing else, Americans be advised: search-and-replace all instances of en-CA with en-US.

<?xml version="1.0" encoding="utf-8"?> <!-- ..:: Windows 10/11 Answer File :: AutoUnattend.xml :: foxpa.ws Edition ::.. Companion article: https://foxpa.ws/win-10-11-unattended Substrate Created using Windows AFG: http://www.windowsafg.com ###################### # Installation Notes # ###################### Drop this file (named AutoUnattend.xml) in the root of the directory tree of the file system containing your Windows 10 or 11 installation medium. It should simply work on your next boot from that medium; be sure to read the following and make any changes BEFORE booting! Details and changes from Answer File Generator guidance ======================================================= Pre-boot configuration - any changes you would like to make should or must be in this document as they have immutable consequences or are difficult to change later: - Corrected XML syntax (namely no tag may appear before the XML declaration, including comments) - ***AUTOMATICALLY WIPES THE PRIMARY ENUMERATED LOCAL (SATA/PATA) DISK (Disk0)*** - Default GPT (UEFI) BIOS partitioning scheme - Alternate MBR ("Legacy" BIOS) partitioning scheme included, commented in following block - ProtectYourPC is set to 3, disabling: "Personalize speech, typing, and inking input by sending contacts and calendar details, along with other associated input data to Microsoft, Let Windows and apps request the user's localization, including location history, and use the advertising ID to personalize experiences on the device, turn on protection from malicious web content and use page prediction to pre-load sites in Windows browsers, which sends the browsing history to Microsoft, automatically connect to suggested open and shared networks and Send problem reports to Microsoft." Post-boot configuration - you can update these here but you may find it smarter to supply one or a scheme of several of your own answer files to be used later with SysPrep or DISM that will overwrite the defaults set out in this initial Unattended Windows Setup answer file, or set of files based hereon, as you may see fit: - en-CA locales in all the right places (Yankees heed: switch these to en-US, and everyone else whatever your godless foreign culture demands... ;p) - Default user account is 'user' with no password, auto-login, never expires - Default machine name is DESKTOP-WIN10. Personal preference is to use a SERVER- LAPTOP- MOBILE- etc. scheme to make hosts more immediately identifiable; YMMV. Change to blank or an asterisk to invoke default randomized naming behaviour. Otherwise, don't forget to change at least the 'WIN10' half after setup is complete or you may soon notice conflict... (you may find it more expeditious to go random unless you do a lot of clients' machines/resale). - Control Panel is configured to traditional, Large Icon view. Some directives still active in the AFG webapp have been disabled, take no effect in Windows 10 or 11, or have been deprecated - in some or all builds thereof. Additionally, a number of new directives have not (at present writing) been made available, nor those which have been modified updated. Changes to the model reflected in this template include: - Setting OOBE|NetworkLocation has no effect in Windows 10 and above; it has been removed. - OOBE|SkipUserOOBE and OOBE|SkipMachineOOBE are deprecated and have been removed as their original function is implied in 10+ with the inclusion of an AutoLogon section. It's a good idea to check with https://docs.microsoft.com/en-us/windows-hardware/customize/desktop/unattend/changed-answer-file-settings-for-previous-windows10-builds or load your answer files against recent System Images from time to time to ensure your answer files stay current with the specification. # NOTE: If you attempt to use an answer file that is configured to set up a disk with a GPT partition table and scheme while booting a machine in "BIOS" or "Legacy" (as opposed to "EFI" or "UEFI") mode OR VICE VERSA the graphical installer will complain and demand a restart early after loading. Avoid this time consuming error by either: a) ensuring the correct disk format will be configured on boot, or... b) by entering the UEFI/BIOS settings and ensuring the correct boot mode is selected, or... c) that you choose the correct entry for your installation medium on your BIOS' dual-mode capable manual/one-time boot device selection menu - WHERE such a dual-mode capability avails itself; many chips only provide a boot menu for options available in either mode it has been configured to automatically cycle, or a single mode even when "UEFI/Legacy" or "Legacy/UEFI" dual-mode has been enabled for regular operation. # NOTE: unlike most answer files circulating, by default this one WILL REQUIRE YOUR INPUT ONCE - at the very beginning - to select your edition. While my personal use-case is more on the server-and-workstation end, this configuration better suits the sheer numeric bulk of my installations; being sporadic refurbishments/servicings for friends, clients and resale where there is often a random license already accompanying the machine. To automatically select the edition and realize a fully automated install simply add a ProductKey directive to the specialize|Microsoft-Windows-Shell-Setup component and populate it with one of the Generic Volume License Keys (GVLK) Microsoft provides : https://docs.microsoft.com/en-us/windows-server/get-started/kms-client-activation-keys Windows 10/11 Pro W269N-WFGWX-YVC9B-4J6C9-T83GX Windows 10/11 Pro for Workstations NRG8B-VKK3Q-CXVCJ-9G2XF-6Q84J Windows 10/11 Pro Education 6TP4R-GNPTD-KYYHQ-7B7DP-J447Y Windows 10/11 Education NW6C2-QMPVW-D7KKK-3GKT6-VCFB2 Windows 10/11 Enterprise NPPR9-FWDCX-D2C8J-H872K-2YT43 Food for thought: if I ran a busy shop or sprawling corporate IT department I'd use this in conjunction with a PXE boot environment to deploy a whole "plug-in, install, and go!" subnet! Big switch for the technicians in back and a vlan to reach the upper floors... Mmmmm, disposable Windows... now THAT'S turning it off and on again! :D --> <unattend xmlns="urn:schemas-microsoft-com:unattend"> <settings pass="windowsPE"> <component name="Microsoft-Windows-International-Core-WinPE" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <SetupUILanguage> <UILanguage>en-US</UILanguage> </SetupUILanguage> <InputLocale>en-US</InputLocale> <SystemLocale>en-CA</SystemLocale> <UILanguage>en-US</UILanguage> <UILanguageFallback>en-US</UILanguageFallback> <UserLocale>en-CA</UserLocale> </component> <component name="Microsoft-Windows-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <DiskConfiguration> <Disk wcm:action="add"> <DiskID>0</DiskID> <WillWipeDisk>true</WillWipeDisk> <CreatePartitions> <!-- Windows RE Tools partition --> <CreatePartition wcm:action="add"> <Order>1</Order> <Type>Primary</Type> <Size>300</Size> </CreatePartition> <!-- System partition (ESP) --> <CreatePartition wcm:action="add"> <Order>2</Order> <Type>EFI</Type> <Size>100</Size> </CreatePartition> <!-- Microsoft reserved partition (MSR) --> <CreatePartition wcm:action="add"> <Order>3</Order> <Type>MSR</Type> <Size>128</Size> </CreatePartition> <!-- Windows partition --> <CreatePartition wcm:action="add"> <Order>4</Order> <Type>Primary</Type> <Extend>true</Extend> </CreatePartition> </CreatePartitions> <ModifyPartitions> <!-- Windows RE Tools partition --> <ModifyPartition wcm:action="add"> <Order>1</Order> <PartitionID>1</PartitionID> <Label>WINRE</Label> <Format>NTFS</Format> <TypeID>DE94BBA4-06D1-4D40-A16A-BFD50179D6AC</TypeID> </ModifyPartition> <!-- System partition (ESP) --> <ModifyPartition wcm:action="add"> <Order>2</Order> <PartitionID>2</PartitionID> <Label>System</Label> <Format>FAT32</Format> </ModifyPartition> <!-- MSR partition does not need to be modified --> <ModifyPartition wcm:action="add"> <Order>3</Order> <PartitionID>3</PartitionID> </ModifyPartition> <!-- Windows partition --> <ModifyPartition wcm:action="add"> <Order>4</Order> <PartitionID>4</PartitionID> <Label>WIN10</Label> <Letter>C</Letter> <Format>NTFS</Format> </ModifyPartition> </ModifyPartitions> </Disk> </DiskConfiguration> <ImageInstall> <OSImage> <InstallTo> <DiskID>0</DiskID> <PartitionID>4</PartitionID> </InstallTo> <InstallToAvailablePartition>false</InstallToAvailablePartition> </OSImage> </ImageInstall><!-- ########################## # MBR Disk Configuration # ########################## Swap this section for the GPT (UEFI) configuration above by cutting and pasting BOTH the opening (above) and closing (below) comment and CDATA tags (and these comments too unless you'll remember from here on out and would rather just delete them. I won't tell if you won't tell.) around that section if you plan to perform this installation on a machine configured to boot in "Legacy" BIOS mode, which requries a traditional (deprecated) MBR-formatted on-disk partition table to boot. You *should* try to use GPT if you can but there are cases where this is buggy or otherwise undesirable and, frankly, you're the boss - if you insist I'll do whatever; I don't want no trouble... p-please don't shoot me mister! Alternatively, you might find it saves time if: a) It's worth a second USB stick or DVD to simply have one installation medium with each disk configuration ready to go, quick-draw! pew pew! b) You keep a second version of this answer file in the root of your USB stick with a different file name; each version having a different config * - You earned this gold star if you didn't have to read this to know THREE files is better: say one named AutoUnattend.xml and one source file each; AutoUnattendMBR.xml and AutoUnattendGPT.xml. Then you can copy either config over the active file, AutoUnattend.xml in a /single operation/ instead of playing Kansas City shuffle every time you go to rotate the filenames. A little too minutiae to make this comment block so long? Perhaps. But if you hadn't thought of that how much agony did I just spare you? Well That's why you rest your truck at my stop, darlin'. I suss out those wee deets so you can keep on haulin' that sweet digital crude... <DiskConfiguration> <Disk wcm:action="add"> <CreatePartitions> <CreatePartition wcm:action="add"> <Order>1</Order> <Type>Primary</Type> <Size>100</Size> </CreatePartition> <CreatePartition wcm:action="add"> <Extend>true</Extend> <Order>2</Order> <Type>Primary</Type> </CreatePartition> </CreatePartitions> <ModifyPartitions> <ModifyPartition wcm:action="add"> <Active>true</Active> <Format>NTFS</Format> <Label>System Reserved</Label> <Order>1</Order> <PartitionID>1</PartitionID> <TypeID>0x27</TypeID> </ModifyPartition> <ModifyPartition wcm:action="add"> <Active>true</Active> <Format>NTFS</Format> <Label>OS</Label> <Letter>C</Letter> <Order>2</Order> <PartitionID>2</PartitionID> </ModifyPartition> </ModifyPartitions> <DiskID>0</DiskID> <WillWipeDisk>true</WillWipeDisk> </Disk> </DiskConfiguration> <ImageInstall> <OSImage> <InstallTo> <DiskID>0</DiskID> <PartitionID>2</PartitionID> </InstallTo> <InstallToAvailablePartition>false</InstallToAvailablePartition> </OSImage> </ImageInstall> --> <UserData> <ProductKey> <!-- Do not uncomment the Key element if you are using trial ISOs --> <!-- You must uncomment the Key element (and optionally insert your own key) if you are using retail or volume license ISOs --> <Key></Key> <WillShowUI>Never</WillShowUI> </ProductKey> <AcceptEula>true</AcceptEula> <FullName>user</FullName> <Organization></Organization> </UserData> </component> </settings> <settings pass="offlineServicing"> <!-- UAC --> <component name="Microsoft-Windows-LUA-Settings" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <EnableLUA>true</EnableLUA> </component> </settings> <settings pass="generalize"> <component name="Microsoft-Windows-Security-SPP" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <SkipRearm>1</SkipRearm> </component> </settings> <settings pass="specialize"> <component name="Microsoft-Windows-International-Core" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <InputLocale>en-US</InputLocale> <SystemLocale>en-CA</SystemLocale> <UILanguage>en-US</UILanguage> <UILanguageFallback>en-US</UILanguageFallback> <UserLocale>en-CA</UserLocale> </component> <component name="Microsoft-Windows-Security-SPP-UX" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <SkipAutoActivation>true</SkipAutoActivation> </component> <component name="Microsoft-Windows-SQMApi" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <CEIPEnabled>0</CEIPEnabled> </component> <component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <ComputerName>DESKTOP-WIN10</ComputerName> </component> </settings> <settings pass="oobeSystem"> <component name="Microsoft-Windows-International-Core" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <InputLocale>en-US</InputLocale> <SystemLocale>en-CA</SystemLocale> <UILanguage>en-US</UILanguage> <UILanguageFallback>en-US</UILanguageFallback> <UserLocale>en-CA</UserLocale> </component> <component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <AutoLogon> <Password> <Value></Value> <PlainText>true</PlainText> </Password> <Enabled>true</Enabled> <Username>user</Username> </AutoLogon> <OOBE> <HideEULAPage>true</HideEULAPage> <HideOEMRegistrationScreen>true</HideOEMRegistrationScreen> <HideOnlineAccountScreens>true</HideOnlineAccountScreens> <HideWirelessSetupInOOBE>true</HideWirelessSetupInOOBE> <ProtectYourPC>3</ProtectYourPC> </OOBE> <UserAccounts> <LocalAccounts> <LocalAccount wcm:action="add"> <Password> <Value></Value> <PlainText>true</PlainText> </Password> <Description></Description> <DisplayName>user</DisplayName> <Group>Administrators</Group> <Name>user</Name> </LocalAccount> </LocalAccounts> </UserAccounts> <RegisteredOrganization></RegisteredOrganization> <RegisteredOwner>user</RegisteredOwner> <DisableAutoDaylightTimeSet>false</DisableAutoDaylightTimeSet> <FirstLogonCommands> <SynchronousCommand wcm:action="add"> <Description>Control Panel View</Description> <Order>1</Order> <CommandLine>reg add "HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\ControlPanel" /v StartupPage /t REG_DWORD /d 1 /f</CommandLine> <RequiresUserInput>true</RequiresUserInput> </SynchronousCommand> <SynchronousCommand wcm:action="add"> <Order>2</Order> <Description>Control Panel Icon Size</Description> <RequiresUserInput>false</RequiresUserInput> <CommandLine>reg add "HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\ControlPanel" /v AllItemsIconView /t REG_DWORD /d 0 /f</CommandLine> </SynchronousCommand> <SynchronousCommand wcm:action="add"> <Order>3</Order> <RequiresUserInput>false</RequiresUserInput> <CommandLine>cmd /C wmic useraccount where name="user" set PasswordExpires=false</CommandLine> <Description>Password Never Expires</Description> </SynchronousCommand> </FirstLogonCommands> <TimeZone>Eastern Standard Time</TimeZone> </component> </settings> </unattend>
Page Preview