The uppercased hell

Recently, Microsoft warned users about compatibility issues with applications using some non-ASCII characters in names of their registry keys. According to Microsoft:

Compatibility issues have been found between apps using some non-ASCII characters in their registry keys or subkeys and Windows 11. Affected apps might be unable to open and might cause other issues or errors in Windows, including the possibility of receiving an error with a blue screen. Important Affected registry keys with non-ASCII characters might not be able to be repaired.

Sounds interesting!

Before we start, here are some useful links:

  1. Windows registry file format specification
  2. Measured Boot and Malware Signatures: exploring two vulnerabilities found in the Windows loader
  3. Playing with case-insensitive file names (a registry hive is similar to a file system)

Now, let’s begin!

During the boot, the NT kernel requires some registry hives to be preloaded. In particular, the SYSTEM hive (HKEY_LOCAL_MACHINE\SYSTEM) and the ELAM hive must be loaded into the memory before the NT kernel is executed. This is done by the Windows loader (winload.efi or winload.exe).

However, there are two constraints:

  • In order to check a registry hive file for possible format violations, one needs to load the Unicode uppercase table first: because one needs to check the order of subkeys in corresponding lists, which is case-insensitive, based on uppercase versions of key names (the lexicographical order).
  • In order to load the Unicode uppercase table, one needs to load the SYSTEM hive first: the Unicode uppercase table is stored as a file and its location is specified in the registry (for legacy reasons).

So, you can’t load the SYSTEM hive before the Unicode uppercase table and you can’t load the Unicode uppercase table before the SYSTEM hive.

To solve this problem, Microsoft relaxed the checks performed by the Windows loader against registry hives being loaded, excluding the lexicographical order check. Later, this check is performed by the NT kernel, covering preloaded registry hives.

This was okay until a vulnerability was discovered by me — CVE-2021-27094. This vulnerability allows the Windows loader to see some expected registry values, measure them into the TPM (resulting in expected PCR values), while the NT kernel won’t see those registry values at all (because corresponding registry keys are deleted by the lexicographical order check performed by the NT kernel). (See the second link in the list above for more details.)

Microsoft fixed that vulnerability by adding the lexicographical order check to the Windows loader. However, this check works properly with ASCII names only (because the Unicode uppercase table isn’t available at that moment). For registry keys containing non-ASCII characters in their names, data loss can occur. This was reported to Microsoft and confirmed by them.

The data loss problem can be described with the following example:

  • For the “z” character, the uppercase version is “Z”. This character belongs to ASCII and there is no problem with the Windows loader and with the NT kernel.
  • For the “я” character, the correct uppercase version is “Я”. This character doesn’t belong to ASCII and there is no problem with the NT kernel (because the Unicode uppercase table is available). But the Windows loader isn’t aware of the proper uppercase version, using the “я” character as its uppercase version (instead of “Я”) — the same character code is used for a non-ASCII character.
  • Let’s assume a subkeys list containing references to registry keys with these names, in that order: z1, Z2, z3. According to the Windows loader and the NT kernel, this list is sorted correctly (remember, each character is converted to its uppercase version, so Z1 < Z2 < Z3).
  • Let’s assume a subkeys list containing references to registry keys with these names, in that order: я1, Я2. According to the NT kernel, this list is sorted correctly (Я1 < Я2). But according to the Windows loader, the same list is sorted incorrectly (because Я2 < я1). This will force the Windows loader to delete an “offending” registry key (Я2) to keep things sane. Thus, data loss can happen in preloaded hives (and the problem doesn’t affect hives being loaded by the NT kernel).

Microsoft tried to fix the problem in Windows 11… And failed!

This new problem isn’t the same as described above. Moreover, it has serious security implications. And data loss is more likely!

Let’s consider the following layout:

Here, there is a key, its path is HKEY_LOCAL_MACHINE\SYSTEM\test_eop\restricted_area\delete_me.

When running with administrator privileges, we can list this key and delete it (but we won’t do the latter).

When running with regular user privileges, we can list that key, but we can’t delete it, its parent or even the HKEY_LOCAL_MACHINE\SYSTEM\test_eop key (the reg command prints the Access denied error).

However, we can create a new subkey under the HKEY_LOCAL_MACHINE\SYSTEM\test_eop key (this new subkey is “new_key”). This is a specific access-control list (ACL) setup I created to demonstrate the problem.

So, we can’t delete an existing subkey, but we can create new ones.

Now, let’s create a new subkey with its name set to a lowercase non-ASCII character (“я”).

We can successfully create this key, open it and list its values (although there are none).

Now, let’s export the SYSTEM hive (using the NtSaveKeyEx function with the REG_NO_COMPRESSION argument) and open it in a HEX editor.

A key node describing the “test_eop” key: red — the number of subkeys (3), green — the relative offset of a list of subkeys (0x010785B0)
A list of subkeys (“hash leaf”): green — the number of subkeys, red — the subkey name hash (for “я”, it’s 0x042F)

Everything looks valid — there are 3 subkeys (“new_key”, “restricted_area”, “я”) and the last name hash (for “я”) is valid.

Now, let’s reboot the computer.

The following can be observed after the boot:

The MACHINE\SYSTEM\test_eop\я\new_key key can be listed (when querying its parent key) and opened (it’s okay and expected).

And the HKEY_LOCAL_MACHINE\SYSTEM\test_eop\я key can be listed, but can’t be opened (the reg query command prints the Cannot find error).

Let’s export the SYSTEM hive (using the same approach) and open it in a HEX editor.

The key node (describing the “test_eop” key) is exactly the same as shown above (so, it won’t be shown below). But the list of subkeys is slightly different!

A list of subkeys (“hash leaf”): red — the subkey name hash (for “я”, should be 0x042F, but now it’s 0x044F)

The hash value for the name “я” is different! Now, it doesn’t correspond to the uppercase version of “я” (which is “Я”, the character code is 0x042F), it corresponds to the “я” character (its code is 0x044F). For key names consisting of one character, the hash is the same as the character code, the algorithm is described here.

What happened? The Windows loader validated the subkeys list, found one entry with a “wrong” hash value, and corrected it (by writing the calculated hash value instead of the stored one). Thus, we see a different hash value, which is based on the lowercase character, while the proper hash would be based on the uppercase character (but proper uppercase conversions are impossible in the Windows loader, because there is no Unicode uppercase table loaded).

Similar behavior has been observed in Windows 10 too. But lists of subkeys are validated there twice — in the Windows loader first and then in the NT kernel. So, this “error” is first “corrected” by the Windows loader and then the correct hash value is written by the NT kernel (the stored hash is first changed to the incorrect one, then back to the correct one, which is equal to the originally stored in the list). For some reason, the NT kernel in Windows 11 doesn’t perform that second pass, so the wrong hash value persists.

Since subkeys are enumerated by simply iterating over a list of subkeys, a subkey with a wrong hash value can be listed. But since subkeys are opened using their hash values (these name hashes are used to make things faster while searching for a given name in a list of subkeys), the same key can’t be opened — the subkey isn’t found by its name hash!

So, such an existing subkey can be listed, but not be opened.

But what happens if we try create the HKEY_LOCAL_MACHINE\SYSTEM\test_eop\я key again?

This key can be created successfully.

But, actually, nothing has changed. There are only 3 subkeys as before.

Now, let’s export the SYSTEM hive again. If we try to do this using the NtSaveKeyEx function (as we did before), the operating system will crash.

Instead, let’s leave the system unattended for more than 10 minutes (here is why), then use a tool to copy a corresponding registry file (C:\WINDOWS\system32\config\SYSTEM) directly from the volume (I will use FTK Imager).

The list of subkeys is the same as shown previously (containing the wrong hash value; it won’t be shown below).

But the key node is different:

A key node describing the “test_eop” key: red — the number of subkeys (4)

Now, the number of subkeys is 4! This is wrong, because the real number of subkeys is still 3 (according to the list of subkeys).

Now, let’s reboot… And…

A key node describing the “test_eop” key: red — the number of subkeys (0)

There are no subkeys at all! We managed to delete them without acquiring administrator privileges, despite the ACL.

This happened because an obvious error in the number of subkeys has been corrected by simply deleting every subkey.

Update (2021-10-17): the problem described in this post seems to be fixed (or mitigated) in the build 10.0.22000.282. As shown on the screenshots, the tests were conducted using the build 10.0.22000.258.

Final thoughts

As you can see, uppercase tables are important to security. What do you think about the UEFI specification?

It “froze” the existing FAT specification:

EFI encompasses the use of FAT32 for a system partition, and FAT12 or FAT16 for removable media. The FAT32 system partition is identified by an OSType value other than that used to identify previous versions of FAT. This unique partition type distinguishes an EFI defined file system from a normal FAT file system. The file system supported by EFI includes support for long file names.

The definition of the EFI file system will be maintained by specification and will not evolve over time to deal with errata or variant interpretations in OS file system drivers or file system utilities. Future enhancements and compatibility enhancements to FAT will not be automatically included in EFI file systems. The EFI file system is a target that is fixed by the EFI specification, and other specifications explicitly referenced by the EFI specification.

UEFI Specification, version 2.9, section 13.3

And enabled long file names:

FAT defines that all files in a directory must have a unique name, and unique is defined as a case insensitive match.


Note: Although the FAT32 specification allows file names to be encoded using UTF-16, this specification only recognizes the UCS-2 subset for the purposes of sorting or collation.

UEFI Specification, version 2.9, section

But nothing is said about the UCS-2 uppercase table to be used!

One thought on “The uppercased hell

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s