A registry hive is very similar to a file system. In fact, there isn’t much difference between a file system and a registry hive except that the registry doesn’t follow usual file system naming rules.
Like a file system, a registry hive can contain deleted data, which is often recovered and used in digital forensics, incident response, and similar activities. But tools that recover such deleted data aren’t the same. And here is why.
Each registry hive is built up using a base block (a file header) followed by one or more hive bins, which contain cells (a hive bin is an allocation unit used to store one or more cells). Each cell is used to store a structure like a key node (describing a registry key) or a key value (describing a registry value), to keep value data (if it’s large enough, so it has to be stored outside of a key value) or something else (other types of structures include security descriptors, subkeys lists, key values lists, and more); all these structures are linked together (by referencing an offset of a cell with a related structure). For more information, see the Windows registry file format specification.
1. When a key or a value is deleted, cells responsible for holding this key or this value are marked as unallocated (free). The offset of an item being deleted is also removed from a subkeys list of a parent key (when a key is being deleted) or from a key values list (when a value is being deleted). Since such lists can’t contain gaps, all current offsets after a deleted one are moved backward (thus, a removed offset is overwritten). Adjacent unallocated cells are coalesced (merged into a single unallocated cell) in order to simplify future allocations of cells.
Tools that recover deleted registry data scan registry files for unallocated cells and then attempt to interpret their data as a key node or a key value (this is easy, because a key node starts with the “nk” signature and a key value starts with the “vk” signature; see the format specification mentioned above for more details). If successful, a recovered structure is reported to a user.
The first problem is that some tools scan only the beginning of each unallocated cell while searching for deleted structures. In this implementation, deleted structures that were merged into another unallocated cell are ignored (in this situation, a single unallocated cell contains two or more deleted structures, but a tool attempts to recover the first one only).
To check if your tool operates properly, download the NIST Hacking Case, extract the “\WINDOWS\system32\config\system” file from the image, then load this file into your tool. If a tool being tested recovers two deleted “DhcpSubnetMaskOpt” values, then it deals with unallocated cells properly.
2. Sometimes hive bins become unused and they are discarded with the total size of active (used) hive bins being reduced (this size is recorded in the base block). In this case, the file size may or may not be reduced to account such a type of data deallocation (if it’s not reduced, new allocations within the current file size won’t increase the fragmentation of the file).
Thus, there could be deleted data between the end of the last hive bin and the end of the file! And the second problem is that some tools ignore such remnant data while searching for deleted structures.
To check if your tool operates properly, go to the M57-Patents Scenario, download the “terry-2009-12-11-001.E01” image, extract the “\Windows\System32\config\SYSTEM” file from the image, then load this file into your tool. If a tool being tested recovers two deleted “AcceptOfficeAndTahoeServers” values, then it deals with remnant data properly (if there is one value only and the same tool passed the previous test, then the tool either didn’t process remnant data or there is an issue with data reported to a user).
3. An unallocated cell may be used to serve an allocation request. In this situation, some parts of a newly allocated cell may not be overwritten with new data. In most cases, such remnant data is rather small (less than 8 bytes), so it won’t contain deleted key nodes and key values. But there are special allocation scenarios:
- growing a subkeys list;
- growing an index root;
- growing a key values list.
A subkeys list is an array of offsets pointing to child key nodes (sometimes these offsets are stored along with metadata, see the format specification mentioned above). An index root is a list of subkeys lists (which is used to subdivide a subkeys list when it becomes too long). A key values list is an array of offsets pointing to key values (similar to a subkeys list but without metadata).
When a subkey is added to a parent key or when a value is assigned to a key, a new item (an offset to a key node or a key value) has to be added to a subkeys list or a key values list respectively. If there is no space left in a cell containing that list, the list should be moved (reallocated) to a larger cell. In order to reduce the number of reallocation operations when a program slowly grows the number of subkeys or values of a given key, a new list is often allocated with some space reserved for future items (in other words, an existing list can be moved to a significantly larger cell).
Thus, a subkeys list, an index root, or a key values list may contain data between the last item on the list and the end of the cell (the slack space) with deleted key nodes and key values.
The third problem is that some tools don’t process the slack space of allocated cells while searching for deleted structures.
To check if your tool operates properly, go back to the NIST Hacking Case, extract the “\WINDOWS\system32\config\software” file from the image, then load this file into your tool. If a tool being tested recovers two deleted keys and four deleted values (all of them reside in the slack space of allocated cells), it did its job well.
4. Finally, there is another location of deleted registry data: allocated but unreferenced cells.
Some releases of Windows 10 had a bug which resulted in a cell containing a key node not being marked as unallocated when a corresponding key is renamed. This key node, however, is unlinked from a subkeys list of a parent key node (thus, this key node can’t be reached by parsing a registry tree).
So, the fourth problem is that some tools don’t detect registry structures stored in allocated but unreferenced cells.
5. Every key node has the “Parent” field, which contains an offset of a parent key node. This offset is used to link recovered deleted key nodes back to their parents (if a parent key node doesn’t exist anymore, a recovered key node should be reported as unassociated, with no valid parent).
However, key values have no similar field. Therefore, it’s almost impossible to link recovered deleted values back to their keys, but there are two exceptions:
- a deleted key node may point to an intact deleted key values list, which can be used to read associated deleted key values;
- when deleting a key value which is the last one on the corresponding key values list, there are no offsets to be moved backward, so an offset of that key value isn’t overwritten and can be used to link that key value back to its key.
The fifth problem is that some tools don’t show unassociated deleted values. Thus, most recovered deleted values (which are unassociated) aren’t visible to a user.
To check if your tool operates properly, refer to the first test. If passed, then the tool did its job well.
The sixth problem is that some tools don’t associate deleted values with active (not deleted) keys where possible.
To check if your tool operates properly, go to the NIST Hacking Case, extract the “\WINDOWS\system32\config\system” file from the image, then load this file into your tool. There should be a deleted value (“DhcpNameServer“) associated with the “ControlSet001\Services\Tcpip\Parameters” key (which isn’t deleted).