Exploring intermediate states of a registry hive using transaction log files

If you don’t know why transaction log files are important when dealing with registry hives from installations of Windows 8.1 & 10, please read this and this.

In this post, I will talk about an easy way to programmatically explore intermediate states of a registry hive using its transaction log files.

What is an intermediate state of a registry hive?

It’s a state of a registry hive after a log entry has been applied to it but before the recovery is finished.

Since a Windows kernel delays writes to primary files of registry hives up to an hour (not counting hibernation and sleep periods, so a real delay may be longer if a computer isn’t actively used), a registry flush results in dirty (modified) data being appended to a transaction log file (while a primary file remains unmodified, see the links above for more information). By applying log entries from transaction log files one by one, we can explore every state of a hive recorded by recent flush operations.

For example, we can collect more data for a timeline when multiple timestamps of a registry key are recorded in log entries (and this isn’t something unusual).

Take a look at the following timeline built by the yarp-timeline tool (the registry hive is from the 2018 Lone Wolf Scenario):

$ yarp-timeline ./Users/jcloudy/NTUSER.DAT | grep -Fa 'Software\Microsoft\Windows\CurrentVersion\Explorer\UserAssist\{CEBFF5CD-ACE2-4F4F-9178-9926F41749EA}\Count'
./Users/jcloudy/NTUSER.DAT	Software\Microsoft\Windows\CurrentVersion\Explorer\UserAssist\{CEBFF5CD-ACE2-4F4F-9178-9926F41749EA}\Count	False	True	2018-04-06 12:50:40.341634
./Users/jcloudy/NTUSER.DAT	Software\Microsoft\Windows\CurrentVersion\Explorer\UserAssist\{CEBFF5CD-ACE2-4F4F-9178-9926F41749EA}\Count	False	True	2018-04-06 12:47:52.767166
./Users/jcloudy/NTUSER.DAT	Software\Microsoft\Windows\CurrentVersion\Explorer\UserAssist\{CEBFF5CD-ACE2-4F4F-9178-9926F41749EA}\Count	False	True	2018-04-06 12:43:39.746196

As you can see, it’s easy to catch additional timestamps!

Extracting data from intermediate states of a registry hive

The yarp library implements a simple interface to access intermediate states of a registry hive when applying transaction log files – a log entry callback.

If a log entry callback function was assigned to the Registry.RegistryHive instance, then this function is called after applying a log entry (thus, this function can be called many times). And the Registry.RegistryHive instance can be used from the callback function to access everything in the “intermediate hive” just like in the “normal hive”.

Let’s take a look at a value in the same registry hive. The code is here:

#!/usr/bin/env python3

from yarp import *

primary = open('/mnt/tmp/Users/jcloudy/NTUSER.DAT', 'rb')
log1 = open('/mnt/tmp/Users/jcloudy/ntuser.dat.LOG1', 'rb')
log2 = open('/mnt/tmp/Users/jcloudy/ntuser.dat.LOG2', 'rb')

hive = Registry.RegistryHive(primary)

previous_data = None
def parse_key():
	"""This is a log entry callback."""

	global previous_data, hive

	key = hive.find_key('Software\\Microsoft\\Windows\\CurrentVersion\\Explorer\\UserAssist\\{CEBFF5CD-ACE2-4F4F-9178-9926F41749EA}\\Count')
	timestamp = key.last_written_timestamp()
	value = key.value('S:\Cebtenzf\Vzntre_Yvgr_3.1.1\SGX Vzntre.rkr')
	data = value.data_raw()

	print(timestamp)
	if previous_data is None or data != previous_data:
		print('Data:')
		print(RegistryHelpers.HexDump(data))
		previous_data = data
	else:
		print('Same data')

	print('---')

hive.log_entry_callback = parse_key # Assign the log entry callback.

parse_key() # Run it before replaying the log files.
hive.recover_auto(None, log1, log2) # Replay the log files.

primary.close()
log1.close()
log2.close()

The output of that code is:

2018-04-06 12:47:52.767166
Data:
00000000  00 00 00 00 01 00 00 00-01 00 00 00 E1 41 02 00  .............A..
00000010  00 00 80 BF 00 00 80 BF-00 00 80 BF 00 00 80 BF  ................
00000020  00 00 80 BF 00 00 80 BF-00 00 80 BF 00 00 80 BF  ................
00000030  00 00 80 BF 00 00 80 BF-FF FF FF FF C0 2B 91 90  .............+..
00000040  A4 CD D3 01 00 00 00 00                          ........
---
2018-04-06 12:43:39.746196
Data:
00000000  00 00 00 00 01 00 00 00-01 00 00 00 4E 1B 02 00  ............N...
00000010  00 00 80 BF 00 00 80 BF-00 00 80 BF 00 00 80 BF  ................
00000020  00 00 80 BF 00 00 80 BF-00 00 80 BF 00 00 80 BF  ................
00000030  00 00 80 BF 00 00 80 BF-FF FF FF FF C0 2B 91 90  .............+..
00000040  A4 CD D3 01 00 00 00 00                          ........
---
2018-04-06 12:43:39.746196
Same data
---
2018-04-06 12:43:39.746196
Same data
---
2018-04-06 12:47:52.767166
Data:
00000000  00 00 00 00 01 00 00 00-01 00 00 00 E1 41 02 00  .............A..
00000010  00 00 80 BF 00 00 80 BF-00 00 80 BF 00 00 80 BF  ................
00000020  00 00 80 BF 00 00 80 BF-00 00 80 BF 00 00 80 BF  ................
00000030  00 00 80 BF 00 00 80 BF-FF FF FF FF C0 2B 91 90  .............+..
00000040  A4 CD D3 01 00 00 00 00                          ........
---
2018-04-06 12:50:40.341634
Data:
00000000  00 00 00 00 01 00 00 00-01 00 00 00 7B D0 04 00  ............{...
00000010  00 00 80 BF 00 00 80 BF-00 00 80 BF 00 00 80 BF  ................
00000020  00 00 80 BF 00 00 80 BF-00 00 80 BF 00 00 80 BF  ................
00000030  00 00 80 BF 00 00 80 BF-FF FF FF FF C0 2B 91 90  .............+..
00000040  A4 CD D3 01 00 00 00 00                          ........
---

(Sorry if the output isn’t using a monospaced font, this is a WordPress issue. Here is a screenshot of the output.)

So, we got 6 different states of a single registry value, 3 of them have unique value data. The only difference between the states of value data is 4 bytes at the offset 12 bytes.

Since we were parsing the UserAssist key, we can find the meaning of these bytes in existing documentation: these bytes represent the focus time.

An attentive reader could notice that the last written timestamp taken from the dirty primary file is “in the future” (time: 12:47), because the last written timestamp taken from the first log entry (time: 12:43) is preceding it. This is explained here.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s