Things are changing and file systems are not an exception. Even when their version numbers are staying the same.
This post will outline some interesting things found in the current NTFS implementation which are either poorly documented or not documented elsewhere.
An $MFT number within a file record segment
Each file record segment (FRS, other sources often call it a file record) has a reference number for itself as shown below (source):
However, two bytes at the offset 0x2A (42) are actually used. Since file reference numbers are 48-bit values, 4 bytes are not enough to store such a number. So, two bytes at the offset 0x2A (42) are used to store the higher part (16 bits) of the reference number (which is zero nevertheless, because it’s almost impossible to reach the 32-bit limit). Note that the higher part is stored before the lower part!
Flags of a file record segment
Each file record segment has the Flags field. Currently, only two flags are documented well as quoted below (the same source):
The values of these previously unknown flags are described below:
- 0x4: the file quota is never charged; this file cannot be opened by its FRS reference number (unless a special flag is given);
- 0x8: is an index file.
This attribute contains information about the volume (version numbers and flags). Currently, only a few flags are documented (source):
Here is a full list of volume flags found in Windows 10:
- 0x1: a volume is corrupt (dirty);
- 0x2: need to resize the $LogFile journal;
- 0x4: need to upgrade the volume version;
- 0x8: the object IDs, the quotas, and the USN journal metadata can be corrupt (this flag is set by Windows NT 4.0);
- 0x10: need to delete the USN journal;
- 0x20: need to repair the object IDs;
- 0x40: a volume is corrupt and it caused a bug check;
- 0x80: persistent volume state: no tunneling cache, the short file names creation is disabled;
- 0x100: need to run the full Chkdsk scan;
- 0x200: need to run the proactive scan;
- 0x400: persistent volume state: the TxF feature is disabled;
- 0x800: persistent volume state: the volume scrub is disabled;
- 0x1000: do not create the corruption log file ($Verify and $Corrupt);
- 0x2000: persistent volume state: the heat gathering is disabled;
- 0x4000: this was a system volume during the Chkdsk scan;
- 0x8000: a volume was modified by the Chkdsk scan.
As you can see, Microsoft ran out of possible volume flags, because the $VOLUME_INFORMATION flags are stored in two bytes.
The documented layout for this attribute is this (source):
The Version number field seems to be renamed to something else. In particular, this field is now used to store extra flags for a file. Currently, the only flag is:
- 0x1: is case sensitive (used by Windows Subsystem for Linux).
Update (2020-03-12): the second byte (counting from the least significant byte) in the same field is used to store a storage reserve ID.
Update (2020-05-19): actually, only the first four bits of the second byte are used to store a storage reserve ID. Also, recent Insider Preview versions of Windows 10 (Fast Ring, not “20H1”) use the first byte in the same field to store known folder information (it uses three bits after the “is case sensitive” bit).
Update (2021-12-10): and the next two bits of the first byte (after the known folder information field) have been allocated – to store the trust level of a reparse point (Windows 11).
Here is an example (the new field is red):
Extended attributes are actively used in Windows 10 for various purposes. An interesting example is the Linux file system metadata stored by Windows Subsystem for Linux:
On the screenshot above, there are two extended attributes: LXXATTR and LXATTRB (they are stored in the FILE_FULL_EA_INFORMATION structure).
The first one, LXXATTR, is used to store extended attributes for a file in the Ubuntu installation.
The second one, LXATTRB, is used to store file status information for the same file. This structure contains the following fields:
- mode (st_mode);
- device ID (st_rdev);
- three timestamps: last access, modified, inode changed (as Unix time with nanosecond precision).
The layout of the LXATTRB blob is the following (everything is defined relative from the start of the extended attribute value):