When I started researching FAT structures, I thought that FAT12/16/32 file systems are well-documented and nothing new can be discovered.
There are specifications from Microsoft (DOC), ECMA (PDF), and an extremely informative article on Wikipedia.
But there are two important things worth mentioning…
First, starting from Windows 10 “Redstone 1”, EFS-based encryption is supported for FAT volumes. This feature is thoroughly described in US10726147B2.
Encrypted files have the “.PFILE” extension and their 8.3 directory entries store additional metadata. In the current implementation, this metadata fits 6 bits: two bits are used as flags and four bits are used to store the padding size.
The additional metadata is stored in the NTByte field, which is located at the offset of 12 bytes within the 8.3 directory entry. Previously, this field was only used to store two flags marking the short base name or extension as lowercase (bits #3 and #4 respectively).
Now, remaining bits are used too. Bit #0 is set when the file is encrypted (it’s also set for a directory when its newly created files should be encrypted by default), bit #1 is set when the file starts with a large EFS header (otherwise, it’s a standard EFS header). Other bits (bits #2, #5, #6, and #7) are used to store the padding size (which is at most 15 bytes in size, so 4 bits are enough) – its bit #0 (LSB) goes to bit #2 of the NTByte field, bit #1 to bit #5, bit #2 to bit #6, bit #3 to bit #7.
Here is an example of such a directory entry:
In this entry, the file is marked as encrypted (the 0x01 bit mask is set) and its padding size is 2 bytes. The EFS header has a standard size – 4096 bytes (since the 0x02 bit mask isn’t set), so the FAT driver can display the decrypted file size without reading the file data (here, the encrypted file size is 0x1010, so the decrypted file size is 0x1010-4096-2=14 bytes).
A large EFS header requires reading the file’s first bytes to determine the decrypted data size (because the header size isn’t known in advance), so the optimization described above allows the driver to quickly provide proper (decrypted) sizes when listing a directory.
Second, it seems that one implementation of FAT12/16/32 file systems misused three fields in 8.3 directory entries for decades!
Let’s try to do some file system activity on a FAT32-formatted USB Flash stick.
I will test three operating systems:
- macOS (Catalina, 10.15.7);
- Windows 11 (10.0.22000.348);
- Ubuntu (21.10).
Using each operating system, I will create two directories in the root of the volume, then place one file (containing some data) in the first directory and another file (containing data too) in the second directory. After that, I will properly eject the drive and attach it again (to the same system). Then, I will rename the file in the first directory and create a subdirectory in the second directory. After that, I will properly eject the drive and move to the next operating system (there will be six directories in the root of the volume after the test).
After the test, the following timestamps can be observed (all timestamps were extracted using The Sleuth Kit):
- The last access (A) timestamps point to the proper date.
- The modified (M) timestamps contain reasonable values. In Linux, the M timestamp of a directory is updated when its contents are changed.
- The created (C) timestamps contain reasonable values in all cases, except Linux. In Linux, the C timestamp of two directories (in the root of the volume) match their M timestamps (which were updated when I renamed a file and created a subdirectory).
Let’s do another test against Ubuntu (21.10), but only create a file in the root (on a fixed drive).
After creating a file and writing some data to it (and unmounting the volume):
- The M timestamp: 2021-12-08 09:34:32.
- The C timestamp: 2021-12-08 09:34:33.
After mounting the volume again and appending data to the same file (and then unmounting the volume):
- The M timestamp: 2021-12-08 10:02:52.
- The C timestamp: 2021-12-08 10:02:53.
So, the created timestamp doesn’t follow the usual rules!
The stat command on the mounted file system displays the following (it’s the same file):
It seems that the driver treats the created timestamp as the inode changed timestamp. This could be a mislabeled timestamp, but we know that it also doesn’t follow the expected rules for the created timestamp.
According to the kernel source code, the C timestamp is updated, for example, when writing to a file, when renaming a file within a directory. So, this timestamp clearly follows the “inode changed” semantics…
I was able to reproduce this behavior even on Debian 3 (its kernel version is 2.2.20; using the “vfat” driver). And the origin of this implementation can be traced back to even earlier versions of the kernel!
Was this observed before? Yes! Here (2017) and here (2002).
So, in this case, the Linux implementation of FAT12/16/32 file systems doesn’t match existing specifications, because it’s treating “ctime” as “inode changed time“, not “creation time” (historically, “ctime” in Linux always referred to “inode changed time“).
Update (2021-12-09): three misused fields are called “DIR_CrtTimeTenth”, “DIR_CrtTime”, and “DIR_CrtDate”.
Update (2022-08-22): in Linux 5.19, the meaning of these fields has been changed.
2 thoughts on “Things you probably didn’t know about FAT”