Windows NTFS
  1. Interface Properties
    1. Case-sensitive (like Unix).
    2. But the Win32 wrapper on top is not.
    3. Names in Unicode.
    4. Files are collections of several attributes.
      1. Each attribute value is a stream of bytes.
      2. An unnamed attribute is the file contents.
      3. Any number of additional streams, usually small, but may be any size.
      4. Allow recording of metadata and support information, such as image thumbs.
      5. Many aps are careless with alternate streams, and lose them.
    5. Hard links and soft links.
      1. Hard links only exposed in the Posix subsystem.
      2. Soft links only for administrators.
        I believe Windows shortcuts are above the FS.
    6. Note that Windows has a Posix mode in which more Unix-like features of NTFS are exposed.
  2. Volume layout

    MFT = Master File Table.
    1. Each volume is a series of blocks. Sizes vary with the size of the partition, but 4K is common.
    2. Blocks identified by offset from the start.
    3. Boot block contains pointer to the MFT, which may be anywhere.
  3. The Master File Table (MFT).
    1. MFT is a series of 1K records.
    2. Each record describes one file, analogous to a Unix i-node.
    3. The first 16 entries are special-use files.
      1. MFT itself.
      2. Mirror copy of MFT, for reliability.
      3. The log file (journal).
      4. Volume information: size, label, version, etc.
      5. Attributes definitions.
      6. Root directory.
      7. Bitmap of used blocks.
      8. Bootstrap loader (first code to run?).
      9. Bad block list. These should not be used in a file.
      10. Security descriptors.
      11. Case mapping (not trivial for some languages).
      12. Directory containing extensions.
      12-15. Reserved for future use.
  4. MFT Entries (fixed 1K each)
    1. Starts with a header, followed by a series of attribute records,
      1. Each attribute has an identifying code and some content.
      2. Attributes appear in a fixed order; some may repeat. All need not be present.
      3. Attribute records are of variable length.
    2. Depending on size, the attribute value may be recorded
      1. Directly in the attribute record.
      2. In the the data section, with the location given in the attribute record. Nonresident attribute
      Attribute types
      1. Security information used to be in an attribute, was moved to the single security file to save space.
      2. The Data attribute is a file contents. More than one may appear; any additional data streams must have a name.
  5. Directories are files
    1. Like other systems, directories are essentially files, but there are special attributes used just for them.
    2. Small directories are simple lists, but large directories are B+ trees.
  6. Storage Allocation
    1. Basic block size typically 4K.
    2. Blocks are allocated in contiguous groups when possible, called extents.
    3. Attribute values are stored in one of three ways:
      1. Inside the attribute record, if very small.
      2. If larger, the attribute contains a list of extents where the data resides. Each extent is given as a start location and size.
      3. If still larger, the attribute record refers to a different MFT record containing only the attribute, giving much more room for extent records. Any number of these may be used.
  7. Reparse points
    1. Associate a path with arbitrary code that executes when the link is used.
    2. Applications may use this to build FS extensions.
    3. Microsoft uses it to create several additional features, including
      1. Symbolic links.
      2. Volume mount points, like Unix mount points.
    4. Management is essentially an indexed system where the storage blocks can vary in size.
    5. If an MFT record cannot hold enough runs, MFTs can be chained. [source].
      1. The starting MFT holds the usual metadata.
      2. Gives the location of a second MFT dedicated to runs.
      3. This can be generalized both by adding indirect MFTs to the first, or adding double (or more?) indirect MFT indexes. Forms a tree similar to Unix's, but without a pre-determined shape.
  8. Files can be marked for compression.
    1. NTFS attempts to compress each 16-block group.
    2. If it get smaller, store compressed.
    3. If it doesn't, store uncompressed.
  9. Directories can be marked for encryption.
    1. All files placed there are encrypted.
    2. NTFS doesn't perform the encryption, but uses call-backs.
    3. A registers the callbacks.
Refs: Textbook, http://www.dewassoc.com/kbase/windows_nt/ntfs_directories_and_files.htm, http://kcall.co.uk/ntfs/index.html, https://support.microsoft.com/en-us/help/140365/default-cluster-size-for-ntfs-fat-and-exfat