Linux File System
Information from Sec. 10.6.
- The Virtual File System (VFS)
- Linux designed to support multiple file systems.
- The VFS is a layer on the OS that provides an interface.
Similar to a Java interface or abstract class
- Each actual file system fills in virtual function methods.
- Supported file systems include native EXT-3 (or 4), FAT (still used on
removable media) ISO-9660 (cd-roms), NTFS (that Window partition
you might have lying around), and several other specialized ones.
- Abstract file operations structure.
- Ext-2
- An EXT sector is devided up into block groups.
- EXT sector layout.
- Superblock describes the whole FS: Size, group number and location, etc.
Copies in each block group.
- Group descriptor describes the group: Size, locations of the other
blocks, etc.
- Two bit maps describing free space, one for data blocks and one for
i-nodes.
- I-node area, then the block area.
- I-Nodes
- Each file has one inode (index node) from which everything you need
to know can be found.
- Inode format.
- Mode describes file permissons and some other things.
- Link count is how many directories refer to this file. That way
the system knows when it's safe to delete it.
- Uid is a numerical code that maps to the owner.
- Gid is a numerical code that maps to a group in /etc/group.
- There are three times: last read, last write,
last inode change for some other cause (usually permission change).
- Locations of data blocks as indicated.
- Indirect blocks are taken from the data area, but contain only
the locations of other blocks (no user data).
- Arrangement favors small files, but allows very large ones.
- Directories.
- Directories are files with a special format.
- They contain a series of items mapping file name to inode number.
The group info and i-node number are enough to find the inode.
- Directory format.
- Allocation Policy
- Block groups are designed to reduce fragmentation.
- If possible,
- When a file is created, allocate it's i-node in the same group
as the directory block.
- Allocate the first data block in the same group as the i-node.
- Allocate additional data blocks in the same group as the last.
- To preserve some balance,
- Allocate subdirectory i-nodes in a different block from the parent
directory block. (Choose one with relatively more free i-nodes.)
- When a file grows past a certain threshhold, deliberately switch
to another group. (Choose one with relatively more free data blocks.)
- This tries to generally keep seeks inside the same block, while
recognizing that this is not always possible.
- Limits file system utilization so the algorithm can work well.
- Basic block size can vary from 1K to 8K when formating. 4k seems most
typical.
- EXT-3: Add journaling.
- Adds the journaling service to EXT-2. Really no other changes.
- Journal is a circular buffer of transactions.
- Journal may exist in a file, a disk, or a portion of a disk, either
on the same device as the file system or another disk.
- Can journal just metadata (file system structural changes), or all
changes to the fs (including data).
- EXT-4: Further extensions (source). These include:
- Extents: Instead of recording each block, a portion of a file may be
a contiguous block of arbitrary size.
- Reduce table sizes.
- Reduce fragmentation.
- Files can grow by allocating multiple blocks instead of one at a time,
and applications can request a pre-allocation of space.
Reduce fragmentation
- Journal is checksummed to avoid errors. (And correct them?)