12 July 2009

File System in Linux

I see not so much people really know the difference about Linux files system. In fact there is a whole lot to it more than just generalizing the usage of each file system, they have different usage purposes and different futures that you might or might not want. The information bellow are freely available on the world wide web, and they are not special information whatsoever , meaning that anyone could find such information, but I thought a good short article about the subject will do no harm. This article will cover EXT2, EXT3, XFS and ReiserFS so no talking about UNIX file systems like UFS or any other "different" OS specific file system, so please don't bring other OS's file system discussions over here.
There will be no historical information here whatsoever, this article main purpose is to provide technical yet simple information about these various file systems to help you in the decision making process of choosing one over the other or better is mixing between them to maximize efficiency.


EXT2:
File allocation: bitmap (free space), table (meta data)
Bad blocks: table
Max file size: 2-64 TiB
Max number of files: 10^8
Max filename length: 255 bytes
Max volume size: 16-32 TiB
Allowed characters in filenames: Any byte except 'NUL' and 0x2F
Dates recorded: modification (mtime), attribute modification (ctime), access (atime)
Date range: December 14, 1901 - January 18, 2038
Date resolution: 1s
File system permission: POSIX
Transparent compression: NO (available through patches)
Transparent encryption: NO
Supported OS's: Linux, BSD, Windows (through an IFS), Mac OS X

EXT3:
Directory contents: table, H tree with dir_index* enabled
File allocation: bitmap (free space), table (meta data)
Bad block: table
Max file size: 16 GiB - 2 TiB
Max number of files: Variable*
Max filename length: 255 bytes
Max volume size: 2-32 TiB
Allowed characters in filenames: Any byte except 'NUL' and 0x2F
Dates recorded: modification (mtime), attribute modification (ctime), access (atime)
Date range: December 14, 1901 - January 18, 2038
Date resolution: 1s, Nanosecond (using undocumented big i-node)
Attributes: No-atime, append-only, synchronous-write, no-dump, h-tree (directory), immutable, journal, secure-delete, top (directory), allow-undelete
File system permission: Unix permissions, ACLs and arbitrary security attributes (Linux 2.6 and later)
Transparent compression: NO
Transparent encryption: NO (provided at the block device level)
Supported OS's: Linux, BSD, Windows (through an IFS)

ReiserFS:
Directory contents: B tree
File allocation: bitmap
Max file size: 8 TiB
Max number of files: 2^32
Max filename length: 4032 bytes, limited to 255 by Linux VFS (Virtual File System)
Max volume size: 16 TiB
Allowed characters in filenames: Any byte except 'NUL' and 0x2F
Dates recorded: modification (mtime), attribute modification (ctime), access (atime)
Date range: December 14, 1901 - January 18, 2038
Date resolution: 1s
Forks: Extended attributes
File system permission: Unix permissions, ACLs and arbitrary security attributes
Transparent compression: NO
Transparent encryption: NO
Supported OS's: Linux

XFS:
Directory contents: B tree
File allocation: B tree
Max file size: 8 EiB minus one byte (on x64 bit system) 16 TiB (on x32 bit system)
Max filename length: 255 bytes
Max volume size: 16 EiB
Allowed characters in filenames: Any byte except 'NULL'
Dates recorded: modification (mtime), attribute modification (ctime), access (atime)
Date resolution: 1ns
Attributes: YES
File system permission: YES
Transparent compression: NO
Transparent encryption: NO (provided at the block device level)
Supported OS's: IRIX, Linux, FreeBSD (experimental)

As we see here there are different structures, and they differ between the B trees and H trees, the B tree data structure keeps data sorted and allows searches, insertions, and deletions in logarithmic amortized time, it is commonly used in databases and file systems. In the other hand we have the H trees structure which is commonly used in VLSI as a clock distribution network (Basically it's a revised version of a B tree data structure for larger directories).

* dir_index is an option that allows indexing which is turned on by typing the command tune2fs -O dir_index /dev/hdXXX.
* If V is the volume size in bytes, then the default number of inodes is given by V/(2^13) or the number of blocks, whichever is less. And the minimum is V/(2^23). The max number of subdirectories in one directory is fixed to 32000.

I think the information above are self explanatory, so I don't have comments on this part of information except to get your attention on the difference between 'NUL' and 'NULL', they are in fact different and mean totally different things, the 'NUL' is a string termination character while 'NULL' means NO thing.