Session 5.4 – File Allocation Methods

Understanding contiguous, linked, and indexed allocation strategies with free-space management

2 Hours Module 5 I/O & File Management

Learning Objectives

By the end of this session, you will be able to:

  • Understand the need for file allocation methods in file systems
  • Compare contiguous, linked, and indexed allocation strategies
  • Analyze the trade-offs between different allocation methods
  • Explain various free-space management techniques
  • Evaluate performance characteristics of allocation methods
  • Understand advanced allocation techniques like extent-based allocation

Introduction to File Allocation

File allocation methods determine how files are stored on disk storage devices. The choice of allocation method affects system performance, storage efficiency, and file access patterns.

Why File Allocation Matters
  • Storage space utilization efficiency
  • File access performance (sequential/random)
  • Fragmentation management
  • Scalability for large files
  • System complexity and overhead
  • Recovery and reliability
Design Considerations
  • Access patterns (sequential vs. random)
  • File size distribution
  • Storage device characteristics
  • Memory overhead for metadata
  • Implementation complexity
  • Portability and standards
Storage Allocation Problem
File System View

Files: A.txt (3 blocks), B.doc (5 blocks), C.jpg (2 blocks)

Physical Disk
0
1
2
3
4
5
6
7
8
9

How should we allocate files to these blocks?

Contiguous Allocation

In contiguous allocation, each file occupies a set of consecutive blocks on the disk. This is the simplest allocation method and provides excellent performance for sequential access.

Contiguous Allocation Mechanism

Directory Entry Structure:
┌─────────────┬─────────────┬─────────────┐
│ File Name   │ Start Block │ Length      │
├─────────────┼─────────────┼─────────────┤
│ file1.txt   │     14      │      6      │
│ file2.doc   │     20      │      4      │
│ file3.jpg   │      8      │      3      │
└─────────────┴─────────────┴─────────────┘

Address Calculation:
For file at start block S, length L:
Block addresses: S, S+1, S+2, ..., S+L-1

Access Time:
Sequential: Optimal (no head movement)
Random: Good (minimal seeking)
Contiguous Allocation Example
Directory Structure
FileStartLength
A.txt03
B.doc35
C.jpg82
Disk Layout
A
0
A
1
A
2
B
3
B
4
B
5
B
6
B
7
C
8
C
9
Advantages
  • Simple Implementation: Easy directory management
  • Excellent Sequential Access: No seeking between blocks
  • Good Random Access: Direct calculation of block address
  • Minimal Storage Overhead: Only start block and length needed
  • High Performance: Optimal for sequential I/O
Disadvantages
  • External Fragmentation: Wasted space between files
  • File Size Limitation: Difficult to extend files
  • Compaction Needed: Expensive reorganization
  • Pre-allocation Required: Need to know maximum file size
  • Fragmentation Over Time: Performance degrades
External Fragmentation Problem
Before File Deletion
A
0
A
1
B
2
B
3
C
4
C
5
C
6
D
7
D
8
D
9
After Deleting Files B and C
A
0
A
1
Free
2
Free
3
Free
4
Free
5
Free
6
D
7
D
8
D
9

5 free blocks, but can't allocate a 6-block file!

Linked Allocation

In linked allocation, each file is a linked list of disk blocks. Blocks can be scattered anywhere on the disk, eliminating external fragmentation but complicating random access.

Linked Allocation Structure

Directory Entry:
┌─────────────┬─────────────────┐
│ File Name   │ First Block Ptr │
├─────────────┼─────────────────┤
│ file1.txt   │        5        │
│ file2.doc   │        2        │
│ file3.jpg   │        9        │
└─────────────┴─────────────────┘

Block Structure:
┌─────────────────┬─────────────────┐
│   Data Portion  │ Next Block Ptr  │
│    (e.g. 508B)  │    (e.g. 4B)    │
└─────────────────┴─────────────────┘

Access Pattern:
Sequential: Follow pointer chain
Random: Must traverse from beginning (O(n))
Linked Allocation Example
Directory Structure
FileFirst Block
A.txt1
B.doc4
C.jpg7
Block Chain Structure
File A.txt: 1 → 3 → 8 → NULL
File B.doc: 4 → 6 → 2 → 9 → 5 → NULL  
File C.jpg: 7 → 0 → NULL
Physical Disk Layout
C
→70
A
→31
B
→92
A
→83
B
→64
B
→∅5
B
→26
C
→07
A
→∅8
B
→59
Advantages
  • No External Fragmentation: Use any available block
  • File Size Flexibility: Grow files dynamically
  • Simple Allocation: Just need one free block
  • No Pre-declaration: Don't need to know file size
  • Efficient Space Use: No wasted space
Disadvantages
  • Poor Random Access: Must traverse links (O(n))
  • Storage Overhead: Pointer in each block
  • Reliability Issues: Lost pointer = lost file
  • Sequential Access Overhead: Extra disk reads
  • Block Size Mismatch: Pointers don't align well

File Allocation Table (FAT)

FAT improves linked allocation by storing all pointers in a separate table, reducing disk seeks and improving reliability.

BlockNext
0EOF
13
29
38
46
5EOF
62
70
8EOF
95
FAT Benefits:
  • Random access improvement
  • Better reliability (table can be cached)
  • No storage overhead in data blocks
  • Easier to implement
FAT Drawbacks:
  • Large table for big disks
  • Table must be in memory
  • Still poor for very large files

Indexed Allocation

Indexed allocation brings all pointers together into an index block. Each file has its own index block, which contains pointers to all the file's data blocks.

Indexed Allocation Structure

Directory Entry:
┌─────────────┬─────────────────┐
│ File Name   │ Index Block Ptr │
├─────────────┼─────────────────┤
│ file1.txt   │        7        │
│ file2.doc   │        3        │
│ file3.jpg   │       12        │
└─────────────┴─────────────────┘

Index Block Structure:
┌─────────────────────────────────┐
│ Pointer 0: Data Block 15        │
│ Pointer 1: Data Block 22        │
│ Pointer 2: Data Block 8         │
│ Pointer 3: Data Block 45        │
│ ...                             │
│ Pointer N: NULL (end of file)   │
└─────────────────────────────────┘

Access:
Direct access to any block via index
Indexed Allocation Example
Directory & Index Blocks
FileIndex Block
A.txt1
B.doc4
C.jpg7
Index Block 1 (A.txt):
[0]: 3
[1]: 8  
[2]: 5
[3]: NULL
Physical Disk Layout
Free
0
Idx
A1
Free
2
A
3
Idx
B4
A
5
Free
6
Idx
C7
A
8
Free
9
Advantages
  • Direct Access: Random access in constant time
  • No External Fragmentation: Blocks anywhere on disk
  • Dynamic File Growth: Add blocks as needed
  • Good for Large Files: Single index block lookup
  • Simple Implementation: Straightforward algorithm
Disadvantages
  • Index Block Overhead: Wasted space for small files
  • File Size Limitation: Limited by index block size
  • Two Disk Accesses: Index + data block
  • Index Block Bottleneck: Must read index first
  • Complex for Very Large Files: Need multi-level indexes

Multi-level Index (Unix Inode)

To support large files, systems use multi-level indexing with direct, indirect, double-indirect, and triple-indirect pointers.

Unix Inode Structure:
Direct pointers (0-11):     12 × 4KB = 48KB
Single indirect (12):       1024 × 4KB = 4MB  
Double indirect (13):       1024² × 4KB = 4GB
Triple indirect (14):       1024³ × 4KB = 4TB

Maximum file size: ~4TB
Small file efficiency: Direct pointers
Large file support: Multi-level indirection
Access Pattern Examples:
  • Block 5: Direct pointer[5] → 1 disk access
  • Block 100: Indirect → data → 2 disk accesses
  • Block 10000: Double indirect → indirect → data → 3 accesses
  • Block 1M: Triple indirect → double → single → data → 4 accesses

Allocation Method Comparison

Performance Characteristics

Method Sequential Access Random Access Space Overhead External Fragmentation File Size Limit
Contiguous Excellent Excellent Minimal High Pre-declare
Linked Good Poor Moderate None Dynamic
Indexed Good Excellent High None Index Limited

Access Time Analysis

Assumptions:
- Seek time: 10ms
- Block read time: 1ms
- Blocks per track: 50

Sequential Access (10 blocks):
Contiguous: 10ms + 10×1ms = 20ms
Linked: 10×(10ms + 1ms) = 110ms
Indexed: (10ms + 1ms) + 10×(10ms + 1ms) = 121ms

Random Access (1 block):
Contiguous: 10ms + 1ms = 11ms
Linked: N/2 × (10ms + 1ms) = 5.5N ms
Indexed: 2×(10ms + 1ms) = 22ms

Space Overhead Analysis

For 1KB blocks, 4-byte pointers:

Contiguous:
Overhead = 8 bytes (start + length)
Percentage = 8/(file_size) → ~0% for large files

Linked:  
Overhead = 4 bytes per block
Percentage = 4/1024 = 0.4% per block

Indexed:
Overhead = 1 index block (1KB)
Percentage = 1024/(file_size) → high for small files

Example (100KB file):
Contiguous: 8 bytes (0.008%)
Linked: 400 bytes (0.4%)  
Indexed: 1024 bytes (1.0%)

Free Space Management

Free space management tracks available disk blocks and efficiently allocates them to files. Different methods offer various trade-offs between space efficiency and access speed.

Free Space Management Methods

1. Bit Vector (Bitmap)
Structure:
Each bit represents one block
0 = free, 1 = allocated

Example (10 blocks):
Blocks: 0 1 2 3 4 5 6 7 8 9
Bitmap: 1 1 0 1 0 0 1 0 1 1
        A A F A F F A F A A

Operations:
Find free: scan for 0 bits
Allocate: set bit to 1
Deallocate: set bit to 0

Space: n/8 bytes for n blocks
Advantages:
  • Simple and efficient
  • Easy to find contiguous blocks
  • Compact representation
  • Fast allocation/deallocation
2. Linked List
Structure:
Link free blocks together
First free block pointer in superblock

Example:
Free list: 2 → 4 → 5 → 7 → NULL
         ↑
    Free pointer

Operations:
Allocate: Remove from head
Deallocate: Add to head

Space: 4 bytes per free block
Advantages:
  • No space wasted when disk full
  • Simple allocation
  • Dynamic structure
Disadvantages:
  • I/O to access list
  • Can't easily find contiguous blocks
3. Grouping
Structure:
Store addresses of free blocks in free blocks
First block contains addresses of n-1 free blocks
Plus address of next group

Example:
Block 2: [4, 5, 7, 10]  (next group at 10)
Block 10: [15, 18, 22, 25] (next group at 25)

Advantages:
- Large number of free blocks found quickly
- Addresses kept in free blocks (no extra space)
- Good for allocation of multiple blocks
4. Counting
Structure:
Store address and count of contiguous free blocks
Take advantage of contiguous allocation/deallocation

Example:
Free blocks: (2,3), (7,1), (15,5), (25,2)
Meaning: 3 blocks starting at 2
         1 block at 7  
         5 blocks starting at 15
         2 blocks starting at 25

Advantages:
- Efficient for contiguous allocation
- Compact when free space is contiguous
- Easy to find large contiguous areas
Free Space Method Comparison
Method Space Efficiency Allocation Speed Contiguous Blocks Best Use Case
Bitmap Excellent Fast Easy to find General purpose
Linked List Good when full Very fast Difficult Simple systems
Grouping Good Fast for multiple Moderate Batch allocation
Counting Excellent Fast for contiguous Excellent Contiguous allocation

Advanced Allocation Techniques

Extent-Based Allocation

Modern file systems use extents - contiguous ranges of blocks - to combine benefits of contiguous and indexed allocation.

Extent Structure:
┌─────────────┬─────────────┐
│ Start Block │   Length    │
├─────────────┼─────────────┤
│     100     │     50      │  (blocks 100-149)
│     200     │     25      │  (blocks 200-224)  
│     300     │    100      │  (blocks 300-399)
└─────────────┴─────────────┘

Advantages:
- Fewer metadata entries
- Good sequential performance  
- Reduced fragmentation
- Efficient for large files

Used in: ext4, XFS, NTFS, HFS+

Copy-on-Write (COW)

COW file systems delay writing data until necessary, enabling efficient snapshots and versioning.

COW Process:
1. Read: Access existing data normally
2. Write: Don't overwrite, allocate new block
3. Update: Metadata points to new block
4. Cleanup: Old block marked free (eventually)

Benefits:
- Efficient snapshots
- Better reliability (atomic updates)
- Reduced write amplification
- Easy rollback

Used in: Btrfs, ZFS, APFS

Log-Structured File Systems

Treat disk as circular log, always writing sequentially to avoid seek overhead.

Write Process:
1. Buffer all writes in memory
2. Write entire segment to disk sequentially  
3. Update inode map with new locations
4. Clean old segments when space needed

Advantages:
- Excellent write performance
- Good for write-intensive workloads
- Simplified crash recovery

Challenges:
- Garbage collection complexity
- Read performance can suffer
- Free space fragmentation

Allocation Policies

Smart allocation policies improve performance by considering access patterns and disk geometry.

Common Policies:

1. Block Group/Cylinder Groups:
- Keep related files close together
- Reduce seek times

2. Preallocation:
- Allocate extra blocks for growing files
- Reduce fragmentation

3. Delayed Allocation:
- Defer block allocation until write
- Better allocation decisions

4. Locality Heuristics:
- Place file near its directory
- Cluster small files together

Practical Examples

File Allocation Simulation

Scenario: Allocating three files on a 20-block disk
  • File A: 4 blocks
  • File B: 3 blocks
  • File C: 5 blocks
1. Contiguous Allocation
FileStartLength
A04
B43
C75
A
A
A
A
B
B
B
C
C
C
C
C
F
F
F
F
F
F
F
F

✓ Excellent sequential access

✗ Cannot extend files easily

2. Linked Allocation
FileFirst Block
A0
B5
C2
A: 0→7→15→19→NULL
B: 5→12→9→NULL  
C: 2→6→11→14→1→NULL

✓ No external fragmentation

✓ Dynamic file growth

✗ Poor random access

3. Indexed Allocation
FileIndex Block
A0
B5
C10
Index 0: [7,15,19,1]
Index 5: [12,9,14]  
Index 10: [2,6,11,16,18]

✓ Excellent random access

⚠ Higher overhead for small files

Performance Analysis
Operation Contiguous Linked Indexed
Read File A sequentially 1 seek + 4 reads 4 seeks + 4 reads 2 seeks + 5 reads
Read block 2 of File C 1 seek + 1 read 3 seeks + 3 reads 2 seeks + 2 reads
Extend File B by 2 blocks Difficult (may need move) Easy (allocate + link) Easy (update index)

Free Space Management Example

Scenario: 32-block disk with blocks 5, 6, 7, 12, 13, 17, 18, 19, 25, 26 free

Bitmap Representation
Blocks 0-31:
11110011 11001110 00111101 10011111

Interpretation:
0: allocated, 1: free
Free blocks: 5,6,7,12,13,17,18,19,25,26

Operations:
Allocate block: Find first 0, set to 1
Find 3 contiguous: Scan for "000" pattern
→ Blocks 5,6,7 available

Memory usage: 32 bits = 4 bytes
Counting Method
Free block groups:
(5, 3)   - 3 blocks starting at 5
(12, 2)  - 2 blocks starting at 12  
(17, 3)  - 3 blocks starting at 17
(25, 2)  - 2 blocks starting at 25

Operations:
Allocate 3 blocks: Use (5,3) group
Allocate 1 block: Split (12,2) → (12,1) + (13,1)

Memory usage: 4 entries × 8 bytes = 32 bytes
Better when free space is contiguous

Session Summary

Key Concepts
  • Contiguous Allocation: Sequential blocks, excellent performance, fragmentation issues
  • Linked Allocation: Scattered blocks with pointers, no fragmentation, poor random access
  • Indexed Allocation: Index blocks with pointers, good random access, overhead for small files
  • Free Space Management: Bitmaps, linked lists, grouping, counting methods
  • Advanced Techniques: Extents, COW, log-structured file systems
Design Trade-offs
  • Performance vs Fragmentation: Contiguous fast but fragments
  • Simplicity vs Flexibility: Linked simple but limited
  • Space vs Speed: Indexed uses more space for speed
  • Small vs Large Files: Different methods suit different sizes
  • Sequential vs Random: Access patterns affect choice
Module 5 Complete!

This concludes Module 5 on I/O and File Management. We've covered virtual memory fundamentals, page replacement policies, file system concepts, and file allocation methods. Next: Module 6 will explore Storage Management and Security, including secondary storage structure, I/O system management, and protection mechanisms.