590
I/O system objects, including driver and device objects. Internally, the Windows I/O system
operates asynchronously to achieve high performance and provides both synchronous and
asynchronous I/O capabilities to user-mode applications.
Device drivers include not only traditional hardware device drivers but also file system, network,
and layered filter drivers. All drivers have a common structure and communicate with one another
and the I/O manager by using common mechanisms. The I/O system interfaces allow drivers to be
written in a high-level language to lessen development time and to enhance their portability.
Because drivers present a common structure to the operating system, they can be layered one on
top of another to achieve modularity and reduce duplication between drivers. Also, all Windows
device drivers should be designed to work correctly on multiprocessor systems.
Finally, the role of the PnP manager is to work with device drivers to dynamically detect hardware
devices and to build an internal device tree that guides hardware device enumeration and driver
installation. The power manager works with device drivers to move devices into low-power states
when applicable to conserve energy and prolong battery life.
Four more upcoming chapters in the book will cover additional topics related to the I/O system:
storage management, file systems (including details on the NTFS file system), the cache manager,
and networking.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
591
8. Storage Management
Storage management defines the way that an operating system interfaces with nonvolatile storage
devices and media. The term storage encompasses many different devices, including tape drives,
optical media, USB flash drives, floppy disks, hard disks, network storage such as iSCSI, and
storage area networks (SANs). Windows provides specialized support for each of these classes of
storage media. Because our focus in this book is on the kernel components of Windows, in this
chapter we’ll concentrate on just the fundamentals of the hard-disk storage subsystem in Windows,
which includes support for external disks and flash drives. Significant portions of the support
Windows provides for removable media and remote storage (offline archiving) are implemented in
user mode.
In this chapter, we’ll examine how kernel-mode device drivers interface file system drivers to disk
it is involved with storage management because it includes support for accessing disk devices
before the Windows I/O system is operational. Winload resides on the boot volume; the
boot-sector code on the system volume executes Bootmgr. Bootmgr reads the BCD from the
system volume or EFI firmware and presents the computer’s boot choices to the user. Bootmgr
translates the name of the BCD boot entry that a user selects to the appropriate boot partition and
then runs Winload to load the Windows system files (starting with the registry, Ntoskrnl.exe and
its dependencies, and the boot drivers) into memory to continue the boot process. In all cases,
Winload uses the computer firmware to read the disk containing the system volume.
8.2.2 Disk Class, Port, and Miniport Drivers
During initialization, the Windows I/O manager starts the disk storage drivers. Storage drivers in
Windows follow a class/port/miniport architecture, in which Microsoft supplies a storage class
driver that implements functionality common to all storage devices and a storage port driver that
implements functionality common to a particular bus—such as a Small Computer System
Interface (SCSI) bus or an Integrated Device Electronics (IDE) system—and OEMs supply
miniport drivers that plug into the port driver to interface Windows to a particular controller
implementation.
In the disk storage driver architecture, only class drivers conform to the standard Windows device
driver interfaces. Miniport drivers use a port driver interface instead of the device driver interface,
and the port driver simply implements a collection of device driver support routines that interface
miniport drivers to Windows. This approach simplifies the role of miniport driver developers and,
because Microsoft supplies operating system–specific port drivers, allows driver developers to
focus on hardware-specific driver logic. Windows includes Disk (\Windows\System32\Drivers
\Disk.sys), a class driver that implements functionality common to disks. Windows also provides a
handful of disk port drivers. For example, Scsiport.sys is the legacy port driver for disks on SCSI
buses, and Ataport.sys is a port driver for IDEbased systems. Most newer drivers use the
Storport.sys port driver as a replacement for Scsiport.sys. Storport.sys is designed to realize the
high performance capabilities of hardware RAID and Fibre Channel adapters. The Storport model
is similar to Scsiport, making it easy for vendors to migrate existing Scsiport miniport drivers to
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
593
Vista Ultimate, as well as on Windows Server 2008.
The Microsoft iSCSI Software Initiator includes several components:
■ Initiator This optional component, which consists of the Storport port driver and the iSCSI
miniport driver (\Windows\System32\Drivers\Msiscsi.sys), uses the TCP/IP driver to implement
software iSCSI over standard Ethernet adapters and TCP/IP offloaded network adapters.
■ Initiator service This service, implemented in \Windows\System32\Iscsiexe.exe, manages the
discovery and security of all iSCSI initiators as well as session initiation and termination. iSCSI
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
594
device discovery functionality is implemented in \Windows\System32\Iscsium.dll and conforms to
the Internet Storage Name Service (iSNS) protocol.
■ Management applications These include Iscsicli.exe, a command-line tool for managing iSCSI
device connections and security, and the corresponding Control Panel application.
Some vendors produce iSCSI adapters that offload the iSCSI protocol to hardware. The Initiator
service works with these adapters, which must support iSNS, so that all iSCSI devices, including
those discovered by the Initiator service and those discovered by iSCSI hardware, are recognized
and managed through standard Windows interfaces.
Multipath I/O (MPIO) Drivers
Most disk devices have one path—or series of adapters, cables, and switches—between them and
a computer. Servers requiring high levels of availability use multipathing solutions, where more
than one set of connection hardware exists between the computer and a disk so that if a path fails
the system can still access the disk via an alternate path. Without support from the operating
system or disk drivers, however, a disk with two paths, for example, appears as two different disks.
Windows includes multipath I/O support to manage multipath disks as a single disk. This support
relies on built-in or third-party drivers called device-specific modules (DSMs) to manage details
of the path management—for example, load balancing policies that choose which path to use for
routing requests and error detection mechanisms to inform Windows when a path fails. MPIO
support is available for Windows Server 2008 in the form of the Microsoft MPIO Driver
Development Kit, which hardware and software vendors can license.
In a Windows MPIO storage stack, shown in Figure 8-2, the disk driver includes functionality for
the link \Device\Harddisk0\Partition0 to refer to \Device\Harddisk0\DR0, and \Device\Harddisk0
\Partition1 to refer to the first partition device object of the first disk. For backward compatibility
with applications that expect legacy names, the disk class driver also creates the same symbolic
links in Windows that represent physical drives that it would have created on Windows NT 4
systems. Thus, for example, the link \GLOBAL??\PhysicalDrive0 references \Device\Harddisk0
\DR0. Figure 8-3 shows the WinObj utility from Sysinternals displaying the contents of a
Harddisk directory for a basic disk. You can see the physical disk and partition device objects in
the pane at the right.
As you saw in Chapter 3, the Windows API is unaware of the Windows object manager
namespace. Windows reserves two groups of namespace subdirectories to use, one of which is the
\Global?? subdirectory. (The other group is the collection of per-session \BaseNamed-Objects
subdirectories, which are covered in Chapter 3.) In this subdirectory, Windows makes available
device objects that Windows applications interact with—including COM and parallel ports—as
well as disks. Because disk objects actually reside in other subdirectories, Windows uses symbolic
links to connect names under \Global?? with objects located elsewhere in the namespace. For each
physical disk on a system, the I/O manager creates a \Global??\ PhysicalDriveX link that points to
\Device\HarddiskX\DRX. (Numbers, starting from 0, replace X.) Windows applications that
directly interact with the sectors on a disk open the disk by calling the Windows CreateFile
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
597
function and specifying the name \\.\PhysicalDriveX (in which X is the disk number) as a
parameter. The Windows application layer converts the name to \Global??\PhysicalDriveX before
handing the name to the Windows object manager.
8.2.4 Partition Manager
The partition manager, \Windows\System32\Drivers\Partmgr.sys, is responsible for discovering,
creating, deleting, and managing partitions. To become aware of partitions, the partition manager
acts as the function driver for disk device objects created by disk class drivers. The partition
manager uses the I/O manager’s IoReadPartitionTableEx function to identify partitions and create
device objects that represent them. As miniport drivers present the disks that they identify early in
598
8.3 Volume Management
Windows has the concept of basic and dynamic disks. Windows calls disks that rely exclusively
on the MBR-style or GPT partitioning scheme basic disks. Dynamic disks implement a more
flexible partitioning scheme than that of basic disks. The fundamental difference between basic
and dynamic disks is that dynamic disks support the creation of new multipartition volumes.
Recall from the list of terms earlier in the chapter that multipartition volumes provide performance,
sizing, and reliability features not supported by simple volumes. Windows manages all disks as
basic disks unless you manually create dynamic disks or convert existing basic disks (with enough
free space) to dynamic disks. Microsoft recommends that you use basic disks unless you require
the multipartition functionality of dynamic disks.
Note Windows does not support multipartition volumes on basic disks. For a number of reasons,
including the fact that laptops usually have only one disk and laptop disks typically don’t move
easily between computers, Windows uses only basic disks on laptops. In addition, only fixed disks
can be dynamic, and disks located on IEEE 1394 or USB buses or on shared cluster server disks
are always basic disks (or fixed dynamic disks).
8.3.1 Basic Disks
This section describes the two types of partitioning, MBR-style and GPT, that Windows uses to
define volumes on basic disks, and the volume manager driver that presents the volumes to file
system drivers. Windows silently defaults to defining all disks as basic disks.
MBR-Style Partitioning
The standard BIOS implementations that BIOS-based (non-EFI) x86 hardware uses dictate one
requirement of the partitioning format in Windows—that the first sector of the primary disk
contains the Master Boot Record (MBR). When a BIOS-based x86 system boots, the computer’s
BIOS reads the MBR and treats part of the MBR’s contents as executable code. The BIOS invokes
the MBR code to initiate an operating system boot process after the BIOS performs preliminary
configuration of the computer’s hardware. In Microsoft operating systems such as Windows, the
MBR also contains a partition table. A partition table consists of four entries that define the
locations of as many as four primary partitions on a disk. The partition table also records a
partition’s type. Numerous predefined partition types exist, and a partition’s type specifies which
of a GPT disk is an MBR that serves to protect the GPT partitioning in case the disk is accessed
from a non-GPT aware operating system. However, the second and last sectors of the disk store
the GPT headers with the actual partition table following the second sector and preceding the last
sector. With its extensible list of partitions, GPT partitioning doesn’t require nested partitions, as
MBR partitions do.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
600
Note Because Windows doesn’t support the creation of multipartition volumes on basic disks, a
new basic disk partition is the equivalent of a volume. For this reason, the Disk Management
MMC snap-in uses the term partition when you create a volume on a basic disk.
Basic Disk Volume Manager
The volume manager driver (\Windows\System32\Drivers\Volmgr.sys) creates disk device objects
that represent volumes on basic disks and plays an integral role in managing all basic disk
volumes, including simple volumes. For each volume, the volume manager creates a device object
of the form \Device\HarddiskVolumeX, in which X is a number (starting from 1) that identifies
the volume.
The volume manager is actually a bus driver because it’s responsible for enumerating basic disks
to detect the presence of basic volumes and report them to the Windows Plug and Play (PnP)
manager. To implement this enumeration, the volume manager leverages the PnP manager, with
the aid of the partition manager (Partmgr.sys) driver to determine what basic disk partitions exist.
The partition manager registers with the PnP manager so that Windows can inform the partition
manager whenever the disk class driver creates a partition device object.
The partition manager informs the volume manager about new partition objects through a private
interface and creates filter device objects that the partition manager then attaches to the partition
objects. The existence of the filter objects prompts Windows to inform the partition manager
whenever a partition device object is deleted so that the partition manager can update the volume
manager. The disk class driver deletes a partition device object when a partition in the Disk
Management MMC snap-in is deleted. As the volume manager becomes aware of partitions, it
uses the basic disk configuration information to determine the correspondence of partitions to
disk on which it resides—hence the Private Header’s designation as information that is private to
the disk. The Private Header also stores the name of the disk group, which is the name of the
computer concatenated with Dg0 (for example, Daryl-Dg0 if the computer’s name is Daryl), and a
pointer to the beginning of the database table of contents. For reliability, LDM keeps a copy of the
Private Header in the disk’s last sector.
The database table of contents is 16 sectors in size and contains information regarding the
database’s layout. LDM begins the database record area immediately following the table of
contents with a sector that serves as the database record header. This sector stores information
about the database record area, including the number of records it contains, the name and GUID of
the disk group the database relates to, and a sequence number identifier that LDM uses for the
next entry it creates in the database. Sectors following the database record header contain 128-byte
fixed-size records that store entries that describe the disk group’s partitions and volumes.
A database entry can be one of four types: partition, disk, component, and volume. LDM uses the
database entry types to identify three levels that describe volumes. LDM connects entries with
internal object identifiers. At the lowest level, partition entries describe soft partitions, which are
contiguous regions on a disk; identifiers stored in a partition entry link the entry to a component
and disk entry. A disk entry represents a dynamic disk that is part of the disk group and includes
the disk’s GUID. A component entry serves as a connector between one or more partition entries
and the volume entry each partition is associated with. A volume entry stores the GUID of the
volume, the volume’s total size and state, and a drive-letter hint. Disk entries that are larger than a
database record span multiple records; partition, component, and volume entries rarely span
multiple records.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
602
LDM requires three entries to describe a simple volume: a partition, component, and volume entry.
The following listing shows the contents of a simple LDM database that defines one 200-MB
volume that consists of one partition:
The partition entry describes the area on a disk that the system assigned to the volume, the
8. Host Id : 1b77da20-c717-11d0-a5be-00a0c91db73c
9. Disk Group Id : b5f4a7fd-758d-11dd-b7f0-000c297f0108
10. Disk Group Name : WIN-SL5V78KD01W-Dg0
11. Logical disk start : 3F
12. Logical disk size : 7FF7C1 (4094 MB)
13. Configuration start: 7FF800
14. Configuration size : 800 (1 MB)
15. Number of TOCs : 2
16. TOC size : 7FD (1022 KB)
17. Number of Configs : 1
18. Config size : 5C9 (740 KB)
19. Number of Logs : 1
20. Log size : E0 (112 KB)
21. TOC 1:
22. Signature : TOCBLOCK
23. Sequence : 0x1
24. Config bitmap start: 0x11
25. Config bitmap size : 0x5C9
26. Log bitmap start : 0x5DA
27. Log bitmap size : 0xE0
28. ...
29. VBLK DATABASE:
30. 0x000004: [000001]
31. Name : WIN-SL5V78KD01W-Dg0
32. Object Id : 0x0001
33. GUID : b5f4a7fd-758d-11dd-b7f0-000c297f010
34. 0x000006: [000003]
35. Name : Disk1
36. Object Id : 0x0002
37. Disk Id : b5f4a7fe-758d-11dd-b7f0-000c297f010
66. 0x00000C: [00000C]
67. Name : Disk3-01
68. Object Id : 0x0009
69. Parent Id : 0x3157
70. Disk Id : 0x0000
71. Start : 0x7C100
72. Size : 0x0 (0 MB)
73. Volume Off : 0xFFD00003 (2095616 MB)
74. 0x00000D: [00000F]
75. Name : Volume1
76. Object Id : 0x0005
77. Volume state: ACTIVE
78.
Size : 0x017FB800 (12279 MB)
79. GUID : b5f4a806-758d-11dd-b7f0-c297f0108
80. Drive Hint : E:
LDM and GPT or MBR-Style Partitioning
When you install Windows on a computer, one of the first things it requires you to do is to create a
partition on the system’s primary physical disk. Windows defines the system volume on this
partition to store the files that it invokes early in the boot process. In addition, Windows Setup
requires you to create a partition that serves as the home for the boot volume, onto which the setup
program installs the Windows system files and creates the system directory (\Windows). The
system and boot volumes can be the same volume, in which case you don’t have to create a new
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
605
partition for the boot volume. The nomenclature that Microsoft defines for system and boot
volumes is somewhat confusing. The system volume is where Windows places boot files,
including the boot loader (Winload) and Boot Manager (Bootmgr), and the boot volume is where
Windows stores operating system files such as Ntoskrnl.exe, the core kernel file.
Although the partitioning data of a dynamic disk resides in the LDM database, LDM implements
convert it to a dynamic disk. The LDM database consists of four regions, which Figure 8-5 shows:
a header sector that LDM calls the Private Header, a table of contents area, a database records area,
and a transactional log area. (The fifth region shown in Figure 8-5 is simply a copy of the Private
Header.) The Private Header sector resides 1 MB before the end of a dynamic disk and anchors
the database. As you spend time with Windows, you’ll quickly notice that it uses GUIDs to
identify just about everything, and disks are no exception. A GUID (globally unique identifier) is
a 128-bit value that various components in Windows use to uniquely identify objects. LDM
assigns each dynamic disk a GUID, and the Private Header sector notes the GUID of the dynamic
disk on which it resides—hence the Private Header’s designation as information that is private to
the disk. The Private Header also stores the name of the disk group, which is the name of the
computer concatenated with Dg0 (for example, Daryl-Dg0 if the computer’s name is Daryl), and a
pointer to the beginning of the database table of contents. For reliability, LDM keeps a copy of the
Private Header in the disk’s last sector.
The database table of contents is 16 sectors in size and contains information regarding the
database’s layout. LDM begins the database record area immediately following the table of
contents with a sector that serves as the database record header. This sector stores information
about the database record area, including the number of records it contains, the name and GUID of
the disk group the database relates to, and a sequence number identifier that LDM uses for the
next entry it creates in the database. Sectors following the database record header contain 128-byte
fixed-size records that store entries that describe the disk group’s partitions and volumes.
A database entry can be one of four types: partition, disk, component, and volume. LDM uses the
database entry types to identify three levels that describe volumes. LDM connects entries with
internal object identifiers. At the lowest level, partition entries describe soft partitions, which are
contiguous regions on a disk; identifiers stored in a partition entry link the entry to a component
and disk entry. A disk entry represents a dynamic disk that is part of the disk group and includes
the disk’s GUID. A component entry serves as a connector between one or more partition entries
and the volume entry each partition is associated with. A volume entry stores the GUID of the
volume, the volume’s total size and state, and a drive-letter hint. Disk entries that are larger than a
database record span multiple records; partition, component, and volume entries rarely span
multiple records.
2. Logical Disk Manager Configuration Dump v1.03
3. Copyright (C) 2000-2002 Mark Russinovich
4. PRIVATE HEAD:
5. Signature : PRIVHEAD
6. Version : 2.12
7. Disk Id : b5f4a801-758d-11dd-b7f0-000c297f0108
8. Host Id : 1b77da20-c717-11d0-a5be-00a0c91db73c
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
608
9. Disk Group Id : b5f4a7fd-758d-11dd-b7f0-000c297f0108
10. Disk Group Name : WIN-SL5V78KD01W-Dg0
11. Logical disk start : 3F
12. Logical disk size : 7FF7C1 (4094 MB)
13. Configuration start: 7FF800
14. Configuration size : 800 (1 MB)
15. Number of TOCs : 2
16. TOC size : 7FD (1022 KB)
17. Number of Configs : 1
18. Config size : 5C9 (740 KB)
19. Number of Logs : 1
20. Log size : E0 (112 KB)
21. TOC 1:
22. Signature : TOCBLOCK
23. Sequence : 0x1
24. Config bitmap start: 0x11
25. Config bitmap size : 0x5C9
26. Log bitmap start : 0x5DA
27. Log bitmap size : 0xE0
28. ...
29. VBLK DATABASE:
58. 0x00000B: [00000B]
59. Name : Disk2-01
60. Object Id : 0x0008
61. Parent Id : 0x3157
62. Disk Id : 0x0000
63. Start : 0x7C100
64. Size : 0x0 (0 MB)
65. Volume Off : 0x7FE80003 (1047808 MB)
66. 0x00000C: [00000C]
67. Name : Disk3-01
68. Object Id : 0x0009
69. Parent Id : 0x3157
70. Disk Id : 0x0000
71. Start : 0x7C100
72. Size : 0x0 (0 MB)
73. Volume Off : 0xFFD00003 (2095616 MB)
74. 0x00000D: [00000F]
75. Name : Volume1
76. Object Id : 0x0005
77. Volume state: ACTIVE
78. Size : 0x017FB800 (12279 MB)
79. GUID : b5f4a806-758d-11dd-b7f0-c297f0108
80. Drive Hint : E:
LDM and GPT or MBR-Style Partitioning
When you install Windows on a computer, one of the first things it requires you to do is to create a
partition on the system’s primary physical disk. Windows defines the system volume on this
partition to store the files that it invokes early in the boot process. In addition, Windows Setup
requires you to create a partition that serves as the home for the boot volume, onto which the setup
program installs the Windows system files and creates the system directory (\Windows). The
system and boot volumes can be the same volume, in which case you don’t have to create a new