Disk buffer
Encyclopedia
In computer storage
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....

, disk buffer (often ambiguously called disk cache or cache buffer) is the embedded memory in a hard drive acting as a buffer
Buffer (computer science)
In computer science, a buffer is a region of a physical memory storage used to temporarily hold data while it is being moved from one place to another. Typically, the data is stored in a buffer as it is retrieved from an input device or just before it is sent to an output device...

 between the rest of the computer and the physical hard disk platter
Hard disk platter
A hard-disk platter is a component of a hard-disk drive: it is the circular disk on which the magnetic data is stored. The rigid nature of the platters in a hard drive is what gives them their name . Hard drives typically have several platters which are mounted on the same spindle...

 that is used for storage. Modern hard disks come with 8 to 64 MiB
Mebibyte
The mebibyte is a multiple of the unit byte for digital information. The binary prefix mebi means 220, therefore 1 mebibyte is . The unit symbol for the mebibyte is MiB. The unit was established by the International Electrotechnical Commission in 2000 and has been accepted for use by all major...

 of such memory.
Since the late 1980s, nearly all disks sold have embedded microcontroller
Microcontroller
A microcontroller is a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals. Program memory in the form of NOR flash or OTP ROM is also often included on chip, as well as a typically small amount of RAM...

s and either an ATA, Serial ATA
Serial ATA
Serial ATA is a computer bus interface for connecting host bus adapters to mass storage devices such as hard disk drives and optical drives...

, SCSI
SCSI
Small Computer System Interface is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it...

, or Fibre Channel
Fibre Channel
Fibre Channel, or FC, is a gigabit-speed network technology primarily used for storage networking. Fibre Channel is standardized in the T11 Technical Committee of the InterNational Committee for Information Technology Standards , an American National Standards Institute –accredited standards...

 interface. The drive circuitry usually has a small amount of memory, used to store the bits going to and coming from the disk platter.

The disk buffer is physically distinct from and is used differently than the page cache
Page cache
In computing, page cache, sometimes ambiguously called disk cache, is a "transparent" buffer of disk-backed pages kept in main memory by the operating system for quicker access. Page cache is typically implemented in kernels with the paging memory management, and is completely transparent to...

 typically kept by the operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 in the computer's main memory. The disk buffer is controlled by the microcontroller in the hard disk drive, and the page cache is controlled by the computer to which that disk is attached. The disk buffer is usually quite small, from 8 to 64 MiB
Mebibyte
The mebibyte is a multiple of the unit byte for digital information. The binary prefix mebi means 220, therefore 1 mebibyte is . The unit symbol for the mebibyte is MiB. The unit was established by the International Electrotechnical Commission in 2000 and has been accepted for use by all major...

, and the page cache is generally all unused physical memory. While data in the page cache is reused multiple times, the data in the disk buffer is rarely reused. In this sense, the terms disk cache and cache buffer are misnomers; the embedded controller's memory is more appropriately called the disk buffer.

Note that disk array controller
Disk array controller
A disk array controller is a device which manages the physical disk drives and presents them to the computer as logical units. It almost always implements hardware RAID, thus it is sometimes referred to as RAID controller. It also often provides additional disk cache.A disk array controller name is...

s, as opposed to disk controller
Disk controller
The disk controller is the circuit which enables the CPU to communicate with a hard disk, floppy disk or other kind of disk drive.Early disk controllers were identified by their storage methods and data encoding. They were typically implemented on a separate controller card...

s, usually have normal cache memory of around 0.5–8 GiB.

Read-ahead/read-behind

When executing a read from the disk, the disk arm moves the read/write head
Disk read-and-write head
Disk read/write heads are the small parts of a disk drive, that move above the disk platter and transform platter's magnetic field into electrical current or vice versa – transform electrical current into magnetic field...

 to (or near) the correct track, and after some settling time the read head begins to pick up bits. Usually, the first sectors to be read are not the ones that have been requested by the operating system. The disk's embedded computer typically saves these unrequested sectors in the disk buffer, in case the operating system requests them later.

Speed matching

The speed of the disk's I/O interface to the computer almost never matches the speed at which the bits are transferred to and from the hard disk platter
Hard disk platter
A hard-disk platter is a component of a hard-disk drive: it is the circular disk on which the magnetic data is stored. The rigid nature of the platters in a hard drive is what gives them their name . Hard drives typically have several platters which are mounted on the same spindle...

. The disk buffer is used so that both the I/O interface and the disk read/write head can operate at full speed.

Write acceleration

The disk's embedded microcontroller may signal the main computer that a disk write is complete immediately after receiving the write data, before the data are actually written to the platter. This early signal allows the main computer to continue working even though the data has not actually been written yet. This can be somewhat dangerous, because if power is lost before the data are permanently fixed in the magnetic media, the data will be lost from the disk buffer, and the file system on the disk may be left in an inconsistent state. On some disks, this vulnerable period between signaling the write complete and fixing the data can be arbitrarily long, as the write can be deferred indefinitely by newly arriving requests. For this reason, the use of write acceleration can be controversial. Consistency can be maintained, however, by using a battery-backed memory system for caching data — although this is typically only found in high end RAID controllers. Alternatively, the caching can simply be turned off when the integrity of data is deemed more important than write performance. Another option is to send data to disk in a carefully managed order and to issue "cache flush" commands in the right places, like the ZFS
ZFS
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include data integrity verification against data corruption modes , support for high storage capacities, integration of the concepts of filesystem and volume management,...

 file system does.

Command queuing

Newer SATA
Serial ATA
Serial ATA is a computer bus interface for connecting host bus adapters to mass storage devices such as hard disk drives and optical drives...

 and most SCSI
SCSI
Small Computer System Interface is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it...

 disks can accept multiple commands while any one command is in operation through "command queuing" (see NCQ
Native Command Queuing
Native Command Queuing is a technology designed to increase performance of SATA hard disks under certain conditions by allowing the individual hard disk to internally optimize the order in which received read and write commands are executed...

 and TCQ
Tagged Command Queuing
Tagged Command Queuing is a technology built into certain ATA and SCSI hard drives. It allows the operating system to send multiple read and write requests to a hard drive. ATA TCQ is not identical in function to the more efficient native command queuing used by SATA drives...

). These commands are stored by the disk's embedded controller until they are completed. Should a read reference the data at the destination of a queued write, the to-be-written data will be returned. Command queuing is different from write acceleration in that the main computer's operating system is notified when data is actually written onto the magnetic media. The OS can use this information to keep the file system consistent through rescheduled writes.

Performance

Disk buffer sizes over 8MiB do not produce any performance gains.
Where a buffer is large and the throughput of the disk is slow, the data becomes cached for too long, resulting in degraded performance over equivalent disks with smaller buffers. This degradation occurs because of longer latencies when flush commands are sent to a disk with a full buffer.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK