In
computer storageComputer data storage, often called storage or memory, refers to computer components, devices, and recording media that retain digital data used for computing for some interval of time. Computer data storage provides one of the core functions of the modern computer, that of information retention...
,
disk buffer (often ambiguously called
disk cache or
cache buffer) is the embedded memory in a hard drive acting as a
bufferIn computing, a buffer is a region of memory used to temporarily hold data while it is being moved from one place to another. Typically, the data is stored in a buffer as it is retrieved from an input device or just before it is sent to an output device . However, a buffer may be used when moving...
between the computer and the physical
hard disk platterA hard disk platter is a component of a hard disk drive: it is the circular disk on which the magnetic data is stored. The rigid nature of the platters in a hard drive is what gives them their name . Hard drives typically have several platters which are mounted on the same spindle...
that is used for storage. Modern hard disks come with 8 to 64
MiBThe mebibyte is a standards-based binary multiple of the byte, a unit of digital information storage. Mebibyte is abbreviated MiB....
of such memory.
Since the late 1980s, nearly all disks sold have embedded
microcontrollerA microcontroller is a small computer on a single integrated circuit consisting of a relatively simple CPU combined with support functions such as a crystal oscillator, timers, watchdog timer, serial and analog I/O etc. Program memory in the form of NOR flash or OTP ROM is also often included on...
s and either an ATA,
Serial ATAThe serial ATA, or SATA computer bus, is a storage-interface for connecting host bus adapters to mass storage devices such as hard disk drives and optical drives. The SATA host adapter is integrated into almost all modern consumer laptop computers and desktop motherboards.Serial ATA was designed...
,
SCSISmall Computer System Interface, or SCSI , is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces...
, or
Fibre ChannelFibre Channel, or FC, is a gigabit-speed network technology primarily used for storage networking. Fibre Channel is standardized in the T11 Technical Committee of the InterNational Committee for Information Technology Standards , an American National Standards Institute –accredited standards...
interface. The drive circuitry usually has a small amount of memory, used to store the bits going to and coming from the disk platter.
The disk buffer is physically distinct from and is used differently than the
page cacheIn computing, page cache, sometimes ambiguously called disk cache, is a "transparent" buffer of disk-backed pages kept in main memory by the operating system for quicker access. Page cache is typically implemented in kernels with the paging memory management, and is completely transparent to...
typically kept by the
operating systemAn operating system is an interface between hardware and user which is responsible for the management and coordination of activities and the sharing of the resources of the computer that acts as a host for computing applications run on the machine. As a host, one of the purposes of an operating...
in the computer's main memory.
In
computer storageComputer data storage, often called storage or memory, refers to computer components, devices, and recording media that retain digital data used for computing for some interval of time. Computer data storage provides one of the core functions of the modern computer, that of information retention...
,
disk buffer (often ambiguously called
disk cache or
cache buffer) is the embedded memory in a hard drive acting as a
bufferIn computing, a buffer is a region of memory used to temporarily hold data while it is being moved from one place to another. Typically, the data is stored in a buffer as it is retrieved from an input device or just before it is sent to an output device . However, a buffer may be used when moving...
between the computer and the physical
hard disk platterA hard disk platter is a component of a hard disk drive: it is the circular disk on which the magnetic data is stored. The rigid nature of the platters in a hard drive is what gives them their name . Hard drives typically have several platters which are mounted on the same spindle...
that is used for storage. Modern hard disks come with 8 to 64
MiBThe mebibyte is a standards-based binary multiple of the byte, a unit of digital information storage. Mebibyte is abbreviated MiB....
of such memory.
Since the late 1980s, nearly all disks sold have embedded
microcontrollerA microcontroller is a small computer on a single integrated circuit consisting of a relatively simple CPU combined with support functions such as a crystal oscillator, timers, watchdog timer, serial and analog I/O etc. Program memory in the form of NOR flash or OTP ROM is also often included on...
s and either an ATA,
Serial ATAThe serial ATA, or SATA computer bus, is a storage-interface for connecting host bus adapters to mass storage devices such as hard disk drives and optical drives. The SATA host adapter is integrated into almost all modern consumer laptop computers and desktop motherboards.Serial ATA was designed...
,
SCSISmall Computer System Interface, or SCSI , is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces...
, or
Fibre ChannelFibre Channel, or FC, is a gigabit-speed network technology primarily used for storage networking. Fibre Channel is standardized in the T11 Technical Committee of the InterNational Committee for Information Technology Standards , an American National Standards Institute –accredited standards...
interface. The drive circuitry usually has a small amount of memory, used to store the bits going to and coming from the disk platter.
The disk buffer is physically distinct from and is used differently than the
page cacheIn computing, page cache, sometimes ambiguously called disk cache, is a "transparent" buffer of disk-backed pages kept in main memory by the operating system for quicker access. Page cache is typically implemented in kernels with the paging memory management, and is completely transparent to...
typically kept by the
operating systemAn operating system is an interface between hardware and user which is responsible for the management and coordination of activities and the sharing of the resources of the computer that acts as a host for computing applications run on the machine. As a host, one of the purposes of an operating...
in the computer's main memory. The disk buffer is controlled by the microcontroller in the
hard disk driveA hard disk drive is a non-volatile storage device that stores digitally encoded data on rapidly rotating platters with magnetic surfaces. Strictly speaking, "drive" refers to the motorized mechanical aspect that is distinct from its medium, such as a tape drive and its tape, or a floppy disk...
, and the page cache is controlled by the computer to which that disk is attached. The disk buffer is usually quite small, from 2 to 32
MiBThe mebibyte is a standards-based binary multiple of the byte, a unit of digital information storage. Mebibyte is abbreviated MiB....
, and the page cache is generally all unused physical memory, which as of 2007, may be as much as 4
GiBThe gibibyte is a standards-based binary multiple of the byte, a unit of digital information storage...
for desktop computers. While data in the page cache is reused multiple times, the data in the disk buffer is rarely. In this sense, the terms
disk cache and
cache buffer are misnomers; the embedded controller's memory is more appropriately called the disk buffer.
Note that
disk array controllerA disk array controller is a device which manages the physical disk drives and presents them to the computer as logical units. It almost always implements hardware RAID, thus it is sometimes referred to as RAID controller. It also often provides additional disk cache.A disk array controller name is...
s, as opposed to
disk controllerThe disk controller is the circuit which allows the CPU to communicate with a hard disk, floppy disk or other kind of disk drive....
s, usually have normal cache memory of around 0.5–8 GiB.
Read-ahead/read-behind
When executing a read from the disk, the disk arm moves the
read/write headDisk read/write heads are the small parts of a disk drive, that move above the disk platter and transform platter's magnetic field into electrical current or vice versa – transform electrical current into magnetic field...
to (or near) the correct track, and after some settling time the read head begins to pick up bits. Usually, the first sectors to be read are not the ones that have been requested by the operating system. The disk's embedded computer typically saves these unrequested sectors in the disk buffer, in case the operating system requests them later.
Speed matching
The speed of the disk's I/O interface to the computer almost never matches the speed at which the bits are transferred to and from the
hard disk platterA hard disk platter is a component of a hard disk drive: it is the circular disk on which the magnetic data is stored. The rigid nature of the platters in a hard drive is what gives them their name . Hard drives typically have several platters which are mounted on the same spindle...
. The disk buffer is used so that both the I/O interface and the disk read/write head can operate at full speed.
Write acceleration
The disk's embedded microcontroller may signal the main computer that a disk write is complete immediately after receiving the write data, before the data are actually written to the platter. This early signal allows the main computer to continue working even though the data has not actually been written yet. This can be somewhat dangerous, because if power is lost before the data are permanently fixed in the magnetic media, the data will be lost from the disk buffer, and the file system on the disk may be left in an inconsistent state. On some disks, this vulnerable period between signaling the write complete and fixing the data can be arbitrarily long, as the write can be deferred indefinitely by newly arriving requests. For this reason, the use of write acceleration can be controversial. Consistency can be maintained, however, by using a battery-backed memory system for caching data — although this is typically only found in high end RAID controllers. Alternately, the caching can simply be turned off when the integrity of data is deemed more important than write performance. Another option is to send data to disk in a carefully managed order and to issue "cache flush" commands in the right places, like
ZFSIn computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include support for high storage capacities, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking...
file system does.
Command queuing
Newer
SATAThe serial ATA, or SATA computer bus, is a storage-interface for connecting host bus adapters to mass storage devices such as hard disk drives and optical drives. The SATA host adapter is integrated into almost all modern consumer laptop computers and desktop motherboards.Serial ATA was designed...
and most
SCSISmall Computer System Interface, or SCSI , is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces...
disks can accept multiple commands while any one command is in operation through "command queuing" (see
NCQNative Command Queuing is a technology designed to increase performance of SATA hard disks under certain situations by allowing the individual hard disk to internally optimize the order in which received read and write commands are executed...
and
TCQTagged Command Queuing is a technology built into certain ATA and SCSI hard drives. It allows the operating system to send multiple read and write requests to a hard drive. ATA TCQ is not identical in function to the more efficient native command queuing used by SATA drives...
). These commands are stored by the disk's embedded controller until they are completed. Should a read reference the data at the destination of a queued write, the to-be-written data will be returned. Command queuing is different from write acceleration in that the main computer's operating system is notified when data is actually written onto the magnetic media. The OS can use this information to keep the file system consistent through rescheduled writes.
Performance
Where a buffer is large and the throughput of the disk is slow, the data becomes cached for too long, resulting in degraded performance over equivalent disks with smaller buffers. This degradation occurs because of longer latencies when flush commands are sent to a disk with a full buffer.