Page cache
Encyclopedia
In computing, page cache, sometimes ambiguously called disk cache
Cache
In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

, is a "transparent" buffer of disk-backed pages kept in main memory (RAM) by the operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 for quicker access. Page cache is typically implemented in kernels with the paging
Paging
In computer operating systems, paging is one of the memory-management schemes by which a computer can store and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the operating system retrieves data from secondary storage in same-size blocks called...

 memory management, and is completely transparent to applications. All memory that is not directly allocated to applications is usually utilized for page cache. Hard disk
Hard disk
A hard disk drive is a non-volatile, random access digital magnetic data storage device. It features rotating rigid platters on a motor-driven spindle within a protective enclosure. Data is magnetically read from and written to the platter by read/write heads that float on a film of air above the...

 read speeds are low and random access
Random access
In computer science, random access is the ability to access an element at an arbitrary position in a sequence in equal time, independent of sequence size. The position is arbitrary in the sense that it is unpredictable, thus the use of the term "random" in "random access"...

es require expensive disk seeks compared to main memory—this is why RAM upgrades usually yield significant improvements in computers' speed and responsiveness. Separate disk caching is provided on the hardware side, by dedicated RAM or NVRAM
NVRAM
Non-volatile random-access memory is random-access memory that retains its information when power is turned off, which is described technically as being non-volatile...

 chips located either in disk controller
Disk controller
The disk controller is the circuit which enables the CPU to communicate with a hard disk, floppy disk or other kind of disk drive.Early disk controllers were identified by their storage methods and data encoding. They were typically implemented on a separate controller card...

 (inside a hard disk drive; properly called disk buffer
Disk buffer
In computer storage, disk buffer is the embedded memory in a hard drive acting as a buffer between the rest of the computer and the physical hard disk platter that is used for storage...

) or in a disk array controller
Disk array controller
A disk array controller is a device which manages the physical disk drives and presents them to the computer as logical units. It almost always implements hardware RAID, thus it is sometimes referred to as RAID controller. It also often provides additional disk cache.A disk array controller name is...

. Such memory should not be confused with page cache.

Memory conservation

Since non-dirty page
Page (computing)
A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory that is the smallest unit of data for the following:* memory allocation performed by the operating system for a program; and...

s in the page cache have identical copies in secondary storage (e.g. hard disk, Flash disk), discarding and re-using their space is much quicker than paging out application memory, and is often preferred. Executable binaries, such as applications and libraries, are also typically accessed through page cache and mapped to individual process
Process (computing)
In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system , a process may be made up of multiple threads of execution that execute instructions concurrently.A computer program is a...

 spaces using virtual memory
Virtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...

 (this is done through the mmap
Mmap
In computing, mmap is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It naturally implements demand paging, because initially file contents are not entirely read from disk and do not use physical RAM at all...

 system call on Unix-like operating systems). This not only means that the binary files are shared between separate processes, but also that unused parts of binaries will be pushed out of main memory eventually, leading to memory conservation.

Since cached pages can be easily evicted and re-used, some operating systems, notably Windows NT
Windows NT
Windows NT is a family of operating systems produced by Microsoft, the first version of which was released in July 1993. It was a powerful high-level-language-based, processor-independent, multiprocessing, multiuser operating system with features comparable to Unix. It was intended to complement...

, even report the page cache usage as "free" memory, while the memory is actually allocated to disk pages. This has led to some confusion about the utilization of page cache in Windows.

Page cache and disk writes

The page cache also aids in writing to a disk. Pages that have been modified in memory for writing to disk, are marked "dirty" and have to be flushed to disk before they can be freed. When a file write occurs, the page backing the particular block is looked up. If it is already found in cache, the write is done to that page in memory. Otherwise, when the write perfectly falls on page size boundaries, the page is not even read from disk, but allocated and immediately marked dirty. Otherwise, the page(s) are fetched from disk and requested modifications are done. A file that is created or opened in the page cache, but not written to, might result in a zero byte file
Zero byte file
A zero byte file or zero length file is a computer file containing no data; that is, it has a length or size of zero bytes.Zero byte files cannot be loaded or used by most applications...

 at a later read.

However, not all cached pages can be written to — often, program code is mapped as read-only
Read-only
In computing, read-only can mean:* Read-only memory , a type of storage media* Read-only access to files or directories in file system permissions...

 or copy-on-write
Copy-on-write
Copy-on-write is an optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, they can all be given pointers to the same resource...

; in the latter case, modifications to code will only be visible to the process itself and will not be written to disk.

History

The first commercially available page cache (disk cache) for microcomputers was MicroCache from Microcosm Ltd
Microcosm Ltd
Microcosm Ltd is a UK company established in 1979. Its early claims to fame included Silicon Disk System in 1981 and Microcache in 1982 ....

. This appeared in 1982, initially for the CP/M
CP/M
CP/M was a mass-market operating system created for Intel 8080/85 based microcomputers by Gary Kildall of Digital Research, Inc...

 operating system and later for MS-DOS
MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...

.

Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

 added a disk cache to MS-DOS (version 4.01) in 1988. They called it SmartDrive
SmartDrive
SmartDrive was a disk caching program that shipped with MS-DOS versions 4.01 through 6.22 and Windows 3.x. It improved disk transfer rates by storing frequently accessed data in the main memory. Early versions of SmartDrive were loaded through a CONFIG.SYS device driver named SMARTDRV.SYS...

.

See also

  • Cache
    Cache
    In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

  • Demand paging
    Demand paging
    In computer operating systems, demand paging is an application of virtual memory. In a system that uses demand paging, the operating system copies a disk page into physical memory only if an attempt is made to access it...

  • Five-minute rule
    Five-minute rule
    In computer science, the five-minute rule is a rule of thumb for deciding whether a data item should be kept in memory, or stored on disk and read back into memory when required. It was first formulated by Jim Gray and G. F...

  • Page table
    Page table
    A page table is the data structure used by a virtual memory system in a computer operating system to store the mapping between virtual addresses and physical addresses. Virtual addresses are those unique to the accessing process...

  • Paging
    Paging
    In computer operating systems, paging is one of the memory-management schemes by which a computer can store and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the operating system retrieves data from secondary storage in same-size blocks called...

  • Virtual memory
    Virtual memory
    In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK