Memory hierarchy
Encyclopedia
The term memory hierarchy is used in the theory of computation
Theory of computation
In theoretical computer science, the theory of computation is the branch that deals with whether and how efficiently problems can be solved on a model of computation, using an algorithm...

 when discussing performance issues in computer architectural design, algorithm predictions, and the lower level programming
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

 constructs such as involving locality of reference
Locality of reference
In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related storage locations being frequently accessed. There are two basic types of reference locality. Temporal locality refers to the reuse of specific data and/or resources...

. A 'memory hierarchy' in computer storage
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....

 distinguishes each level in the 'hierarchy' by response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by the controlling technology.

The many trade-offs in designing for high performance will include the structure of the memory hierarchy, i.e. the size and technology of each component. So the various components can be viewed as forming a hierarchy of memories (m1,m2,...,mn) in which each member mi is in a sense subordinate to the next highest member mi-1 of the hierarchy. To limit waiting by higher levels, a lower level will respond by filling a buffer and then signaling to activate the transfer.

There are four major storage levels.
  1. InternalProcessor register
    Processor register
    In computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...

    s and cache
    CPU cache
    A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...

    .
  2. Main – the system RAM
    Ram
    -Animals:*Ram, an uncastrated male sheep*Ram cichlid, a species of freshwater fish endemic to Colombia and Venezuela-Military:*Battering ram*Ramming, a military tactic in which one vehicle runs into another...

     and controller cards.
  3. On-line mass storage – Secondary storage.
  4. Off-line bulk storage – Tertiary and Off-line storage.

This is a most general memory hierarchy structuring. Many other structures are useful. For example, a paging algorithm may be considered as a level for virtual memory
Virtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...

 when designing a computer architecture
Computer architecture
In computer science and engineering, computer architecture is the practical art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals and the formal modelling of those systems....

.

Example use of the term

Here are some quotes.
  • Adding complexity slows down the memory hierarchy.
  • CMOx memory technology stretches the Flash space in the memory hierarchy
  • One of the main ways to increase system performance is minimising how far down the memory hierarchy one has to go to manipulate data.
  • Latency and bandwidth are two metrics associated with caches and memory. Neither of them is uniform, but is specific to a particular component of the memory hierarchy.
  • Predicting where in the memory hierarchy the data resides is difficult.
  • ...the location in the memory hierarchy dictates the time required for the prefetch to occur.

Application of the concept

The memory hierarchy in most computers is:
  • Processor registers – the fastest possible access (usually 1 CPU cycle), only hundreds of bytes in size
  • Level 1 (L1) cache – often accessed in just a few cycles, usually tens of kilobytes
  • Level 2 (L2) cache – higher latency than L1 by 2× to 10×, usually has 512 KiB or more
  • Level 3 (L3) cache – higher latency than L2, usually has 2048 KiB or more

  • Main memory – may take hundreds of cycles, but can be multiple gigabytes. Access times may not be uniform, in the case of a NUMA
    Non-Uniform Memory Access
    Non-Uniform Memory Access is a computer memory design used in Multiprocessing, where the memory access time depends on the memory location relative to a processor...

     machine.
  • Disk storage
    Disk storage
    Disk storage or disc storage is a general category of storage mechanisms, in which data are digitally recorded by various electronic, magnetic, optical, or mechanical methods on a surface layer deposited of one or more planar, round and rotating disks...

     – millions of cycles latency if not cached, but very large
  • Tertiary storage – several seconds latency, can be huge

Note that the hobbyist who reads "L1 cache" in the computer specifications sheet is reading about the 'internal' memory hierarchy .
Most modern CPUs
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

 are so fast that for most program workloads, the bottleneck
Bottleneck
A bottleneck is a phenomenon where the performance or capacity of an entire system is limited by a single or limited number of components or resources. The term bottleneck is taken from the 'assets are water' metaphor. As water is poured out of a bottle, the rate of outflow is limited by the width...

 is the locality of reference
Locality of reference
In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related storage locations being frequently accessed. There are two basic types of reference locality. Temporal locality refers to the reuse of specific data and/or resources...

 of memory accesses and the efficiency of the caching
Cache
In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

 and memory transfer between different levels of the hierarchy. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete. This is sometimes called the space cost, as a larger memory object is more likely to overflow a small/fast level and require use of a larger/slower level.

Modern programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s mainly assume two levels of memory, main memory and disk storage, though in assembly language
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...

 and inline assembler
Inline assembler
In computer programming, the inline assembler is a feature of some compilers that allows very low level code written in assembly to be embedded in a high level language like C or Ada...

s in languages such as C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, registers can be directly accessed. Taking optimal advantage of the memory hierarchy requires the cooperation of programmers, hardware, and compilers (as well as underlying support from the operating system):
  • Programmers are responsible for moving data between disk and memory through file I/O.
  • Hardware is responsible for moving data between memory and caches.
  • Optimizing compilers
    Compiler optimization
    Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

    are responsible for generating code that, when executed, will cause the hardware to use caches and registers efficiently.


Many programmers assume one level of memory. This works fine until the application hits a performance wall. Then the memory hierarchy will be assessed during code refactoring.

What a memory hierarchy is not

Although a file server is a virtual device on the computer, it was not a design consideration of the computer architect. Network Area Storage (NAS) devices optimize their own memory hierarchy (to optimize their physical design). On-line" (secondary storage) here refers to the "network" from the CPU's point of view, which is "on" the computer "line"—the hard drive. "Off-line" (tertiary storage) here refers to "infinite" latency (waiting for manual intervention). It is the boundary of the realm of the 'memory hierarchy'.

Memory hierarchy refers to a CPU-centric latency (delay)—the primary criterion for designing a placement in storage in a memory hierarchy—that fits the storage device into the design considerations concerning the operators of the computer. It is used only incidentally in operations, artifactually. Its primary use is in thinking about abstract machine
Abstract machine
An abstract machine, also called an abstract computer, is a theoretical model of a computer hardware or software system used in automata theory...

s.

Network architect's use the term latency directly instead of memory hierarchy because they do not design what will become on-line. A NAS may have this article in "computer data storage" but the memory hierarchy of the computers which read and edit it do not care how.

See also

  • Memory Wall
  • Use of spatial and temporal locality: hierarchical memory
  • The difference between buffer and cache
  • Memory hierarchy
  • Cache hierarchy in a modern processor
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK