Fragmentation (computer)
Encyclopedia
In computer storage
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....

, fragmentation is a phenomenon in which storage space is used inefficiently, reducing storage capacity and in most cases reducing the performance. The term is also used to denote the wasted space itself.

There are three different but related forms of fragmentation: external fragmentation, internal fragmentation, and data fragmentation. Various storage allocation schemes exhibit one or more of these weaknesses. Fragmentation can be accepted in return for increase in speed or simplicity.

Basic Principle

"Fragmented memory" denotes all of the system's unusable free memory. Memory fragmentation usually occurs when memory is allocated dynamically (using calls like malloc
Malloc
C dynamic memory allocation refers to performing dynamic memory allocation in the C via a group of functions in the C standard library, namely malloc, realloc, calloc and free....

).Generally, memory allocation performed during runtime is not in the form of stacks. The memory allocator wastes memory in the following ways: 1.Overhead 2.Internal Fragmentation 3.External Fragmentation.
When blocks of memory are alloted during runtime, it is highly unlikely that these released blocks of memory will again form continuous large memory blocks. Hence, the free memory gets interspersed with the blocks of memory in use. The average size of blocks of memory consequently becomes quite small. This problem coupled with incomplete usage of the allocated memory results in memory fragmentation.

For example, consider a system having 5MB total memory. Let the total free memory be 2MB which the programs could use at runtime. However, due to interspersed free and used memory blocks the user is not able to allocate even 1MB of free space for a new program as all the free memory blocks are not larger than 500KB. Also, if in the 3MB allocated memory only 2MB is used then the remaining 1MB memory goes unused. This wastage of space is the major problem of internal and external memory fragmentation.

Types Of Memory Fragmentation


Overhead

The memory allocator needs to store all the information related to all memory allocations. This information includes the location, size and ownership of any free blocks, as well as other internal status details. Overhead comprises of all the additional system resources that the programming algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 requires. A dynamic memory allocator typically stores this overhead information in the memory it manages. This leads to wastage of memory. Hence, it is considered as a part of memory fragmentation.

Internal Fragmentation

When the memory allocated is larger than required, the rest is wasted. Some reasons for excess allocation are: 1. Allocator policy - involves architectural constraints, 2. A client asks for more memory than is required. The term "internal" refers to the fact that the unusable storage is inside the allocated region. While this may seem foolish, it is often accepted in return for increased efficiency or simplicity.
1.There are some basic memory allocation rules involved. All memory allocators must adhere to it. According to the "segregated free list" allocator policy , all memory allocations must start at an address divisible by 4, 8, or 16. The memory allocator may assign blocks of only certain predefined sizes to clients. It depends upon the processor architecture. Also , extra bytes are assigned to a program for alignment
Alignment
Alignment may refer to:* Alignment , secondary evidence used to associate features such as postholes* Alignment , in Israel from 1965 to 1992...

 and metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

.

For example, when a client requests a block of 23 bytes, it may well get 24 or 28 bytes or even more.

For example, in many file systems, each file always starts at the beginning of a cluster, because this simplifies organization and makes it easier to grow files. Any space left over between the last byte of the file and the first byte of the next cluster is a form of internal fragmentation called file slack or slack space.
Slack space is a very important source of evidence in computer forensic investigation.

Another common example: English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

 text is often stored with one character
Character (computing)
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....

 in each 8-bit byte
Byte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...

 even though in standard ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 encoding the most significant bit of each byte is always zero. The unused bits are a form of internal fragmentation.

Similar problems with leaving reserved resources unused appear in many other areas. For example, IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

es can only be reserved in blocks of certain sizes, resulting in many IPs that are reserved but not actively used. This contributes to the IPv4 address shortage.

Unlike other types of fragmentation, internal fragmentation is difficult to reclaim; usually the best way to remove it is with a design change. For example, in dynamic memory allocation, memory pool
Memory pool
Memory pools, also called fixed-size-blocks allocation, allow dynamic memory allocation comparable to malloc or C++'s operator new. As those implementations suffer from fragmentation because of variable block sizes, it can be impossible to use them in a real time system due to performance...

s drastically cut internal fragmentation by spreading the space overhead over a larger number of objects.

External Fragmentation

External fragmentation is the inability to use free memory as the free memory is divided into small blocks of memory and these blocks are interspersed with the allocated memory. It is a weakness of certain storage allocation algorithms, occurring when an application allocates and deallocates ("frees") regions of storage of varying sizes, and the allocation algorithm responds by leaving the allocated and deallocated regions interspersed.
The result is that although free storage is available, it is effectively unusable because it is divided into pieces that are too small to satisfy the demands of the application. The term "external" refers to the fact that the unusable storage is outside the allocated regions.

For example, consider a situation wherein a program allocates 3 continuous blocks of memory and then frees the middle block. The memory allocator can use this free block of memory for future allocations. However, it cannot use this block if the memory to be allocated is larger in size than this free block.

External fragmentation also occurs in file systems as many files of different sizes are created, change size, and are deleted. The effect is even worse if a file which is divided into many small pieces is deleted, because this leaves similarly small regions of free spaces.
As compared to external fragmentation , overhead and internal fragmentation account for little loss in terms of wasted memory and reduced performance. External fragmentation does the most damage to the system. It is defined as:



External fragmentation is the factor between 0 to 1. A 100% fragmentation (factor = 1) suggest that the system is completely out of usable free memory. While a 0 factor (0% fragmentation) denotes that all the free memory is in a single large block.

For example, fragmentation is 90% when 100 MB free memory is present but largest free block
of memory for allocation is just 10 MB.

Data fragmentation

Data fragmentation occurs when a piece of data in memory is broken up into many pieces that are not close together. It is typically the result of attempting to insert a large object into storage that has already suffered external fragmentation.

For example, files in a file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

 are usually managed in units called blocks
Block (data storage)
In computing , a block is a sequence of bytes or bits, having a nominal length . Data thus structured are said to be blocked. The process of putting data into blocks is called blocking. Blocking is used to facilitate the handling of the data-stream by the computer program receiving the data...

or clusters. When a file system is created, there is free space to store file blocks together contiguously. This allows for rapid sequential file reads and writes. However, as files are added, removed, and changed in size, the free space becomes externally fragmented, leaving only small holes in which to place new data. When a new file is written, or when an existing file is extended, the operating system puts the new data in new non-contiguous data blocks to fit into the available holes. The new data blocks are necessarily scattered, slowing access due to seek time and rotational latency of the read/write head, and incurring additional overhead to manage additional locations. This is called file system fragmentation
File system fragmentation
In computing, file system fragmentation, sometimes called file system aging, is the inability of a file system to lay out related data sequentially , an inherent phenomenon in storage-backed file systems that allow in-place modification of their contents. It is a special case of data fragmentation...

.

When writing a new file of a known size, if there are any empty holes that are larger than that file, the operating system can avoid data fragmentation by putting the file into any one of those holes. There are a variety of algorithms for selecting which of those potential holes to put the file; each of them is a heuristic approximate solution to the bin packing problem
Bin packing problem
In computational complexity theory, the bin packing problem is a combinatorial NP-hard problem. In it, objects of different volumes must be packed into a finite number of bins of capacity V in a way that minimizes the number of bins used....

. The "best fit" algorithm chooses the smallest hole that is big enough. The "worst fit" algorithm chooses the largest hole. The "first-fit algorithm" chooses the first hole that is big enough. The "next fit" algorithm keeps track of where each file was written.

As another example, if the nodes of a linked list
Linked list
In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference to the next node in the sequence; more complex variants add additional links...

 are allocated consecutively in memory, this improves locality of reference
Locality of reference
In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related storage locations being frequently accessed. There are two basic types of reference locality. Temporal locality refers to the reuse of specific data and/or resources...

 and enhances data cache performance during traversal of the list. If the memory pool's free space is fragmented, new nodes will be spread throughout memory, increasing the number of cache misses.

Just as compaction can eliminate external fragmentation, data fragmentation can be eliminated by rearranging data storage so that related pieces are close together. For example, the primary job of a defragmentation
Defragmentation
In the maintenance of file systems, defragmentation is a process that reduces the amount of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions . It also attempts to create larger regions of...

 tool is to rearrange blocks on disk so that the blocks of each file are contiguous. Most defragmenting utilities also attempt to reduce or eliminate free space fragmentation. Some moving garbage collectors
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

 will also move related objects close together (this is called compacting) to improve cache performance.

Performance Degradation Due To Fragmentation

Memory fragmentation is one of the most severe problems faced by system
System
System is a set of interacting or interdependent components forming an integrated whole....

 managers. Over time, it leads to degradation of system performance. Eventually memory fragmentation leads to complete loss of free memory.
Memory fragmentation is a kernel
Kernel
-Computer science:* Kernel , the central component of most operating systems** The Linux kernel, from GNU/Linux operating systems** The Windows 9x kernel, used in Windows 95, 98 and ME...

 programming level problem. It becomes a critical issue when it reaches alarming levels. A real life example is of 99% fragmentation which frequently occurs during Real-time computing
Real-time computing
In computer science, real-time computing , or reactive computing, is the study of hardware and software systems that are subject to a "real-time constraint"— e.g. operational deadlines from event to system response. Real-time programs must guarantee response within strict time constraints...

 of applications. This fragmentation occurs just seconds before the System crashes. It is difficult to avert this System crash as it is impossible to anticipate the critical rise in levels of memory fragmentation.
According to a research conducted by International Data Corporation
International Data Corporation
International Data Corporation is a market research and analysis firm specializing in information technology, telecommunications and consumer technology. IDC is a subsidiary of International Data Group...

, the performance degradation is largely due to external fragmentation . The life-time of server is shortened by 33% by external fragmentation alone . This leads to a direct increase of 33% in the yearly budget for hardware upgrades. Thus it can be concluded that memory fragmentation has an undesirable effect not only on memory usage and processing speed of the System but also on hardware components and cost of project.

External links


See also

  • Defragmentation
    Defragmentation
    In the maintenance of file systems, defragmentation is a process that reduces the amount of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions . It also attempts to create larger regions of...

  • File system fragmentation
    File system fragmentation
    In computing, file system fragmentation, sometimes called file system aging, is the inability of a file system to lay out related data sequentially , an inherent phenomenon in storage-backed file systems that allow in-place modification of their contents. It is a special case of data fragmentation...

  • Memory management
    Memory management
    Memory management is the act of managing computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and freeing it for reuse when no longer needed. This is critical to the computer system.Several...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK