Memory leak
Encyclopedia
A memory leak, in computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

 (or leakage, in this context), occurs when a computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 consumes memory
Computer memory
In computing, memory refers to the physical devices used to store programs or data on a temporary or permanent basis for use in a computer or other digital electronic device. The term primary memory is used for the information in physical systems which are fast In computing, memory refers to the...

 but is unable to release it back to the operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

. In object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

, a memory leak happens when an object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

 is stored in memory but cannot be accessed by the running code. A memory leak has symptoms similar to a number of other problems (see below) and generally can only be diagnosed by a programmer with access to the program source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

; however, many people refer to any unwanted increase in memory usage as a memory leak, though this is not strictly accurate from a technical perspective.

Because they can exhaust available system memory as an application runs, memory leaks are often the cause or a contributing factor of software aging
Software aging
In software engineering, software aging refers to progressive performance degradation or a sudden hang/crash of a software system due to exhaustion of operating system resources, fragmentation and accumulation of errors. A proactive fault management method to deal with the software aging...

.

Consequences

A memory leak can diminish the performance of the computer by reducing the amount of available memory. Eventually, in the worst case, too much of the available memory may become allocated and all or part of the system or device stops working correctly, the application fails, or the system slows down unacceptably due to thrashing.

Memory leaks may not be serious or even detectable by normal means. In modern operating systems, normal memory used by an application is released when the application terminates. This means that a memory leak in a program that only runs for a short time may not be noticed and is rarely serious.

Leaks that are much more serious include:
  • Where the program runs for an extended time and consumes additional memory over time, such as background tasks on servers, but especially in embedded devices
    Embedded system
    An embedded system is a computer system designed for specific control functions within a larger system. often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. By contrast, a general-purpose computer, such as a personal...

     which may be left running for many years.
  • Where new memory is allocated frequently for one-time tasks, such as when rendering the frames of a computer game or animated video.
  • Where the program is able to request memory — such as shared memory
    Shared memory
    In computing, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Depending on context, programs may run on a single processor or on multiple separate processors...

     — that is not released, even when the program terminates.
  • Where memory is very limited, such as in an embedded system
    Embedded system
    An embedded system is a computer system designed for specific control functions within a larger system. often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. By contrast, a general-purpose computer, such as a personal...

     or portable device.
  • Where the leak occurs within the operating system or memory manager
    Memory manager
    In IBM PC compatible computing, DOS memory management refers to software and techniques employed to give applications access to more than 640K of "conventional memory". The 640kB limit was specific to the IBM PC and close compatibles; other machines running MS-DOS had different limits, for example...

    .
  • Where the leak is the responsibility of a system device driver
    Device driver
    In computing, a device driver or software driver is a computer program allowing higher-level computer programs to interact with a hardware device....

    .
  • Where running on an operating systems that does not automatically release memory on program termination. Often on such machines if memory is lost, it can only be reclaimed by a reboot
    Reboot
    Reboot can refer to:* Rebooting , an event sequence when restarting a computer* ReBoot, a Canadian CGI-animated television series* ReBoot , a video game based on the television series...

    , an example of such a system being AmigaOS
    AmigaOS
    AmigaOS is the default native operating system of the Amiga personal computer. It was developed first by Commodore International, and initially introduced in 1985 with the Amiga 1000...

    .

An example of memory leak

The following example, written in pseudocode
Pseudocode
In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading...

, is intended to show how a memory leak can come about, and its effects, without needing any programming knowledge. The program in this case is part of some very simple software designed to control an elevator
Elevator
An elevator is a type of vertical transport equipment that efficiently moves people or goods between floors of a building, vessel or other structures...

. This part of the program is run whenever anyone inside the elevator presses the button for a floor.

When a button is pressed:
Get some memory, which will be used to remember the floor number
Put the floor number into the memory
Are we already on the target floor?
If so, we have nothing to do: finished
Otherwise:
Wait until the lift is idle
Go to the required floor
Release the memory we used to remember the floor number

The memory leak would occur if the floor number requested is the same floor that the lift is on; the condition for releasing the memory would be skipped. Each time this case occurs, more memory is leaked.

Cases like this wouldn't usually have any immediate effects. People do not often press the button for the floor they are already on, and in any case, the lift might have enough spare memory that this could happen hundreds or thousands of times. However, the lift will eventually run out of memory. This could take months or years, so it might not be discovered by thorough testing.

The consequences would be unpleasant; at the very least, the lift would stop responding to requests to move to another floor. If other parts of the program need memory (a part assigned to open and close the door, for example), then someone may be trapped inside, since the software cannot open the door.

The memory leak lasts until the system is reset. For example: if the lift's power were turned off the program would stop running. When power was turned on again, the program would restart and all the memory would be available again, but the slow process of memory leak would restart together with the program, eventually prejudicing the correct running of the system.

Programming issues

Memory leaks are a common error in programming, especially when using languages
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 that have no built-in automatic garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

, such as C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 and C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

. Typically, a memory leak occurs because dynamically allocated memory has become unreachable
Unreachable memory
In computer science, unreachable memory is a block of memory allocated dynamically where the program that allocated the memory no longer has any reachable pointer that refers to it. Similarly, an unreachable object is a dynamically allocated object that has no reachable reference to it...

. The prevalence of memory leak bugs
Software bug
A software bug is the common term used to describe an error, flaw, mistake, failure, or fault in a computer program or system that produces an incorrect or unexpected result, or causes it to behave in unintended ways. Most bugs arise from mistakes and errors made by people in either a program's...

 has led to the development of a number of debugging
Debugging
Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge...

 tool
Programming tool
A programming tool or software development tool is a program or application that software developers use to create, debug, maintain, or otherwise support other programs and applications...

s to detect unreachable memory. IBM Rational Purify, BoundsChecker
BoundsChecker
BoundsChecker is a memory checking and API call validation tool used for C++ software development with Microsoft Visual C++. It was created by Nu-Mega Technologies in the early 1990s. When Nu-Mega was purchased by Compuware in 1997, BoundsChecker became part of a larger tool suite, DevPartner...

, Valgrind
Valgrind
Valgrind is a GPL licensed programming tool for memory debugging, memory leak detection, and profiling. The name valgrind comes from the main entrance to Valhalla in Norse mythology....

, Insure++
Insure++
Insure++ is a memory debugger computer program, used by software developers to detect various errors in programs written in C and C++. It is made by Parasoft, and is functionally similar to other memory debuggers, such as Purify and Valgrind.-Overview:...

and memwatch
Memwatch
Memwatch is a free programming tool for memory leak detection in C, released under the GNU General Public License.It is designed to compile and run on any system which has an ANSI C compiler. While it is primarily intended to detect and diagnose memory leaks, it can also be used to analyze a...

are some of the more popular memory debugger
Memory debugger
A memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...

s for C and C++ programs. "Conservative" garbage collection capabilities can be added to any programming language that lacks it as a built-in feature, and libraries for doing this are available for C and C++ programs. A conservative collector finds and reclaims most, but not all, unreachable memory.

Although the memory manager
Memory manager
In IBM PC compatible computing, DOS memory management refers to software and techniques employed to give applications access to more than 640K of "conventional memory". The 640kB limit was specific to the IBM PC and close compatibles; other machines running MS-DOS had different limits, for example...

 can recover unreachable memory, it cannot free memory that is still reachable and therefore potentially still useful. Modern memory managers therefore provide techniques for programmers to semantically mark memory with varying levels of usefulness, which correspond to varying levels of reachability. The memory manager does not free an object that is strongly reachable. An object is strongly reachable if it is reachable either directly by a strong reference
Strong reference
In computer programming, a strong reference is a normal reference that protects the referred object from collection by a garbage collector. The term is used to distinguish the reference from weak references.a Category:Programming constructs...

 or indirectly by a chain of strong references. (A strong reference is a reference that, unlike a weak reference
Weak reference
In computer programming, a weak reference is a reference that does not protect the referenced object from collection by a garbage collector . An object referenced only by weak references is considered unreachable and so may be collected at any time...

, prevents an object from being garbage collected.) To prevent this, the developer is responsible for cleaning up references after use, typically by setting the reference to null once it is no longer needed and, if necessary, by deregistering any event listeners that maintain strong references to the object.

In general, automatic memory management is more robust and convenient for developers, as they don't need to implement freeing routines or worry about the sequence in which cleanup is performed or be concerned about whether or not an object is still referenced. It is easier for a programmer to know when a reference is no longer needed than to know when an object is no longer referenced. However, automatic memory management can impose a performance overhead, and it does not eliminate all of the programming errors that cause memory leaks.

RAII

RAII, short for Resource Acquisition Is Initialization
Resource Acquisition Is Initialization
Resource Acquisition Is Initialization is a programming idiom used in several object-oriented languages like C++, D and Ada. The technique was invented by Bjarne Stroustrup to deal with resource deallocation in C++...

, is an approach to the problem commonly taken in C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, D, and Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

. It involves associating scoped objects with the acquired resources, and automatically releasing the resources once the objects are out of scope. Unlike garbage collection, RAII has the advantage of knowing when objects exist and when they do not. Compare the following C and C++ examples:


/* C version */
  1. include


void f(int n)
{
int* array = calloc(n, sizeof(int));
do_some_work(array);
free(array);
}



// C++ version
  1. include


void f(int n)
{
std::vector array (n);
do_some_work(array);
}


The C version, as implemented in the example, requires explicit deallocation; the array is dynamically allocated (from the heap in most C implementations), and continues to exist until explicitly freed.

The C++ version requires no explicit deallocation; it will always occur automatically as soon as the object array goes out of scope, including if an exception is thrown. This avoids some of the overhead of garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

 schemes. And because object destructors can free resources other than memory, RAII helps to prevent the leaking of input and output resources accessed through a handle
Handle leak
A handle leak is a type of software bug that occurs when a computer program asks for a handle to a resource but does not free the handle when it is no longer used. If this occurs frequently or repeatedly over an extended period of time, a large number of handles may be marked in-use and thus...

, which mark-and-sweep garbage collection does not handle gracefully. These include open files, open windows, user notifications, objects in a graphics drawing library, thread synchronisation primitives such as critical sections, network connections, and connections to the Windows Registry
Windows registry
The Windows Registry is a hierarchical database that stores configuration settings and options on Microsoft Windows operating systems. It contains settings for low-level operating system components as well as the applications running on the platform: the kernel, device drivers, services, SAM, user...

 or another database.

However, using RAII correctly is not always easy and has its own pitfalls. For instance, if one is not careful, it is possible to create dangling pointer
Dangling pointer
Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations....

s (or references) by returning data by reference, only to have that data be deleted when its containing object goes out of scope.

D
D (programming language)
The D programming language is an object-oriented, imperative, multi-paradigm, system programming language created by Walter Bright of Digital Mars. It originated as a re-engineering of C++, but even though it is mainly influenced by that language, it is not a variant of C++...

 uses a combination of RAII and garbage collection, employing automatic destruction when it is clear that an object cannot be accessed outside its original scope, and garbage collection otherwise.

Reference counting and cyclic references

More modern garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

 schemes are often based on a notion of reachability - if you don't have a usable reference to the memory in question, it can be collected. Other garbage collection schemes can be based on reference counting
Reference counting
In computer science, reference counting is a technique of storing the number of references, pointers, or handles to a resource such as an object, block of memory, disk space or other resource...

, where an object is responsible for keeping track of how many references are pointing to it. If the number goes down to zero, the object is expected to release itself and allow its memory to be reclaimed. The flaw with this model is that it doesn't cope with cyclic references, and this is why nowadays we are prepared to accept the burden of the more costly mark and sweep type of systems.

The following Visual Basic
Visual Basic
Visual Basic is the third-generation event-driven programming language and integrated development environment from Microsoft for its COM programming model...

 code illustrates the canonical reference-counting memory leak:


Dim A, B
Set A = CreateObject("Some.Thing")
Set B = CreateObject("Some.Thing")
' At this point, the two objects each have one reference,

Set A.member = B
Set B.member = A
' Now they each have two references.

Set A = Nothing ' You could still get out of it...

Set B = Nothing ' And now you've got a memory leak!

End

In practice, this trivial example would be spotted straight away and fixed. In most real examples, the cycle of references spans more than two objects, and is more difficult to detect.

A well-known example of this kind of leak came to prominence with the rise of AJAX
Ajax
- Mythology :* Ajax , son of Telamon, ruler of Salamis and a hero in the Trojan War, also known as "Ajax the Great"* Ajax the Lesser, son of Oileus, ruler of Locris and the leader of the Locrian contingent during the Trojan War.- People :...

 programming techniques in web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

s. Javascript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

 code which associated a DOM
Document Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...

 element with an event handler and failed to remove the reference before exiting, would leak memory (AJAX web pages keep a given DOM alive for a lot longer than traditional web pages, so this leak was much more apparent).

Effects

If a program has a memory leak and its memory usage is steadily increasing, there will not usually be an immediate symptom. Every physical system has a finite amount of memory, and if the memory leak is not contained (for example, by restarting the leaking program) it will sooner or later start to cause problems.

Most modern consumer desktop operating systems have both main memory which is physically housed in RAM microchips, and secondary storage such as a hard drive. Memory allocation is dynamic - each process gets as much memory as it requests. Active pages
Paging
In computer operating systems, paging is one of the memory-management schemes by which a computer can store and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the operating system retrieves data from secondary storage in same-size blocks called...

 are transferred into main memory for fast access; inactive pages are pushed out to secondary storage to make room, as needed. When a single process starts consuming a large amount of memory, it usually occupies more and more of main memory, pushing other programs out to secondary storage - usually significantly slowing performance of the system. Even if the leaking program is terminated, it may take some time for other programs to swap back into main memory, and for performance to return to normal.

When all the memory on a system is exhausted (whether there is virtual memory or only main memory, such as on an embedded system) any attempt to allocate more memory will fail. This usually causes the program attempting to allocate the memory to terminate itself, or to generate a segmentation fault
Segmentation fault
A segmentation fault , bus error or access violation is generally an attempt to access memory that the CPU cannot physically address. It occurs when the hardware notifies an operating system about a memory access violation. The OS kernel then sends a signal to the process which caused the exception...

. Some programs are designed to recover from this situation (possibly by falling back on pre-reserved memory). The first program to experience the out-of-memory may or may not be the program that has the memory leak.

Some multi-tasking
Computer multitasking
In computing, multitasking is a method where multiple tasks, also known as processes, share common processing resources such as a CPU. In the case of a computer with a single CPU, only one task is said to be running at any point in time, meaning that the CPU is actively executing instructions for...

 operating systems have special mechanisms to deal with an out-of-memory condition, such as killing processes at random (which may affect "innocent" processes), or killing the largest process in memory (which presumably is the one causing the problem). Some operating systems have a per-process memory limit, to prevent any one program from hogging all of the memory on the system. The disadvantage to this arrangement is that the operating system sometimes must be re-configured to allow proper operation of programs that legitimately require large amounts of memory, such as those dealing with graphics, video, or scientific calculations.
If the memory leak is in the kernel, the operating system itself will likely fail. Computers without sophisticated memory management, such as embedded systems, may also completely fail from a persistent memory leak.

Publicly accessible systems such as web server
Web server
Web server can refer to either the hardware or the software that helps to deliver content that can be accessed through the Internet....

s or routers are prone to denial-of-service attacks if an attacker discovers a sequence of operations which can trigger a leak. Such a sequence is known as an exploit
Exploit (computer security)
An exploit is a piece of software, a chunk of data, or sequence of commands that takes advantage of a bug, glitch or vulnerability in order to cause unintended or unanticipated behavior to occur on computer software, hardware, or something electronic...

.

A "sawtooth" pattern of memory utilization may be an indicator of a memory leak if the vertical drops coincide with reboots or application restarts. Care should be taken though because garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

 points could also cause such a pattern.

Other memory consumers

Note that constantly increasing memory usage is not necessarily evidence of a memory leak. Some applications will store ever increasing amounts of information in memory (e.g. as a cache
Cache
In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

). If the cache can grow so large as to cause problems, this may be a programming or design error, but is not a memory leak as the information remains nominally in use. In other cases, programs may require an unreasonably large amount of memory because the programmer has assumed memory is always sufficient for a particular task; for example, a graphics file processor might start by reading the entire contents of an image file and storing it all into memory, something that is not viable where a very large image exceeds available memory.

To put it another way, a memory leak arises from a particular kind of programming error, and without access to the program code, someone seeing symptoms can only guess that there might be a memory leak. It would be better to use terms such as "constantly increasing memory use" where no such inside knowledge exists.

A simple example in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

The following C function deliberately leaks memory by losing the pointer to the allocated memory. Since the program loops forever calling the memory allocation function, malloc, but without saving the address, it will eventually fail (returning NULL) when no more memory is available to the program. Because the address of each allocation is not stored, it is impossible to free any of the previously allocated blocks. It should be noted that, generally, the operating system delays real memory allocation until something is written into it. So the program ends when virtual addresses run out of bounds (per process limits or 2 to 4 GiB on IA-32
IA-32
IA-32 , also known as x86-32, i386 or x86, is the CISC instruction-set architecture of Intel's most commercially successful microprocessors, and was first implemented in the Intel 80386 as a 32-bit extension of x86 architecture...

 or a lot more on x86-64
X86-64
x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...

 systems) and there may be no real impact on the rest of the system.

  1. include


int main(void)
{
/* this is an infinite loop calling the malloc function which
* allocates the memory but without saving the address of the
* allocated place */
while (malloc(50)); /* malloc will return NULL sooner or later, due to lack of memory */
return 0; /* free the allocated memory by operating system itself after program exits */
}

See also

  • Buffer overflow
    Buffer overflow
    In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

  • Memory management
    Memory management
    Memory management is the act of managing computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and freeing it for reuse when no longer needed. This is critical to the computer system.Several...

  • Memory debugger
    Memory debugger
    A memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...

  • nmon
    Nmon
    nmon is a popular system monitor tool for the AIX and Linux operating systems.- Description :The original nmon was a freely downloadable tool for AIX 4.3 from the AIX wiki. It was also rewritten for the Linux operating system running on IA-32, x86-64, RS/6000 and Power processor and Mainframe and...

     (short for Nigel's Monitor) is a popular system monitor tool for the AIX and Linux operating systems.

Articles


The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK