Valgrind is a
GPLThe GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....
licensed
programming toolA programming tool or software development tool is a program or application that software developers use to create, debug, maintain, or otherwise support other programs and applications...
for
memory debuggingA memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...
,
memory leakA memory leak, in computer science , occurs when a computer program consumes memory but is unable to release it back to the operating system. In object-oriented programming, a memory leak happens when an object is stored in memory but cannot be accessed by the running code...
detection, and
profilingIn software engineering, profiling is a form of dynamic program analysis that measures, for example, the usage of memory, the usage of particular instructions, or frequency and duration of function calls...
. The name
valgrind comes from the main entrance to
ValhallaIn Norse mythology, Valhalla is a majestic, enormous hall located in Asgard, ruled over by the god Odin. Chosen by Odin, half of those that die in combat travel to Valhalla upon death, led by valkyries, while the other half go to the goddess Freyja's field Fólkvangr...
in
Norse mythologyNorse mythology, a subset of Germanic mythology, is the overall term for the myths, legends and beliefs about supernatural beings of Norse pagans. It flourished prior to the Christianization of Scandinavia, during the Early Middle Ages, and passed into Nordic folklore, with some aspects surviving...
.
Valgrind was originally designed to be a free memory
debuggingDebugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge...
tool for
LinuxLinux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
on x86, but has since evolved to become a generic framework for creating dynamic analysis tools such as checkers and profilers. It is used by a number of Linux-based projects. Since version 3.5, Valgrind also works on
Mac OS XMac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...
.
The original author of Valgrind is
Julian SewardJulian Seward is a compiler writer and Free Software contributor who lives in Cambridge, UK. He is commonly known for creating the bzip2 compression tool, as well as the valgrind memory debugging toolset founded in 2000...
, who in 2006 won a
Google-O'Reilly Open Source AwardThe Google-O'Reilly Open Source Award is presented to individuals for dedication, innovation, leadership and outstanding contribution to open source.-External links:* *...
for his work on Valgrind.
Several others have also
made significant contributions, including Cerion Armour-Brown, Jeremy Fitzhardinge, Tom Hughes, Nicholas Nethercote, Paul Mackerras, Dirk Mueller, Bart Van Assche,
Josef Weidendorfer and Robert Walsh.
Overview
Valgrind is in essence a
virtual machineA virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...
using
just-in-timeIn computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...
(JIT) compilation techniques, including
dynamic recompilationIn computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution...
. Nothing from the original program ever gets run directly on the host
processorThe central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...
. Instead, Valgrind first translates the program into a temporary, simpler form called Intermediate Representation (IR), which is a processor-neutral,
SSAIn compiler design, static single assignment form is a property of an intermediate representation , which says that each variable is assigned exactly once...
-based form. After the conversion, a
tool (see below) is free to do whatever transformations it would like on the IR, before Valgrind translates the IR back into machine code and lets the host processor run it. Even though it could use dynamic translation (that is, the host and target processors are from different architectures), it doesn't. Valgrind recompiles
binary codeA binary code is a way of representing text or computer processor instructions by the use of the binary number system's two-binary digits 0 and 1. This is accomplished by assigning a bit string to each particular symbol or instruction...
to run on host and target (or simulated) CPUs of the same architecture.
A considerable amount of performance is lost in these transformations (and usually, the code the tool inserts); usually, code run with Valgrind and the "none" tool (which does nothing to the IR) runs at 1/4th to 1/5th of the speed of the normal program. However, the IR form is much more suitable for instrumentation than the original, which makes it easier to write tools, and for most projects, a slowdown of this order is not a big problem during debugging.
Tools
There are multiple tools included with Valgrind (and several external ones). The default (and most used) tool is
Memcheck. Memcheck inserts extra
instrumentationIn context of computer programming, instrumentation refers to an ability to monitor or measure the level of a product's performance, to diagnose errors and to write trace information. Programmers implement instrumentation in the form of code instructions that monitor specific components in a system...
code around almost all instructions, which keeps track of the
validity (all unallocated memory starts as invalid or "undefined", until it is initialized into a deterministic state, possibly from other memory) and
addressability (whether the memory address in question points to an allocated, non-freed memory block), stored in the so-called
V bits and
A bits, respectively. As data is moved around or manipulated, the instrumentation code keeps track of the A and V bits so they are always correct on a single-bit level.
In addition,
MemcheckMemCheck is a name claimed by a few different computer programs, all of which are designed to check for various memory problems, such as memory leaks and out of bounds access....
replaces the standard C
memory allocatorC dynamic memory allocation refers to performing dynamic memory allocation in the C via a group of functions in the C standard library, namely malloc, realloc, calloc and free....
with its own implementation, which also includes
memory guards around all allocated blocks (with the A bits set to "invalid"). This feature enables Memcheck to detect
off-by-one errorAn off-by-one error is a logical error involving the discrete equivalent of a boundary condition. It often occurs in computer programming when an iterative loop iterates one time too many or too few...
s where a program reads or writes outside an allocated block by a small amount. (Other approaches to this problem include implemented
bounded pointerIn computer science a bounded pointer is a pointer that is augmented with additional information that enable the storage bounds within which it may point to be deduced...
s in the compiler that give lower chances of undetected errors, especially on memory that is allocated on the
stackIn computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
and not the heap, but requires recompiling all instrumented binary code.) The problems Memcheck can detect and warn about include the following:
- Use of uninitialized memory
- Reading/writing memory after it has been
free'd
- Reading/writing off the end of
mallocC dynamic memory allocation refers to performing dynamic memory allocation in the C via a group of functions in the C standard library, namely malloc, realloc, calloc and free....
'd blocks
- Memory leak
A memory leak, in computer science , occurs when a computer program consumes memory but is unable to release it back to the operating system. In object-oriented programming, a memory leak happens when an object is stored in memory but cannot be accessed by the running code...
s
The price of this is lost performance. Programs running under Memcheck usually run from five to twenty times as slow as running outside Valgrind, and use a lot more memory (there is a considerable memory penalty per-allocation). Thus, few developers run their code under Memcheck (or any other Valgrind tool) all the time. They most commonly use such tools either to trace down some specific bug, or to verify there are no latent bugs (of the kind Memcheck can detect) in the code.
In addition to Memcheck, Valgrind has several other tools:
- Addrcheck, a lightweight cousin of Memcheck, running much faster and requiring less memory, but catching fewer types of bugs. This tool has been removed as of version 3.2.0.
- Massif, a heap profiler.
- Helgrind and DRD, tools capable of detecting race condition
A race condition or race hazard is a flaw in an electronic system or process whereby the output or result of the process is unexpectedly and critically dependent on the sequence or timing of other events...
s in multithreaded codeIn computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...
.
- Cachegrind, a cache
In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...
profiler and its GUIGui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...
KCacheGrind.
- Callgrind, an extension to Cachegrind created by Josef Weidendorfer which produces more information about callgraphs. It was folded into the mainline version of Valgrind in version 3.2.0. KCacheGrind is capable of visualizing output from Callgrind as well as Cachegrind.
- exp-ptrcheck an experimental tool to find similar bugs as memcheck can do, but with a different approach that is capable of finding a few additional ones.
There are also several externally developed tools available. One of such tools is ThreadSanitizer, a detector of
race conditionA race condition or race hazard is a flaw in an electronic system or process whereby the output or result of the process is unexpectedly and critically dependent on the sequence or timing of other events...
s.
Platforms supported
As of version 3.4.0, Valgrind supports
LinuxLinux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
on x86,
x86-64x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...
and
PowerPCPowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...
. Support for
Mac OS XMac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...
was added in version 3.5.0. Support for Linux on
ARMv7ARM is a 32-bit reduced instruction set computer instruction set architecture developed by ARM Holdings. It was named the Advanced RISC Machine, and before that, the Acorn RISC Machine. The ARM architecture is the most widely used 32-bit ISA in numbers produced...
(used for example in certain smartphones) was added in version 3.6.0. There are unofficial ports to other UNIX-like platforms (like
FreeBSDFreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...
, and
NetBSDNetBSD is a freely available open source version of the Berkeley Software Distribution Unix operating system. It was the second open source BSD descendant to be formally released, after 386BSD, and continues to be actively developed. The NetBSD project is primarily focused on high quality design,...
).
Limitations of Memcheck
In addition to the performance penalty an important limitation of Memcheck is its inability to detect bounds errors in the use of static or stack allocated data. The following code will pass the
Memcheck tool and the experimental
Ptrcheck tool in Valgrind without incident, despite the indicated errors:
int Static[5];
int func(void)
{
int Stack[5];
Static[5] = 0; /* Error - Static[0] to Static[4] exist, Static[5] is out of bounds */
Stack [5] = 0; /* Error - Stack[0] to Stack[4] exist, Stack[5] is out of bounds */
return 0;
}
The inability to detect errors in access of stack allocated data is especially noteworthy since
certain types of stack errorsIn computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....
make software
vulnerableIn computer security, a vulnerability is a weakness which allows an attacker to reduce a system's information assurance.Vulnerability is the intersection of three elements: a system susceptibility or flaw, attacker access to the flaw, and attacker capability to exploit the flaw...
to the classic
stack smashing exploitIn software, a stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure; usually a fixed length buffer....
.
External links