Valgrind
Encyclopedia
Valgrind is a GPL
GNU General Public License
The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

 licensed programming tool
Programming tool
A programming tool or software development tool is a program or application that software developers use to create, debug, maintain, or otherwise support other programs and applications...

 for memory debugging
Memory debugger
A memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...

, memory leak
Memory leak
A memory leak, in computer science , occurs when a computer program consumes memory but is unable to release it back to the operating system. In object-oriented programming, a memory leak happens when an object is stored in memory but cannot be accessed by the running code...

 detection, and profiling
Performance analysis
In software engineering, profiling is a form of dynamic program analysis that measures, for example, the usage of memory, the usage of particular instructions, or frequency and duration of function calls...

. The name valgrind comes from the main entrance to Valhalla
Valhalla
In Norse mythology, Valhalla is a majestic, enormous hall located in Asgard, ruled over by the god Odin. Chosen by Odin, half of those that die in combat travel to Valhalla upon death, led by valkyries, while the other half go to the goddess Freyja's field Fólkvangr...

 in Norse mythology
Norse mythology
Norse mythology, a subset of Germanic mythology, is the overall term for the myths, legends and beliefs about supernatural beings of Norse pagans. It flourished prior to the Christianization of Scandinavia, during the Early Middle Ages, and passed into Nordic folklore, with some aspects surviving...

.

Valgrind was originally designed to be a free memory debugging
Debugging
Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge...

 tool for Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 on x86, but has since evolved to become a generic framework for creating dynamic analysis tools such as checkers and profilers. It is used by a number of Linux-based projects. Since version 3.5, Valgrind also works on Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

.

The original author of Valgrind is Julian Seward
Julian Seward
Julian Seward is a compiler writer and Free Software contributor who lives in Cambridge, UK. He is commonly known for creating the bzip2 compression tool, as well as the valgrind memory debugging toolset founded in 2000...

, who in 2006 won a Google-O'Reilly Open Source Award
Google-O'Reilly Open Source Award
The Google-O'Reilly Open Source Award is presented to individuals for dedication, innovation, leadership and outstanding contribution to open source.-External links:* *...

 for his work on Valgrind.
Several others have also
made significant contributions, including Cerion Armour-Brown, Jeremy Fitzhardinge, Tom Hughes, Nicholas Nethercote, Paul Mackerras, Dirk Mueller, Bart Van Assche,
Josef Weidendorfer and Robert Walsh.

Overview

Valgrind is in essence a virtual machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

 using just-in-time
Just-in-time compilation
In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

 (JIT) compilation techniques, including dynamic recompilation
Dynamic recompilation
In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution...

. Nothing from the original program ever gets run directly on the host processor
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

. Instead, Valgrind first translates the program into a temporary, simpler form called Intermediate Representation (IR), which is a processor-neutral, SSA
Static single assignment form
In compiler design, static single assignment form is a property of an intermediate representation , which says that each variable is assigned exactly once...

-based form. After the conversion, a tool (see below) is free to do whatever transformations it would like on the IR, before Valgrind translates the IR back into machine code and lets the host processor run it. Even though it could use dynamic translation (that is, the host and target processors are from different architectures), it doesn't. Valgrind recompiles binary code
Binary code
A binary code is a way of representing text or computer processor instructions by the use of the binary number system's two-binary digits 0 and 1. This is accomplished by assigning a bit string to each particular symbol or instruction...

 to run on host and target (or simulated) CPUs of the same architecture.

A considerable amount of performance is lost in these transformations (and usually, the code the tool inserts); usually, code run with Valgrind and the "none" tool (which does nothing to the IR) runs at 1/4th to 1/5th of the speed of the normal program. However, the IR form is much more suitable for instrumentation than the original, which makes it easier to write tools, and for most projects, a slowdown of this order is not a big problem during debugging.

Tools

There are multiple tools included with Valgrind (and several external ones). The default (and most used) tool is Memcheck. Memcheck inserts extra instrumentation
Instrumentation (computer programming)
In context of computer programming, instrumentation refers to an ability to monitor or measure the level of a product's performance, to diagnose errors and to write trace information. Programmers implement instrumentation in the form of code instructions that monitor specific components in a system...

 code around almost all instructions, which keeps track of the validity (all unallocated memory starts as invalid or "undefined", until it is initialized into a deterministic state, possibly from other memory) and addressability (whether the memory address in question points to an allocated, non-freed memory block), stored in the so-called V bits and A bits, respectively. As data is moved around or manipulated, the instrumentation code keeps track of the A and V bits so they are always correct on a single-bit level.

In addition, Memcheck
MemCheck
MemCheck is a name claimed by a few different computer programs, all of which are designed to check for various memory problems, such as memory leaks and out of bounds access....

 replaces the standard C memory allocator
Malloc
C dynamic memory allocation refers to performing dynamic memory allocation in the C via a group of functions in the C standard library, namely malloc, realloc, calloc and free....

 with its own implementation, which also includes memory guards around all allocated blocks (with the A bits set to "invalid"). This feature enables Memcheck to detect off-by-one error
Off-by-one error
An off-by-one error is a logical error involving the discrete equivalent of a boundary condition. It often occurs in computer programming when an iterative loop iterates one time too many or too few...

s where a program reads or writes outside an allocated block by a small amount. (Other approaches to this problem include implemented bounded pointer
Bounded pointer
In computer science a bounded pointer is a pointer that is augmented with additional information that enable the storage bounds within which it may point to be deduced...

s in the compiler that give lower chances of undetected errors, especially on memory that is allocated on the stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...

 and not the heap, but requires recompiling all instrumented binary code.) The problems Memcheck can detect and warn about include the following:
  • Use of uninitialized memory
  • Reading/writing memory after it has been free'd
  • Reading/writing off the end of malloc
    Malloc
    C dynamic memory allocation refers to performing dynamic memory allocation in the C via a group of functions in the C standard library, namely malloc, realloc, calloc and free....

    'd blocks
  • Memory leak
    Memory leak
    A memory leak, in computer science , occurs when a computer program consumes memory but is unable to release it back to the operating system. In object-oriented programming, a memory leak happens when an object is stored in memory but cannot be accessed by the running code...

    s


The price of this is lost performance. Programs running under Memcheck usually run from five to twenty times as slow as running outside Valgrind, and use a lot more memory (there is a considerable memory penalty per-allocation). Thus, few developers run their code under Memcheck (or any other Valgrind tool) all the time. They most commonly use such tools either to trace down some specific bug, or to verify there are no latent bugs (of the kind Memcheck can detect) in the code.

In addition to Memcheck, Valgrind has several other tools:
  • Addrcheck, a lightweight cousin of Memcheck, running much faster and requiring less memory, but catching fewer types of bugs. This tool has been removed as of version 3.2.0.
  • Massif, a heap profiler.
  • Helgrind and DRD, tools capable of detecting race condition
    Race condition
    A race condition or race hazard is a flaw in an electronic system or process whereby the output or result of the process is unexpectedly and critically dependent on the sequence or timing of other events...

    s in multithreaded code
    Thread (computer science)
    In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...

    .
  • Cachegrind, a cache
    Cache
    In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

     profiler and its GUI
    Gui
    Gui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...

     KCacheGrind.
  • Callgrind, an extension to Cachegrind created by Josef Weidendorfer which produces more information about callgraphs. It was folded into the mainline version of Valgrind in version 3.2.0. KCacheGrind is capable of visualizing output from Callgrind as well as Cachegrind.
  • exp-ptrcheck an experimental tool to find similar bugs as memcheck can do, but with a different approach that is capable of finding a few additional ones.


There are also several externally developed tools available. One of such tools is ThreadSanitizer, a detector of race condition
Race condition
A race condition or race hazard is a flaw in an electronic system or process whereby the output or result of the process is unexpectedly and critically dependent on the sequence or timing of other events...

s.

Platforms supported

As of version 3.4.0, Valgrind supports Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 on x86, x86-64
X86-64
x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...

 and PowerPC
PowerPC
PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...

. Support for Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

 was added in version 3.5.0. Support for Linux on ARMv7
ARM architecture
ARM is a 32-bit reduced instruction set computer instruction set architecture developed by ARM Holdings. It was named the Advanced RISC Machine, and before that, the Acorn RISC Machine. The ARM architecture is the most widely used 32-bit ISA in numbers produced...

 (used for example in certain smartphones) was added in version 3.6.0. There are unofficial ports to other UNIX-like platforms (like FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

, and NetBSD
NetBSD
NetBSD is a freely available open source version of the Berkeley Software Distribution Unix operating system. It was the second open source BSD descendant to be formally released, after 386BSD, and continues to be actively developed. The NetBSD project is primarily focused on high quality design,...

).

Limitations of Memcheck

In addition to the performance penalty an important limitation of Memcheck is its inability to detect bounds errors in the use of static or stack allocated data. The following code will pass the Memcheck tool and the experimental Ptrcheck tool in Valgrind without incident, despite the indicated errors:


int Static[5];

int func(void)
{
int Stack[5];

Static[5] = 0; /* Error - Static[0] to Static[4] exist, Static[5] is out of bounds */
Stack [5] = 0; /* Error - Stack[0] to Stack[4] exist, Stack[5] is out of bounds */

return 0;
}


The inability to detect errors in access of stack allocated data is especially noteworthy since
certain types of stack errors
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

 make software
vulnerable
Vulnerability (computing)
In computer security, a vulnerability is a weakness which allows an attacker to reduce a system's information assurance.Vulnerability is the intersection of three elements: a system susceptibility or flaw, attacker access to the flaw, and attacker capability to exploit the flaw...

 to the classic
stack smashing exploit
Stack buffer overflow
In software, a stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure; usually a fixed length buffer....

.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK