Call stack
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a call stack is a stack
Stack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...

 data structure that stores information about the active subroutine
Subroutine
In computer science, a subroutine is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code....

s of a computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack". Although maintenance of the call stack is important for the proper functioning of most software, the details are normally hidden and automatic in high-level programming language
High-level programming language
A high-level programming language is a programming language with strong abstraction from the details of the computer. In comparison to low-level programming languages, it may use natural language elements, be easier to use, or be from the specification of the program, making the process of...

s.

A call stack is used for several related purposes, but the main reason for having one is to keep track of the point to which each active subroutine should return control when it finishes executing. An active subroutine is one that has been called but is yet to complete execution after which control should be handed back to the point of call. Such activations of subroutines may be nested to any level (recursive as a special case), hence the stack structure. If, for example, a subroutine DrawSquare calls a subroutine DrawLine from four different places, DrawLine must know where to return when its execution completes. To accomplish this, the address
Memory address
A digital computer's memory, more specifically main memory, consists of many memory locations, each having a memory address, a number, analogous to a street address, at which computer programs store and retrieve, machine code or data. Most application programs do not directly read and write to...

 following the call instruction, the return address, is pushed onto the call stack with each call.

Description

Since the call stack is organized as a stack
Stack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...

, the caller pushes the return address onto the stack, and the called subroutine, when it finishes, pops the return address off the call stack and transfers control to that address. If a called subroutine calls on to yet another subroutine, it will push another return address onto the call stack, and so on, with the information stacking up and unstacking as the program dictates. If the pushing consumes all of the space allocated for the call stack, an error called a stack overflow
Stack overflow
In software, a stack overflow occurs when too much memory is used on the call stack. The call stack contains a limited amount of memory, often determined at the start of the program. The size of the call stack depends on many factors, including the programming language, machine architecture,...

 occurs, generally causing the program to crash
Crash (computing)
A crash in computing is a condition where a computer or a program, either an application or part of the operating system, ceases to function properly, often exiting after encountering errors. Often the offending program may appear to freeze or hang until a crash reporting service documents...

. Adding a subroutine's entry to the call stack is sometimes called winding; conversely, removing entries is unwinding.

There is usually exactly one call stack associated with a running program
(or more accurately, with each task
Task (computers)
A task is an execution path through address space. In other words, a set of program instructions that are loaded in memory. The address registers have been loaded with the initial address of the program. At the next clock cycle, the CPU will start execution, in accord with the program. The sense is...

 or thread
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...

 of a process
Process (computing)
In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system , a process may be made up of multiple threads of execution that execute instructions concurrently.A computer program is a...

), although additional stacks may be created for signal
Signal (computing)
A signal is a limited form of inter-process communication used in Unix, Unix-like, and other POSIX-compliant operating systems. Essentially it is an asynchronous notification sent to a process in order to notify it of an event that occurred. When a signal is sent to a process, the operating system...

 handling or cooperative multitasking (as with setcontext
Setcontext
setcontext is one of a family of C library functions used for context control. The setcontext family allows the implementation in C of advanced control flow patterns such as iterators, fibers, and coroutines...

). Since there is only one in this important context, it can be referred to as the stack (implicitly, "of the task").

In high-level programming language
High-level programming language
A high-level programming language is a programming language with strong abstraction from the details of the computer. In comparison to low-level programming languages, it may use natural language elements, be easier to use, or be from the specification of the program, making the process of...

s, the specifics of the call stack are usually hidden from the programmer. They are given access only to a set of functions, and not the memory on the stack itself. Most assembly language
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...

s, on the other hand, require programmers to be involved with manipulating the stack. The actual details of the stack in a programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 depend upon the compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

, operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

, and the available instruction set
Instruction set
An instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...

.

Functions of the call stack

As noted above, the primary purpose of a call stack is:
Storing the return address : When a subroutine is called, the location (address) of the instruction at which it can later resume needs to be saved somewhere. Using a stack to save the return address has important advantages over alternatives. One is that each task has its own stack, and thus the subroutine can be reentrant, that is, can be active simultaneously for different tasks doing different things. Another benefit is that recursion
Recursion (computer science)
Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science....

 is automatically supported. When a function calls itself recursively, a return address needs to be stored for each activation of the function so that it can later be used to return from the function activation. This capability is automatic with a stack.

A call stack may serve additional functions, depending on the language, operating system, and machine environment. Among them can be:
Local data storage : A subroutine frequently needs memory space for storing the values of local variable
Local variable
In computer science, a local variable is a variable that is given local scope. Such a variable is accessible only from the function or block in which it is declared. In programming languages with only two levels of visibility, local variables are contrasted with global variables...

s, the variables that are known only within the active subroutine and do not retain values after it returns. It is often convenient to allocate space for this use by simply moving the top of the stack by enough to provide the space. This is very fast compared to heap allocation. Note that each separate activation of a subroutine gets its own separate space in the stack for locals.

Parameter passing : Subroutines often require that values for parameters
Parameter (computer science)
In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments...

 be supplied to them by the code which calls them, and it is not uncommon that space for these parameters may be laid out in the call stack. Generally if there are only a few small parameters, processor register
Processor register
In computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...

s will be used to pass the values, but if there are more parameters than can be handled this way, memory space will be needed. The call stack works well as a place for these parameters, especially since each call to a subroutine, which will have differing values for parameters, will be given separate space on the call stack for those values.

Evaluation stack : Operands for arithmetic or logical operations are most often placed into registers and operated on there. However, in some situations the operands may be stacked up to an arbitrary depth, which means something more than registers must be used (this is the case of register spilling). The stack of such operands, rather like that in an RPN calculator, is called an evaluation stack, and may occupy space in the call stack.

Pointer to current instance : Some object-oriented languages (e.g., C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

), store the this pointer
This (computer science)
In many object-oriented programming languages, this is a keyword that is used in instance methods to refer to the object on which they are working. C++ and languages which derive in style from it generally use this...

 along with function arguments in the call stack when invoking methods. The this pointer points to the object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

 instance
Instance (computer science)
In object-oriented programming an instance is an occurrence or a copy of an object, whether currently executing or not. Instances of a class share the same set of attributes, yet will typically differ in what those attributes contain....

 associated with the method to be invoked.

Enclosing subroutine context : Some programming languages (e.g., Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

 and Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

) support nested subroutines, allowing an inner routine to access the context of its outer enclosing routine, i.e., the parameters and local variables within the scope of the outer routine. Such static nesting can repeat - a function declared within a function declared within a function... The implementation must provide a means by which a called function at any given static nesting level can reference the enclosing frame at each enclosing nesting level. Commonly this reference is implemented by a pointer to the encompassing frame, called a "downstack link" or "static link", to distinguish it from the "dynamic link" that refers to the immediate caller (which need not be the static parent function). For example, languages often allow inner routines to call themselves recursively, resulting in multiple call frames for the inner routine's invocations, all of whose static links point to the same outer routine context. Instead of a static link, the references to the enclosing static frames may be collected into an array of pointers known as a display which is indexed to locate a desired frame. The Burroughs B6500 had such a display in hardware that supported up to 32 levels of static nesting.

Other return state : Besides the return address, in some environments there may be other machine or software states that need to be restored when a subroutine returns. This might include things like privilege level, exception handling information, arithmetic modes, and so on. If needed, this may be stored in the call stack just as the return address is.

The typical call stack is used for the return address, locals, and parameters (known as a call frame). In some environments there may be more or fewer functions assigned to the call stack. In the Forth programming language, for example, only the return address and local variables are stored on the call stack (which in that environment is named the return stack); parameters are stored on a separate data stack. Most Forths also have a third stack for floating point
Floating point
In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16...

 parameters.

Structure

A call stack is composed of stack frames (also called activation records or activation frames). These are machine dependent data structures containing subroutine state information. Each stack frame corresponds to a call to a subroutine which has not yet terminated with a return. For example, if a subroutine named DrawLine is currently running, having been called by a subroutine DrawSquare, the top part of the call stack might be laid out like this (where the stack is growing towards the top):

The stack frame at the top of the stack is for the currently executing routine. The stack frame usually includes at least the following items:
  • the arguments (parameter values) passed to the routine (if any)
  • the return address back to the routine's caller (e.g. in the DrawLine stack frame, an address into DrawSquare's code)
  • space for the local variables of the routine (if any)

The stack and frame pointers

The data stored in the stack frame may sometimes be accessed directly via the stack pointer register (which indicates the current top of the stack). However, as the stack pointer is variable during the activation of the routine, memory locations within the stack frame are more typically accessed via a separate register which makes relative addressing simpler and also enables dynamic allocation mechanisms (see below). This register is often termed the frame pointer and is set up at procedure entry to point to a fixed location in the frame structure (such as the return address).

Stack frame sizes

As different routines have different parameters and local data, stack frames have various sizes. Although they may often be fixed across all activations of a particular routine, many modern languages also support dynamic allocations on the stack, which means that the local data area will vary from activation to activation with a size that must be unspecified when the program is compiled
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

. In this case, access via a frame pointer, rather than via the stack pointer, is usually necessary since the offsets from the stack top to values such as the return address would not be known at compile time.

Storing the address to the caller's frame

In most systems a stack frame has a field to contain the previous value of the frame pointer register, the value it had while the caller was executing. For example, in the diagram above, the stack frame of DrawLine would have a memory location holding the frame pointer value that DrawSquare uses. The value is saved upon entry to the subroutine and restored for the return. Having such a field in a known location in the stack frame enables code to access each frame successively underneath the currently executing routine's frame, and also allows the routine to easily restore the frame pointer to the caller's frame, just before it returns.

Lexically nested routines

Programming languages that support nested subroutines
Nested function
In computer programming, a nested function is a function which is lexically encapsulated within another function. It can only be called by the enclosing function or by functions directly or indirectly nested within the same enclosing function. In other words, the scope of the nested function is...

 also have a field in the call frame that points to the stack frame of the latest activation of the procedure that most closely encapsulates the callee, i.e. the immediate scope of the callee. This is called an access link or static link (as it keeps track of static nesting during dynamic and recursive calls) and provides the routine (as well as any other routines it may invoke) access to the local data of its encapsulating routines at every nesting level. Some architectures, compilers, or optimization cases store one link for each enclosing level (not just the immediately enclosing), so that deeply nested routines that access shallow data do not have to traverse several links; this strategy is often called a display. Access link(s) can be optimized away in cases where an inner function does not access any (non constant) local data in the encapsulation—pure functions, i.e routines communicating via argument(s) and return value(s) only, would be an example of this. Some historical computers, such as the Burroughs large systems, had special "display registers" to support nested functions while compilers for most modern machines (such as the ubiquitous x86) simply reserve a few words on the stack for the pointers, as needed.

Overlap

For some purposes, the stack frame of a subroutine and that of its caller can be considered to overlap, the overlap consisting of the area where the parameters are passed from the caller to the callee. In some environments, the caller pushes each argument onto the stack, thus extending its stack frame, then invokes the callee. In other environments, the caller has a preallocated area at the top of its stack frame to hold the arguments it supplies to other subroutines it calls. This area is sometimes termed the outgoing arguments area or callout area. Under this approach, the size of the area is calculated by the compiler to be the largest needed by any called subroutine.

Call site processing

Usually the call stack manipulation needed at the site of a call to a subroutine is minimal (which is good since there can be many call sites for each subroutine to be called). The values for the actual arguments are evaluated at the call site, since they are specific to the particular call, and either pushed onto the stack or placed into registers, as determined by the calling convention
Calling convention
In computer science, a calling convention is a scheme for how subroutines receive parameters from their caller and how they return a result; calling conventions can differ in:...

 being used. The actual call instruction, such as "Branch and Link," is then typically executed to transfer control to the code of the target subroutine.

Caller processing

In the called subroutine, the first code executed is usually termed the subroutine prologue
Function prologue
In assembly language programming, the function prologue is a few lines of code at the beginning of a function, which prepare the stack and registers for use within the function...

, since it does the necessary housekeeping before the code for the statements of the routine is begun.

The prologue will commonly save the return address left in a register by the call instruction by pushing the value onto the call stack. Similarly, the current stack pointer and/or frame pointer values may be pushed. Alternatively, some instruction set architectures automatically provide comparable functionality as part of the action of the call instruction itself, and in such an environment the prologue need not do this.

If frame pointers are being used, the prologue will typically set the new value of the frame pointer register from the stack pointer. Space on the stack for local variables can then be allocated by incrementally changing the stack pointer.

The Forth programming language allows explicit winding of the call stack (called there the "return stack").

Return processing

When a subroutine is ready to return, it executes an epilogue that undoes the steps of the prologue. This will typically restore saved register values (such as the frame pointer value) from the stack frame, pop the entire stack frame off the stack by changing the stack pointer value, and finally branch to the instruction at the return address. Under many calling conventions the items popped off the stack by the epilogue include the original argument values, in which case there usually are no further stack manipulations that need to be done by the caller. With some calling conventions, however, it is the caller's responsibility to remove the arguments from the stack after the return.

Unwinding

Returning from the called function will pop the top frame off of the stack, perhaps leaving a return value.

Some languages (such as Pascal) allow a global goto
Goto
goto is a statement found in many computer programming languages. It is a combination of the English words go and to. It performs a one-way transfer of control to another line of code; in contrast a function call normally returns control...

 statement to transfer control out of a nested function and into a previously invoked outer function. This operation requires the stack to be unwound, removing as many stack frames as necessary to restore the proper context to transfer control to the target statement within the enclosing outer function. Such transfers of control are generally used only for error handling.

Other languages (such as Object Pascal
Object Pascal
Object Pascal refers to a branch of object-oriented derivatives of Pascal, mostly known as the primary programming language of Embarcadero Delphi.-Early history at Apple:...

) provide exception handling
Exception handling
Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....

, which also requires unwinding of the stack. The stack frame of a function contains one or more entries specifying exception handlers. When an exception is thrown, the stack is unwound until an exception handler is found that is prepared to handle (catch) the exception. Common Lisp
Common Lisp
Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers...

 allows control of what happens when the stack is unwound by using the unwind-protect special operator.

When applying a continuation
Continuation
In computer science and programming, a continuation is an abstract representation of the control state of a computer program. A continuation reifies the program control state, i.e...

, the stack is (logically) unwound and then rewound with the stack of the continuation. This is not the only way to implement continuations; for example, using multiple, explicit stacks, application of a continuation can simply activate its stack and wind a value to be passed. The Scheme programming language allows arbitrary thunk
Thunk
Thunk may refer to:* Thunk , a piece of code to perform a delayed computation * Thunk : a feature of some virtual function table implementations...

s to be executed in specified points on "unwinding" or "rewinding" of the control stack when a continuation is invoked.

Call stacks and software testing

A reported technique uses call stacks in a very different way than others discussed on this page. It uses call stacks for test suite reduction. Briefly, test suite reduction seeks to reduce the number of test cases in a test suite while retaining a high percentage of the original suite’s fault detection effectiveness. Two test cases are considered to be equivalent if they generate the same set of call stacks during execution. See for more details.

Performance analysis

Taking random-time samples of the call stack can be very useful in optimizing performance of programs. The reason is if a subroutine call instruction appears on the call stack for a certain fraction of execution time, its possible removal would save that much time. See Performance analysis
Performance analysis
In software engineering, profiling is a form of dynamic program analysis that measures, for example, the usage of memory, the usage of particular instructions, or frequency and duration of function calls...

 and Deep sampling
Deep sampling
Deep sampling is a variation of statistical sampling in which precision is sacrificed for insight. Small numbers of samples are taken, with each sample containing much information. The samples are taken approximately uniformly over the resource of interest, such as time or space...

.

Security

In a language with free pointers and/or non-checked array writes (such as C), the mixing of control flow data affecting the execution of code (return addresses, saved frame pointers) and simple program data (parameters, return values) in a call stack is a security risk, possibly exploit
Exploit (computer security)
An exploit is a piece of software, a chunk of data, or sequence of commands that takes advantage of a bug, glitch or vulnerability in order to cause unintended or unanticipated behavior to occur on computer software, hardware, or something electronic...

able through buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

s.

See also

  • Dynamic memory allocation
  • Automatic memory allocation
    Automatic memory allocation
    In computer programming, an automatic variable is a lexically-scoped variable which is allocated and de-allocated automatically when program flow enters and leaves the variable's scope...

  • Calling convention
    Calling convention
    In computer science, a calling convention is a scheme for how subroutines receive parameters from their caller and how they return a result; calling conventions can differ in:...

  • Stack buffer overflow
    Stack buffer overflow
    In software, a stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure; usually a fixed length buffer....

  • Segmented stack

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK