Subroutine - AbsoluteAstronomy.com

Computer science

Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a subroutine (also called procedure, function, routine, method, or subprogram) is a portion of code within a larger program

Computer program

A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

that performs a specific task

Task (computers)

A task is an execution path through address space. In other words, a set of program instructions that are loaded in memory. The address registers have been loaded with the initial address of the program. At the next clock cycle, the CPU will start execution, in accord with the program. The sense is...

and is relatively independent of the remaining code.

As the name "subprogram" suggests, a subroutine behaves in much the same way as a computer program that is used as one step in a larger program or another subprogram. A subroutine is often coded so that it can be started ("called") several times and/or from several places during a single execution of the program, including from other subroutines, and then branch back (return) to the next instruction after the "call" once the subroutine's task is done.

Subroutines are a powerful programming tool, and the syntax of many programming language

Programming language

A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s includes support for writing and using them. Judicious use of subroutines (for example, through the structured programming

Structured programming

Structured programming is a programming paradigm aimed on improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures and for and while loops - in contrast to using simple tests and jumps such as the goto statement which could...

approach) will often substantially reduce the cost of developing and maintaining a large program, while increasing its quality and reliability. Subroutines, often collected into libraries

Library (computer science)

In computer science, a library is a collection of resources used to develop software. These may include pre-written code and subroutines, classes, values or type specifications....

, are an important mechanism for sharing and trading software. The discipline of object-oriented programming

Object-oriented programming

Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

is based on object

Object (computer science)

In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

s and method

Method (computer science)

In object-oriented programming, a method is a subroutine associated with a class. Methods define the behavior to be exhibited by instances of the associated class at program run time...

s (which are subroutines attached to these object

Object (computer science)

In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

s or object class

Class (computer science)

In object-oriented programming, a class is a construct that is used as a blueprint to create instances of itself – referred to as class instances, class objects, instance objects or simply objects. A class defines constituent members which enable these class instances to have state and behavior...

es).

In the compilation technique called threaded code

Threaded code

In computer science, the term threaded code refers to a compiler implementation technique where the generated code has a form that essentially consists entirely of calls to subroutines...

, the executable program is basically a sequence of subroutine calls.
Maurice Wilkes, David Wheeler, and Stanley Gill

Stanley Gill

Professor Stanley Gill was a British computer scientist credited, along with Maurice Wilkes and David Wheeler, with the invention of the first computer subroutine.-Early life, education and career:...

are credited with the invention of this concept, which they referred to as closed subroutine.

Main concepts

The content of a subroutine is its body, the piece of program code that is executed when the subroutine is called or invoked.

A subroutine may be written so that it expects to obtain one or more data values from the calling program (its parameters
Parameter (computer science)
In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments...

or arguments). It may also return a computed value to its caller (its return value), or provide various result values or out(put) parameters. Indeed, a common use of subroutines is to implement mathematical functions

Function (mathematics)

In mathematics, a function associates one quantity, the argument of the function, also known as the input, with another quantity, the value of the function, also known as the output. A function assigns exactly one output to each input. The argument and the value may be real numbers, but they can...

, in which the purpose of the subroutine is purely to compute one or more results whose values are entirely determined by the parameters passed to the subroutine. (Examples might include computing the logarithm

Logarithm

The logarithm of a number is the exponent by which another fixed value, the base, has to be raised to produce that number. For example, the logarithm of 1000 to base 10 is 3, because 1000 is 10 to the power 3: More generally, if x = by, then y is the logarithm of x to base b, and is written...

of a number or the determinant

Determinant

In linear algebra, the determinant is a value associated with a square matrix. It can be computed from the entries of the matrix by a specific arithmetic expression, while other ways to determine its value exist as well...

of a matrix

Matrix (mathematics)

In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...

.)

However, a subroutine call may also have side effects, such as modifying data structure

Data structure

In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

s in the computer's memory, reading from or writing to a peripheral device, creating a file

Computer file

A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...

, halting the program or the machine, or even delaying the program's execution for a specified time. A subprogram with side effects may return different results each time it is called, even if it is called with the same arguments. An example is a random number function

Pseudorandom number generator

A pseudorandom number generator , also known as a deterministic random bit generator , is an algorithm for generating a sequence of numbers that approximates the properties of random numbers...

, available in many languages, that returns a different random-looking number each time it is called. The widespread use of subroutines with side effects is a characteristic of imperative programming

Imperative programming

In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state...

languages.

A subroutine can be coded so that it may call itself recursively

Recursion (computer science)

Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science....

, at one or more places, in order to perform its task. This technique allows direct implementation of functions defined by mathematical induction

Mathematical induction

Mathematical induction is a method of mathematical proof typically used to establish that a given statement is true of all natural numbers...

and recursive divide and conquer algorithms.

A subroutine whose purpose is to compute a single boolean-valued function

Boolean-valued function

A boolean-valued function, in some usages is a predicate or a proposition, is a function of the type f : X → B, where X is an arbitrary set and where B is a boolean domain....

(that is, to answer a yes/no question) is called a predicate
Branch predication
Branch predication is a strategy in computer architecture design for mitigating the costs usually associated with conditional branches, particularly branches to short sections of code...

. In logic programming

Logic programming

Logic programming is, in its broadest sense, the use of mathematical logic for computer programming. In this view of logic programming, which can be traced at least as far back as John McCarthy's [1958] advice-taker proposal, logic is used as a purely declarative representation language, and a...

languages, often all subroutines are called "predicates", since they primarily determine success or failure.

Language support

High-level programming language

A high-level programming language is a programming language with strong abstraction from the details of the computer. In comparison to low-level programming languages, it may use natural language elements, be easier to use, or be from the specification of the program, making the process of...

s usually include specific constructs for

delimiting the part of the program (body) that comprises the subroutine,
assigning a name
Identifier
An identifier is a name that identifies either a unique object or a unique class of objects, where the "object" or class may be an idea, physical [countable] object , or physical [noncountable] substance...

to the subroutine,
specifying the names and/or types of its parameters and/or return values,
providing a private naming scope for its temporary variable
Temporary variable
In computer programming, a temporary variable is a variable whose purpose is short-lived, usually to hold temporary data that will soon be discarded, or before it can be placed at a more permanent memory location. Because it is short-lived, it is usually declared with local scope...

s,
identifying variables outside the subroutine that are accessible within it,
calling the subroutine,
providing values to its parameters,
specifying the return values from within its body,
returning
Return statement
In computer programming, a return statement causes execution to leave the current subroutine and resume at the point in the code immediately after where the subroutine was called, known as its return address. The return address is saved, usually on the process's call stack, as part of the operation...

to the calling program,
disposing of the values returned by a call,
handling any exceptional conditions
Exception handling
Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....

encountered during the call,
packaging subroutines into a module, library, object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

, class
Class (computer science)
In object-oriented programming, a class is a construct that is used as a blueprint to create instances of itself – referred to as class instances, class objects, instance objects or simply objects. A class defines constituent members which enable these class instances to have state and behavior...

, etc.

Some programming language

Programming language

s, such as Visual Basic .NET

Visual Basic .NET

Visual Basic .NET , is an object-oriented computer programming language that can be viewed as an evolution of the classic Visual Basic , which is implemented on the .NET Framework...

, Pascal

Pascal (programming language)

Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

, Fortran

Fortran

Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

, and Ada

Ada (programming language)

Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

, distinguish between "functions" or "function subprograms", which provide an explicit return value to the calling program, and "subroutines" or "procedures", which do not. In those languages, function calls are normally embedded in expressions

Expression (programming)

An expression in a programming language is a combination of explicit values, constants, variables, operators, and functions that are interpreted according to the particular rules of precedence and of association for a particular programming language, which computes and then produces another value...

(e.g., a sqrt function may be called as y = z + sqrt(x)); whereas procedure calls behave syntactically as statements

Statement (programming)

In computer programming a statement can be thought of as the smallest standalone element of an imperative programming language. A program written in such a language is formed by a sequence of one or more statements. A statement will have internal components .Many languages In computer programming...

(e.g., a print procedure may be called as if x > 0 then print(x). Other languages, such as C

C (programming language)

C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

and Lisp, do not make this distinction, and treat those terms as synonymous.

In strictly functional programming
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...

languages such as Haskell

Haskell (programming language)

Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

, subprograms can have no side effects, and will always return the same result if repeatedly called with the same arguments. Such languages typically only support functions, since subroutines that do not return a value have no use unless they can cause a side effect.

In programming language

Programming language

s, such as C

C (programming language)

C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, C++, and Microsoft Visual C Sharp

Microsoft Visual C Sharp

Microsoft Visual C# is Microsoft's implementation of the C# specification, included in the Microsoft Visual Studio suite of products. It is based on the ECMA/ISO specification of the C# language, which Microsoft also created. While multiple implementations of the specification exist, Visual C# is...

, subroutines may also simply be called "functions", not to be confused with mathematical functions or functional programming
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...

, which are different concepts.

A language's compiler will usually translate procedure calls and returns into machine instructions according to a well-defined calling convention
Calling convention
In computer science, a calling convention is a scheme for how subroutines receive parameters from their caller and how they return a result; calling conventions can differ in:...

, so that subroutines can be compiled separately from the programs that call them. The instruction sequences corresponding to call and return statements are called the procedure's prologue and epilogue

Function prologue

In assembly language programming, the function prologue is a few lines of code at the beginning of a function, which prepare the stack and registers for use within the function...

Advantages

The advantages of breaking a program into subroutines include:

decomposition of
Decomposition (computer science)
Decomposition in computer science, also known as factoring, refers to the process by which a complex problem or system is broken down into parts that are easier to conceive, understand, program, and maintain.- Overview :...

a complex programming task into simpler steps: this is one of the two main tools of structured programming
Structured programming
Structured programming is a programming paradigm aimed on improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures and for and while loops - in contrast to using simple tests and jumps such as the goto statement which could...

, along with data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

s.
reducing the duplication of code
Duplicate code
Duplicate code is a computer programming term for a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered undesirable for a number of reasons...

within a program,
enabling the reuse of code
Code reuse
Code reuse, also called software reuse, is the use of existing software, or software knowledge, to build new software.-Overview:Ad hoc code reuse has been practiced from the earliest days of programming. Programmers have always reused sections of code, templates, functions, and procedures...

across multiple programs,
dividing a large programming task among various programmers, or various stages of a project,
hiding implementation details
Information hiding
In computer science, information hiding is the principle of segregation of the design decisions in a computer program that are most likely to change, thus protecting other parts of the program from extensive modification if the design decision is changed...

from users of the subroutine.
improves traceability, i.e. most languages offer ways to obtain the call trace which includes the names of the involved subroutines and perhaps even more information such as file names and line numbers. By not decomposing the code into subroutines, debugging would be impaired severely.

Disadvantages

The invocation of a subroutine (rather than using in-line code) imposes some computational overhead
Computational overhead
In computer science, overhead is generally considered any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to attain a particular goal...

in the call mechanism itself

The subroutine typically requires standard housekeeping
Housekeeping (computing)
In computer programming, housekeeping can refer either to a standard entry or exit routine appended to a user written block of code at its entry and exit or, alternatively, to any other automated or manual software process whereby a computer is cleaned-up after usage In computer programming,...

code—both at entry to, and exit from, the function (function prologue and epilogue
Function prologue
In assembly language programming, the function prologue is a few lines of code at the beginning of a function, which prepare the stack and registers for use within the function...

—usually saving general purpose registers and return address as a minimum)

Language support

In the (very) early assemblers, subroutine support was limited. Subroutines were not explicitly separated from each other or from the main program, and indeed the source code of a subroutine could be interspersed with that of other subprograms. Some assemblers would offer predefined macros to generate the call and return sequences. Later assemblers (1960s) had much more sophisticated support for both in-line and separately assembled subroutines that could be linked together.

Self-modifying code

The first use of subprograms was on early computers that were programmed in machine code

Machine code

Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

or assembly language

Assembly language

An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...

, and did not have a specific call instruction. On those computers, each subroutine call had to be implemented as a sequence of lower level machine instructions that relied on self-modifying code

Self-modifying code

In computer science, self-modifying code is code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance...

. By replacing the operand

Operand

In mathematics, an operand is the object of a mathematical operation, a quantity on which an operation is performed.-Example :The following arithmetic expression shows an example of operators and operands:3 + 6 = 9\;...

of a branch instruction at the end of the procedure's body, execution could then be returned to the proper location (designated by the return address
Return statement
In computer programming, a return statement causes execution to leave the current subroutine and resume at the point in the code immediately after where the subroutine was called, known as its return address. The return address is saved, usually on the process's call stack, as part of the operation...

) in the calling program (usually just after the instruction that jumped into the subroutine).

Subroutine libraries

Even with this cumbersome approach, subroutines proved very useful. For one thing they allowed the same code to be used in many different programs. Morever, memory was a very scarce resource on early computers, and subroutines allowed significant savings in program size.

In many early computers, the program instructions were entered into memory from a punched paper tape

Punched tape

Punched tape or paper tape is an obsolete form of data storage, consisting of a long strip of paper in which holes are punched to store data...

. Each subroutine could then be provided by a separate piece of tape, loaded or spliced before or after the main program; and the same subroutine tape could then be used by many different programs. A similar approach was used in computers whose main input was through punched card

Punched card

A punched card, punch card, IBM card, or Hollerith card is a piece of stiff paper that contains digital information represented by the presence or absence of holes in predefined positions...

s. The name "subroutine library" originally meant a library, in the literal sense, which kept indexed collections of such tapes or card decks for collective use.

Return by indirect jump

To remove the need for self-modifying code, computer designers eventually provided an "indirect jump

Indirect branch

An indirect branch is a type of program control instruction present in some machine language instruction sets. Rather than specifying the address of the next instruction to execute, as in a direct branch, the argument specifies where the address is located...

" instruction, whose operand, instead of being the return address

Return statement

In computer programming, a return statement causes execution to leave the current subroutine and resume at the point in the code immediately after where the subroutine was called, known as its return address. The return address is saved, usually on the process's call stack, as part of the operation...

itself, was the location of a variable or processor register

Processor register

In computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...

containing the return address.

On those computers, instead of modifying the subroutine's return jump, the calling program would store the return address in a variable so that when the subroutine completed, it would execute an indirect jump that would direct execution to the location given by the predefined variable.

Jump to subroutine

Another advance was the "jump to subroutine" instruction, which combined the saving of the return address with the calling jump, thereby minimizing overhead

Computational overhead

In computer science, overhead is generally considered any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to attain a particular goal...

significantly.

In the IBM System/360

System/360

The IBM System/360 was a mainframe computer system family first announced by IBM on April 7, 1964, and sold between 1964 and 1978. It was the first family of computers designed to cover the complete range of applications, from small to large, both commercial and scientific...

, for example, the branch instructions BAL or BALR, designed for procedure calling, would save the return address in a processor register specified in the instruction. To return, the subroutine had only to execute an indirect branch instruction (BR) through that register. If the subroutine needed that register for some other purpose (such as calling another subroutine), it would save the register's contents to a private memory location or a register stack

Stack (data structure)

In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...

.

In the HP 2100

HP 2100

The HP 2100 was a series of minicomputers produced by Hewlett-Packard from the mid-1960s to early 1990s. The 2100 was also a specific model in this series. The series was renamed HP 1000 by the 1970s and sold as real-time computers, complementing the more complex IT-oriented HP 3000, and would be...

, the JSB instruction would perform a similar task, except that the return address was stored in the memory location that was the target of the branch. Execution of the procedure would actually begin at the next memory location. In the HP 2100 assembly language, one would write, for example
...
JSB MYSUB (Calls subroutine MYSUB.)
BB ... (Will return here after MYSUB is done.)
to call a subroutine called MYSUB from the main program. The subroutine would be coded as
MYSUB NOP (Storage for MYSUB's return address.)
AA ... (Start of MYSUB's body.)
...
JMP MYSUB,I (Returns to the calling program.)
The JSB instruction placed the address of the NEXT instruction (namely, BB) into the location specified as its operand (namely, MYSUB), and then branched to the NEXT location after that (namely, AA = MYSUB + 1). The subroutine could then return to the main program by executing the indirect jump JMP MYSUB,I which branched to the location stored at location MYSUB.

Compilers for Fortran and other languages could easily make use of these instructions when available. This approach supported multiple levels of calls; however, since the return address, parameters, and return values of a subroutine were assigned fixed memory locations, it did not allow for recursive calls.

Incidentally, a similar technique was used by Lotus 1-2-3

Lotus 1-2-3

Lotus 1-2-3 is a spreadsheet program from Lotus Software . It was the IBM PC's first "killer application"; its huge popularity in the mid-1980s contributed significantly to the success of the IBM PC in the corporate environment.-Beginnings:...

, in the early 1980s, to discover the recalculation dependencies in a spreadsheet. Namely, a location was reserved in each cell to store the "return" address. Since circular reference

Circular reference

A circular reference is a series of references where the last object references the first, resulting in a closed loop.-In language:A circular reference is not to be confused with the logical fallacy of a circular argument...

s are not allowed for natural recalculation order, this allows a tree walk without reserving space for a stack in memory, which was very limited on small computers such as the IBM PC

IBM PC

The IBM Personal Computer, commonly known as the IBM PC, is the original version and progenitor of the IBM PC compatible hardware platform. It is IBM model number 5150, and was introduced on August 12, 1981...

Call stack

Most modern implementations use a call stack

Call stack

In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...

, a special case of the stack data structure

Stack (data structure)

, to implement subroutine calls and returns. Each procedure call creates a new entry, called a stack frame, at the top of the stack; when the procedure returns, its stack frame is deleted from the stack, and its space may be used for other procedure calls. Each stack frame contains the private data of the corresponding call, which typically includes the procedure's parameters and internal variables, and the return address.

The call sequence can be implemented by a sequence of ordinary instructions (an approach still used in RISC and VLIW architectures), but many traditional machines designed since the late 1960s have included special instructions for that purpose.

The call stack is usually implemented as a contiguous area of memory. It is an arbitrary design choice whether the bottom of the stack is the lowest or highest address within this area, so that the stack may grow forwards or backwards in memory; however, many architectures chose the latter.

Some designs, notably some Forth implementations, used two separate stacks, one mainly for control information (like return addresses and loop counters) and the other for data. The former was, or worked like, a call stack and was only indirectly accessible to the programmer through other language constructs while the latter was more directly accessible.

When stack-based procedure calls were first introduced, an important motivation was to save precious memory. With this scheme, the compiler does not have to reserve separate space in memory for the private data (parameters, return address, and local variables) of each procedure. At any moment, the stack contains only the private data of the calls that are currently active (namely, which have been called but haven't returned yet). Because of the ways in which programs were usually assembled from libraries, it was (and still is) not uncommon to find programs that include thousands of subroutines, of which only a handful are active at any given moment. For such programs, the call stack mechanism could save significant amounts of memory. Indeed, the call stack mechanism can be viewed as the earliest and simplest method for automatic memory management

Garbage collection (computer science)

In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

.

However, another advantage of the call stack method is that it allows recursive subroutine calls

Recursion (computer science)

, since each nested call to the same procedure gets a separate instance of its private data.

Delayed stacking

One disadvantage of the call stack mechanism is the increased cost of a procedure call and its matching return. The extra cost includes incrementing and decrementing the stack pointer (and, in some architectures, checking for stack overflow

Stack overflow

In software, a stack overflow occurs when too much memory is used on the call stack. The call stack contains a limited amount of memory, often determined at the start of the program. The size of the call stack depends on many factors, including the programming language, machine architecture,...

), and accessing the local variables and parameters by frame-relative addresses, instead of absolute addresses. The cost may be realized in increased execution time, or increased processor complexity, or both.

This overhead is most obvious and objectionable in leaf procedures, which return without making any procedure calls themselves. To reduce that overhead, many modern compilers try to delay the use of a call stack until it is really needed. For example, the call of a procedure P may store the return address and parameters of the called procedure in certain processor registers, and transfer control to the procedure's body by a simple jump. If procedure P returns without making any other call, the call stack is not used at all. If P needs to call another procedure Q, it will then use the call stack to save the contents of any registers (such as the return address) that will be needed after Q returns.

C and C++ examples

In the C

C (programming language)

C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

and C++

C++

C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

programming languages, subprograms are referred to as "functions" (or "member functions" when associated with a class

Class (computer science)

). Note that these languages use the special keyword void to indicate that a function takes no parameters (especially in C) and/or does not return any value. Note that C/C++ functions can have side-effects, including modifying any variables whose addresses are passed as parameters (i.e. "passed by reference"). Examples:

void function1(void) { /* some code */ }

The function does not return a value and has to be called as a stand-alone function, e.g., function1;

int function2(void)
{
return 5;
}

This function returns a result (the number 5), and the call can be part of an expression, e.g., x + function2

char function3(int number)
{
char selection[] = {'S','M','T','W','T','F','S'};
return selection[number];
}

This function converts a number between 0 to 6 into the initial letter of the corresponding day of the week, namely 0 to 'S', 1 to 'M', ..., 6 to 'S'. The result of calling it might be assigned to a variable, e.g., num_day = function3(number);.

void function4(int *pointer_to_var)
{
(*pointer_to_var)++;
}

This function does not return a value but modifies the variable whose address is passed as the parameter; it would be called with "function4(&variable_to_increment);".

Visual Basic 6 examples

In the Visual Basic 6 programming language, subprograms are referred to as "functions" or "subs" (or "methods" when associated with a class

Class (computer science)

). Visual Basic 6 uses various terms called "types" to define what is being passed as a parameter. By default, an unspecified variable is registered as a Variant type

Variant type

Variant is a data type in certain programming languages, particularly Visual Basic and C++ when using the Component Object Model.In Visual Basic the Variant data type is a tagged union that can be used to represent any other data type except fixed-length string type and...

and can be passed as "ByRef" (default) or "ByVal". Also, when a function or sub is declared, it is given a public, private, or friend designation, which determines whether it can be accessed outside the module and/or project that it was declared in.

By value [ByVal]

A way of passing the value of an argument to a procedure instead of passing the address. This allows the procedure to access a copy of the variable. As a result, the variable's actual value can't be changed by the procedure to which it is passed.

By reference [ByRef]

A way of passing the address of an argument to a procedure instead of passing the value. This allows the procedure to access the actual variable. As a result, the variable's actual value can be changed by the procedure to which it is passed. Unless otherwise specified, arguments are passed by reference.

Public (optional)

Indicates that the Function procedure is accessible to all other procedures in all modules. If used in a module that contains an Option Private, the procedure is not available outside the project.

Private (optional)

Indicates that the Function procedure is accessible only to other procedures in the module where it is declared.

Friend (optional)

Used only in a class module. Indicates that the Function procedure is visible throughout the project, but not visible to a controller of an instance of an object.

Private Function Function1
' Some Code Here
End Function

The function does not return a value and has to be called as a stand-alone function, e.g., Function1

Private Function Function2 as Integer
Function2 = 5
End Function

This function returns a result (the number 5), and the call can be part of an expression, e.g., x + Function2

Private Function Function3(ByVal intValue as Integer) as String
Dim strArray(6) as String
strArray = Array("M", "T", "W", "T", "F", "S", "S")
Function3 = strArray(intValue)
End Function

This function converts a number between 0 and 6 into the initial letter of the corresponding day of the week, namely 0 to 'M', 1 to 'T', ..., 6 to 'S'. The result of calling it might be assigned to a variable, e.g., num_day = Function3(number).

Private Function Function4(ByRef intValue as Integer)
intValue = intValue + 1
End Function

This function does not return a value but modifies the variable whose address is passed as the parameter; it would be called with "Function4(variable_to_increment)".

Local variables, recursion and re-entrancy

A subprogram may find it useful to make use of a certain amount of "scratch" space; that is, memory

Virtual memory

In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...

used during the execution of that subprogram to hold intermediate results. Variables stored in this scratch space are referred to as local variables, and the scratch space itself is referred to as an activation record. An activation record typically has a return address

Return address

In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient with a means to determine how to respond to the sender of the message if needed....

that tells it where to pass control back to when the subprogram finishes.

A subprogram may have any number and nature of call sites. If recursion is supported, a subprogram may even call itself, causing its execution to suspend while another nested execution of the same subprogram occurs. Recursion

Recursion

Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...

is a useful technique for simplifying some complex algorithms, and breaking down complex problems. Recursive languages generally provide a new copy of local variables on each call. If the programmer desires the value of local variables to stay the same between calls, they can be declared "static" in some languages, or global values or common areas can be used.

Early languages like Fortran

Fortran

Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

did not initially support recursion because variables were statically allocated, as well as the location for the return address. Most computers before the late 1960s such as the PDP-8

PDP-8

The 12-bit PDP-8 was the first successful commercial minicomputer, produced by Digital Equipment Corporation in the 1960s. DEC introduced it on 22 March 1965, and sold more than 50,000 systems, the most of any computer up to that date. It was the first widely sold computer in the DEC PDP series of...

did not have support for hardware stack registers.

Modern languages after ALGOL

ALGOL

ALGOL is a family of imperative computer programming languages originally developed in the mid 1950s which greatly influenced many other languages and became the de facto way algorithms were described in textbooks and academic works for almost the next 30 years...

such as PL/1 and C

C (programming language)

C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

almost invariably use a stack, usually supported by most modern computer instruction sets to provide a fresh activation record for every execution of a subprogram. That way, the nested execution is free to modify its local variables without concern for the effect on other suspended executions in progress. As nested calls accumulate, a call stack

Call stack

structure is formed, consisting of one activation record for each suspended subprogram. In fact, this stack structure is virtually ubiquitous, and so activation records are commonly referred to as stack frames.

Some languages such as Pascal

Pascal (programming language)

and Ada

Ada (programming language)

Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

also support nested subroutines

Nested function

In computer programming, a nested function is a function which is lexically encapsulated within another function. It can only be called by the enclosing function or by functions directly or indirectly nested within the same enclosing function. In other words, the scope of the nested function is...

, which are subroutines callable only within the scope

Scope (programming)

In computer programming, scope is an enclosing context where values and expressions are associated. Various programming languages have various types of scopes. The type of scope determines what kind of entities it can contain and how it affects them—or semantics...

of an outer (parent) subroutine. Inner subroutines have access to the local variables of the outer subroutine that called them. This is accomplished by storing extra context information within the activation record, also known as a display.

If a subprogram can function properly even when called while another execution is already in progress, that subprogram is said to be re-entrant. A recursive subprogram must be re-entrant. Re-entrant subprograms are also useful in multi-threaded

Thread (computer science)

In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...

situations, since multiple threads can call the same subprogram without fear of interfering with each other. In the IBM

IBM

International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

CICS

Customer Information Control System is a transaction server that runs primarily on IBM mainframe systems under z/OS and z/VSE.CICS is a transaction manager designed for rapid, high-volume online processing. This processing is mostly interactive , but background transactions are possible...

transaction processing

Transaction processing

In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...

system, "quasi-reentrant" was a slightly less restrictive, but similar, requirement for application programs that were shared by many threads.

In a multi-threaded

Thread (computer science)

environment, there is generally more than one stack. An environment that fully supports coroutine

Coroutine

Coroutines are computer program components that generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations...

s or lazy evaluation

Lazy evaluation

In programming language theory, lazy evaluation or call-by-need is an evaluation strategy which delays the evaluation of an expression until the value of this is actually required and which also avoids repeated evaluations...

may use data structures other than stacks to store their activation records.

Overloading

In strongly typed languages, it is sometimes desirable to have a number of functions with the same name, but operating on different types of data, or with different parameter profiles. For example, a square root function might be defined to operate on reals, complex values or matrices. The algorithm to be used in each case is different, and the return result may be different. By writing three separate functions with the same name, the programmer has the convenience of not having to remember different names for each type of data. Further if a subtype can be defined for the reals, to separate positive and negative reals, two functions can be written for the reals, one to return a real when the parameter is positive, and another to return a complex value when the parameter is negative.

In Object-oriented programming

Object-oriented programming

, when a series of functions with the same name can accept different parameter profiles or parameters of different types, each of the functions is said to be overloaded

Method overloading

Function overloading or method overloading is a feature found in various programming languages such as Ada, C#, VB.NET, C++, D and Java that allows the creation of several methods with the same name which differ from each other in terms of the type of the input and the type of the output of the...

.

As another example, a subroutine might construct an object

Object (computer science)

In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

that will accept directions, and trace its path to these points on screen. There are a plethora of parameters that could be passed in to the constructor (colour of the trace, starting x and y co-ordinates, trace speed). If the programmer wanted the constructor to be able to accept only the color parameter, then he could call another constructor that accepts only color, which in turn calls the constructor with all the parameters passing in a set of "default values" for all the other parameters (X and Y would generally be centered on screen or placed at the origin, and the speed would be set to another value of the coder's choosing).

Closures

A closure
Closure (computer science)
In computer science, a closure is a function together with a referencing environment for the non-local variables of that function. A closure allows a function to access variables outside its typical scope. Such a function is said to be "closed over" its free variables...

is a subprogram together with the values of some of its variables captured from the environment in which it was created. Closures were a notable feature of the Lisp programming language, introduced by John McCarthy

John McCarthy (computer scientist)

John McCarthy was an American computer scientist and cognitive scientist. He coined the term "artificial intelligence" , invented the Lisp programming language and was highly influential in the early development of AI.McCarthy also influenced other areas of computing such as time sharing systems...

. Depending on the implementation, closures can can serve as a mechanism for side-effects.

Conventions

A wide number of conventions for the coding of subroutines have been developed. Pertaining to their naming, many developers have adopted the approach that the name of a subroutine should be a verb

Verb

A verb, from the Latin verbum meaning word, is a word that in syntax conveys an action , or a state of being . In the usual description of English, the basic form, with or without the particle to, is the infinitive...

when it does a certain task, an adjective

Adjective

In grammar, an adjective is a 'describing' word; the main syntactic role of which is to qualify a noun or noun phrase, giving more information about the object signified....

when it makes some inquiry, and a noun

Noun

In linguistics, a noun is a member of a large, open lexical category whose members can occur as the main word in the subject of a clause, the object of a verb, or the object of a preposition .Lexical categories are defined in terms of how their members combine with other kinds of...

when it is used to substitute variables.

Some programmers suggest that a subroutine should perform only one task, and if a subroutine does perform more than one task, it should be split up into more subroutines. They argue that subroutines are key components in code maintenance

Software maintenance

Software Maintenance in software engineering is the modification of a software product after delivery to correct faults, to improve performance or other attributes....

, and their roles in the program must remain distinct.

Proponents of code-modularization

Modular programming

Modular programming is a software design technique that increases the extent to which software is composed of separate, interchangeable components called modules by breaking down program functions into modules, each of which accomplishes one function and contains everything necessary to accomplish...

advocate that each subroutine should have minimal dependency on other pieces of code. For example, the use of global variables is generally deemed unwise by advocates for this perspective, because it adds tight coupling between the subroutine and these global variables. If such coupling is not necessary, their advice is to refactor subroutines to accept passed parameters instead. However, increasing the number of parameters passed to subroutines can affect code readability.

Return codes

Besides its "main" or "normal" effect, a subroutine may need to inform the calling program about "exceptional" conditions that may have occurred during its execution. In some languages and/or programming standards, this is often done through a "return code", an integer value placed by the subroutine in some standard location, which encodes the normal and exceptional conditions.

In the IBM

IBM

S/360, where a return code was expected from the subroutine, the return value was often designed to be a multiple of 4—so that it could be used as a direct branch table

Branch table

In computer programming, a branch table is a term used to describe an efficient method of transferring program control to another part of a program using a table of branch instructions. It is a form of multiway branch...

index into a branch table often located immediately after the call instruction to avoid extra conditional tests, further improving efficiency. In the System/360

System/360

assembly language

Assembly language

, one would write, for example:

BAL 14,SUBRTN01 go to subroutine , using reg 14 as save register (sets reg 15 to 0,4,8 as return value)
B TABLE(15) use returned value in reg 15 to index the branch table, branching to the appropriate branch instr.
TABLE B OK return code =00 GOOD }
B BAD return code =04 Invalid input } Branch table
B ERROR return code =08 Unexpected condition }

Optimization of subroutine calls

There is a significant runtime overhead

Computational overhead

In computer science, overhead is generally considered any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to attain a particular goal...

in a calling a subroutine, including passing the arguments, branching to the subprogram, and branching back to the caller. The overhead often includes saving and restoring certain processor registers, allocating and reclaiming call frame storage, etc.. In some languages, each subroutine calls also implies automatic testing of the subroutine's return code, or the handling of exceptions that it may raise. In object-oriented languages, a significant source of overhead is the intensively used dynamic dispatch

Dynamic dispatch

In computer science, dynamic dispatch is the process of mapping a message to a specific sequence of code at runtime. This is done to support the cases where the appropriate method can't be determined at compile-time...

for method calls.

There are some seemingly obvious optimizations of procedure calls that cannot be applied if the procedures may have side effects. For example, in the expression (f(x)-1)/(f(x)+1), the function f must be called twice, because the two calls may return different results. Moreover, the value of x must be fetched again before the second call, since the first call may have changed it. Determining whether a subprogram may have a side effect is very difficult (indeed, undecidable

Undecidable problem

In computability theory and computational complexity theory, an undecidable problem is a decision problem for which it is impossible to construct a single algorithm that always leads to a correct yes-or-no answer....

). So, while those optimizations are safe in purely functional programming languages, compilers of typical imperative programming usually have to assume the worst.

Inlining

A technique used to eliminate this overhead is inline expansion
Inline expansion
In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the callee. This optimization may improve time and space usage at runtime, at the possible cost of increasing the final size of the program In computing, inline...

or inlining of the subprogram's body at each call site

Call site

In programming, a call site of a function is a line in the code which calls a function. A call site passes zero or more arguments to the function, and receives zero or more return values.-Example:...

(rather than branching to the subroutine and back). Not only does this avoid the call overhead, but it also allows the compiler to optimize the procedure's 'body' more effectively by taking into account the context and arguments at that call. The inserted body can be optimized by the compiler

Compiler

A compiler is a computer program that transforms source code written in a programming language into another computer language...

. Inlining however, will usually increase the code size, unless the program contains only a single call to the subroutine, or the subroutine body is less code than the call overhead.

Related terms and clarification

Different programming languages and methodologies possess notions and mechanisms related to subprograms. The name "subroutine" was prevalent in assembly language

Assembly language

s and Fortran

Fortran

Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

.

A subroutine is sometimes called a callable unit.

Main concepts

Language support

Advantages

Disadvantages

Language support

Self-modifying code

Subroutine libraries

Return by indirect jump

Jump to subroutine

Call stack

Delayed stacking

C and C++ examples

Visual Basic 6 examples

By value [ByVal]

By reference [ByRef]

Public (optional)

Private (optional)

Friend (optional)

Local variables, recursion and re-entrancy

Overloading

Closures

Conventions

Return codes

Optimization of subroutine calls

Inlining

Related terms and clarification

See also