Linker
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a linker or link editor is a program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 that takes one or more objects
Object file
An object file is a file containing relocatable format machine code that is usually not directly executable. Object files are produced by an assembler, compiler, or other language translator, and used as input to the linker....

 generated by a compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 and combines them into a single executable
Executable
In computing, an executable file causes a computer "to perform indicated tasks according to encoded instructions," as opposed to a data file that must be parsed by a program to be meaningful. These instructions are traditionally machine code instructions for a physical CPU...

 program.

In IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 mainframe
Mainframe computer
Mainframes are powerful computers used primarily by corporate and governmental organizations for critical applications, bulk data processing such as census, industry and consumer statistics, enterprise resource planning, and financial transaction processing.The term originally referred to the...

 environments such as OS/360 this program is known as a linkage editor.

On Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 variants the term loader
Loader (computing)
In computing, a loader is the part of an operating system that is responsible for loading programs. It is one of the essential stages in the process of starting a program, as it places programs into memory and prepares them for execution...

is often used as a synonym for linker. Other terminology was in use, too. For example, on SINTRAN III the process performed by a linker (assembling object files into a program) was called loading (as in loading executable code onto a file). Because this usage blurs the distinction between the compile-time process and the run-time process, this article will use linking for the former and loading for the latter. However, in some operating systems the same program handles both the jobs of linking and loading a program; see dynamic linking.

Overview

Computer programs typically comprise several parts or modules; all these parts/modules need not be contained within a single object file
Object file
An object file is a file containing relocatable format machine code that is usually not directly executable. Object files are produced by an assembler, compiler, or other language translator, and used as input to the linker....

, and in such case refer to each other by means of symbols
Debug symbol
A debug symbol is information that expresses which programming-language constructs generated a specific piece of machine code in a given executable module. Sometimes the symbolic information is compiled together with the module's binary file, or distributed in separate file, or simply discarded...

. Typically, an object file can contain three kinds of symbols:
  • defined symbols, which allow it to be called by other modules,
  • undefined symbols, which call the other modules where these symbols are defined, and
  • local symbols, used internally within the object file to facilitate relocation
    Relocation (computer science)
    "Relocation is the process of assigning load addresses to various parts of [a] program and adjusting the code and data in the program to reflect the assigned addresses."...

    .


When a program comprises multiple object files, the linker combines these files into a unified executable program, resolving the symbols as it goes along.

Linkers can take objects from a collection called a library
Library (computer science)
In computer science, a library is a collection of resources used to develop software. These may include pre-written code and subroutines, classes, values or type specifications....

. Some linkers do not include the whole library in the output; they only include its symbols that are referenced from other object files or libraries. Libraries exist for diverse purposes, and one or more system libraries are usually linked in by default.

The linker also takes care of arranging the objects in a program's address space
Address space
In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity.- Overview :...

. This may involve relocating code that assumes a specific base address
Base address
In computing, a base address is an address serving as a reference point for other addresses.In computers using relative addressing scheme, to obtain an absolute address, the relevant base address is taken and offset is added to it....

 to another base. Since a compiler seldom knows where an object will reside, it often assumes a fixed base location (for example, zero). Relocating machine code may involve re-targeting of absolute jumps, loads and stores.

The executable output by the linker may need another relocation pass when it is finally loaded into memory (just before execution). This pass is usually omitted on hardware
Computer hardware
Personal computer hardware are component devices which are typically installed into or peripheral to a computer case to create a personal computer upon which system software is installed including a firmware interface such as a BIOS and an operating system which supports application software that...

 offering virtual memory
Virtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...

 — every program is put into its own address space, so there is no conflict even if all programs load at the same base address.
This pass may also be omitted if the executable is a position independent executable.

Dynamic linking

Many operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 environments allow dynamic linking, that is the postponing of the resolving of some undefined symbols until a program is run. That means that the executable code still contains undefined symbols, plus a list of objects or libraries that will provide definitions for these. Loading the program will load these objects/libraries as well, and perform a final linking.

This approach offers two advantages:
  • Often-used libraries (for example the standard system libraries) need to be stored in only one location, not duplicated in every single binary.
  • If an error in a library function is corrected by replacing the library, all programs using it dynamically will benefit from the correction after restarting them. Programs that included this function by static linking would have to be re-linked first.


There are also disadvantages:
  • Known on the Windows platform as "DLL Hell
    DLL hell
    In computing, DLL Hell is a term for the complications that arise when working with dynamic link libraries used with Microsoft Windows operating systems, particularly legacy 16-bit editions which all run in a single memory space....

    ", an incompatible updated DLL will break executables that depended on the behavior of the previous DLL.
  • A program, together with the libraries it uses, might be certified (e.g. as to correctness, documentation requirements, or performance) as a package, but not if components can be replaced. (This also argues against automatic OS updates in critical systems; in both cases, the OS and libraries form part of a qualified environment.)

Relaxation

As the compiler has no information on the layout of objects in the final output, it cannot take advantage of shorter or more efficient instructions that place a requirement on the address of another object. For example, a jump instruction can reference an absolute address or an offset from the current location, and the offset could be expressed with different lengths depending on the distance to the target. By generating the most conservative instruction (usually the largest relative or absolute variant, depending on platform) and adding relaxation hints, it is possible to substitute shorter or more efficient instructions during the final link. This step can be performed only after all input objects have been read and assigned temporary addresses; the relaxation pass subsequently re-assigns addresses, which may in turn allow more relaxations to occur. In general, the substituted sequences are shorter, which allows this process to always converge on the best solution given a fixed order of objects; if this is not the case, relaxations can conflict, and the linker needs to weigh the advantages of either option.

See also

  • Dynamic library
  • GNU linker
  • Library
    Library (computer science)
    In computer science, a library is a collection of resources used to develop software. These may include pre-written code and subroutines, classes, values or type specifications....

  • Name decoration
  • Object file
    Object file
    An object file is a file containing relocatable format machine code that is usually not directly executable. Object files are produced by an assembler, compiler, or other language translator, and used as input to the linker....

  • Relocation
    Relocation (computer science)
    "Relocation is the process of assigning load addresses to various parts of [a] program and adjusting the code and data in the program to reflect the assigned addresses."...

  • Relocation table
  • Prelinking
  • Static library
    Static library
    In computer science, a static library or statically-linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, producing an object file and a stand-alone executable...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK