Relocation
Encyclopedia
"Relocation is the process of assigning load addresses to various parts of [a] program and adjusting the code and data in the program to reflect the assigned addresses."
A linker usually performs relocation in conjunction with symbol resolution, the process of searching files and libraries to replace symbolic references or names of libraries
Library (computer science)
In computer science, a library is a collection of resources used to develop software. These may include pre-written code and subroutines, classes, values or type specifications....

 with actual usable addresses in memory before running a program. Although relocation is typically done by the linker at link time, it can also be done at execution time by a relocating loader
Loader (computing)
In computing, a loader is the part of an operating system that is responsible for loading programs. It is one of the essential stages in the process of starting a program, as it places programs into memory and prepares them for execution...

, or by the running program itself.

Relocation Process

Relocation is typically done in two steps:
  1. Each object file
    Object file
    An object file is a file containing relocatable format machine code that is usually not directly executable. Object files are produced by an assembler, compiler, or other language translator, and used as input to the linker....

     has various sections like code
    Code segment
    In computing, a code segment, also known as a text segment or simply as text, is one of the sections of a program in an object file or in memory, which contains executable instructions....

    , data
    Data segment
    A data segment is a portion of virtual address space of a program, which contains the global variables and static variables that are initialized by the programmer...

    , .bss
    .bss
    In computer programming, the name .bss or bss is used by many compilers and linkers for a part of the data segment containing statically-allocated variables represented solely by zero-valued bits initially...

     etc. To combine all the objects to a single executable, the linker merges all sections of similar type into a single section of that type. The linker then assigns run time addresses to each section and each symbol. At this point, the code (functions) and data (global variables) will have unique run time addresses.
  2. Each section refers to one or more symbols which should be modified so that they point to the correct run time addresses based on information stored in a relocation table in the object file.

Relocation Table

The relocation table is a list of pointers created by the compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 or assembler and stored in the object or executable file. Each entry in the table, or "fixup", is a pointer to an address in the object code that must be changed when the loader relocates the program. Fixups are designed to support relocation of the program as a complete unit. In some cases, each fixup in the table is itself relative to a base address of zero, so the fixups themselves must be changed as the loader moves through the table.

In some architectures a fixup that crosses certain boundaries (such as a segment boundary) or that is not aligned on a word boundary is illegal and flagged as an error by the linker.

16-bit Windows

Far pointers (32-bit
32-bit
The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....

 pointers with segment:offset, used to address 20-bit 640 KB
Kilobyte
The kilobyte is a multiple of the unit byte for digital information. Although the prefix kilo- means 1000, the term kilobyte and symbol KB have historically been used to refer to either 1024 bytes or 1000 bytes, dependent upon context, in the fields of computer science and information...

 memory
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....

 space available to DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...

 programs
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

), which point to code or data within an DOS executable
DOS executable
The DOS MZ executable format is the executable file format used for .EXE files in DOS.The file can be identified by the ASCII string "MZ" or the hexadecimal 4D 5A at the beginning of the file . "MZ" are the initials of Mark Zbikowski, one of the developers of MS-DOS...

 (EXE
EXE
EXE is the common filename extension denoting an executable file in the DOS, OpenVMS, Microsoft Windows, Symbian, and OS/2 operating systems....

) do not have absolute segments, because the actual address
Memory address
A digital computer's memory, more specifically main memory, consists of many memory locations, each having a memory address, a number, analogous to a street address, at which computer programs store and retrieve, machine code or data. Most application programs do not directly read and write to...

 of code/data depends on where the program is loaded in memory and this is not known until the program is loaded.

Instead, segments are relative
Relative
-General use:*Kinship, the principle binding the most basic social units society. If two people are connected by circumstances of birth, they are said to be relatives-Philosophy:...

 values in the DOS EXE file. These segments need to be corrected, when the executable has been loaded into memory. The EXE loader
Loader (computing)
In computing, a loader is the part of an operating system that is responsible for loading programs. It is one of the essential stages in the process of starting a program, as it places programs into memory and prepares them for execution...

 uses a relocation table to find the segments which need to be adjusted.

32-bit Windows

With 32-bit Windows operating systems it is not mandatory to provide relocation tables for EXE files, since they are the first image loaded into the virtual address space and thus will be loaded at their preferred base address.

For both DLLs and for EXEs which opt into Address Space Layout Randomisation - an exploit mitigation technique introduced with Windows Vista, relocation tables once again become mandatory because of the possibility that the binary may be dynamically moved before being executed, even though they are still the first thing loaded in the virtual address space.

64-bit Windows

When running native 64-bit binaries on Windows Vista and above, ASLR is mandatory, and thus relocation sections cannot be omitted by the compiler.

Unix/Linux systems

The ELF executable format and SO shared library format used by most Unix/Linux systems allows to define several types of relocations.

See also

  • Linker (computing)
  • Library (computing)
  • Object file
    Object file
    An object file is a file containing relocatable format machine code that is usually not directly executable. Object files are produced by an assembler, compiler, or other language translator, and used as input to the linker....

  • Prebinding
    Prebinding
    Prebinding is a method for reducing the time it takes to launch executables in the Mach-O file format. For example, this is what Mac OS X is doing when in the "Optimizing" stage of installing system software or certain applications....

  • Static library
    Static library
    In computer science, a static library or statically-linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, producing an object file and a stand-alone executable...

  • Self-relocation
    Self-relocation
    A self-relocating program is a program that relocates its own address-dependent instructions and data when run, and is therefore capable of being loaded into memory at any address.Self-relocating code is a form of self modifying code.-Discussion:...

  • Position-independent code
    Position-independent code
    In computing, position-independent code or position-independent executable is machine instruction code that executes properly regardless of where in memory it resides...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK