In Depth
See Also

GNU Compiler Collection

The GNU Compiler Collection is a set of programming language Programming language

A programming language is an artificial language [i] that can be used to control [i] ... 

 compiler Compiler

A compiler is a computer program [i] that translates text written in a computer language [i] into ano ... 

s produced by the GNU Project GNU

GNU is a free [i] operating system [i] consisting of a kernel [i], libraries [i] ... 

. It is free software Free software

Free software, as defined by the Free Software Foundation [i], is software [i] which can be used, copied ... 

 distributed by the Free Software Foundation Free Software Foundation

The Free Software Foundation is a non-profit corporation founded in October 1985 by [[Richard Stallman]... 

  under the GNU GPL GNU General Public License

The GNU General Public License is a widely used free software license [i], originally written by Richard Stallman [i] ... 

 and GNU LGPL GNU Lesser General Public License

The GNU Lesser General Public License is a free software license [i] published by the Free Software Foundation [i]... 

, and is a key component of the GNU toolchain. It is the standard compiler for the free software Unix-like Unix-like

A "Unix-like" operating system [i] is one that behaves in a manner similar to a Unix [i] system, while n ... 

 operating systems Operating system

An operating system is a software program [i] that manages the hardware [i] and software [i] ... 

, and several proprietary operating systems, notably Apple Mac OS X Mac OS X

Mac OS X is a line of proprietary [i], graphical operating system [i]s developed, ... 

. Originally named the GNU C Compiler, because it only handled the C programming language C (programming language)

The C programming language is a general-purpose, procedural [i], imperative [i] ... 

, GCC was later extended to compile C++ C++

C++ is a general-purpose, high-level [i] programming language [i] with low-level [i] facilities. ... 

, Objective-C, Java Java

style="margin-left: inherit; font-size: medium;" | Java ... 

, Fortran Fortran

FORTRAN is a general-purpose [i], procedural [i] ... 

, and Ada among others.

Discussions

  Discussion Features

   Ask a question about 'GNU Compiler Collection'

   Start a new discussion about 'GNU Compiler Collection'

   Answer questions about 'GNU Compiler Collection'

   'GNU Compiler Collection' discussion forum


Encyclopedia

The GNU Compiler Collection is a set of programming language Programming language

A programming language is an artificial language [i] that can be used to control [i] ... 

 compiler Compiler

A compiler is a computer program [i] that translates text written in a computer language [i] into ano ... 

s produced by the GNU Project GNU

GNU is a free [i] operating system [i] consisting of a kernel [i], libraries [i] ... 

. It is free software Free software

Free software, as defined by the Free Software Foundation [i], is software [i] which can be used, copied ... 

 distributed by the Free Software Foundation Free Software Foundation

The Free Software Foundation is a non-profit corporation founded in October 1985 by [[Richard Stallman]... 

  under the GNU GPL GNU General Public License

The GNU General Public License is a widely used free software license [i], originally written by Richard Stallman [i] ... 

 and GNU LGPL GNU Lesser General Public License

The GNU Lesser General Public License is a free software license [i] published by the Free Software Foundation [i]... 

, and is a key component of the GNU toolchain. It is the standard compiler for the free software Unix-like Unix-like

A "Unix-like" operating system [i] is one that behaves in a manner similar to a Unix [i] system, while n... 

 operating systems Operating system

An operating system is a software program [i] that manages the hardware [i] and software [i] ... 

, and several proprietary operating systems, notably Apple Mac OS X Mac OS X

Mac OS X is a line of proprietary [i], graphical operating system [i]s developed, ... 

.

Originally named the GNU C Compiler, because it only handled the C programming language C (programming language)

The C programming language is a general-purpose, procedural [i], imperative [i] ... 

, GCC was later extended to compile C++ C++

C++ is a general-purpose, high-level [i] programming language [i] with low-level [i] facilities. ... 

, Objective-C, Java Java

style="margin-left: inherit; font-size: medium;" | Java
... 

, Fortran Fortran

FORTRAN is a general-purpose [i], procedural [i] ... 

, and Ada among others.

Overview

GCC was originally written by Richard Stallman Richard Stallman

Richard Matthew Stallman is the founder of the free software movement [i], the GNU Project [i], the Free Software Foundation [i] ... 

 in 1987 as the compiler for the GNU Project, in order to have a compiler available that was free software Free software

Free software, as defined by the Free Software Foundation [i], is software [i] which can be used, copied ... 

. Its development was closely shepherded by the Free Software Foundation Free Software Foundation

The Free Software Foundation is a non-profit corporation founded in October 1985 by [[Richard Stallman]... 

.

In 1997, a group of developers, dissatisfied with the slow pace and closed nature of official GCC development, formed a project called EGCS , which merged several experimental forks Fork

As a piece of cutlery [i] or kitchenware [i], a fork is a tool consisting of a handle with several narro ... 

 into a single project forked from GCC. EGCS development subsequently proved sufficiently more vital than GCC development, and EGCS was eventually "blessed" as the official version of GCC in April 1999.

GCC is now maintained by a varied group of programmers from around the world. It has been ported to more kinds of processor Central processing unit

A central processing unit , or sometimes simply processor, is the component in a digital computer [i] ... 

s and operating system Operating system

An operating system is a software program [i] that manages the hardware [i] and software [i] ... 

s than any other compiler.

As well as being the official compiler of the GNU system, including Linux-based variants , GCC has been adopted as the main compiler used to build and develop other operating systems, including the BSD Berkeley Software Distribution

Berkeley Software Distribution is the Unix [i] derivative distributed by the University of California, Berkeley [i] ... 

s, Mac OS X Mac OS X

Mac OS X is a line of proprietary [i], graphical operating system [i]s developed, ... 

, NeXTSTEP NEXTSTEP

[i] [[operating system]... 

, and BeOS BeOS

BeOS is an operating system [i] for personal computers [i] which began development by Be Inc. [i] in 1991 [i] ... 

.

GCC is often the compiler of choice for developing software that is required to execute on a plethora of hardware. Differences in native compilers lead to difficulties in developing code that will compile correctly on all the compilers and build scripts that will run for all the platforms. By using GCC, the same parser is used for all platforms, so if the code compiles on one, chances are high that it compiles on all.

Languages

As of version 4.1.1 , the standard compiler release includes front ends for:
  • Ada Ada (programming language)

    Ada is a structured [i], statically typed [i] imperative [i] ... 

  • C C (programming language)

    The C programming language is a general-purpose, procedural [i], imperative [i] ... 

  • C++ C++

    C++ is a general-purpose, high-level [i] programming language [i] with low-level [i] facilities. ... 

  • Fortran Fortran

    FORTRAN is a general-purpose [i], procedural [i] ... 

  • Java Java (programming language)

    Java is an object-oriented [i] programming language [i] developed by James Gosling [i] ... 

  • Objective-C
  • Objective-C++


A front end for CHILL was previously included, but has been dropped owing to a lack of maintenance. Before version 4.0, the Fortran front end was G77, which only supports Fortran 77 Fortran

FORTRAN is a general-purpose [i], procedural [i] ... 

. In newer versions, G77 was dropped in favour of the new GFortran frontend that supports Fortran 95 Fortran

FORTRAN is a general-purpose [i], procedural [i] ... 

.

Pascal, D, Modula-2, Modula-3, Mercury Mercury programming language

name = Mercury
| logo = | screenshot =
... 

, VHDL, and PL/I frontends also exist.

Architectures

GCC target processors include:
  • Alpha DEC Alpha

    he DEC Alpha, also known as the Alpha AXP, is a 64-bit RISC [i] microprocessor [i] originally dev ... 

  • ARM ARM architecture

    The ARM architecture is a 32-bit [i] RISC [i] processor [i] architecture that i... 

  • Atmel AVR Atmel AVR

    * AVR Butterfly [i] ... 

  • Blackfin
  • H8/300
  • System/370 System/370

    The System/370 is a model range of IBM mainframe [i]s announced on June 30 [i], 1970 [i] as the successo ... 

    , System/390 ZSeries

    IBM eServer zSeries is a brand name of IBM [i] which was designated to all IBM mainframe [i]s in 2000 wi ... 

  • IA-32  and AMD64 X86-64

    x86-64 is a 64-bit [i] microprocessor architecture [i] and corresponding instruction set [i]... 

  • IA-64 i.e. the "Itanium Itanium

    The Itanium is an IA-64 [i] microprocessor [i] developed jointly by Hewlett-Packard [i] and Intel [i]. ... 

    "
  • Motorola 68000 Motorola 68000

    The Motorola 68000 is a 32-bit CISC [i] microprocessor [i], the first m ... 

  • Motorola 88000
  • MIPS MIPS architecture

    MIPS, for Microprocessor without interlocked pipeline stages, is a RISC [i] micro ... 

  • PA-RISC PA-RISC family

    PA-RISC is a microprocessor [i] architecture developed by Hewlett-Packard [i]'s Systems & VLSI Technol ... 

  • PDP-11 PDP-11

    The PDP-11 was a series of 16-bit [i] minicomputer [i]s sold by Digital Equipment Corp. [i] ... 

  • PowerPC PowerPC

    PowerPC is a RISC [i] microprocessor [i] architecture created by the 1991 [i] Apple [i]IBM [i] ... 

  • SuperH SuperH

    The SuperH is brandname of a certain microcontroller [i] and microprocessor [i] architecture. ... 

  • SPARC SPARC

    SPARC is a pure big-endian [i] RISC [i] microprocessor [i] ... 

  • VAX VAX

    VAX is a 32-bit [i] computing architecture [i] that supports an orthogonal instruction set [i]... 

  • Renesas R8C/M16C/M32C families
  • MorphoSys family


Lesser-known target processors supported in the standard release have included A29K AMD Am29000

The AMD 29000, often simply 29k, was a popular family of RISC [i]-based 32-bit microprocessor [i]s... 

, ARC Advanced RISC Computing

Advanced RISC Computing is a specification promulgated by a defunct consortium of computer [i] manufactu ... 

, C4x, CRIS, D30V, DSP16xx, FR-30, FR-V, Intel i960 Intel i960

Intel [i]'s i960 was a RISC [i]-based microprocessor [i] design that became popular during the early 1990s [i] ... 

, IP2000, M32R, 68HC11 Freescale 68HC11

The 68HC11 is a microcontroller [i] family originally from Motorola, now produced by Freescale Semiconductor [i] ... 

, MCORE, MMIX, MN10200, MN10300, NS32K NS320xx

The 320xx or NS32000 is a series of microprocessor [i]s from National Semiconductor [i] . ... 

, ROMP ROMP

The ROMP or Research Micro Processor chip, also known in some circles as 032, was first in silicon... 

, Stormy16, V850, and Xtensa. Additional processors, such as the D10V, PDP-10 PDP-10

The PDP-10 was a computer manufactured by Digital Equipment Corporation [i] from the late 1960s [i] on; ... 

, MicroBlaze, MSP430 and Z8000, have been supported by GCC versions maintained separately from the FSF version.

When retargeting GCC to a new platform, bootstrapping is often used.

Structure


GCC's external interface is generally standard for a Unix Unix

Unix or UNIX is a computer [i] operating system [i] originally developed in the 1960s and 1970s by ... 

 compiler. Users invoke a driver program named gcc, which interprets command arguments, decides which language compilers to use for each input file, runs the assembler on their output, and then possibly runs the linker Linker

In computer science [i], a linker or link editor is a program [i] that takes one ... 

 to produce a complete executable binary.

Each of the language compilers is a separate program that takes in source code and produces assembly language. All have a common internal structure; a per-language front end that parses the languages and produces an abstract syntax tree , and a back end that converts the trees to GCC's Register Transfer Language , runs various compiler optimizations, then produces assembly language using architecture-specific pattern matching originally based on an algorithm of Jack Davidson and Chris Fraser's.

Nearly all of GCC is written in C, although much of the Ada frontend is written in Ada.

Front ends

Frontends vary internally, having to produce trees that can be handled by the backend. The parsers are hand-coded recursive descent parsers.

Until recently, the tree representation of the program was not fully independent of the processor being targeted. Confusingly, the meaning of a tree was somewhat different for different language front-ends, and front-ends could provide their own tree codes.

In 2005, two new forms of language-independent trees were introduced. These new tree formats are called GENERIC and GIMPLE. Parsing is now done by creating temporary language-dependent trees, and converting them to GENERIC. The so-called "gimplifier" then lowers this more complex form into the simpler SSA Static single assignment form

In compiler [i] design, static single assignment form is an intermediate representation [i] in which eve ... 

-based GIMPLE form which is the common language for a large number of new powerful language- and architecture-independent global optimizations.

Middle end

Optimization on trees does not generally fit into what most compiler developers would consider a front end task, as it is not language dependent and does not involve parsing. GCC developers have given this part of the compiler the somewhat contradictory name the "middle end." These optimizations include dead code elimination, partial redundancy elimination, global value numbering, sparse conditional constant propagation, and scalar replacement of aggregates. Array dependence based optimizations such as automatic vectorization are currently being developed.

Back end

The behavior of the GCC back end is partly specified by preprocessor macros and functions specific to a target architecture, for instance to define the endianness, word size, and calling conventions. The front part of the back end uses these to help decide RTL generation, so although GCC's RTL is nominally processor-independent, the initial sequence of abstract instructions is already adapted to the target.

The exact set of GCC optimizations varies from release to release as it develops, but includes the standard algorithms, such as jump optimization, jump threading, common subexpression elimination, instruction scheduling, and so forth. The RTL optimizations are of less importance with the recent addition of global SSA-based optimizations on GIMPLE trees, as RTL optimizations have a much more limited scope, and have less high-level information.

A "reloading" phase changes abstract registers into real machine registers, using data collected from the patterns describing the target's instruction set. This is a somewhat complicated phase, because it must account for the vagaries of all of GCC's targets.

The final phase is somewhat anticlimactic, since the patterns to match were generally chosen during reloading, and so the assembly code is simply built by running substitutions of registers and addresses into the strings specifying the instructions.

Debugging GCC programs

The primary tool for debugging GCC code is the GNU Debugger . Among more specialized tools are Valgrind Valgrind

Valgrind is a free [i] programming tool [i] for memory debugging [i], memory leak [i] ... 

 for finding memory leaks.

References

  • Richard M. Stallman Richard Stallman

    Richard Matthew Stallman is the founder of the free software movement [i], the GNU Project [i], the Free Software Foundation [i] ... 

    : , Free Software Foundation Free Software Foundation

    The Free Software Foundation is a non-profit corporation founded in October 1985 by [[Richard Stallman]... 

    , ISBN 0-595-10035-X
  • Richard M. Stallman: , Free Software Foundation, ISBN 1-882114-39-6
  • Brian J. Gough: , Network Theory Ltd., ISBN 0-9541617-9-3

See also

GCC now includes Boehm GC, a conservative garbage collector for C/C++.

  • distcc - software for distributing compiles, designed to work with GCC
  • GCC Introspector
  • LLVM, Low Level Virtual Machine compiler infrastructure
  • MinGW, Minimalist GNU for Windows
  • Cygwin Cygwin

    Cygwin // - is a collection of free software [i] tools originally developed by Cygnus Solutions [i] to a ... 

  • GCC Summit
  • OpenWatcom, another free open-source C++/Fortran compiler
  • Code Sourcery - a company which contributes to GCC

External links

  • - hosted by archiving all gcc mailing lists into a searchable forum.
  • Overview and explanation of gcc's internal structure in Red Hat Magazine

Further reading

  • Arthur Griffith, GCC: The Complete Reference. McGrawHill/Osborne. ISBN 0-07-222405-3.