Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
GNU Compiler Collection

GNU Compiler Collection

Overview
The GNU Compiler Collection (GCC) is a compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 system produced by the GNU Project
GNU Project
The GNU Project is a free software, mass collaboration project, announced on September 27, 1983, by Richard Stallman at MIT. It initiated GNU operating system development in January, 1984...

 supporting various programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s. GCC is a key component of the GNU toolchain
GNU toolchain
The GNU toolchain is a blanket term for a collection of programming tools produced by the GNU Project. These tools form a toolchain used for developing applications and operating systems....

. As well as being the official compiler of the unfinished GNU
GNU
GNU is a Unix-like computer operating system developed by the GNU project, ultimately aiming to be a "complete Unix-compatible software system"...

 operating system, GCC has been adopted as the standard compiler by most other modern Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

 computer operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s, including Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

, the BSD
Berkeley Software Distribution
Berkeley Software Distribution is a Unix operating system derivative developed and distributed by the Computer Systems Research Group of the University of California, Berkeley, from 1977 to 1995...

 family and Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

. A port to RISC OS
RISC OS
RISC OS is a computer operating system originally developed by Acorn Computers Ltd in Cambridge, England for their range of desktop computers, based on their own ARM architecture. First released in 1987, under the name Arthur, the subsequent iteration was renamed as in 1988...

 has also been developed extensively in recent years. There is also an old (3.0) port of GCC to Plan9
Plan 9 from Bell Labs
Plan 9 from Bell Labs is a distributed operating system. It was developed primarily for research purposes as the successor to Unix by the Computing Sciences Research Center at Bell Labs between the mid-1980s and 2002...

, running under its ANSI/POSIX Environment
ANSI/POSIX Environment
The ANSI/POSIX Environment is a compatibility subsystem for the Plan 9 operating system, that implements an interface close to ANSI C and POSIX, with some common extensions...

 (APE). GCC is also available for the widely-used Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 operating systems, and for the ARM
ARM
An arm is an upper limb of the body.Arm may also refer to:-Geography:* Arm , a narrow stretch of a larger body of water** Canal arm, a subsidiary branch of a canal or inland waterway** Distributary or arm, a subsidiary branch of a river...

 processor used by many portable devices.
Discussion
Ask a question about 'GNU Compiler Collection'
Start a new discussion about 'GNU Compiler Collection'
Answer questions from other users
Full Discussion Forum
 
Unanswered Questions
Encyclopedia
The GNU Compiler Collection (GCC) is a compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 system produced by the GNU Project
GNU Project
The GNU Project is a free software, mass collaboration project, announced on September 27, 1983, by Richard Stallman at MIT. It initiated GNU operating system development in January, 1984...

 supporting various programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s. GCC is a key component of the GNU toolchain
GNU toolchain
The GNU toolchain is a blanket term for a collection of programming tools produced by the GNU Project. These tools form a toolchain used for developing applications and operating systems....

. As well as being the official compiler of the unfinished GNU
GNU
GNU is a Unix-like computer operating system developed by the GNU project, ultimately aiming to be a "complete Unix-compatible software system"...

 operating system, GCC has been adopted as the standard compiler by most other modern Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

 computer operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s, including Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

, the BSD
Berkeley Software Distribution
Berkeley Software Distribution is a Unix operating system derivative developed and distributed by the Computer Systems Research Group of the University of California, Berkeley, from 1977 to 1995...

 family and Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

. A port to RISC OS
RISC OS
RISC OS is a computer operating system originally developed by Acorn Computers Ltd in Cambridge, England for their range of desktop computers, based on their own ARM architecture. First released in 1987, under the name Arthur, the subsequent iteration was renamed as in 1988...

 has also been developed extensively in recent years. There is also an old (3.0) port of GCC to Plan9
Plan 9 from Bell Labs
Plan 9 from Bell Labs is a distributed operating system. It was developed primarily for research purposes as the successor to Unix by the Computing Sciences Research Center at Bell Labs between the mid-1980s and 2002...

, running under its ANSI/POSIX Environment
ANSI/POSIX Environment
The ANSI/POSIX Environment is a compatibility subsystem for the Plan 9 operating system, that implements an interface close to ANSI C and POSIX, with some common extensions...

 (APE). GCC is also available for the widely-used Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 operating systems, and for the ARM
ARM
An arm is an upper limb of the body.Arm may also refer to:-Geography:* Arm , a narrow stretch of a larger body of water** Canal arm, a subsidiary branch of a canal or inland waterway** Distributary or arm, a subsidiary branch of a river...

 processor used by many portable devices.

GCC has been ported
Porting
In computer science, porting is the process of adapting software so that an executable program can be created for a computing environment that is different from the one for which it was originally designed...

 to a wide variety of processor architectures, and is widely deployed as a tool in commercial, proprietary
Proprietary software
Proprietary software is computer software licensed under exclusive legal right of the copyright holder. The licensee is given the right to use the software under certain conditions, while restricted from other uses, such as modification, further distribution, or reverse engineering.Complementary...

 and closed source software
Closed source software
Closed source is a term for software released or distributed without the corresponding source code. Generally, it means only the binaries of a computer program are distributed and the license provides no access to the program's source code. The source code of such programs might be regarded as a...

 development environments. GCC is also available for most embedded platforms, for example Symbian
Symbian
Symbian is a mobile operating system and computing platform designed for smartphones and currently maintained by Accenture. The Symbian platform is the successor to Symbian OS and Nokia Series 60; unlike Symbian OS, which needed an additional user interface system, Symbian includes a user...

 (called gcce), AMCC
Applied Micro Circuits Corporation
Applied Micro Circuits Corporation is a fabless semiconductor company designing network and embedded Power Architecture , and server processor ARM , optical transport and storage solutions...

 and Freescale Power Architecture
Power Architecture
Power Architecture is a broad term to describe similar RISC instruction sets for microprocessors developed and manufactured by such companies as IBM, Freescale, AMCC, Tundra and P.A. Semi...

-based chips. The compiler can target a wide variety of platforms, including videogame consoles such as the PlayStation 2
PlayStation 2
The PlayStation 2 is a sixth-generation video game console manufactured by Sony as part of the PlayStation series. Its development was announced in March 1999 and it was first released on March 4, 2000, in Japan...

 and Dreamcast. Several companies make a business out of supplying and supporting GCC ports to various platforms, and chip manufacturers today consider a GCC port almost essential to the success of an architecture.

Originally named the GNU C Compiler, because it only handled the C programming language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, GCC 1.0 was released in 1987, and the compiler was extended to compile C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 in December of that year. Front ends were later developed for Objective-C
Objective-C
Objective-C is a reflective, object-oriented programming language that adds Smalltalk-style messaging to the C programming language.Today, it is used primarily on Apple's Mac OS X and iOS: two environments derived from the OpenStep standard, though not compliant with it...

, Objective-C++, Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

, Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

, and Go
Go (programming language)
Go is a compiled, garbage-collected, concurrent programming language developed by Google Inc.The initial design of Go was started in September 2007 by Robert Griesemer, Rob Pike, and Ken Thompson. Go was officially announced in November 2009. In May 2010, Rob Pike publicly stated that Go was being...

 among others.

The Free Software Foundation
Free Software Foundation
The Free Software Foundation is a non-profit corporation founded by Richard Stallman on 4 October 1985 to support the free software movement, a copyleft-based movement which aims to promote the universal freedom to create, distribute and modify computer software...

 (FSF) distributes GCC under the GNU General Public License
GNU General Public License
The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

 (GNU GPL). GCC has played an important role in the growth of free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

, as both a tool and an example.

History


Richard Stallman
Richard Stallman
Richard Matthew Stallman , often shortened to rms,"'Richard Stallman' is just my mundane name; you can call me 'rms'"|last= Stallman|first= Richard|date= N.D.|work=Richard Stallman's homepage...

's initial plan was to rewrite an existing compiler from Lawrence Livermore Lab from Pastel to C with some help from Len Tower and others. Stallman wrote a new C front end for the Livermore compiler but then realized that it required megabytes of stack space, an impossibility on a 68000 Unix system with only 64K, and concluded he would have to write a new compiler from scratch. None of the Pastel compiler code ended up in GCC, though Stallman did use the C front end he had written.

GCC was first released March 22, 1987, available by ftp
File Transfer Protocol
File Transfer Protocol is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server...

 from MIT
Massachusetts Institute of Technology
The Massachusetts Institute of Technology is a private research university located in Cambridge, Massachusetts. MIT has five schools and one college, containing a total of 32 academic departments, with a strong emphasis on scientific and technological education and research.Founded in 1861 in...

. Stallman was listed as the author but cited others for their contributions, including Jack Davidson and Christopher Fraser for the idea of using RTL
Register Transfer Language
In computer science, register transfer language is a term used to describe a kind of intermediate representation that is very close to assembly language, such as that which is used in a compiler. Academic papers and textbooks also often use a form of RTL as an architecture-neutral assembly language...

 as an intermediate language, Paul Rubin for writing most of the preprocessor and Leonard Tower for "parts of the parser, RTL generator, RTL definitions, and of the Vax machine description."

By 1991, GCC 1.x had reached a point of stability, but architectural limitations prevented many desired improvements, so the FSF started work on GCC 2.x.

As GCC was licensed under the GPL, programmers wanting to work in other directions—particularly those writing interfaces for languages other than C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

—were free to develop their own fork of the compiler (provided they meet the GPL's terms, including its requirements to distribute source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

). Multiple forks proved inefficient and unwieldy, however, and the difficulty in getting work accepted by the official GCC project was greatly frustrating for many. The FSF kept such close control on what was added to the official version of GCC 2.x that GCC was used as one example of the "cathedral" development model in Eric S. Raymond
Eric S. Raymond
Eric Steven Raymond , often referred to as ESR, is an American computer programmer, author and open source software advocate. After the 1997 publication of The Cathedral and the Bazaar, Raymond was for a number of years frequently quoted as an unofficial spokesman for the open source movement...

's essay The Cathedral and the Bazaar
The Cathedral and the Bazaar
The Cathedral and the Bazaar is an essay by Eric S. Raymond on software engineering methods, based on his observations of the Linux kernel development process and his experiences managing an open source project, fetchmail. It examines the struggle between top-down and bottom-up design...

.

With the release of 4.4BSD in 1994, GCC became the default compiler for most BSD systems.

EGCS fork


In 1997, a group of developers formed EGCS (Experimental/Enhanced GNU Compiler System), to merge several experimental forks into a single project. The basis of the merger was a GCC development snapshot taken between the 2.7 and 2.81 releases. Projects merged included g77 (FORTRAN
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

), PGCC (P5
P5 (microarchitecture)
The original Pentium microprocessor was introduced on March 22, 1993. Its microarchitecture, deemed P5, was Intel's fifth-generation and first superscalar x86 microarchitecture. As a direct extension of the 80486 architecture, it included dual integer pipelines, a faster FPU, wider data bus,...

 Pentium-optimized GCC), many C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 improvements, and many new architectures and operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 variants.

EGCS development proved considerably more vigorous than GCC development, so much so that the FSF officially halted development on their GCC 2.x compiler, "blessed" EGCS as the official version of GCC and appointed the EGCS project as the GCC maintainers in April 1999. Furthermore, the project explicitly adopted the "bazaar" model over the "cathedral" model. With the release of GCC 2.95 in July 1999, the two projects were once again united.

GCC is now maintained by a varied group of programmers from around the world, under the direction of a steering committee.
It has been ported to more kinds of processors
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

 and operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s than any other compiler.

GCC stable release


The current stable version of GCC is 4.6.2, which was released on October 26, 2011.

GCC 4.6 supports many new Objective-C
Objective-C
Objective-C is a reflective, object-oriented programming language that adds Smalltalk-style messaging to the C programming language.Today, it is used primarily on Apple's Mac OS X and iOS: two environments derived from the OpenStep standard, though not compliant with it...

 features, such as declared and synthesized properties, dot syntax, fast enumeration, optional protocol methods, method/protocol/class attributes, class extensions and a new GNU Objective-C runtime API. It also supports the Go programming language and includes the libquadmath library, which provides quadruple-precision
Quadruple precision floating-point format
In computing, quadruple precision is a binary floating-point computer number format that occupies 16 bytes in computer memory....

 mathematical functions on targets supporting the __float128 datatype. The library is used to provide the REAL(16) type in GNU Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

 on such targets.

GCC uses many standard tools in its build, including Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

, Flex
Flex lexical analyser
flex is a free software alternative to lex. It is frequently used with the free Bison parser generator. Unlike Bison, flex is not part of the GNU Project. Flex was written in C by Vern Paxson around 1987...

, Bison
GNU bison
GNU bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification of a context-free language, warns about any parsing ambiguities, and generates a parser which reads sequences of tokens and decides whether the sequence conforms to the syntax...

, and other common tools. In addition it currently requires three additional libraries to be present in order to build: GMP
GNU Multi-Precision Library
The GNU Multiple Precision Arithmetic Library, also known as GMP, is a free library for arbitrary-precision arithmetic, operating on signed integers, rational numbers, and floating point numbers...

, MPC, and MPFR
MPFR
GNU MPFR is a portable C library for arbitrary-precision binary floating-point computation with correct rounding, based on GNU Multi-Precision Library. The computation is both efficient and has a well-defined semantics. It copies the ideas from the ANSI/IEEE-754 standard for fixed-precision...

.

The previous major version, 4.5, was initially released on April 14, 2010 (last minor version is 4.5.3, released on April 29, 2011). It included several minor new features (new targets, new language dialects) and a couple major new features:
  • Link-time optimization optimizes across object file boundaries to directly improve the linked binary. Link-time optimization relies on an intermediate file containing the serialization of some -Gimple- representation included in the object file http://gcc.gnu.org/wiki/LinkTimeOptimization. The file is generated alongside the object file during source compilation. Each source compilation generates a separate object file and link-time helper file. When the object files are linked, the compiler is executed again and uses the helper files to optimize code across the separately compiled object files.
  • Plugins can extend the GCC compiler directly http://gcc.gnu.org/onlinedocs/gccint/Plugins.html. Plugins allow a stock compiler to be tailored to specific needs by external code loaded as plugins. For example, plugins can add, replace, or even remove middle–end passes operating on Gimple representations. Several GCC plugins have already been published, notably:
    • TreeHydra to help with Mozilla code development
    • DragonEgg to use the GCC front-end with LLVM
    • MELT (site GCC MELT) to enable coding GCC extensions in a lispy domain-specific language providing powerful Pattern-matching
    • MILEPOST CTuning to use machine learning
      Machine learning
      Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

       techniques to tune the compiler.

GCC trunk


The trunk concentrates the major part of the development efforts, where new features are implemented and tested. Eventually, the code from the trunk will become the next major release of GCC, with version 4.7.

Uses


GCC is often chosen for developing software that is required to execute on a wide variety of hardware and/or operating systems. System-specific compilers provided by hardware or OS vendors can differ substantially, complicating both the software's source code and the scripts that invoke the compiler to build it. With GCC, most of the compiler is the same on every platform, so only code that explicitly uses platform-specific features must be rewritten for each system.

Languages


The standard compiler release 4.6 includes front ends for C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 (gcc), C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 (g++), Objective-C
Objective-C
Objective-C is a reflective, object-oriented programming language that adds Smalltalk-style messaging to the C programming language.Today, it is used primarily on Apple's Mac OS X and iOS: two environments derived from the OpenStep standard, though not compliant with it...

 (gobjc), Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

 (gfortran
GFortran
gfortran is the name of the GNU Fortran compiler, which is part of the GNU Compiler Collection . gfortran has replaced the g77 compiler, which stopped development before GCC version 4.0. It includes support for the Fortran 95 language and is compatible with most language extensions supported by...

), Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 (gcj), Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

 (GNAT
GNAT
GNAT is a free-software compiler for the Ada programming language which forms part of the GNU Compiler Collection. It supports all versions of the language, i.e. Ada 2005, Ada 95 and Ada 83; it allows already some constructs of Ada 2012...

), and Go
Go (programming language)
Go is a compiled, garbage-collected, concurrent programming language developed by Google Inc.The initial design of Go was started in September 2007 by Robert Griesemer, Rob Pike, and Ken Thompson. Go was officially announced in November 2009. In May 2010, Rob Pike publicly stated that Go was being...

 (gccgo). Also available, but not in standard are Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

 (gpc
GNU Pascal
GNU Pascal is a Pascal compiler composed of a frontend to GNU Compiler Collection , similar to the way Fortran and other languages were added to GCC...

), Mercury, Modula-2
Modula-2
Modula-2 is a computer programming language designed and developed between 1977 and 1980 by Niklaus Wirth at ETH Zurich as a revision of Pascal to serve as the sole programming language for the operating system and application software for the personal workstation Lilith...

, Modula-3
Modula-3
In computer science, Modula-3 is a programming language conceived as a successor to an upgraded version of Modula-2 known as Modula-2+. While it has been influential in research circles it has not been adopted widely in industry...

, PL/I
PL/I
PL/I is a procedural, imperative computer programming language designed for scientific, engineering, business and systems programming applications...

, D
D (programming language)
The D programming language is an object-oriented, imperative, multi-paradigm, system programming language created by Walter Bright of Digital Mars. It originated as a re-engineering of C++, but even though it is mainly influenced by that language, it is not a variant of C++...

 (gdc), and VHDL (ghdl). A popular parallel language extension, OpenMP
OpenMP
OpenMP is an API that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran, on most processor architectures and operating systems, including Linux, Unix, AIX, Solaris, Mac OS X, and Microsoft Windows platforms...

, is also supported.

The Fortran front end was g77 before version 4.0, which only supports FORTRAN 77. In newer versions, g77 is dropped in favor of the new gfortran
GFortran
gfortran is the name of the GNU Fortran compiler, which is part of the GNU Compiler Collection . gfortran has replaced the g77 compiler, which stopped development before GCC version 4.0. It includes support for the Fortran 95 language and is compatible with most language extensions supported by...

front end that supports Fortran 95 and parts of Fortran 2003 as well. As the later Fortran standards incorporate the F77 standard, standards-compliant F77 code is also standards-compliant F90/95 code, and so can be compiled without trouble in gfortran. A front-end for CHILL
CHILL
In computing, CHILL is a procedural programming language designed for use in telecommunication switches . The language is still used for legacy systems in some telecommunication companies and for signal box programming.The CHILL language is similar in size and complexity to the Ada language...

 was dropped due to a lack of maintenance.

A few experimental branches exist to support additional languages, such as the GCC UPC
Unified Parallel C
Unified Parallel C is an extension of the C programming language designed for high-performance computing on large-scale parallel machines, including those with a common global address space and those with distributed memory...

 compiler for Unified Parallel C
Unified Parallel C
Unified Parallel C is an extension of the C programming language designed for high-performance computing on large-scale parallel machines, including those with a common global address space and those with distributed memory...

.

Architectures


GCC target processor families as of version 4.3 include:

  • Alpha
    DEC Alpha
    Alpha, originally known as Alpha AXP, is a 64-bit reduced instruction set computer instruction set architecture developed by Digital Equipment Corporation , designed to replace the 32-bit VAX complex instruction set computer ISA and its implementations. Alpha was implemented in microprocessors...

  • ARM
    ARM architecture
    ARM is a 32-bit reduced instruction set computer instruction set architecture developed by ARM Holdings. It was named the Advanced RISC Machine, and before that, the Acorn RISC Machine. The ARM architecture is the most widely used 32-bit ISA in numbers produced...

  • Atmel AVR
    Atmel AVR
    The AVR is a modified Harvard architecture 8-bit RISC single chip microcontroller which was developed by Atmel in 1996. The AVR was one of the first microcontroller families to use on-chip flash memory for program storage, as opposed to one-time programmable ROM, EPROM, or EEPROM used by other...

  • Blackfin
    Blackfin
    The Blackfin is a family of 16- or 32-bit microprocessors developed, manufactured and marketed by Analog Devices. The family is characterized by their built-in, fixed-point digital signal processor functionality supplied by 16-bit Multiply–accumulates , accompanied on-chip by a small and...

  • H8/300
  • HC12
  • IA-32
    IA-32
    IA-32 , also known as x86-32, i386 or x86, is the CISC instruction-set architecture of Intel's most commercially successful microprocessors, and was first implemented in the Intel 80386 as a 32-bit extension of x86 architecture...

     (x86
    X86 architecture
    The term x86 refers to a family of instruction set architectures based on the Intel 8086 CPU. The 8086 was launched in 1978 as a fully 16-bit extension of Intel's 8-bit based 8080 microprocessor and also introduced segmentation to overcome the 16-bit addressing barrier of such designs...

    )
  • IA-64
  • MIPS
    MIPS architecture
    MIPS is a reduced instruction set computer instruction set architecture developed by MIPS Technologies . The early MIPS architectures were 32-bit, and later versions were 64-bit...

  • Motorola 68000
    Motorola 68000
    The Motorola 68000 is a 16/32-bit CISC microprocessor core designed and marketed by Freescale Semiconductor...

  • PA-RISC
    PA-RISC
    PA-RISC is an instruction set architecture developed by Hewlett-Packard. As the name implies, it is a reduced instruction set computer architecture, where the PA stands for Precision Architecture...

  • PDP-11
    PDP-11
    The PDP-11 was a series of 16-bit minicomputers sold by Digital Equipment Corporation from 1970 into the 1990s, one of a succession of products in the PDP series. The PDP-11 replaced the PDP-8 in many real-time applications, although both product lines lived in parallel for more than 10 years...

  • PowerPC
    PowerPC
    PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...

  • R8C
    R8C
    The Renesas R8C is a 16-bit microcontroller that was developed as a smaller and cheaper version of the Renesas M16C . It retains the M16C's 16-bit CISC architecture and instruction set, but trades size for speed by cutting the internal data bus from 16 bits to 8 bits...

    /M16C/M32C
  • SPARC
    SPARC
    SPARC is a RISC instruction set architecture developed by Sun Microsystems and introduced in mid-1987....

  • SPU
  • SuperH
    SuperH
    SuperH is a 32-bit reduced instruction set computer instruction set architecture developed by Hitachi. It is implemented by microcontrollers and microprocessors for embedded systems....

  • System/390/zSeries
    ZSeries
    IBM System z, or earlier IBM eServer zSeries, is a brand name designated by IBM to all its mainframe computers.In 2000, IBM rebranded the existing System/390 to IBM eServer zSeries with the e depicted in IBM's red trademarked symbol, but because no specific machine names were changed for...

  • VAX
    VAX
    VAX was an instruction set architecture developed by Digital Equipment Corporation in the mid-1970s. A 32-bit complex instruction set computer ISA, it was designed to extend or replace DEC's various Programmed Data Processor ISAs...

  • x86-64
    X86-64
    x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...


Lesser-known target processors supported in the standard release have included:

  • 68HC11
    Freescale 68HC11
    The 68HC11 is an 8-bit microcontroller family introduced by Motorola in 1985. Now produced by Freescale Semiconductor, it descended from the Motorola 6800 microprocessor. It is a CISC microcontroller...

  • A29K
    AMD Am29000
    The AMD 29000, often simply 29k, was a popular family of 32-bit RISC microprocessors and microcontrollers developed and fabricated by Advanced Micro Devices . They were, for a time, the most popular RISC chips on the market, widely used in laser printers from a variety of manufacturers...

  • ARC
    ARC International
    ARC International plc was a developer of configurable microprocessor technology and is now owned by Synopsys. ARC developed synthesisable IP and licensed it to semiconductor companies....

  • AVR32
    AVR32
    The AVR32 is a 32-bit RISC microprocessor architecture designed by Atmel. The microprocessor architecture was designed by a handful of people educated at the Norwegian University of Science and Technology, including lead designer Øyvind Strøm, PhD and CPU architect Erik Renno, M.Sc in Atmel's...

  • D30V
  • DSP16xx
  • ETRAX CRIS
    ETRAX CRIS
    The ETRAX CRIS is a series of CPUs designed and manufactured by Axis Communications for use in embedded systems since 1993. The name is an acronym of the chip's features: Ethernet, Token Ring, AXis - Code Reduced Instruction Set...

  • FR-30
  • FR-V
    FR-V
    The Fujitsu FR-V is a VLIW-based RISC microprocessor, including FR-400 and FR-450 which runs Linux, and are also supported by the GNU Compiler Collection. Some processors include support with an MMU while others do not....

  • Intel i960
    Intel i960
    Intel's i960 was a RISC-based microprocessor design that became popular during the early 1990s as an embedded microcontroller, becoming a best-selling CPU in that field, along with the competing AMD 29000...

  • IP2000
  • M32R
    M32R
    The M32R is a 32-bit RISC instruction set architecture developed by Mitsubishi for embedded microprocessors and microcontrollers. The ISA is now owned by Renesas Electronics Corporation, and the company designs and fabricates M32R implementations. M32R processors are used in embedded systems such...

  • MCORE
  • MIL-STD-1750A
  • MMIX
    MMIX
    MMIX is a 64-bit RISC instruction set architecture designed by Donald Knuth, with significant contributions by John L. Hennessy and Richard L. Sites...

  • MN10200
  • MN10300
  • Motorola 88000
    Motorola 88000
    The 88000 is a RISC instruction set architecture developed by Motorola. The 88000 was Motorola's attempt at a home-grown RISC architecture, started in the 1980s. The 88000 arrived on the market some two years after the competing SPARC and MIPS...

  • NS32K
    NS320xx
    The 320xx or NS32000 was a series of microprocessors from National Semiconductor . They were likely the first 32-bit general-purpose microprocessors on the market, but due to a number of factors never managed to become a major player...

  • ROMP
    ROMP
    The ROMP or Research Micro Processor was a 10 MHz RISC microprocessor designed by IBM in the early 1980s manufactured on a 2 µm process with 45,000 transistors....

  • Stormy16
  • V850
    V850
    The Renesas Electronics V850 is a 32-bit RISC CPU core architecture for embedded microcontrollers originally developed and manufactured by NEC, succeeded by V850 variants named V850ES, V850E, and V850E2 which run uClinux. Compilers available for it include the GNU Compiler Collection, IAR Systems...

  • Xtensa

Additional processors have been supported by GCC versions maintained separately from the FSF version:

  • Cortus APS3
  • D10V
  • EISC
    EISC
    The EISC is a compressed code processor architecture for embedded applications. It has both the properties of RISC architecture,simplicity, and that of CISC processor,expenability...

  • eSi-RISC
    ESi-RISC
    eSi-RISC is a configurable CPU architecture from EnSilica. It is currently available in three different implementations: the eSi-1600, eSi-3200 and eSi-3250. The eSi-1600 features a 16-bit data-path, while the eSi-3200 and eSi-3250 feature 32-bit data-paths...

  • Hexagon
  • LatticeMico32
    LatticeMico32
    LatticeMico32 is a 32-bit microprocessor soft core from Lattice Semiconductor optimized for field-programmable gate arrays . It uses a Harvard architecture, which means the instruction and data buses are separate. Bus arbitration logic can be used to combine the two buses, if desired.LatticeMico32...

  • LatticeMico8
  • MeP
    MEP
    MEP may refer to:* Member of the European Parliament, an elected politician in the European Union * Mechanical, Electrical and Plumbing, a part of the building design industry...

  • MicroBlaze
    MicroBlaze
    The MicroBlaze is a soft processor core designed for Xilinx FPGAs from Xilinx. As a soft-core processor, MicroBlaze is implemented entirely in the general-purpose memory and logic fabric of Xilinx FPGAs.-Overview:...

  • Motorola 6809
  • MSP430
    TI MSP430
    The MSP430 is a mixed-signal microcontroller family from Texas Instruments. Built around a 16-bit CPU, the MSP430 is designed for low cost, and specifically, low power consumption embedded applications. The architecture dates from the 1990s and is reminiscent of the DEC PDP-11.-Applications:The...

  • NEC SX architecture
  • Nios II
    Nios II
    Nios II is a 32-bit embedded-processor architecture designed specifically for the Altera family of FPGAs. Nios II incorporates many enhancements over the original Nios architecture, making it more suitable for a wider range of embedded computing applications, from DSP to system-control.Nios II is...

     and Nios
    Nios embedded processor
    Nios was Altera's first configurable 16-bit embedded processor for its FPGA product-line. For new designs, Altera recommends the 32-bit Nios II. It is now considered obsolete.- See also :* LatticeMico8* LatticeMico32* MicroBlaze* PicoBlaze* Micon P200...

  • OpenRISC 1200
    OpenRISC 1200
    The OpenRISC 1200 is a synthesizable CPU core maintained by developers at OpenCores.org. The OR1200 design is an open source implementation of the OpenRISC 1000 RISC architecture . The Verilog RTL description is released under the GNU Lesser General Public License .-Architecture :The IP core of...

  • PDP-10
    PDP-10
    The PDP-10 was a mainframe computer family manufactured by Digital Equipment Corporation from the late 1960s on; the name stands for "Programmed Data Processor model 10". The first model was delivered in 1966...

  • PIC24/dsPIC
  • System/370
    System/370
    The IBM System/370 was a model range of IBM mainframes announced on June 30, 1970 as the successors to the System/360 family. The series maintained backward compatibility with the S/360, allowing an easy migration path for customers; this, plus improved performance, were the dominant themes of the...

  • TIGCC
    TIGCC
    TIGCC is a software development environment which allows developers to program and compile A68K assembly, GNU assembly, and C code for the Motorola 68000 series Texas Instruments graphing calculators...

     (m68k
    Motorola 68000
    The Motorola 68000 is a 16/32-bit CISC microprocessor core designed and marketed by Freescale Semiconductor...

     variant)
  • Z8000
    Zilog Z8000
    The Z8000 is a 16-bit microprocessor introduced by Zilog in 1979. The architecture was designed by Bernard Peuto while the logic and physical implementation was done by Masatoshi Shima, assisted by a small group of people. The Z8000 was not Z80-compatible, and although it saw steady use well into...


The gcj Java compiler can target either a native machine language architecture or the Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

's Java bytecode
Java bytecode
Java bytecode is the form of instructions that the Java virtual machine executes. Each bytecode opcode is one byte in length, although some require parameters, resulting in some multi-byte instructions. Not all of the possible 256 opcodes are used. 51 are reserved for future use...

. When retargeting
Retargetable compiler
In software engineering, retargeting is an attribute of software development tools that have been specifically designed to generate code for more than one computing platform.-Compilers:...

 GCC to a new platform, bootstrapping
Bootstrapping (compilers)
In computer science, bootstrapping is the process of writing a compiler in the target programming language which it is intended to compile...

 is often used.

Structure


GCC's external interface is generally standard for a UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 compiler. Users invoke a driver program named gcc, which interprets command arguments, decides which language compilers to use for each input file, runs the assembler on their output, and then possibly runs the linker to produce a complete executable
Executable
In computing, an executable file causes a computer "to perform indicated tasks according to encoded instructions," as opposed to a data file that must be parsed by a program to be meaningful. These instructions are traditionally machine code instructions for a physical CPU...

 binary.

Each of the language compilers is a separate program that inputs source code and outputs machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

. All have a common internal structure. A per-language front end parses
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...

 the source code in that language and produces an abstract syntax tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...

 ("tree" for short).

These are, if necessary, converted to the middle-end's input representation, called GENERIC form; the middle-end then gradually transforms the program towards its final form. Compiler optimization
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

s and static code analysis
Static code analysis
Static program analysis is the analysis of computer software that is performed without actually executing programs built from that software In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code...

 techniques (such as FORTIFY_SOURCE, a compiler directive that attempts to discover some buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

s) are applied to the code. These work on multiple representations, mostly the architecture-independent GIMPLE representation and the architecture-dependent RTL
Register Transfer Language
In computer science, register transfer language is a term used to describe a kind of intermediate representation that is very close to assembly language, such as that which is used in a compiler. Academic papers and textbooks also often use a form of RTL as an architecture-neutral assembly language...

 representation. Finally, machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

 is produced using architecture-specific pattern matching
Pattern matching
In computer science, pattern matching is the act of checking some sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact. The patterns generally have the form of either sequences or tree structures...

 originally based on an algorithm of Jack Davidson
Jack Davidson
Jack Davidson was an Australian rules footballer who played for Melbourne and South Melbourne in the Victorian Football League ....

 and Chris Fraser.

GCC is written primarily in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 except for parts of the Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

 front end. The distribution includes the standard libraries for Ada, C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, and Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 whose code is mostly written in those languages. On some platforms, the distribution also includes a low-level runtime library, libgcc, written in a combination of machine-independent C and processor-specific machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

, designed primarily to handle arithmetic operations that the target processor cannot perform directly.

In May 2010, the GCC steering committee decided to allow use of a C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 compiler to compile GCC. The compiler will be written in C plus a subset of features from C++. In particular, this was decided so that GCC's developers could use the "destructors
Destructor (computer science)
In object-oriented programming, a destructor is a method which is automatically invoked when the object is destroyed...

" and "generics
Generic programming
In a broad definition, generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters...

" features of C++.

Front-ends


Frontends vary internally, having to produce trees that can be handled by the backend. Currently, the parsers are all hand-coded recursive descent parser
Recursive descent parser
A recursive descent parser is a top-down parser built from a set of mutually-recursive procedures where each such procedure usually implements one of the production rules of the grammar...

s, though there is no reason why a parser generator could not be used for new front-ends in the future (version 2 of the C compiler used a bison
GNU bison
GNU bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification of a context-free language, warns about any parsing ambiguities, and generates a parser which reads sequences of tokens and decides whether the sequence conforms to the syntax...

 based grammar).

Until recently, the tree representation of the program was not fully independent of the processor being targeted.

The meaning of a tree was somewhat different for different language front-ends, and front-ends could provide their own tree codes. This was simplified with the introduction of GENERIC and GIMPLE, two new forms of language-independent trees that were introduced with the advent of GCC 4.0. GENERIC is more complex, based on the GCC 3.x Java front-end's intermediate representation. GIMPLE is a simplified GENERIC, in which various constructs are lowered to multiple GIMPLE instructions. The C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 and Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 front ends produce GENERIC directly in the front end. Other front ends instead have different intermediate representations after parsing and convert these to GENERIC.

In either case, the so-called "gimplifier" then lowers this more complex form into the simpler SSA
Static single assignment form
In compiler design, static single assignment form is a property of an intermediate representation , which says that each variable is assigned exactly once...

-based GIMPLE form that is the common language for a large number of new powerful language- and architecture-independent global (function scope) optimizations.

GENERIC and GIMPLE


GENERIC is an intermediate representation language used as a "middle-end" while compiling source code into executable binaries. A subset, called GIMPLE, is targeted by all the front-ends of GCC.

The middle stage of GCC does all the code analysis and optimization, working independently of both the compiled language and the target architecture, starting from the GENERIC representation and expanding it to Register Transfer Language
Register Transfer Language
In computer science, register transfer language is a term used to describe a kind of intermediate representation that is very close to assembly language, such as that which is used in a compiler. Academic papers and textbooks also often use a form of RTL as an architecture-neutral assembly language...

. The GENERIC representation contains only the subset of the imperative programming
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

 constructs optimised by the middle-end.

In transforming the source code to GIMPLE, complex expressions
Expression (programming)
An expression in a programming language is a combination of explicit values, constants, variables, operators, and functions that are interpreted according to the particular rules of precedence and of association for a particular programming language, which computes and then produces another value...

 are split into a three address code
Three address code
In computer science, three-address code is a form of representing intermediate code used by compilers to aid in the implementation of code-improving transformations...

 using temporary variable
Temporary variable
In computer programming, a temporary variable is a variable whose purpose is short-lived, usually to hold temporary data that will soon be discarded, or before it can be placed at a more permanent memory location. Because it is short-lived, it is usually declared with local scope...

s. This representation was inspired by the SIMPLE representation proposed in the McCAT compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 by Laurie J. Hendren for simplifying the analysis and optimization
Optimization (computer science)
In computer science, program optimization or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources...

 of imperative programs
Imperative programming
In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state...

.

Optimization


Optimization can occur during any phase of compilation, however the bulk of optimizations are performed after the syntax and semantic analysis of the front-end and before the code generation of the back-end, thus a common, even though somewhat contradictory, name for this part of the compiler is "middle end."

The exact set of GCC optimizations varies from release to release as it develops, but includes the standard algorithms, such as loop optimization
Loop optimization
In compiler theory, loop optimization plays an important role in improving cache performance, making effective use of parallel processing capabilities, and reducing overheads associated with executing loops. Most execution time of a scientific program is spent on loops...

, jump threading
Jump threading
In computing, jump threading is a compiler optimization of one jump directly to a second jump. If the second condition is a subset or inverse of the first, it can be eliminated, or threaded through the first jump. This is easily done in a single pass through the program, following acyclic chained...

, common subexpression elimination
Common subexpression elimination
In computer science, common subexpression elimination is a compiler optimization that searches for instances of identical expressions , and analyses whether it is worthwhile replacing them with a single variable holding the computed value.- Example :In the following code: a = b * c + g; d = b * c...

, instruction scheduling
Instruction scheduling
In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines...

, and so forth. The RTL
Register Transfer Language
In computer science, register transfer language is a term used to describe a kind of intermediate representation that is very close to assembly language, such as that which is used in a compiler. Academic papers and textbooks also often use a form of RTL as an architecture-neutral assembly language...

 optimizations are of less importance with the addition of global SSA-based optimizations on GIMPLE trees,
as RTL optimizations have a much more limited scope, and have less high-level information.

Some of these optimizations performed at this level include dead code elimination
Dead code elimination
In compiler theory, dead code elimination is a compiler optimization to remove code which does not affect the program results. Removing such code has two benefits: it shrinks program size, an important...

, partial redundancy elimination
Partial redundancy elimination
In compiler theory, partial redundancy elimination is a compiler optimization that eliminates expressions that are redundant on some but not necessarily all paths through a program...

, global value numbering
Global value numbering
Global value numbering is a compiler optimization based on the SSA intermediate representation. It sometimes helps eliminate redundant code that common subexpression elimination does not. At the same time, however, CSE may eliminate code that GVN does not, so both are often found in modern...

, sparse conditional constant propagation
Sparse conditional constant propagation
In computer science, sparse conditional constant propagation is an optimization frequently applied in compilers after conversion to static single assignment form . It simultaneously removes some kinds of dead code and propagates constants throughout a program...

, and scalar replacement of aggregates. Array dependence based optimizations such as automatic vectorization and automatic parallelization
Automatic parallelization
Automatic parallelization, also auto parallelization, autoparallelization, or parallelization, the last one of which implies automation when used in context, refers to converting sequential code into multi-threaded or vectorized code in order to utilize multiple processors simultaneously in a...

 are also performed. Profile-guided optimization
Profile-guided optimization
Profile-guided optimization is a compiler optimization technique in computer programming to improve program runtime performance. In contrast to traditional optimization techniques that solely use the source code, PGO uses the results of test runs of the instrumented program to optimize the final...

 is also possible as demonstrated here: http://gcc.gnu.org/install/build.html#TOC4

Back-end


The behavior of GCC's back end is partly specified by preprocessor macros
C preprocessor
The C preprocessor is the preprocessor for the C and C++ computer programming languages. The preprocessor handles directives for source file inclusion , macro definitions , and conditional inclusion ....

 and functions specific to a target architecture, for instance to define the endianness
Endianness
In computing, the term endian or endianness refers to the ordering of individually addressable sub-components within the representation of a larger data item as stored in external memory . Each sub-component in the representation has a unique degree of significance, like the place value of digits...

, word size, and calling convention
Calling convention
In computer science, a calling convention is a scheme for how subroutines receive parameters from their caller and how they return a result; calling conventions can differ in:...

s. The front part of the back end uses these to help decide RTL generation, so although GCC's RTL is nominally processor-independent, the initial sequence of abstract instructions is already adapted to the target. At any moment, the actual RTL instructions forming the program representation have to comply with the machine description of the target architecture.

The machine description file contains RTL patterns, along with operand constraints, and code snippets to output the final assembly. The constraints indicate that a particular RTL pattern might only apply (for example) to certain hardware registers, or (for example) allow immediate operand offsets of only a limited size (e.g. 12, 16, 22, ... bit offsets, etc.). During RTL generation, the constraints for the given target architecture are checked. In order to issue a given snippet of RTL, it must match one (or more) of the RTL patterns in the machine description file, and satisfy the constraints for that pattern; otherwise, it would be impossible to convert the final RTL into machine code.

Towards the end of compilation, valid RTL is reduced to a strict form in which each instruction refers to real machine registers and a pattern from the target's machine description file. Forming strict RTL is a complicated task; an important step is register allocation
Register allocation
In compiler optimization, register allocation is the process of assigning a large number of target program variables onto a small number of CPU registers...

, where real, hardware registers are chosen to replace the initially-assigned pseudo-registers. This is followed by a "reloading" phase; any pseudo-registers that were not assigned a real hardware register are 'spilled' to the stack, and RTL to perform this spilling is generated. Likewise, offsets that are too large to fit in an actual instruction must be broken up and replaced by RTL sequences that will obey the offset constraints.

In the final phase the machine code is built by calling a small snippet of code, associated with each pattern, to generate the real instructions from the target's instruction set
Instruction set
An instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...

, using the final registers, offsets and addresses chosen during the reload phase. The assembly-generation snippet may be just a string; in which case, a simple string substitution of the registers, offsets, and/or addresses into the string is performed. The assembly-generation snippet may also be a short block of C code, performing some additional work, but ultimately returning a string containing the valid machine code.

Compatible IDEs


Most integrated development environment
Integrated development environment
An integrated development environment is a software application that provides comprehensive facilities to computer programmers for software development...

s written for GNU/Linux and some for other operating systems support GCC. These include:
  • Anjuta
    Anjuta
    Anjuta is an integrated development environment for the C, C++, Java, JavaScript, Python and Vala computer programming languages, written for the GNOME project...

  • Code::Blocks
    Code::Blocks
    Code::Blocks is a free and open source, cross-platform IDE which supports multiple compilers including GCC and MSVC. It is developed in C++ using wxWidgets as the GUI toolkit. Using a plugin architecture, its capabilities and features are defined by the provided plugins.Currently, Code::Blocks is...

  • CodeLite
    Codelite
    CodeLite is a free, open-source, cross-platform IDE for the C/C++ programming languages.- History :On August 2006 Eran Ifrah, CodeLite's author, started a project named CodeLite...

  • Dev-C++
    Dev-C++
    Dev-C++ is a free integrated development environment distributed under the GNU General Public License for programming in C and C++. MinGW, a free compiler, is bundled with it. The IDE is written in Delphi....

  • Eclipse
    Eclipse (software)
    Eclipse is a multi-language software development environment comprising an integrated development environment and an extensible plug-in system...

  • geany
    Geany
    Geany is a lightweight cross-platform GTK+ text editor based on Scintilla and including basic Integrated Development Environment features. It is designed to have short load times, with limited dependency on separate packages or external libraries. It is available for a wide range of operating...

  • KDevelop
    KDevelop
    KDevelop is a free software integrated development environment for the KDE Platform on Unix-like computer operating systems. KDevelop includes no compiler. Instead, it uses an external compiler such as gcc to produce executable code....

  • NetBeans
    NetBeans
    NetBeans refers to both a platform framework for Java desktop applications, and an integrated development environment for developing with Java, JavaScript, PHP, Python, Groovy, C, C++, Scala, Clojure, and others...

  • Qt Creator
    Qt Creator
    Qt Creator is a cross-platform C++ integrated development environment which is part of the Qt SDK. It includes a visual debugger and an integrated GUI layout and forms designer. The editor's features includes syntax highlighting and autocompletion, but not tabs. Qt Creator uses the C++ compiler...

  • Xcode
    Xcode
    Xcode is a suite of tools, developed by Apple, for developing software for Mac OS X and iOS. Xcode 4.2, the latest major version, is available on the Mac App Store for free for Mac OS X 10.7 , and on the Apple Developer Connection website for free to registered developers Xcode is a suite of tools,...



Debugging GCC programs


The primary tool used to debug GCC code is the GNU Debugger
GNU Debugger
The GNU Debugger, usually called just GDB and named gdb as an executable file, is the standard debugger for the GNU software system. It is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, C, C++, Objective-C, Free Pascal, Fortran, Java...

 (gdb). Among more specialized tools are Valgrind
Valgrind
Valgrind is a GPL licensed programming tool for memory debugging, memory leak detection, and profiling. The name valgrind comes from the main entrance to Valhalla in Norse mythology....

, for finding memory errors and leaks, and the graph profiler (gprof) that can determine how much time is spent in which routines, and how often they are called; this requires programs to be compiled with profiling options.

Further reading


  • Richard Stallman
    Richard Stallman
    Richard Matthew Stallman , often shortened to rms,"'Richard Stallman' is just my mundane name; you can call me 'rms'"|last= Stallman|first= Richard|date= N.D.|work=Richard Stallman's homepage...

    : Using the GNU Compiler Collection (GCC), Free Software Foundation
    Free Software Foundation
    The Free Software Foundation is a non-profit corporation founded by Richard Stallman on 4 October 1985 to support the free software movement, a copyleft-based movement which aims to promote the universal freedom to create, distribute and modify computer software...

    , 2008.
  • Richard Stallman
    Richard Stallman
    Richard Matthew Stallman , often shortened to rms,"'Richard Stallman' is just my mundane name; you can call me 'rms'"|last= Stallman|first= Richard|date= N.D.|work=Richard Stallman's homepage...

    : GNU Compiler Collection (GCC) Internals, Free Software Foundation
    Free Software Foundation
    The Free Software Foundation is a non-profit corporation founded by Richard Stallman on 4 October 1985 to support the free software movement, a copyleft-based movement which aims to promote the universal freedom to create, distribute and modify computer software...

    , 2008.
  • Brian J. Gough: An Introduction to GCC, Network Theory Ltd., 2004 (Revised August 2005). ISBN 0-9541617-9-3.
  • Arthur Griffith, GCC: The Complete Reference. McGrawHill/Osborne, 2002. ISBN 0-07-222405-3.


External links