Clang
Encyclopedia
Clang is a compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 front end
Front-end and back-end
Front end and back end are generalized terms that refer to the initial and the end stages of a process. The front end is responsible for collecting input in various forms from the user and processing it to conform to a specification the back end can use...

 for the C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, Objective-C
Objective-C
Objective-C is a reflective, object-oriented programming language that adds Smalltalk-style messaging to the C programming language.Today, it is used primarily on Apple's Mac OS X and iOS: two environments derived from the OpenStep standard, though not compliant with it...

, and Objective-C++ programming languages. It uses the Low Level Virtual Machine
Low Level Virtual Machine
The Low Level Virtual Machine is a compiler infrastructure written in C++ that is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages...

 (LLVM) as its back end
Front-end and back-end
Front end and back end are generalized terms that refer to the initial and the end stages of a process. The front end is responsible for collecting input in various forms from the user and processing it to conform to a specification the back end can use...

, and Clang has been part of LLVM releases since LLVM 2.6.

Its goal is to offer a replacement to the GNU Compiler Collection
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

 (GCC). Development is sponsored by Apple. Clang is available under a free software license.

The Clang project includes the Clang front end and the Clang static analyzer among others.

Background

Starting in 2005, Apple has made extensive use of LLVM in a number of commercial systems, including the iPhone development kit
IPhone SDK
The iOS SDK is a software development kit developed by Apple Inc. and released in February 2008 to develop native applications for iOS.-History:...

 and Xcode
Xcode
Xcode is a suite of tools, developed by Apple, for developing software for Mac OS X and iOS. Xcode 4.2, the latest major version, is available on the Mac App Store for free for Mac OS X 10.7 , and on the Apple Developer Connection website for free to registered developers Xcode is a suite of tools,...

 3.1.

One of the first uses of LLVM was an OpenGL
OpenGL
OpenGL is a standard specification defining a cross-language, cross-platform API for writing applications that produce 2D and 3D computer graphics. The interface consists of over 250 different function calls which can be used to draw complex three-dimensional scenes from simple primitives. OpenGL...

 code compiler for Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

 that converts OpenGL calls into more fundamental calls for graphics processing unit
Graphics processing unit
A graphics processing unit or GPU is a specialized circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display...

s (GPU) that do not support certain features. This allowed Apple to support the entire OpenGL application programming interface
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

 (API) on computers using Intel Graphics Media Accelerator
Intel GMA
The Intel Graphics Media Accelerator, or GMA, is a series of Intel integrated graphics processors built into various motherboard chipsets....

 (GMA) chipsets, increasing performance on those machines.
For sufficiently capable GPUs, the code is compiled to take full advantage of the underlying hardware, but on GMA machines, LLVM compiles the same OpenGL code into subroutines to ensure it continues to work properly.

LLVM was originally intended to use GCC's front end, but GCC turned out to cause some problems for both the LLVM developers and Apple. GCC is a large and somewhat cumbersome system to develop; as one long-time GCC developer put it, "Trying to make the hippo dance is not really a lot of fun"
and a Google Summer of Code
Google Summer of Code
The Google Summer of Code is an annual program, first held from May to August 2005, in which Google awards stipends to hundreds of students who successfully complete a requested free or open-source software coding project during the summer...

 intern commented, "Reading GCC codebase has been a hard exercise for me. In fact it's the only project I know of that becomes more and more difficult as time passes."

Apple software makes heavy use of Objective-C
Objective-C
Objective-C is a reflective, object-oriented programming language that adds Smalltalk-style messaging to the C programming language.Today, it is used primarily on Apple's Mac OS X and iOS: two environments derived from the OpenStep standard, though not compliant with it...

, but the Objective-C front-end in GCC is a low priority for the current GCC developers. Also, GCC does not fit smoothly into Apple's IDE.
Finally, GCC is GPL licensed, which requires developers who distribute extensions for (or modified versions of) GCC to make their source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 available, whereas LLVM has a BSD-like license
BSD licenses
BSD licenses are a family of permissive free software licenses. The original license was used for the Berkeley Software Distribution , a Unix-like operating system after which it is named....

  which permits including the source into proprietary software.

Apple chose to develop a new compiler front end from scratch, supporting only C99
C99
C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

, Objective-C and C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

.
This "clang" project was open-sourced in July 2007.

Overview

Clang is a new C-targeted compiler intended specifically to work on top of LLVM.
The combination of Clang and LLVM provides the majority of a toolchain, allowing the replacement of the whole GCC stack. Because it is built with a library-based design, like the rest of LLVM, Clang is easy to embed into other applications. This is one reason why a majority of the OpenCL
OpenCL
OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. OpenCL includes a language for writing kernels , plus APIs that are used to define and then control the platforms...

 implementations are built with Clang and LLVM.

One of Clang's primary goals is to better support incremental compilation
Incremental compiler
The term incremental compiler may refer to two different types of compiler.-Imperative programming:In imperative programming and software development, an incremental compiler is one that when invoked, takes only the changes of a known set of source files and updates any corresponding output files ...

 to allow the compiler to be more tightly tied to the IDE GUI
Integrated development environment
An integrated development environment is a software application that provides comprehensive facilities to computer programmers for software development...

. GCC is designed to work in a "classic" compile-link-debug cycle, and although it provides useful ways to support incremental and interrupted compiling on-the-fly, integrating them with other tools is not always easy. For instance, GCC uses a step called "fold" that is key to the overall compile process, which has the side effect of translating the code tree into a form that does not look very much like the original source code. If an error is found during or after the fold step, it can be difficult to translate that back into a single location in the original source. Additionally, vendors using the GCC stack within IDEs used separate tools to index the code to provide features like syntax highlighting
Syntax highlighting
Syntax highlighting is a feature of some text editors that display text—especially source code—in different colors and fonts according to the category of terms. This feature eases writing in a structured language such as a programming language or a markup language as both structures and...

 and autocomplete
Autocomplete
Autocomplete is a feature provided by many web browsers, e-mail programs, search engine interfaces, source code editors, database query tools, word processors, and command line interpreters. Autocomplete involves the program predicting a word or phrase that the user wants to type in without the...

.

Clang is designed to retain more information during the compilation process than GCC, and preserve the overall form of the original code. The objective of this is to make it easier to map errors back into the original source. The error reports offered by Clang are also aimed to be more detailed and specific, as well as machine-readable, so IDEs can index the output of the compiler during compilation. Modular design of the compiler can offer source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 indexing
Index (search engine)
Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and computer science...

, syntax checking, and other features normally associated with rapid application development
Rapid application development
Rapid application development is a software development methodology that uses minimal planning in favor of rapid prototyping. The "planning" of software developed using RAD is interleaved with writing the software itself...

 systems. The parse tree
Parse tree
A concrete syntax tree or parse tree or parsing treeis an ordered, rooted tree that represents the syntactic structure of a string according to some formal grammar. In a parse tree, the interior nodes are labeled by non-terminals of the grammar, while the leaf nodes are labeled by terminals of the...

 is also more suitable for supporting automated code refactoring, as it remains in a parsable text form at all times. Changes to the compiler can be checked by diff
Diff
In computing, diff is a file comparison utility that outputs the differences between two files. It is typically used to show the changes between one version of a file and a former version of the same file. Diff displays the changes made per line for text files. Modern implementations also...

ing the intermediate form (IF).

Although development on GCC may be difficult, the reasons for this have been well explored by its developers. This allowed the Clang team to avoid these problems and make a more flexible system. Clang is highly modularized, based almost entirely on replaceable link-time libraries — as opposed to source-code modules that are combined at compile time — and well-documented. This makes it much easier for new developers to get up to speed in Clang and add to the project. In some cases the libraries are provided in several versions that can be swapped out at runtime; for instance the parser comes with a version that offers performance measurement of the compile process.

Clang, as the name implies, is a compiler only for C and C-like languages. It does not offer compiler front-ends for languages other than C, C++, Objective-C, and Objective-C++. For other languages, including Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

, and Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

, LLVM remains dependent on GCC. In many cases, Clang can be used or swapped out for GCC as needed, with no other effects on the toolchain as a whole. It supports most of the commonly used GCC options.

Performance and GCC compatibility

Clang's developers claim that it provides reduced memory footprint and increased speed compared to competing compilers, such as GCC. To support their claim, they present that, as of October 2007, Clang compiled the Carbon
Carbon (API)
Carbon is one of Apple Inc.'s procedural application programming interfaces for the Macintosh operating system. It provides C programming language access to Macintosh system services...

 libraries well over twice as fast as GCC, while using about one-sixth GCC's memory and disk space.

Although Clang's overall compatibility with GCC is very good, and its compilation speed typically better than GCC's, as of early 2011 the runtime performance of clang/LLVM output is sometimes worse than GCC's.

Status history

This table present only releases significant steps in Clang history.
Date Highlights
25 February 2009 Clang/LLVM able to compile a working FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

 kernel
Kernel (computing)
In computing, the kernel is the main component of most computer operating systems; it is a bridge between applications and the actual data processing done at the hardware level. The kernel's responsibilities include managing the system's resources...

. Currently all of the FreeBSD source code - both kernel and userland - can be compiled with Clang.
16 March 2009 Clang/LLVM able to compile a working DragonFly BSD
DragonFly BSD
DragonFly BSD is a free Unix-like operating system created as a fork of FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and a FreeBSD developer between 1994 and 2003, began work on DragonFly BSD in June 2003 and announced it on the FreeBSD mailing lists on July...

 kernel.
23 October 2009 Clang 1.0 released along with LLVM 2.6 for the first time.
December 2009 Code generation for C and Objective-C reach production quality (support for C++ and Objective-C++ still incomplete). Clang C++ able to parse GCC 4.2 libstdc++ and generate working code for non-trivial programs and was able to compile itself
2 February 2010 Clang self-hosting
Self-hosting
The term self-hosting was coined to refer to the use of a computer program as part of the toolchain or operating system that produces new versions of that same program—for example, a compiler that can compile its own source code. Self-hosting software is commonplace on personal computers and larger...

.
20 February 2010 The source code of HelenOS
HelenOS
HelenOS is an operating system based on a multiserver microkernel design. The source code of HelenOS is published under a BSD License.- Technical overview :...

 was modified to successfully compile with Clang, and passed all kernel and user space regression tests on IA-32
IA-32
IA-32 , also known as x86-32, i386 or x86, is the CISC instruction-set architecture of Intel's most commercially successful microprocessors, and was first implemented in the Intel 80386 as a 32-bit extension of x86 architecture...

.
20 May 2010 The latest version of Clang successfully built the Boost C++ libraries, and passed nearly all tests.
10 June 2010 Clang/LLVM became an integral part of FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

 (The default compiler is still GCC)
25 October 2010 Clang/LLVM able to compile a working Linux Kernel
Linux kernel
The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....

.
January 2011 Preliminary work completed to support the draft C++0x
C++0x
C++11, also formerly known as C++0x, is the name of the most recent iteration of the C++ programming language, replacing C++03, approved by the ISO as of 12 August 2011...

 standard, with a few of the draft's new features supported in the development version of clang.
10 February 2011 Clang able to compile a working HotSpot
HotSpot
HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

 Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...


See also

  • LLDB
    LLDB (debugger)
    The LLDB Debugger is a high-performance debugger. It is built as a set of reusable components which highly leverage existing libraries in the larger LLVM Project, such as the Clang expression parser and LLVM disassembler.-Current State:...

  • Portable C Compiler
    Portable C Compiler
    The Portable C Compiler is an early compiler for the C programming language written by Stephen C...

  • GCC
    GNU Compiler Collection
    The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK