Tiny C Compiler
Encyclopedia
The Tiny C Compiler is an x86 and x86-64
X86-64
x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...

 C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 created by Fabrice Bellard
Fabrice Bellard
Fabrice Bellard is a computer programmer who is best known as the creator of the FFmpeg and QEMU software projects. He has also developed a number of other programs, including the Tiny C Compiler....

. It is designed to work for slow computers with little disk space (e.g. on rescue disks). Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 operating system support has been added in version 0.9.23 (17 Jun 2005). TCC is distributed under the GNU Lesser General Public License
GNU Lesser General Public License
The GNU Lesser General Public License or LGPL is a free software license published by the Free Software Foundation . It was designed as a compromise between the strong-copyleft GNU General Public License or GPL and permissive licenses such as the BSD licenses and the MIT License...

 (LGPL).

TCC claims to implement all of ANSI C
ANSI C
ANSI C refers to the family of successive standards published by the American National Standards Institute for the C programming language. Software developers writing in C are encouraged to conform to the standards, as doing so aids portability between compilers.-History and outlook:The first...

 (C89/C90), much of the new ISO C99
C99
C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

 standard, and many GNU C extensions including inline assembly
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...

.

Features

TCC has a number of features which differentiate it from other current C compilers:
  • Its small file size (about 100 KB for the x86 TCC executable) and memory footprint allow it to be used directly from a single 1.44 M floppy disk
    Floppy disk
    A floppy disk is a disk storage medium composed of a disk of thin and flexible magnetic storage medium, sealed in a rectangular plastic carrier lined with fabric that removes dust particles...

    , such as a rescue disk.
  • TCC is intended to produce native x86 and x86-64 code very quickly; according to Bellard, it compiles, assembles and links the Links web browser
    Links (web browser)
    Links is an open source text and graphic web browser with a pull-down menu system. It renders complex pages, has partial HTML 4.0 support , supports color and monochrome terminals and allows horizontal scrolling.It is oriented toward visual users who want to retain many typical elements of...

     about 9 times faster than GCC
    GNU Compiler Collection
    The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

     does.http://bellard.org/tcc/
  • TCC has a number of compiler-specific language features intended to improve its practicality, such as an optional memory and bound checker, for improved code stability.
  • TCC allows programs to be run automatically at compile time using a command-line switch. This allows programs to be run as a shell script under Unix-like systems which support the shebang
    Shebang (Unix)
    In computing, a shebang is the character sequence consisting of the characters number sign and exclamation point , when it occurs as the first two characters on the first line of a text file...

     interpreter directive
    Interpreter directive
    An interpreter directive is a computer language construct that is used to control which interpreter parses and interprets the instructions in a computer program.- See also :* Shebang * Bourne-Again Shell* C Shell...

     syntax.

Compiled program performance

Although the TCC compiler itself is exceptionally fast and produces very small executables, there is an inherent trade off between this size of the compiler and the performance of the code that TCC produces.

TCC does perform a few optimizations
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

, such as constant propagation
Constant folding
Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.- Constant folding...

 for all operations, multiplications and divisions are optimized to shifts
Arithmetic shift
In computer programming, an arithmetic shift is a shift operator, sometimes known as a signed shift . For binary numbers it is a bitwise operation that shifts all of the bits of its operand; every bit in the operand is simply moved a given number of bit positions, and the vacant bit-positions are...

 when appropriate, and comparison operators are specially optimized (by maintaining a special cache for the processor flags). It also does some simple register allocation
Register allocation
In compiler optimization, register allocation is the process of assigning a large number of target program variables onto a small number of CPU registers...

, which prevents many extraneous save/load pairs inside a single statement.

But in general, TCC's implementation emphasizes smallness instead of optimally-performing results. TCC generates code in a single pass, and does not perform most of the optimizations performed by other compilers such as gcc. TCC compiles every statement on its own, and at the end of each statement register values are written back to the stack and must be re-read even if the next line uses the values in registers (creating extraneous save/load pairs between statements). TCC uses only some of the available registers (e.g., on x86 it never uses ebx, esi, or edi because they need to be preserved across function calls).

Here are two benchmark examples:
  • Rough benchmarks of a recursive Fibonacci algorithm on a 1.8 GHz Intel Centrino laptop with 512MB RAM
    Random-access memory
    Random access memory is a form of computer data storage. Today, it takes the form of integrated circuits that allow stored data to be accessed in any order with a worst case performance of constant time. Strictly speaking, modern types of DRAM are therefore not random access, as data is read in...

     yields a noticeable difference in results between Microsoft Visual C++ compiler 13.10.3052 and TCC. To calculate the 49th Fibonacci number, it took a TCC-compiled program approximately 110 seconds whereas the same program compiled by VC++ took approximately 93 seconds. Here, TCC takes 18% longer.
  • With a tcc modified to compile gcc, running cc1 (the gcc C compiler) on itself required 518 seconds when compiled using GCC 3.4.2, 558 seconds using GCC 2.95.3, 545 using Microsoft C compiler, and 1145 seconds using tcc. The level of optimization in each compiler was -O1 or similar.

Uses

Well-known uses of tcc include:
  • TCCBOOT, a hack where TCC loads and boots a Linux kernel
    Linux kernel
    The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....

     from source in about 10 seconds. That is to say, it is a "boot loader" which reads Linux kernel source code from disk, writes executable instructions to memory, and begins running it. This did require changes to the Linux build process.
  • TCC was used to demonstrate a defense against the trusting trust attack
  • TCC has been used to compile gcc, though various patches were required to make this work http://lists.gnu.org/archive/html/tinycc-devel/2005-09/threads.html
  • Cinpy is a Python library that allows you to implement functions with C in Python modules. The functions are compiled with TCC in runtime. The results are made callable in Python through the ctypes library.
  • Comes installed on Javascript Linux (also by Bellard).

History

TCC has its origins in the Obfuscated Tiny C Compiler (OTCC), a program Bellard wrote to win the International Obfuscated C Code Contest
International Obfuscated C Code Contest
The International Obfuscated C Code Contest is a programming contest for the most creatively obfuscated C code. It was held annually between 1984 and 1996, and thereafter in 1998, 2000, 2001, 2004, 2005 and 2006....

 (IOCCC) in 2001. Since that time, Bellard expanded and un-obfuscated the program to produce tcc.

Current status

TCC has an active mailing list, and Fabrice Bellard's current version is available through CVS
Concurrent Versions System
The Concurrent Versions System , also known as the Concurrent Versioning System, is a client-server free software revision control system in the field of software development. Version control system software keeps track of all work and all changes in a set of files, and allows several developers ...

. However, official tcc development slowed due to Bellard's work on other projects.

Rob Landley created a fork of tcc that incorporated various patches from others, using the Mercurial SCM; Landley's Mercurial branch showed its current status while the project was active. The project was discontinued on October 4, 2007 and recontinued as a fork on October 27, 2007 http://landley.net/code/tinycc/ then discontinued until further notice http://lists.gnu.org/archive/html/tinycc-devel/2008-09/msg00013.html on September 5, 2008.

Various others have distributed patches or download sites of various improved versions of tcc, such as Dave Dodge's collection of unofficial tcc patches, Debian and kfreebsd downstream patches, and grischka's gcc patches. grischka's Public Git Hosting contains a mob branch with recent contributions, including a shared build, cross-compilers, and SELinux compatibility.

TCC 0.9.23 is the subject of vulnerability number CVE-2006-0635, which is also Open Source Vulnerability Database vulnerability 22956. The report is that TCC "contains a flaw that may have security implications on programs compiled with it. The compiler fails to return unsigned values for the sizeof
Sizeof
In the programming languages C and C++, the unary operator sizeof is used to calculate the sizes of datatypes, in number of bytes. A byte in this context is the same as an unsigned char, and may be larger than the standard 8 bits, although that is uncommon in modern implementations...

operator, resulting in potential integer overflows in the objects it compiles." In version 0.9.24 of TCC this vulnerability was closed.

November 1, 2010 - A recent mailing list suggestion to extend TCC to include a subset of C++ including the class, public/protected/private, inheritance, member functions and variables, along with virtual function support has been proposed. Discussion is ongoing about whether or not it will be included.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK