C-- is a
CC is a general-purpose computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
-like
programming languageA programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human...
. Its creators,
functional programmingIn computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
researchers
Simon Peyton JonesSimon Peyton Jones is a British computer scientist who researches the implementation and applications of functional programming languages, particularly lazy functional languages...
and Norman Ramsey, designed it to be generated mainly by
compilerA compiler is a computer program that transforms source code written in a computer language into another computer language...
s for very high-level languages rather than written by human programmers. Unlike many other
intermediate languageIn computer science, an intermediate language is the language of an abstract machine designed to aid in the analysis of computer programs. The term comes from their use in compilers, where a compiler first translates the source code of a program into a form more suitable for code-improving...
s, its representation is
plain textThe American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
, not
bytecodeBytecode is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...
or another binary format.
C-- is a "portable
assembly languageAssembly languages are a family of low-level languages for programming computers, microprocessors, microcontrollers, and other integrated circuits. They implement a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture...
", designed to ease the task of implementing a compiler which produces high-quality machine code by having the compiler generate C-- code, delegating the harder work of low-level code generation and optimisation to a C-- compiler.
Work on C-- began in the late 1990s.
C-- is a
CC is a general-purpose computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
-like
programming languageA programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human...
. Its creators,
functional programmingIn computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
researchers
Simon Peyton JonesSimon Peyton Jones is a British computer scientist who researches the implementation and applications of functional programming languages, particularly lazy functional languages...
and Norman Ramsey, designed it to be generated mainly by
compilerA compiler is a computer program that transforms source code written in a computer language into another computer language...
s for very high-level languages rather than written by human programmers. Unlike many other
intermediate languageIn computer science, an intermediate language is the language of an abstract machine designed to aid in the analysis of computer programs. The term comes from their use in compilers, where a compiler first translates the source code of a program into a form more suitable for code-improving...
s, its representation is
plain textThe American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
, not
bytecodeBytecode is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...
or another binary format.
Design
C-- is a "portable
assembly languageAssembly languages are a family of low-level languages for programming computers, microprocessors, microcontrollers, and other integrated circuits. They implement a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture...
", designed to ease the task of implementing a compiler which produces high-quality machine code by having the compiler generate C-- code, delegating the harder work of low-level code generation and optimisation to a C-- compiler.
Work on C-- began in the late 1990s. Since writing a custom code generator is a challenge in itself, and the compiler back ends available to researchers at that time were complex and poorly documented, several projects had written compilers which generated
CC is a general-purpose computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
code (for instance, the original
Modula-3In Computer science, Modula-3 is a programming language conceived as a successor to an upgraded version of Modula-2. While it has been influential in research circles it has not been adopted widely in industry...
compiler). However, C is a poor choice for functional languages: it does not support
tail recursionIn computer science, tail recursion is a special case of recursion in which the last operation of the function, the tail call, is a recursive call. Such recursions can be easily transformed to iterations. Replacing recursion with iteration, manually or automatically, can drastically decrease the...
, accurate
garbage collectionIn computer science, garbage collection is a form of automatic memory management. It is a special case of resource management, in which the limited resource being managed is memory. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no...
or efficient
exception handlingException handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....
. C-- is a simpler, tightly-defined alternative to C which does support all of these things. Its most innovative feature is a run-time interface which allows writing of portable garbage collectors, exception handling systems and other run-time features which work with any C-- compiler.
The language's syntax borrows heavily from C. It omits or changes standard C features such as
variadic functionIn computer programming, a variadic function is a function of variable arity; that is, one which can take different numbers of arguments. Support for variadic functions differs widely among programming languages....
s, pointer
syntaxIn linguistics, syntax is the study of the principles and rules for constructing sentences in natural languages...
, and aspects of C's
type systemIn computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."...
, because they hamper certain essential features of C-- and the ease with which code-generation tools can produce it.
The name of the language is an in-joke, indicating that C-- is a reduced form of C, in the same way that
C++C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as a middle-level language, as it comprises a combination of both high-level and low-level language features...
is basically an expanded form of C. (In C and C++, "--" and "++" are operators meaning "subtract 1 from" and "add 1 to".)
C-- is a target platform for the
Glasgow Haskell CompilerThe Glorious Glasgow Haskell Compilation System, more commonly known as the Glasgow Haskell Compiler or GHC, is an open source native code compiler for the functional programming language Haskell...
, and an adaptation of C-- will eventually become the main code-generation path. Some of C--'s developers, including
Simon Peyton JonesSimon Peyton Jones is a British computer scientist who researches the implementation and applications of functional programming languages, particularly lazy functional languages...
, João Dias, and Norman Ramsey, also work or have worked on the Glasgow Haskell Compiler. The GHC codebase and development are based at
Microsoft ResearchMicrosoft Research is a division of Microsoft created in 1991 for researching various computer science topics and issues. It currently employs Turing Award winners C.A.R...
in
CambridgeThe city of Cambridge is a university town and the administrative centre of the county of Cambridgeshire, England. It lies in East Anglia about north of London. It is also at the heart of the high-technology centre known as Silicon Fen....
, though it is not a
MicrosoftMicrosoft Corporation is a multinational computer technology corporation that develops, manufactures, licenses, and supports a wide range of software products for computing devices...
project.
Type system
The C--
type systemIn computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."...
is deliberately designed to reflect constraints imposed by hardware rather than conventions imposed by higher-level languages. In C-- a value stored in a register or memory may have only one type: bit vector. However, bit vector is a
polymorphicIn computer science, polymorphism is a programming language feature that allows values of different data types to be handled using a uniform interface. The concept of parametric polymorphism applies to both data types and functions...
type and may come in several widths, e.g.,
bits8,
bits32, or
bits64. In addition to the bit-vector type C-- also provides a Boolean type
bool, which can be computed by expressions and used for control flow but cannot be stored in a register or in memory. As in an assembly language, any higher type discipline, such as distinctions between signed, unsigned, float, and pointer, is imposed by the C-- operators or other syntactic constructs in the language.
Sphinx C--
The name "C--" was also used for an earlier programming language developed in the 1990s by Peter Cellik for
x86The term x86 refers to a family of instruction set architectures based on the Intel 8086. The term derived from the fact that many early processors backward compatible with the 8086 also had names ending in "86". Many additions and extensions have been added to the x86 instruction set over the...
computers.
Sphinx C-- mixes C with x86 assembly language.