All Topics  
Type system

 

   Email Print
   Bookmark   Link






 

Type system



 
 
In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute.". Loosely, a type system associates one (or more) type(s) with each program value; by examining the flow of these values, a type system attempts to prove that no "type errors" can occur. The type system in question determines what constitutes a "type error", but a type system generally seeks to guarantee that operations expecting a certain kind of value are not used with values for which that operation makes no sense.

A compiler
Compiler

A compiler is a computer program that transforms source code written in a programming language into another computer language . The most common reason for wanting to transform source code is to create an executable program....
 may use the static type of a value to optimize the storage it needs and the choice of algorithms for operations on the value.






Discussion
Ask a question about 'Type system'
Start a new discussion about 'Type system'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute.". Loosely, a type system associates one (or more) type(s) with each program value; by examining the flow of these values, a type system attempts to prove that no "type errors" can occur. The type system in question determines what constitutes a "type error", but a type system generally seeks to guarantee that operations expecting a certain kind of value are not used with values for which that operation makes no sense.

A compiler
Compiler

A compiler is a computer program that transforms source code written in a programming language into another computer language . The most common reason for wanting to transform source code is to create an executable program....
 may use the static type of a value to optimize the storage it needs and the choice of algorithms for operations on the value. In many C
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
 compilers the "float" data type
Data type

A data type in programming languages is an attribute of a data which tells the computer something about the kind of data it is. This involves setting constraints on the datum, such as what values it can take and what operations may be performed upon it....
, for example, is represented in 32 bit
Bit

A bit is a binary numeral system numerical digit, taking a value of either 0 or 1. Binary digits are a basic unit of information Computer data storage and transmission in digital computing and digital information theory....
s, in accordance with the IEEE specification
IEEE floating-point standard

The first IEEE Standard for Binary Floating-Point Arithmetic set the standard for floating-point computation for 23 years. It became the most widely-used standard for floating point computation, and is followed by many Central processing unit and floating point unit implementations....
 for single-precision floating point numbers. C thus uses floating-point-specific operations on those values (floating-point addition, multiplication, etc.).

The depth of type constraints and the manner of their evaluation affect the typing of the language. A programming language may further associate an operation with varying concrete algorithms on each type in the case of type polymorphism
Type polymorphism

In computer science, polymorphism is a programming language feature that allows values of different data types to be handled using a uniform interface....
. Type theory
Type theory

In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general....
 is the study of type systems, although the concrete type systems of programming languages originate from practical issues of computer architecture, compiler implementation, and language design.

Basis

Assigning data types (typing) gives meaning to collections of bit
Bit

A bit is a binary numeral system numerical digit, taking a value of either 0 or 1. Binary digits are a basic unit of information Computer data storage and transmission in digital computing and digital information theory....
s. Types usually have associations either with values in memory
Computer memory

Computer memory is usually meant to refer to the semiconductor technology that is used to store information in Electronics devices. Current primary computer memory makes use of integrated circuits consisting of silicon-based transistors....
 or with objects
Object (computer science)

In its simplest embodiment, an object is an allocated region of storage. Since programming languages use variable#Computer_programmings to access objects, the terms object and variable are often used interchangeably....
 such as variables. Because any value simply consists of a set of bits in a computer
Computer

A computer is a machine that manipulates Data according to a list of Code .The first devices that resemble modern computers date to the mid-20th century , although the computer concept and various machines similar to computers existed earlier....
, hardware makes no distinction even between memory address
Memory address

In computer science, a memory address is an identifier for a computer memory location, at which a computer program or a hardware device can store a piece of data and later retrieve it....
es, instruction code, characters
Character (computing)

In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written language form of a natural language....
, integer
Integer

The integers are natural numbers including 0 and their negative and non-negative numberss . They are numbers that can be written without a fractional or decimal component, and fall within the set ....
s and floating-point numbers. Assignment to a type informs programs and programmers how those bit collections should be treated.

Major functions type systems provide include:
  • Safety - Use of types may allow a compiler
    Compiler

    A compiler is a computer program that transforms source code written in a programming language into another computer language . The most common reason for wanting to transform source code is to create an executable program....
     to detect meaningless or probably invalid code. For example, we can identify an expression 3 / "Hello, World" as invalid because one cannot divide (in any usual sense) an integer
    Integer

    The integers are natural numbers including 0 and their negative and non-negative numberss . They are numbers that can be written without a fractional or decimal component, and fall within the set ....
     by a string
    String (computer science)

    In computer programming and some branches of mathematics, a string is an ordered sequence of symbols. These symbols are chosen from a predetermined set or alphabet....
    . As discussed below, strong typing offers more safety, but does not necessarily guarantee complete safety (see type-safety for more information).
  • Optimization - Static type-checking may provide useful compile-time information. For example, if a type requires that a value must align in memory at a multiple of 4 bytes, the compiler may be able to use more efficient machine instructions.
  • Documentation - In more expressive type systems, types can serve as a form of documentation
    Documentation

    Documentation may refer to the process of providing evidence or to the communicable material used to provide such documentation . Documentation may also refer to tools aiming at identifying documents or to the field of study devoted to the study of documents and bibliographies ....
    , since they can illustrate the intent of the programmer. For instance, timestamps may be represented as integers—but if a programmer declares a function as returning a timestamp type rather than merely an integer type, this documents part of the meaning of the function.
  • Abstraction (or modularity) - Types allow programmers to think about programs at a higher level than the bit or byte, not bothering with low-level implementation. For example, programmers can think of a string as a collection of character values instead of as a mere array of bytes. Or, types can allow programmers to express the interface
    Interface (computer science)

    Interface generally refers to an Abstraction_%28computer_science%29 that an entity provides of itself to the outside. This separates the methods of external communication from internal operation, and allows it to be internally modified without affecting the way outside entities interact with it, as well as provide Polymorphism in object-orien...
     between two subsystems. This helps localize the definitions required for interoperability of the subsystems and prevents inconsistencies when those subsystems communicate.
Notice, however, that type safety is not equivalent to program correctness. A program may give the wrong result and be safely typed, producing no compiler errors. Other methods
Software testing

Software Testing is an empirical investigation conducted to provide stakeholders with information about the quality of the product or service under test , with respect to the context in which it is intended to operate....
 are needed to ensure a program is correct.

A program typically associates each value with one particular type (although a type may have more than one subtype
Subtype

In computer science, a subtype is a datatype that is generally related to another datatype by some notion of substitutability, meaning that computer programs written to operate on elements of the supertype can also operate on elements of the subtype....
). Other entities, such as objects
Object (computer science)

In its simplest embodiment, an object is an allocated region of storage. Since programming languages use variable#Computer_programmings to access objects, the terms object and variable are often used interchangeably....
, modules, communication channels, dependencies, or even types themselves, can become associated with a type. Some implementations might make the following identifications (though these are technically different concepts):
  • data type
    Data type

    A data type in programming languages is an attribute of a data which tells the computer something about the kind of data it is. This involves setting constraints on the datum, such as what values it can take and what operations may be performed upon it....
     - a type of a value
  • class
    Class (computer science)

    In object-oriented programming, a class is a programming language construct that is used as a blueprint to create Object s. This blueprint includes Attribute s and Method s that the created objects all share....
     - a type of an object
  • kind
    Kind (type theory)

    In computer science and type theory, a kind is a "type of a type". Kinds appear, either explicitly or implicitly, in certain languages with complex type systems, such as Haskell and Scala ....
     - a type of a type


A type system, specified in for each programming language, controls the ways typed programs may behave, and makes behavior outside these rules illegal. An effect system
Effect system

An effect system is a formal system which describes the computational effects of computer programs, such as side effect s. An effect system can be used to provide a compile-time checking of the possible effects of the program....
 typically provides more fine-grained control than does a type system.

More formally, type theory
Type theory

In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general....
 studies type systems.

Type checking

The process of verifying and enforcing the constraints of types – type checking – may occur either at compile-time (a static check) or run-time (a dynamic check). If a language specification requires its typing rules strongly (ie, more or less allowing only those automatic type conversions which do not lose information), one can refer to the process as strongly typed, if not, as weakly typed. The terms are not used in a strict sense.

Static typing

A programming language is said to use static typing when type checking is performed during compile-time as opposed to run-time. Examples of languages that use static typing include Ada
Ada (programming language)

Ada is a structured programming, statically typed, Imperative programming, and Object-oriented programming high-level language computer programming programming language, extended from Pascal and other languages....
, C
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
, C++
C++

C++ is a general-purpose programming language. It is regarded as a middle-level language, as it comprises a combination of both high-level programming language and low-level programming language language features....
, C#, Java
Java (programming language)

Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java ....
, Fortran
Fortran

Fortran is a general-purpose programming language, procedural programming language, imperative programming language programming language that is especially suited to numerical analysis and scientific computing....
, ML, Pascal
Pascal (programming language)

Pascal is an influential imperative programming and Procedural programming programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structure....
, and Haskell
Haskell (programming language)

Haskell is a standardized, purely functional programming language with non-strict programming language, named after logician Haskell Curry. The goals of the language are described as:...
. Static typing is a limited form of program verification (see type safety
Type safety

In computer science, type safety is a property of some programming languages that is defined differently by different communities, but most definitions involve the use of a type system to prevent certain erroneous or undesirable program behavior ....
): accordingly, it allows many errors to be caught early
Fail-fast

Fail-fast is a property of a system or module with respect to its response to failures. A fail-fast system is designed to immediately report at its interface any failure or condition that is likely to lead to failure....
 in the development cycle. Program execution may also be made more efficient (i.e. faster or taking reduced memory).

Because they evaluate type information during compilation, and therefore lack type information that is only available at run-time, static type checkers are conservative. They will reject some programs that may be well-behaved at run-time, but that cannot be statically determined to be well-typed. For example, even if an expression always evaluates to true at run-time, a program containing the code

if then 42 else


will be rejected as ill-typed, because a static analysis cannot determine that the else branch won't be taken.

Some statically typed languages have a "loophole
Loophole

A loophole is a weakness that allows a system to be circumvented. The term loophole could also refer to:* Embrasure, a slit in a castle wall* Loophole , a short science fiction story by Arthur C....
" in the programming language specification
Programming language specification

A programming language specification is an artifact that defines a programming language so that programmers and programming language implementation can agree on what programs in that language mean....
 that enables programmers to write pieces of code that circumvent the default verification performed by a static type checker. For example, Java and most C-style languages have type conversion
Type conversion

In computer science, type conversion or typecasting refers to changing an entity of one data type into another. This is done to take advantage of certain features of type hierarchies....
; such operations may be unsafe at runtime, in that they can cause unwanted behavior due to incorrect typing of values when the program runs.

Dynamic typing


A programming language is said to be dynamically typed, or just 'dynamic', when the majority of its type checking is performed at run-time as opposed to at compile-time. Examples of languages that are dynamically typed include Clojure
Clojure

Clojure is a modern dialect of the Lisp programming language. It is a general-purpose language sporting interactive development, and it encourages a functional programming style that enables simplified Thread programming....
, Groovy, JavaScript
JavaScript

JavaScript is a scripting language widely used for client-side web development. It was the originating Programming language dialect of the ECMAScript standard....
, Lisp, Objective-C
Objective-C

Objective-C is a Reflection , Object-oriented programming programming language which adds Smalltalk-style message passing to C .Today it is used primarily on Mac OS X, iPhone OS, and GNUstep, three environments based on the OpenStep standard, and is the primary language used for the NEXTSTEP, OpenStep#OPENSTEP, and Cocoa application framew...
, Perl
Perl

In computer programming, Perl is a high-level programming language, List of programming languages by category, Interpreter , dynamic programming language....
, PHP
PHP

PHP is a scripting language originally designed for producing dynamic web pages. It has evolved to include a command line interface capability and can be used in Standalone software Graphical user interface....
, Prolog, Python
Python (programming language)

Python is a general-purpose high-level programming language. Its design philosophy emphasizes code readability. Python's core syntax and semantics are Minimalism , while the standard library is large and comprehensive....
, Ruby
Ruby (programming language)

Ruby is a dynamic programming language, reflection , general purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features....
, and Smalltalk
Smalltalk

Smalltalk is an Object-oriented programming, Type system, reflection computer programming programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human?computer symbiosis." It was designed and created in part for educational use, more so for constructionist learning, at PARC by Al...
. Dynamic typing can be more flexible than static typing (for example by allowing programs to generate types based on run-time data), since static type checkers may conservatively reject programs that actually have acceptable run-time behavior. The cost of this additional flexibility is fewer a priori guarantees, since static type checking ensures that type errors will not occur in any possible execution of a program.

Dynamically-typed language systems, compared to their statically-typed cousins, make fewer "compile-time" checks on the source code (but will check, for example, that the program is syntactically correct). Run-time checks can potentially be more sophisticated, since they can use dynamic information as well as any information that was present during compilation. On the other hand, runtime checks only assert that conditions hold in a particular execution of the program, and these checks are repeated for every execution of the program. Static type checkers evaluate only the type information that can be determined at compile time, but are able to verify that the checked conditions hold for all possible executions of the program, which eliminates the need to repeat type checks every time the program is executed.

Development in dynamically-typed languages is often supported by programming practices such as unit testing. Testing is a key practice in professional software development, and is particularly important in dynamically-typed languages. In practice, the testing done to ensure correct program operation can detect a much wider range of errors than static type-checking, but conversely cannot search as comprehensively for the errors that both testing and static type checking are able to detect. Testing can be incorporated into the software build
Software build

In the field of computer software, the term software build refers either to the process of converting source code files into standalone software artifact that can be run on a computer, or the result of doing so....
 cycle, in which case it can be thought of as a "compile-time" check, in that the program user will not have to manually run such tests.

Combinations of dynamic and static typing


The presence of static typing in a programming language does not necessarily imply the absence of all dynamic typing mechanisms. For example, Java, and various other object-oriented languages, while using static typing, require for certain operations (downcasting
Downcasting

In object-oriented programming, downcasting or type refinement is the act of Cast a reference to a base class to one of its derived classes.In most programming languages, it is usually possible to check through Runtime Type Information whether the type of the referenced object is indeed the one being cast to or a derived type of it, and th...
) the support of runtime type tests, a form of dynamic typing. See programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
 for more discussion of the interactions between static and dynamic typing.

Static and dynamic type checking in practice

The choice between static and dynamic typing requires trade-off
Trade-off

A trade-off is a situation that involves losing one quality or aspect of something in return for gaining another quality or aspect. It implies a decision to be made with full comprehension of both the upside and downside of a particular choice....
s.

Static typing can find type errors reliably at compile time. This should increase the reliability of the delivered program. However, programmers disagree over how commonly type errors occur, and thus what proportion of those bugs which are written would be caught by static typing. Static typing advocates believe programs are more reliable when they have been well type-checked, while dynamic typing advocates point to distributed code that has proven reliable and to small bug databases. The value of static typing, then, presumably increases as the strength of the type system is increased. Advocates of dependently-typed languages
Dependent type

In computer science and logic, a dependent type is a type which depends on a value. Dependent types play a central role in intuitionistic type theory and in the design of experimental functional programming languages like Dependent ML and Epigram ....
 such as Dependent ML
Dependent ML

Dependent ML is an experimental functional programming language proposed by Frank Pfenning and Hongwei Xi. Dependent ML extends ML programming language by a restricted notion of dependent types: types may dependent on static indices of type Nat....
 and Epigram have suggested that almost all bugs can be considered type errors, if the types used in a program are properly declared by the programmer or correctly inferred by the compiler.

Static typing usually results in compiled code that executes more quickly. When the compiler knows the exact data types that are in use, it can produce optimized machine code. Further, compilers for statically typed languages can find assembler shortcuts more easily. Some dynamically-typed languages such as Common Lisp
Common Lisp

Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in American National Standards Institute standard document Information Technology - Programming Language - Common Lisp, formerly X3.226-1994 ....
 allow optional type declarations for optimization for this very reason. Static typing makes this pervasive. See optimization
Optimization (computer science)

In computing, optimization is the process of modifying a system to make some aspect of it work more efficiently or use fewer resources. For instance, a computer program may be optimized so that it executes more rapidly, or is capable of operating with less Computer data storage or other resources, or draw less power....
.

By contrast, dynamic typing may allow compilers to run more quickly and allow interpreters to dynamically load new code, since changes to source code in dynamically-typed languages may result in less checking to perform and less code to revisit. This too may reduce the edit-compile-test-debug cycle.

Statically-typed languages which lack type inference
Type inference

Type inference, or implicit typing, refers to the ability to deduce automatically the type of a value in a programming language. It is a feature present in some strongly-typed programming language static typing#Static and dynamic typing languages....
 (such as Java and C) require that programmers declare the types they intend a method or function to use. This can serve as additional documentation for the program, which the compiler will not permit the programmer to ignore or permit to drift out of synchronization. However, a language can be statically typed without requiring type declarations (examples include Haskell
Haskell (programming language)

Haskell is a standardized, purely functional programming language with non-strict programming language, named after logician Haskell Curry. The goals of the language are described as:...
, Scala and C#3.0), so this is not a necessary consequence of static typing.

Dynamic typing allows constructs that some static type checking would reject as illegal. For example, eval
Eval

In some programming languages, eval is a subroutine which evaluates a string as though it were an expression and returns a result; in others, it executes multiple lines of code as though they had been included instead of the line including the eval....
 functions, which execute arbitrary data as code, become possible (however, the typing within that evaluated code might remain static). Furthermore, dynamic typing better accommodates transitional code and prototyping, such as allowing a placeholder data structure (mock object
Mock object

In object-oriented programming, mock objects are simulated objects that mimic the behavior of real objects in controlled ways. A computer programming typically creates a mock object to test the behavior of some other object, in much the same way that a car designer uses a crash test dummy to Simulation the dynamic behavior of a human in vehi...
) to be transparently used in place of a full-fledged data structure (usually for the purposes of experimentation and testing). Recent enhancements to statically typed languages (e.g. Haskell Generalized algebraic data types) have allowed eval functions to be written in a type-safe way.

Dynamic typing typically makes metaprogramming
Metaprogramming

Metaprogramming is the writing of computer programs that write or manipulate other programs as their data, or that do part of the work at runtime that would otherwise be done at compile time....
 more effective and easier to use. For example, C++
C++

C++ is a general-purpose programming language. It is regarded as a middle-level language, as it comprises a combination of both high-level programming language and low-level programming language language features....
 templates
Template (programming)

Templates are a feature of the C++ programming language that allow functions and classes to operate with Generic programming. This allows a function or class to work on many different datatype without being rewritten for each one....
 are typically more cumbersome to write than the equivalent Ruby or Python
Python (programming language)

Python is a general-purpose high-level programming language. Its design philosophy emphasizes code readability. Python's core syntax and semantics are Minimalism , while the standard library is large and comprehensive....
 code. More advanced run-time constructs such as metaclass
Metaclass

In object-oriented programming Computer programming, a metaclass is a Class whose instances are classes. Just as an ordinary class defines the behavior of certain objects, a metaclass defines the behavior of certain classes and their instances....
es and introspection are often more difficult to use in statically-typed languages.

Static typing expands the possibilities for programmatic refactoring. For example, in a statically typed language, analysis of the source code can reveal all callers of a method, and hence a tool can consistently rename the method throughout all of its uses. In dynamic languages, this kind of analysis is either impossible or more difficult because the reference of a name (e.g. a method name) cannot surely be determined until runtime.

Strong and weak typing


One definition of strongly typed involves preventing success for an operation on arguments which have the wrong type. A C
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
 cast
Type conversion

In computer science, type conversion or typecasting refers to changing an entity of one data type into another. This is done to take advantage of certain features of type hierarchies....
 gone wrong exemplifies the problem of absent strong typing; if a programmer casts a value from one type to another in C, not only must the compiler allow the code at compile time, but the runtime must allow it as well. This may permit more compact and faster C code, but it can make debugging
Debugging

Debugging is a methodical process of finding and reducing the number of computer bugs, or defects, in a computer program or a piece of electronic hardware thus making it behave as expected....
 more difficult.

Some observers use the term memory-safe language (or just safe language) to describe languages that do not allow undefined operations to occur. For example, a memory-safe language will check array bounds
Bounds checking

In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before its use. It is particularly relevant to a variable used as an index into an array to ensure its value lies within the bounds of the array....
, or else statically guarantee (i.e., at compile time before execution) that array accesses out of the array boundaries will cause compile-time and perhaps runtime errors.

Weak typing means that a language implicitly converts (or casts) types when used. Revisiting the previous example, we have:

var x := 5; // (1) (x is an integer) var y := "37"; // (2) (y is a string) x + y; // (3)

It is not clear what result one would get in a weakly typed language. Some languages, such as Visual Basic
Visual Basic

'Visual Basic' is the third-generation programming language event-driven programming and integrated integrated development environment from Microsoft for its Component Object Model programming model....
, would produce runnable code producing the result 42: the system would convert the string "37" into the number 37 to forcibly make sense of the operation. Other languages like JavaScript
JavaScript

JavaScript is a scripting language widely used for client-side web development. It was the originating Programming language dialect of the ECMAScript standard....
 would produce the result "537": the system would convert the number 5 to the string "5" and then concatenate the two. In both Visual Basic and JavaScript, the resulting type is determined by rules that take both operands into consideration. In some languages, such as AppleScript
AppleScript

AppleScript is a scripting language devised by Apple Inc., and built into Mac OS. More generally, "AppleScript" is the word used to designate the Mac OS scripting interface, which is meant to operate in parallel with the graphical user interface....
, the type of the resulting value is determined by the type of the left-most operand only.

Reduction of operator overloading
Operator overloading

In computer programming, operator overloading is a specific case of polymorphism in which some or all of operator s like +, =, or have different implementations depending on the types of their arguments....
, such as not using "+" for string concatenation in addition to arithmetic addition, can reduce some of the confusion caused by weak typing. For example, PHP
PHP

PHP is a scripting language originally designed for producing dynamic web pages. It has evolved to include a command line interface capability and can be used in Standalone software Graphical user interface....
 uses periods (.) for string concatenation, and similarly Ada
Ada (programming language)

Ada is a structured programming, statically typed, Imperative programming, and Object-oriented programming high-level language computer programming programming language, extended from Pascal and other languages....
 uses the ampersand (&).

Safely and unsafely typed systems


A third way of categorizing the type system of a programming language uses the safety of typed operations and conversions. Computer scientists consider a language "type-safe" if it does not allow operations or conversions which lead to erroneous conditions.

Let us again have a look at the pseudocode example:

var x := 5; // (1) var y := "37"; // (2) var z := x + y; // (3)

In languages like Visual Basic variable z in the example acquires the value 42. While the programmer may or may not have intended this, the language defines the result specifically, and the program does not crash or assign an ill-defined value to z. In this respect, such languages are type-safe, however, if the value of y was a string that could not be converted to a number (eg "hello world"), the results would be undefined. Such languages are type-safe (in that they will not crash) but can easily produce undesirable results.

Now let us look at the same example in C:

int x = 5; char y[] = "37"; char* z = x + y;

In this example z will point to a memory address five characters beyond y, equivalent to two characters after the terminating zero character of the string pointed to by y. The content of that location is undefined, and might lie outside addressable memory. The mere computation of such a pointer may result in undefined behavior (including the program crashing) according to C standards, and in typical systems dereferencing z at this point could cause the program to crash. We have a well-typed, but not memory-safe program — a condition that cannot occur in a type-safe language.

Polymorphism and types


The term "polymorphism" refers to the ability of code (in particular, methods or classes) to act on values of multiple types, or to the ability of different instances of the same data-structure to contain elements of different types. Type systems that allow polymorphism generally do so in order to improve the potential for code re-use: in a language with polymorphism, programmers need only implement a data structure such as a list or an associative array
Associative array

An associative array is an abstract data type composed of a Collection of unique keys and a collection of values, where each key is associated with one value ....
 once, rather than once for each type of element with which they plan to use it. For this reason computer scientists sometimes call the use of certain forms of polymorphism generic programming
Generic programming

Generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters and was pioneered by Ada which appeared in 1983....
. The type-theoretic foundations of polymorphism are closely related to those of abstraction
Abstraction (computer science)

In computer science, abstraction is a mechanism and practice to reduce and factor out details so that one can focus on a few concepts at a time....
, modularity and (in some cases) subtyping
Subtype

In computer science, a subtype is a datatype that is generally related to another datatype by some notion of substitutability, meaning that computer programs written to operate on elements of the supertype can also operate on elements of the subtype....
.

Duck typing

In "duck typing," a statement calling a method m on an object does not rely on the declared type of the object; only that the object, of whatever type, must implement the method called. One way of looking at this is that in "duck" typing systems the type of an object is intrinsic to the object and is determined by what methods it implements, and hence that a "duck" typing system is by definition type-safe since one can only invoke operations an object actually implements. Another way of looking at this is that the object is a member of several types, including a type that describes the fact that it "has a method m." Type checking however occurs only on demand at runtime, every time the method m needs to be executed, not at compile-time or load-time.

Duck typing differs from structural typing
Structural type system

A structural type system is a major class of type system, in which type compatibility and equivalence are determined by the type's structure, and not through explicit declarations....
 in that, if the part (of the whole module structure) needed for a given local computation is present at runtime, the duck type system is satisfied in its type identity analysis. On the other hand, a structural type system would require the analysis of the whole module structure at compile-time to determine type identity or type dependence.

Duck typing differs from a nominative type system
Nominative type system

In computer science nominative type system is a major class of type system, in which type compatibility and equivalence is determined by explicit declarations and/or the name of the types....
 in a number of aspects. The most prominent ones are that, for duck typing, type information is determined at runtime (as contrasted to compile-time) and the name of the type is irrelevant to determine type identity or type dependence; only partial structure information is required for that, for a given point in the program execution.

Initially coined by Alex Martelli
Alex Martelli

Alex Martelli is an Italian computer scientist and member of the Python Software Foundation. Since early 2005, he worked as "?ber Tech Lead" for Google, Inc....
 in the Python
Python (programming language)

Python is a general-purpose high-level programming language. Its design philosophy emphasizes code readability. Python's core syntax and semantics are Minimalism , while the standard library is large and comprehensive....
 community, duck typing uses the premise that (referring to a value) "if it walks like a duck, and quacks like a duck, then it is a duck".

Specialized type systems


Many type systems have been created that are specialized for use in certain environments, with certain types of data, or for out-of-band static program analysis. Frequently these are based on ideas from formal type theory
Type theory

In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general....
 and are only available as part of prototype research systems.

Dependent types


Dependent type
Dependent type

In computer science and logic, a dependent type is a type which depends on a value. Dependent types play a central role in intuitionistic type theory and in the design of experimental functional programming languages like Dependent ML and Epigram ....
s are based on the idea of using scalars or values to more precisely describe the type of some other value. For example, "matrix(3,3)" might be the type of a 3×3 matrix. We can then define typing rules such as the following rule for matrix multiplication:

matrix_multiply : matrix(k,m) × matrix(m,n) → matrix(k,n)


where k, m, n are arbitrary positive integer values. A variant of ML
ML programming language

ML is a general-purpose functional programming language developed by Robin Milner and others in the late 1970s at the University of Edinburgh, whose syntax is inspired by ISWIM....
 called Dependent ML
Dependent ML

Dependent ML is an experimental functional programming language proposed by Frank Pfenning and Hongwei Xi. Dependent ML extends ML programming language by a restricted notion of dependent types: types may dependent on static indices of type Nat....
 has been created based on this type system, but because type-checking conventional dependent types is undecidable, not all programs using them can be type-checked without some kind of limitations. Dependent ML limits the sort of equality it can decide to Presburger arithmetic
Presburger arithmetic

Presburger arithmetic is the first-order predicate calculus of the natural numbers with addition, named in honor of Mojzesz Presburger, who introduced it in 1929....
;other languages such as Epigram make the value of all expressions in the language decidable so that type checking can be decidable, it is also possible to make the language Turing complete at the price of undecidable type checking like in Cayenne .

Linear types


Linear types, based on the theory of linear logic
Linear logic

fr:Logique lin?aireLinear logic is a substructural logic invented by Jean-Yves Girard as a refinement of both classical_logic and intuitionistic logic, combining the symmetries of the former with many of the Constructivism_ properties of the latter....
, and closely related to uniqueness type
Uniqueness type

In computing, a unique type guarantees that an object is used in a single-threaded way, with at most a single reference to it. If a value has a unique type, a function applied to it can be made to update the value in-place in the object code....
s, are types assigned to values having the property that they have one and only one reference to them at all times. These are valuable for describing large immutable values such as strings, files, and so on, because any operation that simultaneously destroys a linear object and creates a similar object (such as 'str = str + "a"') can be optimized "under the hood" into an in-place mutation. Normally this is not possible because such mutations could cause side effects on parts of the program holding other references to the object, violating referential transparency. They are also used in the prototype operating system Singularity for interprocess communication, statically ensuring that processes cannot share objects in shared memory in order to prevent race conditions. The Clean language (a Haskell
Haskell (programming language)

Haskell is a standardized, purely functional programming language with non-strict programming language, named after logician Haskell Curry. The goals of the language are described as:...
-like language) uses this type system in order to gain a lot of speed while remaining safe.

Intersection types


Intersection types are types describing values that belong to both of two other given types with overlapping value sets. For example, in most implementations of C the signed char has range -128 to 127 and the unsigned char has range 0 to 255, so the intersection type of these two types would have range 0 to 127. Such an intersection type could be safely passed into functions expecting either signed or unsigned chars, because it is compatible with both types.

Intersection types are useful for describing overloaded function types: For example, if "int ? int" is the type of functions taking an integer argument and returning an integer, and "float ? float" is the type of functions taking a float argument and returning a float, then the intersection of these two types can be used to describe functions that do one or the other, based on what type of input they are given. Such a function could be passed into another function expecting an "int ? int" function safely; it simply would not use the "float ? float" functionality.

In a subclassing hierarchy, the intersection of a type and an ancestor type (such as its parent) is the most derived type. The intersection of sibling types is empty.

The Forsythe language includes a general implementation of intersection types. A restricted form is refinement types.

Union types


Union types are types describing values that belong to either of two types. For example, in C, the signed char has range -128 to 127, and the unsigned char has range 0 to 255, so the union of these two types would have range -128 to 255. Any function handling this union type would have to deal with integers in this complete range. More generally, the only valid operations on a union type are operations that are valid on both types being unioned. C's "union" concept is similar to union types, but is not typesafe because it permits operations that are valid on either type, rather than both. Union types are important in program analysis, where they are used to represent symbolic values whose exact nature (eg, value or type) is not known.

In a subclassing hierarchy, the union of a type and an ancestor type (such as its parent) is the ancestor type. The union of sibling types is a subtype of their common ancestor (that is, all operations permitted on their common ancestor are permitted on the union type, but they may also have other valid operations in common).

Existential types


Existential types are frequently used to represent module
Module

Module or modular may refer to:...
s and abstract data type
Abstract data type

In computing, an abstract data type is a specification of a set of data and the set of operations that can be performed on the data. Such a data type is abstract in the sense that it is independent of various concrete implementations....
s because of their ability to separate implementation from interface. For example, in C pseudocode, the type "T = ?X " describes a module interface that has a data member of type X and a function that takes a parameter of the same type X and returns an integer. This could be implemented in different ways; for example:

  • intT
  • floatT


These types are both subtypes of the more general existential type T and correspond to concrete implementation types, so any value of one of these types is a value of type T. Given a value "t" of type "T", we know that "t.f(t.a)" is well-typed, regardless of what the abstract type X is. This gives flexibility for choosing types suited to a particular implementation while clients that use only values of the interface type — the existential type — are isolated from these choices.

Explicit or implicit declaration and inference

Many static type systems, such as C's and Java's, require type declarations: The programmer must explicitly associate each variable with a particular type. Others, such as Haskell's, perform type inference
Type inference

Type inference, or implicit typing, refers to the ability to deduce automatically the type of a value in a programming language. It is a feature present in some strongly-typed programming language static typing#Static and dynamic typing languages....
: The compiler draws conclusions about the types of variables based on how programmers use those variables. For example, given a function f(x,y) which adds x and y together, the compiler can infer that x and y must be numbers – since addition is only defined for numbers. Therefore, any call to f elsewhere in the program that specifies a non-numeric type (such as a string or list) as an argument would signal an error.

Numerical and string constants and expressions in code can and often do imply type in a particular context. For example, an expression 3.14 might imply a type of floating-point, while [1, 2, 3] might imply a list of integers – typically an array
Array

In computer science, an array is a data structure consisting of a group of element s that are accessed by index . In most programming languages each element has the same data type and the array occupies a contiguous area of computer memory....
.

Type inference is in general possible if it is decidable in the type theory in question. Moreover, even if inference is undecidable in general for a given type theory, inference is often possible for a large subset of real-world programs. Haskell's type system, a version of Hindley-Milner
Type inference

Type inference, or implicit typing, refers to the ability to deduce automatically the type of a value in a programming language. It is a feature present in some strongly-typed programming language static typing#Static and dynamic typing languages....
, is a restriction of System F?
System F

System F, also known as the polymorphic lambda calculus or the second-order lambda calculus, is a typed lambda calculus. It was discovered independently by the logician Jean-Yves Girard and the computer scientist John C....
 to so-called rank-1 polymorphic types, in which type inference is decidable. Most Haskell compilers allow arbitrary-rank polymorphism as an extension, but this makes type inference undecidable. (Type checking is decidable, however, and rank-1 programs still have type inference; higher rank polymorphic programs are rejected unless given explicit type annotations.)

Types of types



A type of types is a kind
Kind (type theory)

In computer science and type theory, a kind is a "type of a type". Kinds appear, either explicitly or implicitly, in certain languages with complex type systems, such as Haskell and Scala ....
. Kinds appear explicitly in typeful programming
Typeful programming

In computer science typeful programming is a programming style identified by widespread use of type theory handled through mechanical typechecking techniques....
, such as a type constructor in the Haskell
Haskell (programming language)

Haskell is a standardized, purely functional programming language with non-strict programming language, named after logician Haskell Curry. The goals of the language are described as:...
 programming language.

Types fall into several broad categories:
  • primitive type
    Primitive type

    In computer science, primitive type can refer to either of the following concepts:* a basic type is a data type provided by a programming language as a basic building block....
    s
    — the simplest kind of type; e.g., integer
    Integer

    The integers are natural numbers including 0 and their negative and non-negative numberss . They are numbers that can be written without a fractional or decimal component, and fall within the set ....
     and floating-point number
    boolean
    Boolean

    Boolean , as a noun or an adjective, may refer to* Boolean algebra , a logical calculus of truth values or set membership* Boolean algebra , a set with operations resembling logical ones...
    integral types — types of whole numbers; e.g., integers and natural numbers
    floating point types — types of numbers in floating-point representation
  • reference type
    Reference type

    In Object-oriented programming, a reference type is a data type that can only be accessed by Reference . Unlike objects of value types, objects of reference types cannot be directly embedded into composite type and are always Dynamic memory allocation....
    s
  • null types
  • composite type
    Composite type

    In computer science, composite types are datatypes which can be constructed in a programming language out of that language's basic primitive types and other composite types....
    s
    — types composed of basic types; e.g., array
    Array

    In computer science, an array is a data structure consisting of a group of element s that are accessed by index . In most programming languages each element has the same data type and the array occupies a contiguous area of computer memory....
    s or records
    Record (computer science)

    In computer science, a record type or struct is a type whose values are records, i.e. aggregates of several items of possibly different types....
    .
    abstract data type
    Abstract data type

    In computing, an abstract data type is a specification of a set of data and the set of operations that can be performed on the data. Such a data type is abstract in the sense that it is independent of various concrete implementations....
    s
  • algebraic types
  • subtype
    Subtype

    In computer science, a subtype is a datatype that is generally related to another datatype by some notion of substitutability, meaning that computer programs written to operate on elements of the supertype can also operate on elements of the subtype....
  • derived type
    Derived type

    A derived type is a type given a new type but structurally the same as the original type. The purpose of this type is to create a new type name so that two values can have two distinct types in terms of name....
  • object type
    Object type

    In computer science, an object type is a datatype which is used in object-oriented programming to wrapper pattern a non-object type to make it look like a Reference type object ....
    s
    ; e.g., type variable
  • partial type
  • recursive type
    Recursive type

    In computer programming languages, a recursive type is a data type for values that may contain other values of the same type.An example is the List type, in Haskell :...
  • function types; e.g., binary functions
  • universally quantified
    Universal quantification

    In predicate logic, universal quantification formalizes the notion that something is true for everything, or every relevant thing.The resulting statement is a universally quantified statement, and we have universally quantified over the predicate....
     types
    , such as parameterized types
  • existentially quantified
    Existential quantification

    In predicate logic, an existential quantification is the predication of a property or relation to at least one member of the domain. In laymen's terms, it simply refers to something....
     types
    , such as modules
  • refinement types — types which identify subsets of other types
  • dependent type
    Dependent type

    In computer science and logic, a dependent type is a type which depends on a value. Dependent types play a central role in intuitionistic type theory and in the design of experimental functional programming languages like Dependent ML and Epigram ....
    s
    — types which depend on run-time values
  • ownership types — types which describe or constrain the structure of object-oriented systems


Compatibility: equivalence and subtyping

A type-checker for a statically typed language must verify that the type of any expression
Expression (programming)

An expression in a programming language is a combination of value s, variables, operator s, and function s that are interpreted according to the particular Order of operations and of association for a particular programming language, which computes and then produces another value....
 is consistent with the type expected by the context in which that expression appears. For instance, in an assignment statement of the form x := e, the inferred type of the expression e must be consistent with the declared or inferred type of the variable x. This notion of consistency, called compatibility, is specific to each programming language.

If the type of e and the type of x are the same and assignment is allowed for that type, then this is a valid expression. In the simplest type systems, therefore, the question of whether two types are compatible reduces to that of whether they are equal (or equivalent). Different languages, however, have different criteria for when two type expressions are understood to denote the same type. These different equational theories of types vary widely, two extreme cases being structural type system
Structural type system

A structural type system is a major class of type system, in which type compatibility and equivalence are determined by the type's structure, and not through explicit declarations....
s,
in which any two types are equivalent that describe values with the same structure, and nominative type system
Nominative type system

In computer science nominative type system is a major class of type system, in which type compatibility and equivalence is determined by explicit declarations and/or the name of the types....
s,
in which no two syntactically distinct type expressions denote the same type (i.e., types must have the same "name" in order to be equal).

In languages with subtyping
Subtype

In computer science, a subtype is a datatype that is generally related to another datatype by some notion of substitutability, meaning that computer programs written to operate on elements of the supertype can also operate on elements of the subtype....
, the compatibility relation is more complex. In particular, if A is a subtype of B, then a value of type A can be used in a context where one of type B is expected, even if the reverse is not true. Like equivalence, the subtype relation is defined differently for each programming language, with many variations possible. The presence of parametric or ad hoc polymorphism in a language may also have implications for type compatibility.

Controversy

There are often conflicts between those who prefer statically-typed languages and those who prefer dynamically typed languages. The first group advocates for the early detection of type errors during compilation and increased runtime performance, while the latter group advocates for rapid prototyping that is possible with a more dynamic typing system and that type errors are only a small subset of errors in a program.. Related to this is the consideration that often there is no need to manually declare all types in statically typed programming languages with type inference; thus, the need for the programmer to explicitly specify types of variables is automatically lowered for such languages; and some dynamic languages have run-time optimisers that can generate fast code approaching the speed of static language compilers, often by using partial type inference.

See also

  • Type theory
    Type theory

    In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general....
  • Operator overloading
    Operator overloading

    In computer programming, operator overloading is a specific case of polymorphism in which some or all of operator s like +, =, or have different implementations depending on the types of their arguments....
  • Polymorphism in object-oriented programming
    Polymorphism in object-oriented programming

    In simple terms, polymorphism is the ability of one type, A, to appear as and be used like another type, B. In strongly typed languages, this usually means that type A somehow derives from type B, or type A implements an interface that represents type B....
  • Programming language
    Programming language

    A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
  • Type signature
    Type signature

    Type signature is a term that is used in computer programming. A type signature defines the inputs and outputs for a subroutine or method ....
  • Signedness
    Signedness

    In computing, signedness is a property of variables representing numbers in computer programs. A numeric variable is signed if it can represent both negative and non-negative numbers, and unsigned if it can only represent positive numbers....
  • Type system cross reference list


External links

  • , by Chris Smith