Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
Type safety

Type safety

Overview
In computer science
Computer science
Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. It is frequently described as the systematic study of algorithmic processes that create, describe and transform...

, type safety is a property of some programming language
Programming language
A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human...

s that is defined differently by different communities, but most definitions involve the use of a type system
Type system
In computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."...

 to prevent certain erroneous or undesirable program behavior (called type errors). The formal type-theoretic
Type theory
In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general...

 definition is considerably stronger than what is understood by most programmers.

The enforcement can be static, catching potential errors at compile time
Compile time
In computer science, compile time refers to either the operations performed by a compiler , programming language requirements that must be met by source code for it to be successfully compiled , or properties of the program that can be reasoned about at compile time.The operations performed at...

, or dynamic, associating type information with values at run time and consulting them as needed to detect imminent errors, or a combination of both.
Discussion
Ask a question about 'Type safety'
Start a new discussion about 'Type safety'
Answer questions from other users
Full Discussion Forum
 
Encyclopedia
In computer science
Computer science
Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. It is frequently described as the systematic study of algorithmic processes that create, describe and transform...

, type safety is a property of some programming language
Programming language
A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human...

s that is defined differently by different communities, but most definitions involve the use of a type system
Type system
In computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."...

 to prevent certain erroneous or undesirable program behavior (called type errors). The formal type-theoretic
Type theory
In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general...

 definition is considerably stronger than what is understood by most programmers.

The enforcement can be static, catching potential errors at compile time
Compile time
In computer science, compile time refers to either the operations performed by a compiler , programming language requirements that must be met by source code for it to be successfully compiled , or properties of the program that can be reasoned about at compile time.The operations performed at...

, or dynamic, associating type information with values at run time and consulting them as needed to detect imminent errors, or a combination of both. Type safety is a property of the programming language, not of the programs themselves. For example, it is possible to write a safe program in a type-unsafe language.

The behaviors classified as type errors by a given programming language are usually those that result from attempts to perform operations on values
Value (computer science)
In computer science, a value is a sequence of bits that is interpreted according to some data type. It is possible for the same sequence of bits to have different values, depending on the type used to interpret its meaning...

 that are not of the appropriate data type
Data type
A data type in programming languages is a set of values and the operations on those values.-Overview:Almost all programming languages explicitly include the notion of data type, though different languages may use different terminology...

. This classification is partly based on opinion; some language designers and programmers argue that any operation not leading to program crashes, security flaws or other obvious failures is legitimate and need not be considered an error, while others consider any contravention of the programmer's explicit intent (as communicated via typing annotations) to be erroneous and not "type safe."

In the context of static type systems, type safety usually involves (among other things) a guarantee that the eventual value of any expression
Expression (programming)
An expression in a programming language is a combination of values, variables, operators, and functions that are interpreted according to the particular rules of precedence and of association for a particular programming language, which computes and then produces another value...

 will be a legitimate member of that expression's static type. The precise requirement is more subtle than this — see, for example, subtype
Subtype
In programming language theory, subtyping or subtype polymorphism is a form of type polymorphism in which a subtype is a datatype that is related to another datatype by some notion of substitutability, meaning that program constructs, typically subroutines or functions, written to operate on...

 and polymorphism for complications.

Type safety is closely linked to memory safety
Memory safety
Memory safety is a concern in software development that aims to avoid software bugs that cause security vulnerabilities dealing with RAM memory access, such as buffer overflows and dangling pointers....

, a restriction on the ability to copy arbitrary bit patterns from one memory location to another. For instance, in an implementation of a language that has some type , such that some sequence of bits (of the appropriate length) does not represent a legitimate member of , if that language allows data to be copied into a variable
Variable (programming)
In computer programming, a variable is a facility for storing data. The current value of the variable is the data actually stored in the variable. Depending on the programming language in question, the data stored in the variable can be intentionally altered during the program run. This is why it...

 of type , then it is not type safe because such an operation might assign a non- value to that variable. Conversely, if the language is type unsafe to the extent of allowing an arbitrary integer to be used as a pointer, then it is clearly not memory safe.

Most statically typed languages provide a degree of type safety that is strictly stronger than memory safety, because their type systems enforce the proper use of abstract data type
Abstract data type
In computing, an abstract data type or abstract data structure is a mathematical model for a certain class of data structures that have similar behavior; or for certain data types of one or more programming languages that have similar semantics...

s defined by programmers even when this is not strictly necessary for memory safety or for the prevention of any kind of catastrophic failure.

Definitions


Robin Milner
Robin Milner
Arthur John Robin Gorell Milner FRS FRSE is a prominent British computer scientist.-Life, education and career:...

 provided the following slogan to describe type safety:
"Well-typed programs never go wrong."

The appropriate formalization of this slogan depends on the style of formal semantics used for a particular language. In the context of denotational semantics
Denotational semantics
In computer science, denotational semantics is an approach to formalizing the meanings of programming languages by constructing mathematical objects which describe the meanings of expressions from the languages...

, type safety means that the value of an expression that is well-typed, say with type τ, is a bona fide member of the set corresponding to τ.

In 1994, Andrew Wright and Matthias Felleisen
Matthias Felleisen
Matthias Felleisen is a computer science professor and an author of German background.Felleisen is currently a Trustee Professor in the College of Computer and Information Science at Northeastern University in Boston, Massachusetts. In the past he has taught at Rice University after receiving his...

 formulated what is now the standard definition and proof technique for type safety in languages defined by operational semantics
Operational semantics
In computer science, operational semantics is a way to give meaning to computer programs in a mathematically rigorous way. Other approaches to providing a formal semantics of programming languages include axiomatic semantics and denotational semantics....

. Under this approach, type safety is determined by two properties of the semantics of the programming language:

(Type-) preservation: "Well typedness" of programs remains invariant under the transition rules (i.e. evaluation rules or reduction rules) of the language.
Progress: A well typed program never gets "stuck", i.e., never gets into an undefined state where no further transitions are possible.

These properties do not exist in a vacuum; they are linked to the semantics of the programming language they describe, and there is a large space of varied languages that can fit these criteria, since the notion of "well typed" program is part of the static semantics of the programming language and the notion of "getting stuck" (or "going wrong") is a property of its dynamic semantics.

Vijay Saraswat provides the following definition:
"A language is type-safe if the only operations that can be performed on data in the language are those sanctioned by the type of the data."

Type-safe and type-unsafe languages


Type safety is usually a requirement for any toy language
Toy language
----For the toy language named TOY, see TOY .----A toy language is a term for a computer programming language that it not considered to fulfill the robustness or completeness requirement of a computer programming language...

 proposed in academic programming language research. Many languages, on the other hand, are too big for human-generated type-safety proofs, as they often require checking thousands of cases. Nevertheless, some languages such as Standard ML
Standard ML
Standard ML is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of theorem provers.SML is a modern descendant of the ML...

, which has rigorously defined semantics, have been proved to meet one definition of type safety. On the other hand, some languages, like Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, have been proved to not meet this definition of type safety. Some other languages such as Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry.- History :...

 are believed to meet some definition of type safety, provided certain "escape" features are not used (for example Haskell's unsafePerformIO, used to "escape" from the usual restricted environment in which I/O is possible, circumvents the type system and so can be used to break type safety). Type punning
Type punning
__FORCETOC__In computer science, type punning is a common term for any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language.In C and C++,...

 is another example of such an "escape" feature. Regardless of the properties of the language definition, certain errors may occur at runtime due to bugs in the implementation, or in linked libraries
Library (computer science)
In computer science, a library is a collection of subroutines or classes used to develop software. Libraries contain code and data that provide services to independent programs. This allows the sharing and changing of code and data in a modular fashion. Some executables are both standalone programs...

 written in other languages; such errors could render a given implementation type unsafe in certain circumstances.

Memory management in type-safe languages


To be type-safe, a language must have garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. It is a special case of resource management, in which the limited resource being managed is memory. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no...

 or otherwise restrict the allocation and de-allocation of memory. Specifically, it must not allow dangling pointer
Dangling pointer
Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type.Dangling pointers arise when an object is deleted or deallocated, without modifying the value of the pointer, so that the pointer still points to the memory location...

s across structurally different types. This is because if a typed language (like Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

) required that allocated memory be explicitly released, and a dangling pointer still points to the old memory location, then a new data structure may be allocated in the same space with the slot the pointer refers to but point to a different type. For example, if the pointer initially points to a structure with an integer field, but in the new object a pointer field is allocated to the place of the integer, then the pointer field could be changed to anything by changing the value of the integer field (via dereferencing the dangling pointer). Because it is not specified what would happen when such a pointer is changed, the language is not type safe. Most type-safe languages satisfy these restrictions by using garbage collection to implement memory management.

Garbage collectors themselves are best implemented in languages that allow pointer arithmetic, so the library that implements the collector is best done in a type-unsafe language or a language where type safety can be deactivated. C and C++ are often used.

Type safety and strong typing


Type safety is synonymous with one of the many definitions of strong typing; but type safety and dynamic typing are mutually compatible. A dynamically typed language such as Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

 can be seen as a strongly typed language with a very permissive type system where any syntactically correct program is well-typed; as long as its dynamic semantics ensures that no such program ever "goes wrong" in an appropriate sense, it satisfies the definition above and can be called type-safe.

Ada



Ada was designed to be suitable for embedded system
Embedded system
An embedded system is a computer system designed to perform one or a few dedicated functions , often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. In contrast, a general-purpose computer, such as a personal...

s, device driver
Device driver
In computing, a device driver or software driver is a computer program allowing higher-level computer programs to interact with a hardware device....

s and other forms of system programming
System programming
System programming is the activity of programming system software. The primary distinguishing characteristic of systems programming when compared to application programming is that application programming aims to produce software which provides services to the user System programming (or systems...

, but also to encourage type safe programming. To resolve these conflicting goals, Ada confines type-unsafety to a certain set of special constructs whose names usually begin with the string Unchecked_. Unchecked_Deallocation can be effectively banned from a unit of Ada text by applying pragma Pure to this unit. It is expected that programmers will use Unchecked_ constructs very carefully and only when necessary; programs that do not use them are type safe.

The SPARK programming language
SPARK programming language
SPARK is a formally-defined computer programming language based on the Ada programming language, intended to be secure and to support the development of high integrity software used in applications and systems where predictable and highly reliable operation is essential either for reasons of safety...

 is a subset of Ada eliminating all its potential ambiguities and insecurities while at the same time adding statically checked contracts
Design by contract
Design by Contract or Programming by Contract is an approach to designing computer software. It prescribes that software designers should define formal, precise and verifiable interface specifications for software components, which extend the ordinary definition of abstract data types with...

 to the language features available. SPARK avoids the issues with dangling pointers by disallowing allocation at run time entirely.

C



The C programming language
C (programming language)
C is a general-purpose computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 is widely considered to be the poster child
Poster child
A poster child is a child afflicted by some disease or deformity whose picture is used on posters or other media as part of a campaign to raise money or enlist volunteers for a cause or organization...

 of type-unsafe languages. While the language definition explicitly calls out the fact that behavior of 'type-unsafe' conversions is not defined, most implementations perform conversions that programmers find useful. The widespread use of C idioms that make use of unspecified or undefined conversions has helped give it a reputation for being a type-unsafe language (the same kind of conversions can be performed in Ada using unchecked conversions, however this usage is much less common than in C).

Nevertheless, C# (the latest in the C languages tree), amongst other new features that it introduced, includes type safety.

Standard ML



SML
Standard ML
Standard ML is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of theorem provers.SML is a modern descendant of the ML...

 has a rigorously defined semantics and is known to be type safe. However, some implementations of SML, including Standard ML of New Jersey
Standard ML of New Jersey
Standard ML of New Jersey is a compiler and programming environment for Standard ML. Aside from its runtime system, which is written in C, SML/NJ is written in Standard ML...

 (SML/NJ), its syntactic variant Mythryl
Mythryl
Mythryl is a general-purpose, modular, functional programming language with compile-time type checking and type inferencesupporting both scripting and application development....

 and Mlton
MLton
MLton is an open source, whole-program optimizing compiler for the Standard ML programming language.MLton aims to produce fast executables, and to encourage rapid prototyping and modular programming by eliminating performance penalties often associated with the use of high-level language...

, provide libraries that offer certain unsafe operations. These facilities are often used in conjunction with those implementations' foreign function interface
Foreign function interface
A foreign function interface is a mechanism by which a program written in one programming language can call routines or make use of services written in another. The term comes from the specification for Common Lisp, which explicitly refers to the language features for inter-language calls as...

s to interact with non-ML code (such as C libraries) that may require data laid out in specific ways. Another example is the SML/NJ interactive toplevel
Read-eval-print loop
A read-eval-print loop , also known as an interactive toplevel, is a simple, interactive computer programming environment. The term is most usually used to refer to a Lisp interactive environment, but can be applied to command line shells and similar environments for Smalltalk, Perl, Scala,...

 itself, which must use unsafe operations to execute ML code entered by the user.

Pascal



Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

has had a number of type safety requirements, some of which are kept in some compilers. Where a Pascal compiler dictates "strict typing", two variables cannot be assigned to each other unless they are either compatible (such as conversion of integer to real) or assigned to the identical subtype. For example, if you have the following code fragment:


type
TwoTypes = record
I: Integer;
Q: Real;
end;

DualTypes = record
I: Integer;
Q: Real;
end;

var
T1, T2: TwoTypes;
D1, D2: DualTypes;


Under strict typing, a variable defined as TwoTypes is not compatible with DualTypes (because they are not identical, even though the components of that user defined type are identical) and an assignment of T1 := D2; is illegal. An assignment of T1 := T2; would be legal because the subtypes they are defined to are identical. However, an assignment such as T1.Q := D1.Q; would be legal.