All Topics  
Covariance and contravariance (computer science)

 

   Email Print
   Bookmark   Link






 

Covariance and contravariance (computer science)



 
 
Within the type system
Type system

In computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."....
 of a programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
, an operator from types to types is covariant if it preserves the ordering, =, of types
Subtype

In computer science, a subtype is a datatype that is generally related to another datatype by some notion of substitutability, meaning that computer programs written to operate on elements of the supertype can also operate on elements of the subtype....
, which orders types from more specific ones to more generic ones; it is contravariant if it reverses this ordering. If neither of these apply, the operator is invariant. These terms come from category theory
Category theory

In mathematics, category theory deals in an abstract way with mathematical structures and relationships between them: it abstracts from set s and function s to objects linked in diagrams by morphisms or arrows....
, which has a general definition of covariance and contravariance that unifies the computer science definition of these terms with the defintion used in vector spaces.

This distinction is important in considering argument and return types of methods in class hierarchies
Class hierarchy

As in taxonomy, the classifications of species, a class hierarchy in computer science is a classification of object types, denoting objects as the instantiations of classes inter-relating the various classes by relationships such as "inherits", "extends", "is an abstraction of", "an interface definition"....
.






Discussion
Ask a question about 'Covariance and contravariance (computer science)'
Start a new discussion about 'Covariance and contravariance (computer science)'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Within the type system
Type system

In computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."....
 of a programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
, an operator from types to types is covariant if it preserves the ordering, =, of types
Subtype

In computer science, a subtype is a datatype that is generally related to another datatype by some notion of substitutability, meaning that computer programs written to operate on elements of the supertype can also operate on elements of the subtype....
, which orders types from more specific ones to more generic ones; it is contravariant if it reverses this ordering. If neither of these apply, the operator is invariant. These terms come from category theory
Category theory

In mathematics, category theory deals in an abstract way with mathematical structures and relationships between them: it abstracts from set s and function s to objects linked in diagrams by morphisms or arrows....
, which has a general definition of covariance and contravariance that unifies the computer science definition of these terms with the defintion used in vector spaces.

This distinction is important in considering argument and return types of methods in class hierarchies
Class hierarchy

As in taxonomy, the classifications of species, a class hierarchy in computer science is a classification of object types, denoting objects as the instantiations of classes inter-relating the various classes by relationships such as "inherits", "extends", "is an abstraction of", "an interface definition"....
. In object-oriented languages such as C++, if class B is a subtype of class A, then all member functions of B must return the same or narrower set of types as A; the return type is said to be covariant. On the other hand, the member functions of B must take the same or broader set of arguments compared with the member functions of A; the argument type is said to be contravariant. The problem for instances of B is how to be perfectly substitutable
Liskov substitution principle

In object-oriented programming, the Liskov substitution principle is a particular definition of subtype that was introduced by Barbara Liskov in a 1987 conference keynote address entitled Data abstraction and hierarchy...
 for instances of A. The only way to guarantee type safety and substitutability is to be equally or more liberal than A on inputs, and to be equally or more strict than A on outputs. Note that not all programming languages guarantee both properties in every context, and that some are unnecessarily strict; they are said not to support covariance or contravariance in a given context; the behavior of some programming languages is discussed below.

Typical examples:
  • The operator which constructs array types
    Array

    In computer science, an array is a data structure consisting of a group of element s that are accessed by index . In most programming languages each element has the same data type and the array occupies a contiguous area of computer memory....
     from element types is usually covariant on the base type: since String = Object then ArrayOf(String) = ArrayOf(Object). Note that this is only correct (i.e. type safe) if the array is immutable; if insert and remove operators are permitted, then the insert operator is covariant (e.g. one can insert a String into an ArrayOf(Object)) and the remove operator is contravariant (e.g. one can remove an Object from an ArrayOf(String)). Since the mutators have conflicting variance, mutable arrays should be invariant on the base type.
  • A function with a parameter of type T (defined as fun f (x : T) : Integer) can be replaced by a function g (defined as fun g (x : S) : Integer) if T = S. In other words, if g cares less about the type of its parameter, then it can replace f anywhere, since both return an Integer. So, in a language accepting function arguments, g = f and the type of the parameter to f is said to be contravariant.
  • In the general case, the type of the result is covariant.


In object-oriented programming
Object-oriented programming

Object-oriented programming is a programming paradigm that uses "Object_" and their interactions to design applications and computer programs....
, substitution is also implicitly invoked by overriding
Method overriding (programming)

Method overriding, in object oriented programming, is a language feature that allows a Subclass to provide a specific implementation of a method that is already provided by one of its superclass es....
 methods
Method (computer science)

In object-oriented programming, a method is a subroutine that is exclusively associated either with a class or with an object . Like a procedure in procedural programming languages, a method usually consists of a sequence of statement to perform an action, a set of input parameter to customize those actions, and possibly an output value...
 in subclass
Subclass (computer science)

In object-oriented programming, a subclass is a class that Inheritance some properties from its superclass .You can usually think of the subclass as being "a kind of" its superclass, as in a "a Manx is a kind of cat", or "a square is a kind of rectangle":...
es: the new method can be used where the old method was invoked in the original code. Programming languages vary widely on their allowed forms of overriding, and on the variance of overridden methods' types.

Origin of the terms


The origin of these terms is in category theory
Category (mathematics)

In mathematics, a category is a fundamental and abstract way to describe mathematical entities and their relationships. A category is composed of a collection of abstract "objects" of any kind, linked together by a collection of abstract "morphism" of any kind that have a few basic properties ....
, where the types in the type system
Type system

In computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."....
 form a category C, with arrows representing the subtype relationship. The subtype relationship supposedly reflects the substitution principle: that any expression of type t can be substituted by an expression of type s if s = t.

Defining a function that accepts type p and returns type r creates a new type p ? r in the type system which the new function name is associated with. This function definition operator is actually a functor
Functor

In category theory, a branch of mathematics, a functor is a special type of mapping between categories. Functors can be thought of as morphisms in the category of small categories....
 F : C × C ? C that creates the said type. From the substitution principle above, this functor must be contravariant
Functor

In category theory, a branch of mathematics, a functor is a special type of mapping between categories. Functors can be thought of as morphisms in the category of small categories....
 in the first argument and covariant
Functor

In category theory, a branch of mathematics, a functor is a special type of mapping between categories. Functors can be thought of as morphisms in the category of small categories....
 in the second.

Need for covariant argument types?


In many strictly-typed languages (with the notable exception of Eiffel
Eiffel

Eiffel can refer to several things:* The Eiffel Tower in Paris* Gustave Eiffel, engineer and designer of the Eiffel Tower* Eiffel , successor of Gustave Eiffel's engineering company...
, see below), subclassing must allow for substitution. That is, a child class can always stand in for a parent class. This places restrictions on the sorts of relationships that subclassing can represent. In particular, it means that arguments to member functions must be contravariant and return types must be covariant, as explained in previous section.

This creates problems in some situations, where argument types should be covariant to model real-life requirements. Suppose you have a class representing a person. A person can see the doctor, so this class might have a method virtual void Personsee(Doctor d). Now suppose you want to make a subclass of the Person class, Child. That is, a Child is a Person. One might then like to make a subclass of Doctor, Pediatrician. If children only visit pediatricians, we would like to enforce that in the type system. However, a naive implementation fails: because a Child is a Person, Childsee(d) must take any Doctor, not just a Pediatrician.

We could try moving the see method to the Doctor class hierarchy, but we would have the same problem: If a Doctor could see a Person and a Child is a Person, then there is still no way to enforce that a Child must see a Pediatrician and that a Person who is not a Child cannot see a Pediatrician and must see another Doctor.

In this case, the visitor pattern
Visitor pattern

In object-oriented programming and software engineering, the visitor design pattern is a way of separating an algorithm from an object structure upon which it operates....
 could be used to enforce this relationship. Another way to solve the problems, in C++
C++

C++ is a general-purpose programming language. It is regarded as a middle-level language, as it comprises a combination of both high-level programming language and low-level programming language language features....
, is using generic programming (see below).

Avoiding the need for covariant argument types

The problem arises since different object oriented languages have different strategies to select the actual code used in a particular context and the first parameter is the object itself (which is not contravariant).

However, Castagna showed that all depends on the correct method fetching algorithm: types used for runtime selection of the right method are covariant; types not used for runtime selection of the method are contravariant. In Castagna's work, examples which would suggest the usage of covariance for parameter types are treated with the usage of multiple dispatch
Multiple dispatch

Multiple dispatch or multimethods is the feature of some object-oriented programming languages in which a function or method can be dynamically dispatched based on the run time type of more than one of its arguments....
, i.e. overriding where the right method is selected also based on the type of some arguments; applying the rule, covariance is allowed for those argument types. However, this solution cannot be applied to most programming languages, since they do not support multiple dispatch
Multiple dispatch

Multiple dispatch or multimethods is the feature of some object-oriented programming languages in which a function or method can be dynamically dispatched based on the run time type of more than one of its arguments....


Note that for (static) overload resolution, the opposite rule applies: types used for compile-time method selection (i.e. parameter types) are contravariant; types not used to select the method are covariant.

These terms are also used in the context of modern programming languages that offer other functors to create new types with type variables, e.g., generic programming
Generic programming

Generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters and was pioneered by Ada which appeared in 1983....
 or parametric polymorphism, and exception handling
Exception handling

Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions - special conditions that change the normal flow of execution....
 where method definitions are enriched with annotations that indicate possible failures.

Overview of covariance/contravariance in some programming languages


Both the subtype
Subtype

In computer science, a subtype is a datatype that is generally related to another datatype by some notion of substitutability, meaning that computer programs written to operate on elements of the supertype can also operate on elements of the subtype....
 and method overriding concepts are defined differently between programming languages. They do not necessarily follow the substitution principle above, sometimes adding runtime checking instead. What follows is a simple comparison of how overriding methods behave in some common programming languages.

C++


C++
C++

C++ is a general-purpose programming language. It is regarded as a middle-level language, as it comprises a combination of both high-level programming language and low-level programming language language features....
 supports covariant return type
Return type

In computer programming, the return type defines and constrains the type of value returned from a method or Subroutine. In many programming languages the return type must be explicitly specified when declaring a function....
s in overridden virtual function
Virtual function

In object-oriented programming, a virtual function or virtual method is one whose behavior can be Method overriding within an inheriting class by a function with the same Method signature....
s. Adding the covariant return type was the first modification of the C++ language approved by the standards committee in 1998. See

With generic programming
Generic programming

Generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters and was pioneered by Ada which appeared in 1983....
, C++ allows for what amounts to covariance in argument and return type alike. For example, the argument and return types of member functions of the stdvector<T> class vary with T. The push_back method takes a const T&, so one pushes an int onto a vector<int> but a stdstring onto a vector<string>. This is done at compile time (statically) and, strictly speaking, is parametric polymorphism, because neither of vector<int> and vector<string> is a subtype of the other; this allows offering covariance for argument types without the undesirable effects discussed in the introduction.

Arrays in C# and Java


In the above discussion we have shown that type safety requires invariance of array types. However, arrays of reference types are covariant in both languages, and this leads to lack of static type safety: for instance, in C# string[] is a subtype of object[], and in Java [] is a subtype of [], although with some caveats. For instance, in C#, we have:

// a is a single-element array of System.String
string[] a = new string[1];

// b is an array of System.Object object[] b = a;

// Assign an integer to b. This would be possible if b really were // an array of objects, but since it really is an array of strings, // we will get an ArrayTypeMismatchException with the following message: // "Attempted to store an element of the incorrect type into the array". b[0] = 1;


The same problem exists in Java, too:
// a is a single-element array of String
String[] a = new String[1];

// b is an array of Object Object[] b = a;

// Assign an Integer to b. This would be possible if b really were // an array of Object, but since it really is an array of String, // we will get a java.lang.ArrayStoreException. b[0] = 1;


Note: In the above cases you can read from b without problem. It is only when trying to write to the array that you must know its real type.

Arrays of primitive types are invariant: int[] is not a subtype of double[], although int is in some sense a subtype of double.

C#


In C# it is possible to store an object which is an instance of an equal or smaller type in that storage location.

Ever since C# 1.0, arrays where the element type is a reference type are covariant.

Method group to delegate conversions are contravariant in their argument types.

Generic delegate types are always invariant in C# 3.0

A variant interface which inherits from another variant interface must do so in a manner which does not introduce problems in the type system

C# 4.0 is planned to allow co- and contravariance on parameterized interface and delegate types

D

The D Programming Language
D (programming language)

The D programming language, also known simply as D, is an Object-oriented programming, Imperative programming, Multi-paradigm programming language system programming language by Walter Bright of Digital Mars....
 supports covariance for method overriding:
interface IFactory 

class X

class XFactory : IFactory


Java


Return type covariance is implemented in the Java
Java (programming language)

Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java ....
 programming language version J2SE 5.0
Java Platform, Standard Edition

Java Platform, Standard Edition or Java SE is a widely used Platform for programming in the Java language. It is the Java Platform used to deploy porting Application software for general use....
. Parameter types have to be exactly the same (invariant) for method overriding
Method overriding (programming)

Method overriding, in object oriented programming, is a language feature that allows a Subclass to provide a specific implementation of a method that is already provided by one of its superclass es....
, otherwise the method is overloaded
Type polymorphism

In computer science, polymorphism is a programming language feature that allows values of different data types to be handled using a uniform interface....
 with a parallel definition instead.

Generics
Generics in Java

Generics are a facility of generic programming that was added to the Java in 2004 as part of Java Platform, Standard Edition 5.0. They allow "a type or method to operate on objects of various types while providing compile-time type safety."...
 were introduced in Java in Java 5.0 to allow type-safe generic programming. Unlike arrays, generic classes are neither covariant nor contravariant. For example, neither List nor List is a subtype of the other:

// a is a single-element List of String
List a = new ArrayList;
a.add("foo");

// b is a List of Object List b = a; // This is a compile-time error

However, generic type parameters can contain wildcards (a shortcut for an extra type parameter that is only used once). Example: Given a requirement for a method which operates on Lists, of any object, then the only operations that can be performed on the object are those for which the type relationships can be guaranteed to be safe.
// a is a single-element List of String
List a = new ArrayList;
a.add("foo");

// b is a List of anything List b = a;

// retrieve the first element Object c = b.get(0); // This is legal, because we can guarantee that the return type "?" is a subtype of Object

// Add an Integer to b. b.add(new Integer (1)); // This is a compile-time error; we cannot guarantee that Integer is a subtype of the parameter type "?"


Wildcards can also be bound, e.g. "? extends Foo" or "? super Foo" for upper and lower bounds, respectively. This allows to refine permitted performance. Example: given a List, then an element can be retrieved and safely assigned to a Foo type (contravariance). Given a List, then a Foo object can be safely added as an element (covariance).

Eiffel


Eiffel
Eiffel (programming language)

Eiffel is an International Organization for Standardization-standardized, object-oriented programming language designed to enable programmers to efficiently develop extensible, reusable, reliable software....
 allows covariant return and parameter types in overriding methods. This is possible because Eiffel does not require subclasses to be substitutable
Substitutability

Substitutability is a principle in computer programming. It states that, if S is a subtype of T, then objects of datatype T in a computer program may be replaced with objects of type S , without altering any of the desirable properties of that program ....
 for superclasses — that is, subclasses are not necessarily subtypes.

However, this can lead to surprises if subclasses with such covariant parameter types are operated upon presuming they were a more general class (polymorphism), leading to the possibility of compiler
Compiler

A compiler is a computer program that transforms source code written in a programming language into another computer language . The most common reason for wanting to transform source code is to create an executable program....
 errors.

REALbasic


REALbasic
REALbasic

REALbasic is an object-oriented dialect of the BASIC programming language developed and commercially marketed by REAL Software, Inc in Austin, Texas for Mac OS X, Microsoft Windows, and Linux....
 added support for return type covariance in version 5.5. Like with Java, the parameter types of the overriding method must be the same.

Scala


Scala supports use-site declarations of covariance and contravariance. Its arrays are invariant in the base type.

Sather


Sather
Sather

Sather is an object-oriented programming language. It originated circa 1990 at the International Computer Science Institute at the University of California, Berkeley, developed by an international team led by Steve Omohundro....
 supports both covariance and contravariance. Calling convention for overridden methods are covariant with out arguments and return values, and contravariant with normal arguments (with the mode in).

See also


  • Polymorphism (computer science)
  • Inheritance (computer science)
    Inheritance (computer science)

    In object-oriented programming, inheritance is a way to form new class es using classes that have already been defined. The inheritance concept was invented in 1967 for Simula....


External links

  • : An article series about implementation concerns surrounding co/contravariance in C#
  • (note this article is not updated about C++)