List comprehension
Encyclopedia
A list comprehension is a syntactic
Syntax of programming languages
In computer science, the syntax of a programming language is the set of rules that define the combinations of symbols that are considered to be correctly structured programs in that language. The syntax of a language defines its surface form...

 construct available in some programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s for creating a list based on existing lists. It follows the form of the mathematical set-builder notation
Set-builder notation
In set theory and its applications to logic, mathematics, and computer science, set-builder notation is a mathematical notation for describing a set by stating the properties that its members must satisfy...

(set comprehension) as distinct from the use of map
Map (higher-order function)
In many programming languages, map is the name of a higher-order function that applies a given function to each element of a list, returning a list of results. They are examples of both catamorphisms and anamorphisms...

 and filter
Filter (higher-order function)
In functional programming, filter is a higher-order function that processes a data structure in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the boolean value true.-Example:In Haskell, the code...

 functions.

Overview

Consider the following example in set builder notation.


This can be read, " is the set of all numbers "2 times " where is an item in the set of natural number
Natural number
In mathematics, the natural numbers are the ordinary whole numbers used for counting and ordering . These purposes are related to the linguistic notions of cardinal and ordinal numbers, respectively...

s (), for which squared is greater than ."

In this annotated version of the example:
  • is the variable representing members of an input set.
  • represents the input set, which in this example is the set of natural numbers
  • is a predicate function acting as a filter on members of the input set.
  • is an output function producing members of the new set from members of the input set that satisfy the predicate function.
  • brackets contain the expression
  • the vertical bar and the comma are separators.


A list comprehension has the same syntactic components to represent generation of a list in order from an input list or iterator
Iterator
In computer programming, an iterator is an object that enables a programmer to traverse a container. Various types of iterators are often provided via a container's interface...

:
  • A variable representing members of an input list.
  • An input list (or iterator).
  • An optional predicate expression.
  • And an output expression producing members of the output list from members of the input iterable that satisfy the predicate.

The order of generation of members of the output list is based on the order of items in the input.

In Haskell's
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

 list comprehension syntax, this set-builder construct would be written similarly, as:

s = [ 2*x | x <- [0..], x^2 > 3 ]


Here, the list [0..] represents , x^2>3 represents the predicate, and 2*x represents the output expression.

List comprehensions give results in a defined order (unlike the members of sets); and list comprehensions may generate
Generator (computer science)
In computer science, a generator is a special routine that can be used to control the iteration behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values...

 the members of a list in order, rather than produce the entirety of the list thus allowing, for example, the previous Haskell definition of the members of an infinite list.

History

The SETL programming language (later 1960s) had a set formation construct, and the computer algebra system
Computer algebra system
A computer algebra system is a software program that facilitates symbolic mathematics. The core functionality of a CAS is manipulation of mathematical expressions in symbolic form.-Symbolic manipulations:...

 AXIOM
Axiom
In traditional logic, an axiom or postulate is a proposition that is not proven or demonstrated but considered either to be self-evident or to define and delimit the realm of analysis. In other words, an axiom is a logical statement that is assumed to be true...

 (1973) has a similar construct that processes streams,
but the first use of the term "comprehension" for such constructs was in Rod Burstall and John Darlington's description of their functional programming language NPL
NPL programming language
NPL was a functional language with pattern matching designed by Rod Burstall and John Darlington in 1977. The language allowed certain sets and logic constructs to appear on the right hand side of definitions, E.g. setofeven...

 from 1977.

Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

 block context messages which constitute list comprehensions have been in that language since at least Smalltalk-80.

Burstall and Darlington's work with NPL influenced many functional programming languages during the 1980s but not all included list comprehensions. An exception was the influential pure lazy functional programming language Miranda
Miranda programming language
Miranda is a non-strict purely functional programming language designed by David Turner as a successor to his earlier programming languages SASL and KRC, using some concepts from ML and Hope. It was produced by Research Software Ltd...

 which was released in 1985. The subsequently developed standard pure lazy functional language, Haskell, includes many of Miranda's features including list comprehensions. The Python programming language was heavily influenced by the pure lazy school of functional programming and adopted list comprehensions. Pure functional programming remains a niche while Python achieved wider adoption which introduced list comprehensions to a wider audience.

Comprehensions were proposed as a query notation for databases and were implemented in the Kleisli database query language.

Examples in different programming languages

The following provides a few examples of specific syntax used in programming languages.

Although the original example denotes an infinite list, few languages can express that, so in some of those cases we show how to take a subset of rather than a subset of .

B-Prolog


L @= [Y : X in 1..100, [Y], (X*X>3, Y is 2*X)]


A list of the form [T : E1 in D1, ..., En in Dn, LocalVars, Goal] is interpreted as a list comprehension in calls to @=/2 and constraints. A list comprehension is translated into a foreach construct with an accumulator.

Clojure

Clojure generates infinite lazy sequences (similar to Haskell's lazy lists or Python's generators). Use take to get the first N results from the infinite sequence.


(take 20 (for [x (iterate inc 0) :when (> (* x x) 3)] (* 2 x)))

Common Lisp

List comprehensions can be expressed with the loop macro's collect keyword. Conditionals are expressed with if, as follows:


(loop for x from 0 to 100 if (> (* x x) 3) collect (* 2 x))


An infinite lazy sequence can be created in a variety of ways, such as the CLOS
CLOS
The Common Lisp Object System is the facility for object-oriented programming which is part of ANSI Common Lisp. CLOS is a powerful dynamic object system which differs radically from the OOP facilities found in more static languages such as C++ or Java. CLOS was inspired by earlier Lisp object...

 object system or a yield
Generator (computer science)
In computer science, a generator is a special routine that can be used to control the iteration behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values...

 macro.

Erlang

The same example in Erlang:

S = [2*X || X <- lists:seq(0,100), X*X > 3].

F#

The F# generator comprehension has the list comprehension syntax elements.
Generator comprehensions can be used to generate Lists, Sequences (lazy lists) and Arrays (not discussed here).

Generators are of the form [for x in collection do ... yield expr] for lists and seq {for x in collection do ... yield expr} for sequences.
For example:


> seq { for x in 0..100 do
if x*x > 3 then yield 2*x } ;;
val it : seq = seq [4; 6; 8; 10; ...]

Falcon

The "comp" generic method family provides wide support for comprehension. For example, the "mfcomp" method can be applied to an array:

s = [].mfcomp( { i => if i*i > 3: return 2*i; return oob(1)}, [1:101] )

Falcon can also use functional generators to provide input lists. For example, the following code uses a continuation to create a set of pairs.

gen = Continuation( function( max, c )
i = 0
while i < max: c(++i)
return oob(0)
end )
data = [10,11,12]
s = Set.mfcomp( {x, y => x+y}, .[gen 3], data )

Method "comp" was introduced in version 0.9.6, and methods "mcomp" and "mfcomp" in version 0.9.6.2.

Haskell

Please refer to the main example in the overview.

s = [ 2*x | x <- [0..], x^2 > 3 ]


Here, the list [0..] generates natural numbers one by one which get bound to variable x, x^2>3 represents the predicate that either accepts or rejects a given variable's value, and 2*x represents the result expression. There might be several generators and test predicates in one list compehension expression in Haskell, in effect defining nested loops, e.g.:

s = [ 2*x*y | x <- [0..], x^2 > 3, y <- [1,3..x], y^2 < 100-x^2]
-- for each x from 0 by 1:
-- if x^2 > 3:
-- for each y from 1 by 2 upto x:
-- if y^2 < 100 - x^2:
-- produce 2*x*y

The above expression becomes unproductive ("stuck") at some point, when new xs keep being generated only to be rejected later on. This is so because any test can only reject a value it is given, not any future ones (there is no cut mechanism here, in Prolog
Prolog
Prolog is a general purpose logic programming language associated with artificial intelligence and computational linguistics.Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is declarative: the program logic is expressed in terms of...

 terms - a generator in general might produce its values unordered, like e.g. the above expression itself). This can be delt with using bounded list generators always or by enclosing a generator inside a take or takeWhile call, limiting the amount of generated values.

JavaFX

Using the for statement and a boolean expression:

var s = for (i in [1..100][n | n*n > 3]) { 2*i }

JavaScript 1.8


Number.prototype.__iterator__=function{for (var i=0; i var s = [2*i for (i in 101) if (i*i>3)]

Or, without prototyping the Number object,

var range = function(start,end){for (var i=start; i<=end; i++) yield i}
var s = [2*i for (i in range(0,100)) if (i*i>3)]

OCaml

OCaml Batteries Included has uniform comprehension syntax for lists, arrays, enumerations (like streams), lazy lists (like lists but evaluated on-demand), sets, hashtables, etc.

Comprehension are of the form [? expression | x <- enumeration ; condition; condition ; ... ?].

For instance,
  1. [? 2 * x | x <- 0 -- max_int ; x * x > 3 ?];;

- : int Enum.t =

or, to compute a list,
  1. [? List: 2 * x | x <- 0 -- 5 ; x * x > 3 ?];;

- : int list = [4; 6; 8; 10]

or, to compute a set,
  1. [? PSet: 2 * x | x <- 0 -- 5 ; x * x > 3 ?];;

- : int PSet.t =


etc.

Octave

GNU Octave
GNU Octave
GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command-line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with MATLAB...

 can do list (vector) comprehensions in the form (vector expression)(vector condition).

For example,

octave:1> x=0:100; s=(2*x)(x.**2<3)
s =
0 2

Pure

The same example in Pure:
s = [2*n | n=1..100; n*n > 3];

Python

The Python programming language
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

 has a corresponding syntax for expressing list comprehensions.
The near-equivalent in Python to the example above is as follows:


S = [2*x for x in range(101) if x**2 > 3]


List comprehensions were introduced in Python version 2.0.

A generator expression may be used in Python versions 2.4 and above to achieve functional equivalence with S using a generator
Generator (computer science)
In computer science, a generator is a special routine that can be used to control the iteration behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values...

 to iterate a (finite or infinite) sequence:


from itertools import count
S = (2*x for x in count if x**2 > 3)

Racket

Racket provides functional versions of for-loops, which are essentially list comprehension syntax:

(for/list ([x (in-range 100)] #:when (> (* x x) 3))
(* 2 x))

The imperative for can also be used, combined with Racket's generator library to produce an infinite generator:

(require racket/generator)
(generator
(for ([x (in-naturals)] #:when (> (* x x) 3))
(yield (* 2 x))))

Scala

Using the for-comprehension:


val s = for (x <- Stream.from(0); if x*x > 3) yield 2*x

Scheme

Although there is no standard list comprehension syntax in R5RS, many implementations provide an extension for this. For example, in Chicken Scheme:

(require-extension list-of)
(list-of (* 2 x) (x range 0 101) (> (* x x) 3))

There is also a portable library SRFI/42 "Eager Comprehensions", which in particular includes list comprehensions:

(require srfi/42) ; example import into Racket Scheme
(list-ec (: x 101) (if (> (* x x) 3)) (* 2 x))

Smalltalk


((1 to: 100) select: [:x|x*x>3]) collect: [:x|2*x]

Visual Prolog

S = [ 2*X || X = std::fromTo(1, 100), X^2 > 3 ]

PowerShell


0..100 | Where {$_ * $_ -gt 3} | ForEach {$_ * 2}

Monad comprehension

In Haskell, a monad comprehension is a generalization of the list comprehension to other monads in functional programming
Monads in functional programming
In functional programming, a monad is a programming structure that represents computations. Monads are a kind of abstract data type constructor that encapsulate program logic instead of data in the domain model...

.

Set comprehension

Version 3.x and 2.7 of the Python language introduces syntax for set
Set (computer science)
In computer science, a set is an abstract data structure that can store certain values, without any particular order, and no repeated values. It is a computer implementation of the mathematical concept of a finite set...

 comprehensions. Similar in form to list comprehensions, set comprehensions generate Python sets instead of lists.

>>> s = {v for v in 'ABCDABCD' if v not in 'CB'}
>>> print(s)
{'A', 'D'}
>>> type(s)

>>>


Racket set comprehensions generate Racket sets instead of lists.

(for/set ([v "ABCDABCD"] #:unless (member v (string->list "CB")))
v))

Dictionary comprehension

Version 3.x and 2.7 of the Python language introduced a new syntax for dictionary
Associative array
In computer science, an associative array is an abstract data type composed of a collection of pairs, such that each possible key appears at most once in the collection....

 comprehensions, similar in form to list comprehensions but which generate Python dicts instead of lists.

>>> s = {key: val for key, val in enumerate('ABCD') if val not in 'CB'}
>>> s
{0: 'A', 3: 'D'}
>>>


Racket hash table comprehensions generate Racket hash tables (one implementation of the Racket dictionary type).

(for/hash ([(val key) (in-indexed "ABCD")]
#:unless (member val (string->list "CB")))
(values key val))

Parallel list comprehension

The Glasgow Haskell Compiler
Glasgow Haskell Compiler
The Glorious Glasgow Haskell Compilation System, more commonly known as the Glasgow Haskell Compiler or GHC, is an open source native code compiler for the functional programming language Haskell. The lead developers are Simon Peyton Jones and Simon Marlow...

 has an extension called parallel list comprehension (also known as zip-comprehension) that permits multiple independent branches of qualifiers within the list comprehension syntax.
Whereas qualifiers separated by commas are dependent ("nested"), qualifier branches separated by pipes are evaluated in parallel (this does not refer to any form of multithreadedness: it merely means that the branches are zipped
Map (higher-order function)
In many programming languages, map is the name of a higher-order function that applies a given function to each element of a list, returning a list of results. They are examples of both catamorphisms and anamorphisms...

).

-- regular list comprehension
a = [(x,y) | x <- [1..5], y <-[3..5]]
-- [(1,3),(1,4),(1,5),(2,3),(2,4) ...

-- zipped list comprehension
b = [(x,y) | (x,y) <- zip [1..5] [3..5]]
-- [(1,3),(2,4),(3,5)]

-- parallel list comprehension
c = [(x,y) | x <- [1..5] | y <- [3..5]]
-- [(1,3),(2,4),(3,5)]


Racket's comprehensions standard library contains parallel and nested versions of its comprehensions, distinguished by "for" vs "for*" in the name. For example, the vector comprehensions "for/vector" and "for*/vector" create vectors by parallel versus nested iteration over sequences. The following is Racket code for the Haskell list comprehension examples.

> (for*/list ([x (in-range 1 6)] [y (in-range 3 6)]) (list x y))
'((1 3) (1 4) (1 5) (2 3) (2 4) (2 5) (3 3) (3 4) (3 5) (4 3) (4 4) (4 5) (5 3) (5 4) (5 5))
> (for/list ([x (in-range 1 6)] [y (in-range 3 6)]) (list x y))
'((1 3) (2 4) (3 5))


In Python we could do as follows:
  1. regular list comprehension

>>> a = [(x, y) for x in range(1, 6) for y in range(3, 6)]
[(1, 3), (1, 4), (1, 5), (2, 3), (2, 4), ...
  1. parallel/zipped list comprehension

>>> b = [x for x in zip(range(1, 6), range(3, 6))]
[(1, 3), (2, 4), (3, 5)]

XQuery and XPath

Like the original NPL use, these are fundamentally database access languages.

This makes the comprehension concept more important, because it is computationally infeasible to retrieve the entire list and operate on it (the initial 'entire list' may be an entire XML database).

In XPath, the expression:

/library/book//paragraph[@style='first-in-chapter']

is conceptually evaluated as a series of "steps" where each step produces a list and the next step applies a filter function to each element in the previous step's output.

In XQuery, full XPath is available, but FLWOR statements are also used, which is a more powerful comprehension construct.

for $b in //book
where $b[@pages < 400]
order by $b//title
return

{$b//title}
{($book//paragraph)[1]}


Here the XPath //book is evaluated to create a sequence (aka list); the where clause is a functional "filter", the order by sorts the result, and the ... XML snippet is actually an anonymous function that builds/transforms XML for each element in the sequence using the 'map' approach found in other functional languages.

So, in another functional language the above FLWOR statement may be implemented like this:

map(
newXML(shortBook, newXML(title, $1.title), newXML(firstPara, $1...))
filter(
lt($1.pages, 400),
xpath(//book)
)
)

LINQ in C#

C# 3.0 has a group of related features called LINQ
LINQ
Linq is a word-based card game from Endless Games, introduced at the American International Toy Fair in 2005.Game play requires at least four players, two of whom are dealt cards with the same word, while the others receive blanks. The goal is to gain points by correctly naming the players with...

, which defines a set of query operators for manipulating list-like structures (object enumerations, SQL datasets, …). On top of these operators it offers a comprehension syntax, reminiscent of SQL:


var s = from x in Enumerable.Range(0, 100) where x*x > 3 select x*2;

C++

List comprehensions can be constructed using the C++ STL algorithms remove_copy_if to select elements in a list and for_each to transform them.
  1. include
  2. include


int linspace { static int i=1; return i++; }
bool sqrle3(int x) { return x*x <= 3; }
void into2(int &x) { x*=2; }

int main
{
list N1(10), N2(10);
// N1 and N2 are both empty lists of 10 elements each
generate_n(N1.begin, 10, linspace);
// N1 now contains 1,2,...,10
for_each(N2.begin,
remove_copy_if(N1.begin, N1.end, N2.begin, sqrle3),
into2);
// N2 now contains 4,6,...,20
}

There is some effort in providing C++ with list-comprehension constructs/syntax similar to the set builder notation.

counting_range(1,10) | filtered( _1*_1 > 3 ) | transformed(ret( _1*2 ))

Full example is here: http://codepad.org/y4bpgLJu
  • This implementation uses a macro and overloads the << operator. It evaluates any expression valid inside an 'if', and any variable name may be chosen. It's not threadsafe, however. Usage example:


list N;
list S;

for (int i = 0; i < 10; i++)
N.push_back(i);

S << list_comprehension(3.1415 * x, x, N, x*x > 3)

  • This implementation provides head/tail slicing using classes and operator overloading, and the | operator for filtering lists (using functions). Usage example:


bool even(int x) { return x % 2

0; }
bool x2(int &x) { x *= 2; return true; }

list l, t;
int x, y;

for (int i = 0; i < 10; i++)
l.push_back(i);

(x, t) = l | x2;
(t, y) = t;

t = l < 9;
t = t < 7 | even | x2;

See also
  • The SELECT
    Select
    Select or SELECT may refer to:* Select , an album by Kim Wilde* Select , a British music magazine* Select , a keyword in SQL* select , a system call for polling multiple file descriptors...

     statement together with its FROM and WHERE clauses in SQL
  • Programming language
    Programming language
    A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

  • Mathematical notation
    Mathematical notation
    Mathematical notation is a system of symbolic representations of mathematical objects and ideas. Mathematical notations are used in mathematics, the physical sciences, engineering, and economics...

  • Monads in functional programming
    Monads in functional programming
    In functional programming, a monad is a programming structure that represents computations. Monads are a kind of abstract data type constructor that encapsulate program logic instead of data in the domain model...

     for monads and monadic notation in general
  • For other programming language constructs used to process sequences:
    • Generator (computer science)
      Generator (computer science)
      In computer science, a generator is a special routine that can be used to control the iteration behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values...

    • Map (higher-order function)
      Map (higher-order function)
      In many programming languages, map is the name of a higher-order function that applies a given function to each element of a list, returning a list of results. They are examples of both catamorphisms and anamorphisms...

  • For other programming language constructs copied from the mathematical notation:
    • Guard (computing)
      Guard (computing)
      In computer programming, a guard is a boolean expression that must evaluate to true if the program execution is to continue in the branch in question. The term is used at least in Haskell, Clean, Erlang, occam, Promela, OCaml and Scala programming languages. In Mathematica, guards are called...

    • Pattern matching
      Pattern matching
      In computer science, pattern matching is the act of checking some sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact. The patterns generally have the form of either sequences or tree structures...

    • Operator (programming)
      Operator (programming)
      Programming languages typically support a set of operators: operations which differ from the language's functions in calling syntax and/or argument passing mode. Common examples that differ by syntax are mathematical arithmetic operations, e.g...


Haskell


OCaml


Python


Common Lisp


External links
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK