All Topics  
Array

 

   Email Print
   Bookmark   Link






 

Array



 
 
In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, an array is a data structure
Data structure

A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data....
 consisting of a group of element
Element (mathematics)

In mathematics, an element or member of a Set is any one of the distinct objects that make up that set....
s that are accessed by indexing
Index (information technology)

In computer science, an index can be:# an integer which identifies an array element# a pointer data element.# a data structure that enables sublinear-time lookup...
. In most programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
s each element has the same data type
Data type

A data type in programming languages is an attribute of a data which tells the computer something about the kind of data it is. This involves setting constraints on the datum, such as what values it can take and what operations may be performed upon it....
 and the array occupies a contiguous area of storage
Computer memory

Computer memory is usually meant to refer to the semiconductor technology that is used to store information in Electronics devices. Current primary computer memory makes use of integrated circuits consisting of silicon-based transistors....
.

programming languages have a built-in array data type, although what is called an array in the language documentation is sometimes really an associative array
Associative array

An associative array is an abstract data type composed of a Collection of unique keys and a collection of values, where each key is associated with one value ....
. Conversely, the contiguous storage kind of array discussed here may alternatively be called a vector, list, or table.

Some programming languages support array programming
Array programming

In computer science, array programming languages generalize operations on scalar s to apply transparently to vector s, matrix , and higher dimensional arrays....
 (e.g., APL
APL programming language

APL is an array programming language based on a notation invented in 1957 by Kenneth E. Iverson while at Harvard University. It originated as an attempt to provide consistent notation for the teaching and analysis of topics related to the application of computers....
, newer versions of Fortran
Fortran

Fortran is a general-purpose programming language, procedural programming language, imperative programming language programming language that is especially suited to numerical analysis and scientific computing....
) which generalises operations and functions to work transparently over arrays as they do with scalars, instead of requiring looping over array members.

Multi-dimensional arrays are accessed using more than one index: one for each dimension.






Discussion
Ask a question about 'Array'
Start a new discussion about 'Array'
Answer questions from other users
Full Discussion Forum



Recent Posts









Encyclopedia


In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, an array is a data structure
Data structure

A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data....
 consisting of a group of element
Element (mathematics)

In mathematics, an element or member of a Set is any one of the distinct objects that make up that set....
s that are accessed by indexing
Index (information technology)

In computer science, an index can be:# an integer which identifies an array element# a pointer data element.# a data structure that enables sublinear-time lookup...
. In most programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
s each element has the same data type
Data type

A data type in programming languages is an attribute of a data which tells the computer something about the kind of data it is. This involves setting constraints on the datum, such as what values it can take and what operations may be performed upon it....
 and the array occupies a contiguous area of storage
Computer memory

Computer memory is usually meant to refer to the semiconductor technology that is used to store information in Electronics devices. Current primary computer memory makes use of integrated circuits consisting of silicon-based transistors....
.

Overview

Most programming languages have a built-in array data type, although what is called an array in the language documentation is sometimes really an associative array
Associative array

An associative array is an abstract data type composed of a Collection of unique keys and a collection of values, where each key is associated with one value ....
. Conversely, the contiguous storage kind of array discussed here may alternatively be called a vector, list, or table.

Some programming languages support array programming
Array programming

In computer science, array programming languages generalize operations on scalar s to apply transparently to vector s, matrix , and higher dimensional arrays....
 (e.g., APL
APL programming language

APL is an array programming language based on a notation invented in 1957 by Kenneth E. Iverson while at Harvard University. It originated as an attempt to provide consistent notation for the teaching and analysis of topics related to the application of computers....
, newer versions of Fortran
Fortran

Fortran is a general-purpose programming language, procedural programming language, imperative programming language programming language that is especially suited to numerical analysis and scientific computing....
) which generalises operations and functions to work transparently over arrays as they do with scalars, instead of requiring looping over array members.

Multi-dimensional arrays are accessed using more than one index: one for each dimension. Multidimensional indexing can be reduced internally to linear indexing; for example, a two-dimensional array with 6 rows and 5 columns is typically represented by a one-dimensional array of 30 elements.

Arrays can be classified as fixed-sized arrays (sometimes known as static arrays) whose size cannot change once their storage has been allocated, or dynamic array
Dynamic array

In computer science, a dynamic array, growable array, resizable array, dynamic table, or array list is an array data structure that can be resized and allows elements to be added or removed....
s
, which can be resized.

Properties

Arrays permit constant time (O
Big O notation

In mathematics, big O notation describes the asymptotic analysis of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions....
(1)) random access
Random access

In computer science, random access is the ability to access an arbitrary element of a sequence in equal time. The opposite is sequential access, where a remote element takes longer time to access....
 to individual elements, which is optimal, but moving elements requires time proportional to the number of elements moved. On actual hardware, the presence of e.g. cache
Cache

In computer science, a cache is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch or to compute, compared to the cost of reading the cache....
s can make sequential iteration over an array noticeably faster than random access
Random access

In computer science, random access is the ability to access an arbitrary element of a sequence in equal time. The opposite is sequential access, where a remote element takes longer time to access....
 — a consequence of arrays having good locality of reference
Locality of reference

In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related computer storage locations being frequently accessed....
 because their elements occupy contiguous memory locations — but this does not change the asymptotic complexity
Computational complexity theory

Computational complexity theory, as a branch of the theory of computation in computer science, investigates the problems related to the Computational resource required for the execution of algorithms , and the inherent difficulty in providing efficient algorithms for specific computational problems....
 of access. Likewise, there are often facilities (such as memcpy) which can be used to move contiguous blocks of array elements faster than one can do through individual element access, but that does not change the asymptotic complexity either.

Memory-wise, arrays are compact data structures with no per-element overhead
Computational overhead

In computer science, overhead is generally considered any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to attain a particular goal....
. There may be a per-array overhead, e.g. to store index bounds, but this is language-dependent. It can also happen that elements stored in an array require less memory than the same elements stored in individual variables, because several array elements can be stored in a single word; such arrays are often called packed arrays.

Properties in comparison

Dynamic array
Dynamic array

In computer science, a dynamic array, growable array, resizable array, dynamic table, or array list is an array data structure that can be resized and allows elements to be added or removed....
s have similar characteristics to arrays, but can grow. The price for this is a memory overhead, due to elements being allocated but not used. With a constant per-element bound on the memory overhead, dynamic arrays can grow in constant amortized time per element.

Associative array
Associative array

An associative array is an abstract data type composed of a Collection of unique keys and a collection of values, where each key is associated with one value ....
s provide a mechanism for array-like functionality without huge storage overheads when the index values are sparse. Specialized associative arrays with integer keys include Patricia tries and Judy array
Judy array

In computer science and software engineering, a Judy array is a complex but very fast associative array data structure for storing and looking up values using integer or string keys....
s.

Balanced trees require O(log n) time for index access, but also permit inserting or deleting elements in T(log n) time. Arrays require O(n) time for insertion and deletion of elements.

Applications

Arrays are used to implement mathematical vectors
Coordinate vector

In linear algebra, a coordinate vector is an explicit representation of a vector in an Real_coordinate_space#Intuitive_overview as an ordered list of numbers or, equivalently, as an element of the coordinate space Fn....
 and matrices
Matrix (mathematics)

In mathematics, a matrix is a rectangular array of numbers, as shown at the right. In addition to a number of elementary, entrywise operations such as matrix addition a key notion is matrix multiplication....
, as well as other kinds of rectangular tables. In early programming languages, these were often the applications that motivated having arrays.

Because of their performance characteristics, arrays are used to implement other data structures, such as heaps
Heap (data structure)

In computer science, a heap is a specialized tree data structure-based data structure that satisfies the heap property: if B is a child node of A, then key ≥ key....
, hash table
Hash table

In computer science, a hash table, or a hash map, is a data structure that associates Unique key with value .The primary operation that hash functions support efficiently is a lookup: given a key , find the corresponding value ....
s, deque
Deque

In computer science theory, a deque is an abstract data structure, also called a head-tail linked list, for which elements can only be added to or removed from the front or back ....
s, queues, stacks
Stack (data structure)

In computer science, a stack is an abstract data type and data structure based on the principle of LIFO . Stacks are used extensively at every level of a modern computer system....
, strings
String (computer science)

In computer programming and some branches of mathematics, a string is an ordered sequence of symbols. These symbols are chosen from a predetermined set or alphabet....
, and VList
Vlist

Vlist is a village and municipality in the western Netherlands, in the province of South Holland. The municipality covers an area of 56.52 km? and had a population of 9,803 in 2004....
s.

One or more large arrays are sometimes used to emulate in-program dynamic memory allocation
Dynamic memory allocation

In computer science, dynamic memory allocation is the allocation of computer storage storage for use in a computer program during the runtime of that program....
, particularly memory pool
Memory pool

Memory pools, also called Memory_allocation#Fixed-size-blocks_allocation , allow dynamic memory allocation comparable to malloc or C++'s new . As those implementations suffer from fragmentation because of variable block sizes, it can be impossible to use them in a Real-time computing due to performance....
 allocation. Historically, this has sometimes been the only way to allocate "dynamic memory" portably.

Array accesses with statically predictable access patterns are a major source of data parallelism
Data parallelism

Data parallelism is a form of parallelization of computing across multiple central processing units in parallel computing environments. Data parallelism focuses on distributing the data across different parallel computing nodes....
.

Some algorithms store a variable number of elements in part of a fixed-size array, which is equivalent to using dynamic array
Dynamic array

In computer science, a dynamic array, growable array, resizable array, dynamic table, or array list is an array data structure that can be resized and allows elements to be added or removed....
 with a fixed capacity; the so-called Pascal strings
String (computer science)

In computer programming and some branches of mathematics, a string is an ordered sequence of symbols. These symbols are chosen from a predetermined set or alphabet....
 are examples of this.

Indexing


The valid index values of each dimension of an array are a bounded set of integers (or values of some enumerated type
Enumerated type

In computer programming, an enumerated type is a data type consisting of a set of named constants called enumerators. The act of creating an enumerated type defines an enumeration....
). Programming environments that check indexes for validity are said to perform bounds checking
Bounds checking

In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before its use. It is particularly relevant to a variable used as an index into an array to ensure its value lies within the bounds of the array....
.

Index of the first element

The index of the first element (sometimes called the "origin") varies by language. There are three main implementations: zero-based, one-based, and n-based arrays, for which the first element has an index of zero, one, or a programmer-specified value. The zero-based array is more natural in the root machine language and was popularized by the C programming language
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
, where the abstraction of array is very weak, and an index n of a one-dimensional array is simply the offset of the element accessed from the address of the first (or "zeroth") element (scaled by the size of the element). One-based arrays are based on traditional mathematics notation for matrices
Matrix (mathematics)

In mathematics, a matrix is a rectangular array of numbers, as shown at the right. In addition to a number of elementary, entrywise operations such as matrix addition a key notion is matrix multiplication....
 and most, but not all, mathematical sequence
Sequence

In mathematics, a sequence is an ordered list of objects . Like a Set , it contains Element , and the number of terms is called the length of the sequence....
s. n-based is made available so the programmer is free to choose the lower bound, which may even be negative, which is most naturally suited for the problem at hand.

The Comparison of programming languages (array)
Comparison of programming languages (array)

Syntax ...
, indicates the base index used by various languages.

Supporters of zero-based indexing sometimes criticize one-based and n-based arrays for being slower. Often this criticism is mistaken when one-based or n-based array accesses are optimized with common subexpression elimination
Common subexpression elimination

In computer science, common subexpression elimination is a compiler optimization that searches for instances of identical Expression , and analyses whether it is worthwhile replacing them with a single variable holding the computed value....
 (for single dimensioned arrays) and/or with well-defined dope vector
Dope vector

In computer programming, a dope vector is a data structure used to hold information about a data object, e.g. an array, especially its Computer Storage....
s (for multi-dimensioned arrays). However, in multidimensional arrays where the net offset into linear memory is computed from all of the indices, zero-based indexing is more natural, simpler, and faster. Edsger W. Dijkstra expressed an opinion in this debate: .

The 0-based/1-based debate is not limited to just programming languages. For example, the ground-floor of a building is elevator button "0" in France, but elevator button "1" in the USA.

Index of the last element

The relation between numbers appearing in an array declaration and the index of that array's last element also varies by language. In some languages (e.g. C
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
) the number of elements contained in the arrays must be specified, whereas in others (e.g. Visual Basic .NET
Visual Basic .NET

Visual Basic , formerly called Visual Basic .NET , is an object-oriented programming computer language that can be viewed as an evolution of Microsoft Visual Basic implemented on the .NET Framework....
) the numeric value of the index of the last element must be specified.

Indexing methods

When an array is implemented as continuous storage, the index-based access, e.g. to element n, is simply done (for zero-based indexing) by using the address of the first element and adding n · sizeof(one element). So this is a T(1) operation.

Multi-dimensional arrays

Ordinary arrays are indexed by a single integer. Also useful, particularly in numerical and graphics applications, is the concept of a multi-dimensional array, in which we index into the array using an ordered list of integers, such as in a[3,1,5]. The number of integers in the list used to index into the multi-dimensional array is always the same and is referred to as the array's dimensionality, and the bounds on each of these are called the array's dimensions. An array with dimensionality k is often called k-dimensional. One-dimensional arrays correspond to the simple arrays discussed thus far; two-dimensional arrays are a particularly common representation for matrices. In practice, the dimensionality of an array rarely exceeds three. Mapping a one-dimensional array into memory is obvious, since memory is logically itself a (very large) one-dimensional array. When we reach higher-dimensional arrays, however, the problem is no longer obvious. Suppose we want to represent this simple two-dimensional array:

It is most common to index this array using the RC-convention, where elements are referred in row, column fashion or , such as:

Common ways to index into multi-dimensional arrays include:

  • Row-major order
    Row-major order

    In computing, row-major order and column-major order describe methods for storing multidimensional arrays in linear memory. Following standard matrix notation, rows are identified by the first index of a two-dimensional array and columns by the second index....
    . Used most notably by statically-declared arrays in C
    C (programming language)

    C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
    . The elements of each row are stored in order.


  • Column-major order. Used most notably in Fortran
    Fortran

    Fortran is a general-purpose programming language, procedural programming language, imperative programming language programming language that is especially suited to numerical analysis and scientific computing....
    . The elements of each column are stored in order.


  • Arrays of arrays. Multi-dimensional arrays are typically represented by one-dimensional arrays of references
    Reference (computer science)

    In computer science, a reference is an object containing information about how to locate and access the particular data item, as opposed to containing the data itself....
     (Iliffe vector
    Iliffe vector

    In computer programming, an Iliffe vector is a data structure used to implement multi-dimensional arrays. Named after John K. Iliffe, an Iliffe vector for an n dimensional array consists of a vector of pointers to an n-1 dimensional array....
    s) to other one-dimensional arrays. The subarrays can be either the rows or columns.


Array of Array Storage
The first two forms are more compact and have potentially better locality of reference, but are also more limiting; the arrays must be rectangular, meaning that no row can contain more elements than any other. Arrays of arrays, on the other hand, allow the creation of ragged arrays, also called jagged arrays, in which the valid range of one index depends on the value of another, or in this case, simply that different rows can be different sizes. Arrays of arrays are also of value in programming languages that only supply one-dimensional arrays as primitives.

In many applications, such as numerical applications working with matrices, we iterate over rectangular two-dimensional arrays in predictable ways. For example, computing an element of the matrix product AB involves iterating over a row of A and a column of B simultaneously. In mapping the individual array indexes into memory, we wish to exploit locality of reference as much as we can. A compiler can sometimes automatically choose the layout for an array so that sequentially accessed elements are stored sequentially in memory; in our example, it might choose row-major order for A, and column-major order for B. Even more exotic orderings can be used, for example if we iterate over the main diagonal
Main diagonal

In linear algebra, the main diagonal of a matrix is the collection of cells where is equal to .The main diagonal of a square matrix is the diagonal which runs from the top left corner to the bottom right corner....
 of a matrix.

See also

  • Array slicing
    Array slicing

    In computer programming, array slicing is an operation that extracts certain elements from an array and packages them as another array, possibly with different number of indices and different index ranges....
  • Collection class
  • Comparison of programming languages (array)
    Comparison of programming languages (array)

    Syntax ...
  • Parallel array
    Parallel array

    In computing, a parallel array is a data structure for representing arrays of Record . It keeps a separate, homogeneous array for each field of the record, each having the same number of elements....
  • Set (computer science)
    Set (computer science)

    In computer science, a set is a collection of certain values, without any particular Canonical order, and no repeated values. It corresponds with a finite set in mathematics....
  • Sparse array
    Sparse array

    In computer science, a sparse array is an array in which most of the elements have the same value .A naive implementation of an array may allocate space for the entire array, but in the case where there are few non-default values, this implementation is inefficient....
  • Variable-length array
    Variable-length array

    In programming, a variable length array is an array data structure of automatic variable whose length is determined at run time .Programming languages that support VLAs include APL , COBOL, and C ....


External links