Scalable parallelism - AbsoluteAstronomy.com

Software is said to exhibit scalable parallelism if it can make use of additional processors to solve larger problems,
i.e. this term refers to software for which Gustafson's law

Gustafson's law

Gustafson's Law is a law in computer science which says that problems with large, repetitive data sets can be efficiently parallelized. Gustafson's Law contradicts Amdahl's law, which describes a limit on the speed-up that parallelization can provide. Gustafson's law was first described by John...

holds.
Consider a program whose execution time is dominated by one or more loops,
each of that updates every element of an array ---
for example, the following finite difference

Finite difference method

In mathematics, finite-difference methods are numerical methods for approximating the solutions to differential equations using finite difference equations to approximate derivatives.- Derivation from Taylor's polynomial :...

heat equation

Heat equation

The heat equation is an important partial differential equation which describes the distribution of heat in a given region over time...

stencil calculation

Stencil (numerical analysis)

In mathematics, especially the areas of numerical analysis concentrating on the numerical solution of partial differential equations, a stencil is a geometric arrangement of a nodal group that relate to the point of interest by using a numerical approximation routine. Stencils are the basis for...

:

for t := 0 to T do
for i := 1 to N-1 do
new(i) := (A(i-1) + A(i) + A(i) + A(i+1)) * .25
// explicit forward-difference with R = 0.25
end
for i := 1 to N-1 do
A(i) := new(i)
end
end

In the above code, we can execute all iterations of each "i" loop concurrently,
i.e., turn each into a parallel loop

Data parallelism

Data parallelism is a form of parallelization of computing across multiple processors in parallel computing environments. Data parallelism focuses on distributing the data across different parallel computing nodes...

.
In such cases,
it is often possible to make effective use of twice as many processors for a problem of array size 2N
as for a problem of array size N.
As in this example, scalable parallelism is typically a form of data parallelism

Data parallelism

.
This form of parallelism is often the target of automatic parallelization

Automatic parallelization

Automatic parallelization, also auto parallelization, autoparallelization, or parallelization, the last one of which implies automation when used in context, refers to converting sequential code into multi-threaded or vectorized code in order to utilize multiple processors simultaneously in a...

of loops

Loop optimization

In compiler theory, loop optimization plays an important role in improving cache performance, making effective use of parallel processing capabilities, and reducing overheads associated with executing loops. Most execution time of a scientific program is spent on loops...

.

Distributed computing systems

Distributed computing

Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...

and non-uniform memory access

Non-Uniform Memory Access

Non-Uniform Memory Access is a computer memory design used in Multiprocessing, where the memory access time depends on the memory location relative to a processor...

architectures
are typically the most easily scaled to large numbers of processors,
and thus would seem a natural target for software that exhibits scalable parallelism.
However, applications with scalable parallelism may not have parallelism of
sufficiently coarse grain
to run effectively on such systems (unless the software is embarrassingly parallel).
In our example above, the second "i" loop is embarrassingly parallel,
but in the first loop each iteration requires results produced in several prior iterations.
Thus, for the first loop, parallelization may involve extensive communication or synchronization among processors,
and thus only result in a net speedup if such interactions have very low overhead,
or if the code can be transformed to resolve this issue (i.e., by combined scalable locality

Scalable locality

Software is said to exhibit scalable locality if it can continue to make use of processors that out-pace their memory systems, to solve ever larger problems....

/scalable parallelism optimization).

Languages

Ateji PX

Ateji PX is an object-oriented programming language extension for Java. It is intended to facilliate parallel computing on multi-core processors, GPU, Grid and Cloud....

an extension of Java making Scalable Parallelism possible on the Java Virtual Machine (JVM)

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.