SIMD
Encyclopedia

Single instruction, multiple data (SIMD), is a class of parallel computers in Flynn's taxonomy
Flynn's Taxonomy
Flynn's taxonomy is a classification of computer architectures, proposed by Michael J. Flynn in 1966.-Classifications:The four classifications defined by Flynn are based upon the number of concurrent instruction and data streams available in the architecture:Single Instruction, Single Data stream...

. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously. Thus, such machines exploit data level parallelism
Data parallelism
Data parallelism is a form of parallelization of computing across multiple processors in parallel computing environments. Data parallelism focuses on distributing the data across different parallel computing nodes...

.

History

The first use of SIMD instructions was in vector supercomputers
Vector processor
A vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...

 of the early 1970s such as the CDC Star-100
CDC STAR-100
The STAR-100 was a vector supercomputer designed, manufactured, and marketed by Control Data Corporation . It was one of the first machines to use a vector processor to improve performance on appropriate scientific applications....

 and the Texas Instruments ASC, which could operate on a vector of data with a single instruction. Vector processing was especially popularized by Cray
Cray
Cray Inc. is an American supercomputer manufacturer based in Seattle, Washington. The company's predecessor, Cray Research, Inc. , was founded in 1972 by computer designer Seymour Cray. Seymour Cray went on to form the spin-off Cray Computer Corporation , in 1989, which went bankrupt in 1995,...

 in the 1970s and 1980s. Vector-processing architectures are now considered separate from SIMD machines, based on the fact that vector machines processed the vectors one word at a time through pipelined processors (though still based on a single instruction), whereas modern SIMD machines process all elements of the vector simultaneously.

The first era of modern SIMD machines was characterized by massively parallel processing-style supercomputer
Supercomputer
A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation.Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting, climate research, molecular modeling A supercomputer is a...

s such as the Thinking Machines CM-1 and CM-2. These machines had many limited-functionality processors that would work in parallel. For example, each of 64,000 processors in a Thinking Machines CM-2 would execute the same instruction at the same time, to do multiplications on 64,000 pairs of numbers at a time.

Supercomputing moved away from the SIMD approach when inexpensive scalar MIMD
MIMD
In computing, MIMD is a technique employed to achieve parallelism. Machines using MIMD have a number of processors that function asynchronously and independently. At any time, different processors may be executing different instructions on different pieces of data...

 approaches based on commodity processors such as the Intel i860 XP
Intel i860
The Intel i860 was a RISC microprocessor from Intel, first released in 1989. The i860 was one of Intel's first attempts at an entirely new, high-end instruction set since the failed Intel i432 from the 1980s...

 http://www.cs.kent.edu/~walker/classes/pdc.f01/lectures/MIMD-1.pdf became more powerful, and interest in SIMD waned.

The current era of SIMD processors grew out of the desktop-computer market rather than the supercomputer market. As desktop processors became powerful enough to support real-time gaming and video processing, demand grew for this particular type of computing power, and microprocessor vendors turned to SIMD to meet the demand. Sun Microsystems introduced SIMD integer instructions in its "VIS" instruction set extensions in 1995, in its UltraSPARC I microprocessor. The first widely-deployed desktop SIMD was with Intel's MMX extensions to the x86 architecture in 1996, followed in 1999 by SSE
Streaming SIMD Extensions
In computing, Streaming SIMD Extensions is a SIMD instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! . SSE contains 70 new instructions, most of which work on single precision floating point...

 after IBM and Motorola added AltiVec
AltiVec
AltiVec is a floating point and integer SIMD instruction set designed and owned by Apple, IBM and Freescale Semiconductor, formerly the Semiconductor Products Sector of Motorola, , and implemented on versions of the PowerPC including Motorola's G4, IBM's G5 and POWER6 processors, and P.A. Semi's...

 to the POWER
IBM POWER
POWER is a reduced instruction set computer instruction set architecture developed by IBM. The name is an acronym for Performance Optimization With Enhanced RISC....

 architecture. Since then, there have been several extensions to the SIMD instruction sets for both architectures. All of these developments have been oriented toward support for real-time graphics, and are therefore oriented toward processing in two, three, or four dimensions, usually with vector lengths of between two and sixteen words, depending on data type and architecture. When new SIMD architectures need to be distinguished from older ones, the newer architectures are then considered "short-vector" architectures, as earlier SIMD and Vector supercomputers had vector lengths from 64 to 64,000. A modern supercomputer is almost always a cluster of MIMD machines, each of which implements (short-vector) SIMD instructions. A modern desktop computer is often a multiprocessor MIMD machine where each processor can execute short-vector SIMD instructions.

Advantages

An application that may take advantage of SIMD is one where the same value is being added (or subtracted) to a large number of data points, a common operation in many multimedia
Multimedia
Multimedia is media and content that uses a combination of different content forms. The term can be used as a noun or as an adjective describing a medium as having multiple content forms. The term is used in contrast to media which use only rudimentary computer display such as text-only, or...

 applications. One example would be changing the brightness of an image. Each pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....

 of an image consists of three values for the brightness of the red (R), green (G) and blue (B) portions of the color. To change the brightness, the R, G and B values are read from memory, a value is added (or subtracted) from them, and the resulting values are written back out to memory.

With a SIMD processor there are two improvements to this process. For one the data is understood to be in blocks, and a number of values can be loaded all at once. Instead of a series of instructions saying "get this pixel, now get the next pixel", a SIMD processor will have a single instruction that effectively says "get lots of pixels" ("lots" is a number that varies from design to design). For a variety of reasons, this can take much less time than "getting" each pixel individually, as with traditional CPU design.

Another advantage is that SIMD systems typically include only those instructions that can be applied to all of the data in one operation. In other words, if the SIMD system works by loading up eight data points at once, the add operation being applied to the data will happen to all eight values at the same time. Although the same is true for any super-scalar processor design, the level of parallelism in a SIMD system is typically much higher.

Disadvantages

  • Not all algorithms can be vectorized. For example, a flow-control-heavy task like code parsing
    Parsing
    In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...

     wouldn't benefit from SIMD.
  • It also has large register files which increases power consumption and chip area.
  • Currently, implementing an algorithm with SIMD instructions usually requires human labor; most compilers don't generate SIMD instructions from a typical C
    C (programming language)
    C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

     program, for instance. Vectorization in compilers is an active area of computer science research. (Compare vector processing
    Vector processor
    A vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...

    .)
  • Programming with particular SIMD instruction sets can involve numerous low-level challenges.
    • SSE (Streaming SIMD Extension) has restrictions on data alignment; programmers familiar with the x86 architecture may not expect this.
    • Gathering data into SIMD registers and scattering it to the correct destination locations is tricky and can be inefficient.
    • Specific instructions like rotations or three-operand addition aren't in some SIMD instruction sets.
    • Instruction sets are architecture-specific: old processors and non-x86 processors lack SSE entirely, for instance, so programmers must provide non-vectorized implementations (or different vectorized implementations) for them.
    • The early MMX instruction set shared a register file with the floating-point stack, which caused inefficiencies when mixing floating-point and MMX code. However, SSE2
      SSE2
      SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...

       corrects this.

Chronology

Examples of SIMD supercomputers (not including vector processors
Vector processor
A vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...

):
  • ILLIAC IV
    ILLIAC IV
    The ILLIAC IV was one of the most infamous supercomputers ever built. One of a series of research machines, the ILLIACs from the University of Illinois, the ILLIAC IV design featured fairly high parallelism with up to 256 processors, used to allow the machine to work on large data sets in what...

    , circa 1974
  • ICL Distributed Array Processor (DAP), circa 1974
  • Burroughs Scientific Processor, circa 1976
  • Geometric-Arithmetic Parallel Processor
    Geometric-Arithmetic Parallel Processor
    The GAPP , invented by Polish mathematician Włodzimierz Holsztyński in 1981, was patented by Martin Marietta and is now owned by Silicon Optix, Inc. In terms of network topology, the GAPP is a mesh-connected array of single bit SIMD processing elements , where each PE can communicate with its...

    , from Martin Marietta
    Martin Marietta
    Martin Marietta Corporation was an American company founded in 1961 through the merger of The Martin Company and American-Marietta Corporation. The combined company became a leader in chemicals, aerospace, and electronics. In 1995, it merged with Lockheed Corporation to form Lockheed Martin. The...

    , starting in 1981, continued at Lockheed Martin
    Lockheed Martin
    Lockheed Martin is an American global aerospace, defense, security, and advanced technology company with worldwide interests. It was formed by the merger of Lockheed Corporation with Martin Marietta in March 1995. It is headquartered in Bethesda, Maryland, in the Washington Metropolitan Area....

    , then at Teranex and Silicon Optix
    Silicon Optix
    Silicon Optix Inc was a privately held fabless semiconductor company that designed and manufactured video/image digital processing integrated circuits. Originally a division of Genesis Microchip, Silicon Optix was spun off in 2001 by Paul Russo, the CEO of Genesis Microchip at the time...

  • Massively Parallel Processor
    Goodyear MPP
    The Goodyear Massively Parallel Processor was amassively parallel processing supercomputer built by Goodyear Aerospacefor the NASA Goddard Space Flight Center.It was designed to deliver enormous computational power at lower cost than...

     (MPP), from NASA
    NASA
    The National Aeronautics and Space Administration is the agency of the United States government that is responsible for the nation's civilian space program and for aeronautics and aerospace research...

    /Goddard Space Flight Center
    Goddard Space Flight Center
    The Goddard Space Flight Center is a major NASA space research laboratory established on May 1, 1959 as NASA's first space flight center. GSFC employs approximately 10,000 civil servants and contractors, and is located approximately northeast of Washington, D.C. in Greenbelt, Maryland, USA. GSFC,...

    , circa 1983-1991
  • Connection Machine
    Connection Machine
    The Connection Machine was a series of supercomputers that grew out of Danny Hillis' research in the early 1980s at MIT on alternatives to the traditional von Neumann architecture of computation...

    , models 1 and 2 (CM-1 and CM-2), from Thinking Machines Corporation, circa 1985
  • MasPar
    MasPar
    MasPar Computer Corporation was a minisupercomputer vendor that was founded in 1987 by Jeff Kalb. The company was based in Sunnyvale, California....

     MP-1 and MP-2, circa 1987-1996
  • Zephyr DC computer from Wavetracer, circa 1991
  • Xplor, from Pyxsys, Inc., circa 2001.


There were many others from that era too.

Hardware

Small-scale (64 or 128 bits) SIMD has become popular on general-purpose CPUs in the early 1990s and continuing through 1997 and later with Motion Video Instructions (MVI) for Alpha
DEC Alpha
Alpha, originally known as Alpha AXP, is a 64-bit reduced instruction set computer instruction set architecture developed by Digital Equipment Corporation , designed to replace the 32-bit VAX complex instruction set computer ISA and its implementations. Alpha was implemented in microprocessors...

. SIMD instructions can be found, to one degree or another, on most CPUs, including the IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

's AltiVec
AltiVec
AltiVec is a floating point and integer SIMD instruction set designed and owned by Apple, IBM and Freescale Semiconductor, formerly the Semiconductor Products Sector of Motorola, , and implemented on versions of the PowerPC including Motorola's G4, IBM's G5 and POWER6 processors, and P.A. Semi's...

 and SPE for PowerPC
PowerPC
PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...

, HP
Hewlett-Packard
Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including...

's PA-RISC
PA-RISC
PA-RISC is an instruction set architecture developed by Hewlett-Packard. As the name implies, it is a reduced instruction set computer architecture, where the PA stands for Precision Architecture...

 Multimedia Acceleration eXtensions
Multimedia Acceleration eXtensions
The Multimedia Acceleration eXtensions or MAX are instruction set extensions to the Hewlett-Packard PA-RISC instruction set architecture ....

 (MAX), Intel
Intel Corporation
Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States and the world's largest semiconductor chip maker, based on revenue. It is the inventor of the x86 series of microprocessors, the processors found in most...

's MMX and iwMMXt, SSE
Streaming SIMD Extensions
In computing, Streaming SIMD Extensions is a SIMD instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! . SSE contains 70 new instructions, most of which work on single precision floating point...

, SSE2
SSE2
SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...

, SSE3
SSE3
SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions , is the third iteration of the SSE instruction set for the IA-32 architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU...

 and SSSE3
SSSE3
Supplemental Streaming SIMD Extensions 3 is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology.- History :...

, AMD
Advanced Micro Devices
Advanced Micro Devices, Inc. or AMD is an American multinational semiconductor company based in Sunnyvale, California, that develops computer processors and related technologies for commercial and consumer markets...

's 3DNow!
3DNow!
3DNow! is an extension to the x86 instruction set developed by Advanced Micro Devices . It adds single instruction multiple data instructions to the base x86 instruction set, enabling it to perform simple vector processing, which improves the performance of many graphic-intensive applications...

, ARC
ARC International
ARC International plc was a developer of configurable microprocessor technology and is now owned by Synopsys. ARC developed synthesisable IP and licensed it to semiconductor companies....

's ARC Video subsystem, SPARC
SPARC
SPARC is a RISC instruction set architecture developed by Sun Microsystems and introduced in mid-1987....

's VIS
Visual Instruction Set
Visual Instruction Set, or VIS, is a SIMD instruction set for SPARC V9 microprocessors developed by Sun Microsystems. There are three versions of VIS: VIS 1, VIS 2 and VIS 2+...

 and VIS2, Sun
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

's MAJC
MAJC
MAJC was a Sun Microsystems multi-core, multithreaded, very long instruction word microprocessor design from the mid-to-late 1990s. Originally called the UltraJava processor, the MAJC processor was targeted at running Java programs, whose "late compiling" allowed Sun to make several favourable...

, ARM
ARM Holdings
ARM Holdings plc is a British multinational semiconductor and software company headquartered in Cambridge. Its largest business is in processors, although it also designs, licenses and sells software development tools under the RealView and KEIL brands, systems and platforms, system-on-a-chip...

's NEON technology, MIPS
MIPS architecture
MIPS is a reduced instruction set computer instruction set architecture developed by MIPS Technologies . The early MIPS architectures were 32-bit, and later versions were 64-bit...

' MDMX
MDMX
The MDMX , also known as MaDMaX, is an extension to the MIPS instruction set architecture released in October 1996 at the Microprocessor Forum.- History :...

 (MaDMaX) and MIPS-3D
MIPS-3D
MIPS-3D is an extension to the MIPS V instruction set architecture that added 13 new instructions for improving the performance of 3D graphics applications...

. The IBM, Sony, Toshiba co-developed Cell Processor
Cell (microprocessor)
Cell is a microprocessor architecture jointly developed by Sony, Sony Computer Entertainment, Toshiba, and IBM, an alliance known as "STI". The architectural design and first implementation were carried out at the STI Design Center in Austin, Texas over a four-year period beginning March 2001 on a...

's SPU's instruction set is heavily SIMD based. NXP founded by Philips developed several SIMD processors named Xetal
Xetal
Xetal is the name of a family of massively parallel processors developed within Philips Research and NXP Semiconductors.-Background:The Xetal was conceived in 1999 at Philips Research when researchers investigated possibilities for combining a CMOS image sensor with powerful image processing logic...

. The Xetal has 320 16bit processor elements especially designed for vision tasks.

Modern graphics processing unit
Graphics processing unit
A graphics processing unit or GPU is a specialized circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display...

s (GPUs) are often wide SIMD implementations, capable of branches, loads, and stores on 128 or 256 bits at a time.

Intel's AVX
Advanced Vector Extensions
Advanced Vector Extensions is an extension to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008 and first supported by Intel with the Westmere processor shipping in Q1 2011 and now by AMD with the Bulldozer processor shipping in Q3 2011.AVX...

 SIMD instructions now process 256 bits of data at once. Intel's Larrabee prototype microarchitecture includes more than two 512-bit SIMD registers on each of its cores (VPU: Wide Vector Processing Units), and this 512-bit SIMD capability is being continued in Intel's future Many Integrated Core Architecture (Intel MIC
Intel MIC
Intel Many Integrated Core Architecture or Intel MIC is a multiprocessor computer architecture developed by Intel incorporating earlier work on the Larrabee multicore architecture, the Teraflops Research Chip multicore chip research project and the Intel Single-chip Cloud Computer multicore...

).

Software

SIMD instructions are widely used to process 3D graphics, although modern graphics card
Video card
A video card, Graphics Card, or Graphics adapter is an expansion card which generates output images to a display. Most video cards offer various functions such as accelerated rendering of 3D scenes and 2D graphics, MPEG-2/MPEG-4 decoding, TV output, or the ability to connect multiple monitors...

s with embedded SIMD have largely taken over this task from the CPU. Some systems also include permute functions that re-pack elements inside vectors, making them particularly useful for data processing and compression. They are also used in cryptography. The trend of general-purpose computing on GPUs (GPGPU
GPGPU
General-purpose computing on graphics processing units is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU...

) may lead to wider use of SIMD in the future.

Adoption of SIMD systems in personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...

 software was at first slow, due to a number of problems. One was that many of the early SIMD instruction sets tended to slow overall performance of the system due to the re-use of existing floating point registers. Other systems, like MMX and 3DNow!
3DNow!
3DNow! is an extension to the x86 instruction set developed by Advanced Micro Devices . It adds single instruction multiple data instructions to the base x86 instruction set, enabling it to perform simple vector processing, which improves the performance of many graphic-intensive applications...

, offered support for data types that were not interesting to a wide audience and had expensive context switching instructions to switch between using the FPU and MMX registers
Processor register
In computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...

. Compilers also often lacked support requiring programmers to resort to assembly language
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...

 coding.

SIMD on x86 had a slow start. The introduction of 3DNow!
3DNow!
3DNow! is an extension to the x86 instruction set developed by Advanced Micro Devices . It adds single instruction multiple data instructions to the base x86 instruction set, enabling it to perform simple vector processing, which improves the performance of many graphic-intensive applications...

 by AMD
Advanced Micro Devices
Advanced Micro Devices, Inc. or AMD is an American multinational semiconductor company based in Sunnyvale, California, that develops computer processors and related technologies for commercial and consumer markets...

 and SSE
Streaming SIMD Extensions
In computing, Streaming SIMD Extensions is a SIMD instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! . SSE contains 70 new instructions, most of which work on single precision floating point...

 by Intel
Intel Corporation
Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States and the world's largest semiconductor chip maker, based on revenue. It is the inventor of the x86 series of microprocessors, the processors found in most...

 confused matters somewhat, but today the system seems to have settled down (after AMD adopted SSE) and newer compilers should result in more SIMD-enabled software. Intel and AMD now both provide optimized math libraries that use SIMD instructions, and open source alternatives like libSIMD and SIMDx86 have started to appear.

Apple Computer had somewhat more success, even though they entered the SIMD market later than the rest. AltiVec
AltiVec
AltiVec is a floating point and integer SIMD instruction set designed and owned by Apple, IBM and Freescale Semiconductor, formerly the Semiconductor Products Sector of Motorola, , and implemented on versions of the PowerPC including Motorola's G4, IBM's G5 and POWER6 processors, and P.A. Semi's...

 offered a rich system and can be programmed using increasingly sophisticated compilers from Motorola
Motorola
Motorola, Inc. was an American multinational telecommunications company based in Schaumburg, Illinois, which was eventually divided into two independent public companies, Motorola Mobility and Motorola Solutions on January 4, 2011, after losing $4.3 billion from 2007 to 2009...

, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 and GNU
GNU
GNU is a Unix-like computer operating system developed by the GNU project, ultimately aiming to be a "complete Unix-compatible software system"...

, therefore assembly language programming is rarely needed. Additionally, many of the systems that would benefit from SIMD were supplied by Apple itself, for example iTunes
ITunes
iTunes is a media player computer program, used for playing, downloading, and organizing digital music and video files on desktop computers. It can also manage contents on iPod, iPhone, iPod Touch and iPad....

 and QuickTime
QuickTime
QuickTime is an extensible proprietary multimedia framework developed by Apple Inc., capable of handling various formats of digital video, picture, sound, panoramic images, and interactivity. The classic version of QuickTime is available for Windows XP and later, as well as Mac OS X Leopard and...

. However, in 2006, Apple computers moved to Intel x86 processors. Apple's API
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

s and development tools
Integrated development environment
An integrated development environment is a software application that provides comprehensive facilities to computer programmers for software development...

 (XCode
Xcode
Xcode is a suite of tools, developed by Apple, for developing software for Mac OS X and iOS. Xcode 4.2, the latest major version, is available on the Mac App Store for free for Mac OS X 10.7 , and on the Apple Developer Connection website for free to registered developers Xcode is a suite of tools,...

) were rewritten to use SSE2
SSE2
SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...

 and SSE3
SSE3
SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions , is the third iteration of the SSE instruction set for the IA-32 architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU...

 instead of AltiVec. Apple was the dominant purchaser of PowerPC chips from IBM and Freescale Semiconductor
Freescale Semiconductor
Freescale Semiconductor, Inc. is a producer and designer of embedded hardware, with 17 billion semiconductor chips in use around the world. The company focuses on the automotive, consumer, industrial and networking markets with its product portfolio including microprocessors, microcontrollers,...

 and even though they abandoned the platform, further development of AltiVec is continued in several Power Architecture
Power Architecture
Power Architecture is a broad term to describe similar RISC instruction sets for microprocessors developed and manufactured by such companies as IBM, Freescale, AMCC, Tundra and P.A. Semi...

 designs from Freescale, IBM.

SIMD within a register, or SWAR
SWAR
SWAR is an acronym for SIMD Within A Register.SIMD, in turn, stands for Single Instruction, Multiple Data.Many modern general-purpose computer processors have some provisions for SIMD, in the form of a group of registers and instructions to make use of them...

, is a range of techniques and tricks used for performing SIMD in general-purpose registers on hardware that doesn't provide any direct support for SIMD instructions. This can be used to exploit parallelism in certain algorithms even on hardware that does not support SIMD directly.

Commercial applications

Though it has generally proven difficult to find sustainable commercial applications for SIMD-only processors, one that has had some measure of success is the GAPP
Geometric-Arithmetic Parallel Processor
The GAPP , invented by Polish mathematician Włodzimierz Holsztyński in 1981, was patented by Martin Marietta and is now owned by Silicon Optix, Inc. In terms of network topology, the GAPP is a mesh-connected array of single bit SIMD processing elements , where each PE can communicate with its...

, which was developed by Lockheed Martin
Lockheed Martin
Lockheed Martin is an American global aerospace, defense, security, and advanced technology company with worldwide interests. It was formed by the merger of Lockheed Corporation with Martin Marietta in March 1995. It is headquartered in Bethesda, Maryland, in the Washington Metropolitan Area....

 and taken to the commercial sector by their spin-off Teranex. The GAPP's recent incarnations have become a powerful tool in real-time video processing
Digital image processing
Digital image processing is the use of computer algorithms to perform image processing on digital images. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing...

 applications like conversion between various video standards and frame rates (NTSC
NTSC
NTSC, named for the National Television System Committee, is the analog television system that is used in most of North America, most of South America , Burma, South Korea, Taiwan, Japan, the Philippines, and some Pacific island nations and territories .Most countries using the NTSC standard, as...

 to/from PAL
PAL
PAL, short for Phase Alternating Line, is an analogue television colour encoding system used in broadcast television systems in many countries. Other common analogue television systems are NTSC and SECAM. This page primarily discusses the PAL colour encoding system...

, NTSC to/from HDTV
High-definition television
High-definition television is video that has resolution substantially higher than that of traditional television systems . HDTV has one or two million pixels per frame, roughly five times that of SD...

 formats, etc.), deinterlacing
Deinterlacing
Deinterlacing is the process of converting interlaced video, such as common analog television signals or 1080i format HDTV signals, into a non-interlaced form....

, image noise reduction
Noise reduction
Noise reduction is the process of removing noise from a signal.All recording devices, both analogue or digital, have traits which make them susceptible to noise...

, adaptive video compression, and image enhancement.

A more ubiquitous application for SIMD is found in video games: nearly every modern video game console
Video game console
A video game console is an interactive entertainment computer or customized computer system that produces a video display signal which can be used with a display device to display a video game...

 since 1998
History of video game consoles (sixth generation)
The sixth-generation era refers to the computer and video games, video game consoles, and video game handhelds available at the turn of the 21st century. Platforms of the sixth generation include the Sega Dreamcast, Sony PlayStation 2, Nintendo GameCube, and Microsoft Xbox...

 has incorporated a SIMD processor somewhere in its architecture. The PlayStation 2
PlayStation 2
The PlayStation 2 is a sixth-generation video game console manufactured by Sony as part of the PlayStation series. Its development was announced in March 1999 and it was first released on March 4, 2000, in Japan...

 was unusual in that one of its vector-float units could function as an autonomous DSP executing its own instruction stream, or as a coprocessor driven by ordinary CPU instructions. 3D graphics applications tend to lend themselves well to SIMD processing as they rely heavily on operations with 4-dimensional vectors. Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

's Direct3D 9.0
DirectX
Microsoft DirectX is a collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. Originally, the names of these APIs all began with Direct, such as Direct3D, DirectDraw, DirectMusic, DirectPlay,...

 now chooses at runtime processor-specific implementations of its own math operations, including the use of SIMD-capable instructions.

One of the recent processors to use vector processing is the Cell Processor
Cell (microprocessor)
Cell is a microprocessor architecture jointly developed by Sony, Sony Computer Entertainment, Toshiba, and IBM, an alliance known as "STI". The architectural design and first implementation were carried out at the STI Design Center in Austin, Texas over a four-year period beginning March 2001 on a...

 developed by IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 in cooperation with Toshiba
Toshiba
is a multinational electronics and electrical equipment corporation headquartered in Tokyo, Japan. It is a diversified manufacturer and marketer of electrical products, spanning information & communications equipment and systems, Internet-based solutions and services, electronic components and...

 and Sony
Sony
, commonly referred to as Sony, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan and the world's fifth largest media conglomerate measured by revenues....

. It uses a number of SIMD processors (each with independent RAM
Random-access memory
Random access memory is a form of computer data storage. Today, it takes the form of integrated circuits that allow stored data to be accessed in any order with a worst case performance of constant time. Strictly speaking, modern types of DRAM are therefore not random access, as data is read in...

 and controlled by a general purpose CPU) and is geared towards the huge datasets required by 3D and video processing applications.

A recent advancement by Ziilabs was the production of an SIMD type processor which can be used on mobile devices, such as media players and mobile phones.

Larger scale commercial SIMD processors are available from ClearSpeed Technology, Ltd. and Stream Processors, Inc. ClearSpeed
ClearSpeed
ClearSpeed Technology Ltd is a semiconductor company, formed in 2002 to develop enhanced SIMD processors for use in high-performance computing and embedded systems. Based in Bristol, UK, the company has been selling its processors since 2005...

's CSX600 (2004) has 96 cores each with 2 double-precision floating point units while the CSX700 (2008) has 192. Stream Processors is headed by computer architect Bill Dally. Their Storm-1 processor (2007) contains 80 SIMD cores controlled by a MIPS CPU.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK