OpenCL
Encyclopedia
OpenCL is a framework for writing programs that execute across heterogeneous
Heterogeneous computing
Heterogeneous computing systems refer to electronic systems that use a variety of different types of computational units. A computational unit could be a general-purpose processor , a special-purpose processor Heterogeneous computing systems refer to electronic systems that use a variety of...

 platforms consisting of CPUs
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

, GPUs
Graphics processing unit
A graphics processing unit or GPU is a specialized circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display...

, and other processors. OpenCL includes a language (based on C99
C99
C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

) for writing kernels (functions that execute on OpenCL devices), plus APIs
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

 that are used to define and then control the platforms. OpenCL provides parallel computing
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

 using task-based and data-based parallelism. It has been adopted by Intel, AMD, Nvidia
NVIDIA
Nvidia is an American global technology company based in Santa Clara, California. Nvidia is best known for its graphics processors . Nvidia and chief rival AMD Graphics Techonologies have dominated the high performance GPU market, pushing other manufacturers to smaller, niche roles...

, and ARM
ARM
An arm is an upper limb of the body.Arm may also refer to:-Geography:* Arm , a narrow stretch of a larger body of water** Canal arm, a subsidiary branch of a canal or inland waterway** Distributary or arm, a subsidiary branch of a river...

.

OpenCL gives any application access to the graphics processing unit for non-graphical computing. Thus, OpenCL extends the power of the Graphics Processing Unit beyond graphics (general-purpose computing on graphics processing units
GPGPU
General-purpose computing on graphics processing units is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU...

).
Academic researchers have investigated automatically compiling OpenCL programs into application-specific processors running on FPGAs, and commercial FPGA vendors are developing tools to translate OpenCL to run on their FPGA devices.

OpenCL is analogous to the open industry standards OpenGL
OpenGL
OpenGL is a standard specification defining a cross-language, cross-platform API for writing applications that produce 2D and 3D computer graphics. The interface consists of over 250 different function calls which can be used to draw complex three-dimensional scenes from simple primitives. OpenGL...

 and OpenAL
OpenAL
OpenAL is a cross-platform audio API. It is designed for efficient rendering of multichannel three dimensional positional audio. Its API style and conventions deliberately resemble those of OpenGL.- History :...

, for 3D graphics and computer audio, respectively. OpenCL is managed by the non-profit
Non-profit organization
Nonprofit organization is neither a legal nor technical definition but generally refers to an organization that uses surplus revenues to achieve its goals, rather than distributing them as profit or dividends...

 technology consortium Khronos Group
Khronos Group
The Khronos Group is a not-for-profit member-funded industry consortium based in Beaverton, Oregon, focused on the creation of open standard, royalty-free APIs to enable the authoring and accelerated playback of dynamic media on a wide variety of platforms and devices...

.

History

OpenCL was initially developed by Apple Inc., which holds trademark rights, and refined into an initial proposal in collaboration with technical teams at AMD
Advanced Micro Devices
Advanced Micro Devices, Inc. or AMD is an American multinational semiconductor company based in Sunnyvale, California, that develops computer processors and related technologies for commercial and consumer markets...

, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

, Intel
Intel Corporation
Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States and the world's largest semiconductor chip maker, based on revenue. It is the inventor of the x86 series of microprocessors, the processors found in most...

, and Nvidia
NVIDIA
Nvidia is an American global technology company based in Santa Clara, California. Nvidia is best known for its graphics processors . Nvidia and chief rival AMD Graphics Techonologies have dominated the high performance GPU market, pushing other manufacturers to smaller, niche roles...

. Apple submitted this initial proposal to the Khronos Group
Khronos Group
The Khronos Group is a not-for-profit member-funded industry consortium based in Beaverton, Oregon, focused on the creation of open standard, royalty-free APIs to enable the authoring and accelerated playback of dynamic media on a wide variety of platforms and devices...

. On June 16, 2008 the Khronos Compute Working Group was formed with representatives from CPU, GPU, embedded-processor, and software companies. This group worked for five months to finish the technical details of the specification for OpenCL 1.0 by November 18, 2008. This technical specification was reviewed by the Khronos members and approved for public release on December 8, 2008.

OpenCL 1.0 has been released with Mac OS X Snow Leopard. According to an Apple press release:

Snow Leopard further extends support for modern hardware with Open Computing Language (OpenCL), which lets any application tap into the vast gigaflops of GPU computing power previously available only to graphics applications. OpenCL is based on the C programming language and has been proposed as an open standard.


AMD has decided to support OpenCL (and DirectX
DirectX
Microsoft DirectX is a collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. Originally, the names of these APIs all began with Direct, such as Direct3D, DirectDraw, DirectMusic, DirectPlay,...

 11) instead of the now deprecated Close to Metal
Close to Metal
Close To Metal is the name of a beta version of a low-level programming interface developed by ATI , aimed at enabling GPGPU computing...

 in its Stream framework.
RapidMind
RapidMind
RapidMind Inc. was a privately held company founded and headquartered in Waterloo, Ontario, Canada, acquired by Intel in 2009. It provided a software product that aims to make it simpler for software developers to target multi-core processors and accelerators such as GPUs.-History:RapidMind was...

 announced their adoption of OpenCL underneath their development platform, in order to support GPUs from multiple vendors with one interface.
On December 9, 2008, Nvidia announced its intention to add full support for the OpenCL 1.0 specification to its GPU Computing Toolkit. On October 30, 2009, IBM released its first OpenCL implementation as a part of the XL compilers.

OpenCL 1.1 was ratified by the Khronos Group June 14, 2010 and adds significant functionality for enhanced parallel programming flexibility, functionality and performance including:
  • New data types including 3-component vectors and additional image formats;
  • Handling commands from multiple host threads and processing buffers across multiple devices;
  • Operations on regions of a buffer including read, write and copy of 1D, 2D or 3D rectangular regions;
  • Enhanced use of events to drive and control command execution;
  • Additional OpenCL built-in C functions such as integer clamp, shuffle and asynchronous strided copies;
  • Improved OpenGL interoperability through efficient sharing of images and buffers by linking OpenCL and OpenGL events.


On Nov 15, 2011 the OpenCL 1.2 specification was announced by the Khronos Group which added significant functionality over the previous versions in terms of performance and features for parallel programming. Most notable features include:
  • Device partitioning: the ability to partition a device into sub-devices so that work assignments can be allocated to individual compute units. This is useful for reserving areas of the device in order to reduce latency for time-critical tasks.
  • Separate compilation and linking of objects: the functionality to compile OpenCL into external libraries for inclusion into other programs.
  • Enhanced image support: 1.2 adds support for 1D images and 1D/2D image arrays. Furthermore, the OpenGL sharing extensions now allow for OpenGL 1D textures and 1D/2D texture arrays to be used to create OpenCL images.
  • Built-in kernels: custom devices that contain specific unique functionality are now integrated more closley into the OpenCL framework. Kernels can be called to use specialised or non-programmable aspects of underlying hardware. Examples include, video encoding/decoding and digital signal processors.
  • DirectX functionality: DX9 media surface sharing allows for efficient sharing between OpenCL and DX9 or DXVA media surfaces. Equally, for DX11 seamless sharing between OpenCL and DX11 surfaces is enabled.


The OpenCL specification is under development at Khronos, which is open to any interested company to join.

Implementation

  • On December 10, 2008, AMD and Nvidia held the first public OpenCL demonstration, a 75-minute presentation at Siggraph Asia 2008
    SIGGRAPH
    SIGGRAPH is the name of the annual conference on computer graphics convened by the ACM SIGGRAPH organization. The first SIGGRAPH conference was in 1974. The conference is attended by tens of thousands of computer professionals...

    . AMD showed a CPU-accelerated OpenCL demo explaining the scalability of OpenCL on one or more cores while Nvidia showed a GPU-accelerated demo.
  • On March 16, 2009, at the 4th Multicore Expo, Imagination Technologies announced the PowerVR
    PowerVR
    PowerVR is a division of Imagination Technologies that develops hardware and software for 2D and 3D rendering, and for video encoding, decoding, associated image processing and Direct X, OpenGL ES, OpenVG, and OpenCL acceleration....

     SGX543MP, the first GPU of this company to feature OpenCL support.
  • On March 26, 2009, at GDC 2009
    Game Developers Conference
    The Game Developers Conference is the largest annual gathering of professional video game developers, focusing on learning, inspiration, and networking...

    , AMD and Havok
    Havok (company)
    Havok is an Irish computer software company that provides interactive software and services for digital media creators in the video game and movie industries....

     demonstrated the first working implementation for OpenCL accelerating Havok Cloth on AMD Radeon HD 4000 series GPU.
  • On April 20, 2009, Nvidia announced the release of its OpenCL driver and SDK
    Software development kit
    A software development kit is typically a set of software development tools that allows for the creation of applications for a certain software package, software framework, hardware platform, computer system, video game console, operating system, or similar platform.It may be something as simple...

     to developers participating in its OpenCL Early Access Program.
  • On August 5, 2009, AMD unveiled the first development tools for its OpenCL platform as part of its ATI Stream SDK v2.0 Beta Program.
  • On August 28, 2009, Apple released Mac OS X Snow Leopard, which contains a full implementation of OpenCL.
OpenCL in Snow Leopard is supported on the NVIDIA GeForce 320M, GeForce GT 330M, GeForce 9400M, GeForce 9600M GT, GeForce 8600M GT, GeForce GT 120, GeForce GT 130, GeForce GTX 285, GeForce 8800 GT, GeForce 8800 GS, Quadro FX 4800, Quadro FX5600, ATI Radeon HD 4670, ATI Radeon HD 4850, Radeon HD 4870, ATI Radeon HD 5670, ATI Radeon HD 5750, ATI Radeon HD 5770 and ATI Radeon HD 5870.
  • On September 28, 2009, NVIDIA released its own OpenCL drivers and SDK implementation.
  • On October 13, 2009, AMD released the fourth beta of the ATI Stream SDK 2.0, which provides a complete OpenCL implementation on both R700
    Radeon R700
    The Radeon R700 is the engineering codename for a graphics processing unit series developed by Advanced Micro Devices under the ATI brand name. The foundation chip, codenamed RV770, was announced and demonstrated on June 16, 2008 as part of the FireStream 9250 and Cinema 2.0 initiative launch media...

    /R800
    Radeon R800
    The Evergreen series is a family of GPUs developed by Advanced Micro Devices for its Radeon line under the ATI brand name.-Release:The existence was spotted on a presentation slide from AMD Technology Analyst Day July 2007 as "R8xx"...

     GPUs and SSE3
    SSE3
    SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions , is the third iteration of the SSE instruction set for the IA-32 architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU...

     capable CPUs. The SDK is available for both Linux and Windows.
  • On November 26, 2009, NVIDIA released drivers for OpenCL 1.0 (rev 48).
The Apple, Nvidia, RapidMind
RapidMind
RapidMind Inc. was a privately held company founded and headquartered in Waterloo, Ontario, Canada, acquired by Intel in 2009. It provided a software product that aims to make it simpler for software developers to target multi-core processors and accelerators such as GPUs.-History:RapidMind was...

 and Gallium3D
Gallium3D
Gallium3D is a free software library for 3D graphics device drivers being developed by VMware, after they acquired Tungsten Graphics – the original authors....

 implementations of OpenCL are all based on the LLVM Compiler technology and use the Clang
Clang
Clang is a compiler front end for the C, C++, Objective-C, and Objective-C++ programming languages. It uses the Low Level Virtual Machine as its back end, and Clang has been part of LLVM releases since LLVM 2.6....

 Compiler as its frontend.
  • On October 27, 2009, S3
    S3 Graphics
    S3 Graphics, Ltd is an American company specializing in graphics chipsets. Although they do not have the large market share that they once had, they still produce graphics accelerators for home computers under the "S3 Chrome" brand name.-History:...

     released their first product supporting native OpenCL 1.0 - the Chrome 5400E embedded graphics processor.
  • On December 10, 2009, VIA
    VIA Technologies
    VIA Technologies is a Taiwanese manufacturer of integrated circuits, mainly motherboard chipsets, CPUs, and memory, and is part of the Formosa Plastics Group. It is the world's largest independent manufacturer of motherboard chipsets...

     released their first product supporting OpenCL 1.0 - ChromotionHD 2.0 video processor included in VN1000 chipset.
  • On December 21, 2009, AMD released the production version of the ATI Stream SDK 2.0, which provides OpenCL 1.0 support for R800
    Radeon R800
    The Evergreen series is a family of GPUs developed by Advanced Micro Devices for its Radeon line under the ATI brand name.-Release:The existence was spotted on a presentation slide from AMD Technology Analyst Day July 2007 as "R8xx"...

     GPUs and beta support for R700
    Radeon R700
    The Radeon R700 is the engineering codename for a graphics processing unit series developed by Advanced Micro Devices under the ATI brand name. The foundation chip, codenamed RV770, was announced and demonstrated on June 16, 2008 as part of the FireStream 9250 and Cinema 2.0 initiative launch media...

     GPUs.
  • On June 1, 2010, ZiiLABS
    ZiiLABS
    ZiiLABS is a global technology company, whose ZMS media-rich application processors, reference platforms and enabling software are designed to enable OEMs and ODMs to create products that target a range of low-power consumer electronics and embedded markets, including Android based tablets.-...

     released details of their first OpenCL implementation for the ZMS processor for handheld, embedded and digital home products.
  • On June 30, 2010, IBM released a fully conformant version of OpenCL 1.0.
  • On September 13, 2010, Intel released details of their first OpenCL implementation for the Sandy Bridge chip architecture. Sandy Bridge will integrate Intel's newest graphics chip technology directly onto the central processing unit.
  • On November 15, 2010, Wolfram Research released Mathematica 8
    Mathematica
    Mathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing...

     with OpenCLLink package.
  • On March 3, 2011, Khronos Group
    Khronos Group
    The Khronos Group is a not-for-profit member-funded industry consortium based in Beaverton, Oregon, focused on the creation of open standard, royalty-free APIs to enable the authoring and accelerated playback of dynamic media on a wide variety of platforms and devices...

     announces the formation of the WebCL working group to explore defining a JavaScript
    JavaScript
    JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

     binding to OpenCL. This creates the potential to harness GPU and multi-core CPU parallel processing from a Web browser
    Web browser
    A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

    .
  • On March 31, 2011, IBM released a fully conformant version of OpenCL 1.1.
  • On April 25, 2011, IBM released OpenCL Common Runtime v0.1 for Linux on x86 Architecture.
  • On May 4, 2011, Nokia Research releases an open source WebCL extension for the Firefox web browser, providing a JavaScript binding to OpenCL.
  • On July 1, 2011, Samsung Electronics releases an open source prototype implementation of WebCL for WebKit, providing a JavaScript binding to OpenCL.
  • On August 8, 2011, AMD released the OpenCL-driven AMD Accelerated Parallel Processing (APP) Software Development Kit (SDK) v2.5, replacing the ATI Stream SDK as technology and concept.

OpenCL language

The programming language used to write computation kernels is based on C99
C99
C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

 with some limitations and additions. It omits the use of function pointer
Function pointer
A function pointer is a type of pointer in C, C++, D, and other C-like programming languages, and Fortran 2003. When dereferenced, a function pointer can be used to invoke a function and pass it arguments just like a normal function...

s, recursion
Recursion (computer science)
Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science....

, bit field
Bit field
A bit field is a common idiom used in computer programming to compactly store multiple logical values as a short series of bits where each of the single bits can be addressed separately. A bit field is most commonly used to represent integral types of known, fixed bit-width. A well-known usage of...

s, variable-length array
Variable-length array
In programming, a variable-length array is an array data structure of automatic storage duration whose length is determined at run time ....

s, and standard C99 header files. The language is extended to easily use parallelism
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

 with vector types and operations, synchronization, functions to work with work-items/groups. It has memory region qualifiers: __global, __local, __constant, and __private. Also, a lot of built-in functions are added.

Example

This example will load a Fast Fourier Transformation
Fast Fourier transform
A fast Fourier transform is an efficient algorithm to compute the discrete Fourier transform and its inverse. "The FFT has been called the most important numerical algorithm of our lifetime ." There are many distinct FFT algorithms involving a wide range of mathematics, from simple...

 (FFT) and execute it:


// create a compute context with GPU device
context = clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU, NULL, NULL, NULL);

// create a command queue
queue = clCreateCommandQueue(context, NULL, 0, NULL);

// allocate the buffer memory objects
memobjs[0] = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(float)*2*num_entries, srcA, NULL);
memobjs[1] = clCreateBuffer(context, CL_MEM_READ_WRITE, sizeof(float)*2*num_entries, NULL, NULL);

// create the compute program
program = clCreateProgramWithSource(context, 1, &fft1D_1024_kernel_src, NULL, NULL);

// build the compute program executable
clBuildProgram(program, 0, NULL, NULL, NULL, NULL);

// create the compute kernel
kernel = clCreateKernel(program, "fft1D_1024", NULL);

// set the args values
clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&memobjs[0]);
clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *)&memobjs[1]);
clSetKernelArg(kernel, 2, sizeof(float)*(local_work_size[0]+1)*16, NULL);
clSetKernelArg(kernel, 3, sizeof(float)*(local_work_size[0]+1)*16, NULL);

// create N-D range object with work-item dimensions and execute kernel
global_work_size[0] = num_entries;
local_work_size[0] = 64;
clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global_work_size, local_work_size, 0, NULL, NULL);


The actual calculation: (Based on Fitting FFT onto the G80 Architecture)

// This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into
// calls to a radix 16 function, another radix 16 function and then a radix 4 function

__kernel void fft1D_1024 (__global float2 *in, __global float2 *out,
__local float *sMemx, __local float *sMemy) {
int tid = get_local_id(0);
int blockIdx = get_group_id(0) * 1024 + tid;
float2 data[16];

// starting index of data to/from global memory
in = in + blockIdx; out = out + blockIdx;

globalLoads(data, in, 64); // coalesced global reads
fftRadix16Pass(data); // in-place radix-16 pass
twiddleFactorMul(data, tid, 1024, 0);

// local shuffle using local memory
localShuffle(data, sMemx, sMemy, tid, (((tid & 15) * 65) + (tid >> 4)));
fftRadix16Pass(data); // in-place radix-16 pass
twiddleFactorMul(data, tid, 64, 4); // twiddle factor multiplication

localShuffle(data, sMemx, sMemy, tid, (((tid >> 4) * 64) + (tid & 15)));

// four radix-4 function calls
fftRadix4Pass(data); // radix-4 function number 1
fftRadix4Pass(data + 4); // radix-4 function number 2
fftRadix4Pass(data + 8); // radix-4 function number 3
fftRadix4Pass(data + 12); // radix-4 function number 4

// coalesced global writes
globalStores(data, out, 64);
}

A full, open source implementation of an OpenCL FFT can be found on Apple's website

OpenCL conformant products

The Khronos Group
Khronos Group
The Khronos Group is a not-for-profit member-funded industry consortium based in Beaverton, Oregon, focused on the creation of open standard, royalty-free APIs to enable the authoring and accelerated playback of dynamic media on a wide variety of platforms and devices...

 announces an extended list of OpenCL conformant products, see OpenCL Conformant Products.
Synopsis
Synopsis
A synopsis is a brief summary of the major points of a written work, either as prose or as a table; an abridgment or condensation of a work.-See also:*Synopsys, an electronic design automation company based in Mountain View, California...

 of OpenCL conformant products
AMD APP SDK (supports OpenCL CPU and Accelerated processing unit
Accelerated processing unit
An accelerated processing unit is a processing system that includes additional processing capability designed to accelerate one or more types of computations outside of a CPU. This may include a graphics processing unit used for general-purpose computing , a field-programmable gate array , or...

 Devices)
X86  + SSE2
SSE2
SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...

 (or higher) compatible CPUs 64bit & 32bit; Linux 2.6 PC, Windows Vista/7 PC
AMD Fusion
AMD Fusion
AMD Fusion is the marketing name for a series of APUs by AMD. There are two flavors of Fusion currently available, one with its CPU logic based on the Bobcat core and the other its CPU logic based on the 10h core. In both cases the GPU logic is HD6xxx, which itself is based on the mobile variant of...

 E-350, E-240, C-50, C-30 with HD 6310/HD 6250
AMD Radeon
Radeon
Radeon is a brand of graphics processing units and random access memory produced by Advanced Micro Devices , first launched in 2000 by ATI Technologies, which was acquired by AMD in 2006. Radeon is the successor to the Rage line. There are four different groups, which can be differentiated by...

/Mobility HD 6800, HD 5x00 series GPU, iGPU HD 6310/HD 6250
ATI FirePro Vx800 series GPU
Intel OpenCL SDK 1.1 (supports only OpenCL Intel Core based CPU Device) Intel CPUs with SSE
SSE
-Computing:*Server-sent events, a technology to push content to web clients*Simple Sharing Extensions, a specification that extends RSS from unidirectional to bidirectional information flows*SPARQL Syntax Expressions*Microsoft SQL Server Express Edition...

 4.1, SSE 4.2 or AVX
Advanced Vector Extensions
Advanced Vector Extensions is an extension to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008 and first supported by Intel with the Westmere processor shipping in Q1 2011 and now by AMD with the Bulldozer processor shipping in Q3 2011.AVX...

 support. Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

Intel Core
Intel Core
Yonah was the code name for Intel's first generation of 65 nm process mobile microprocessors, based on the Banias/Dothan-core Pentium M microarchitecture. SIMD performance has been improved through the addition of SSE3 instructions and improvements to SSE and SSE2 implementations, while integer...

 i7, i5, i3; 2nd Generation Intel Core i7/5/3
Intel Core 2 Solo, Duo Quad, Extreme Intel Xeon 7x00,5x00,3x00 (Core based)
IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 Servers with OpenCL Development Kit for Linux on Power running on Power VSX
IBM Power
IBM Power Systems
Power Systems is the name of IBM's Power Architecture-based server line.Before the Power Systems line was announced on April 2, 2008, IBM had two distinct Power-based lines: the System i running IBM i - and the System p series running AIX or Linux.- History :IBM had two discrete Power Architecture...

 755 (PERCS
PERCS
PERCS , officially known as the Power 775, is IBM's answer to DARPA's High Productivity Computing Systems initiative....

), 750
IBM BladeCenter
IBM Power Systems
Power Systems is the name of IBM's Power Architecture-based server line.Before the Power Systems line was announced on April 2, 2008, IBM had two distinct Power-based lines: the System i running IBM i - and the System p series running AIX or Linux.- History :IBM had two discrete Power Architecture...

 PS70x Express
IBM BladeCenter JS2x, JS43 IBM BladeCenter QS22
IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 OpenCL Common Runtime (OCR)
X86  + SSE2
SSE2
SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...

 (or higher) compatible CPUs 64bit & 32bit; Linux 2.6 PC
AMD Fusion
AMD Fusion
AMD Fusion is the marketing name for a series of APUs by AMD. There are two flavors of Fusion currently available, one with its CPU logic based on the Bobcat core and the other its CPU logic based on the 10h core. In both cases the GPU logic is HD6xxx, which itself is based on the mobile variant of...

, NVIDIA ION
Ion
An ion is an atom or molecule in which the total number of electrons is not equal to the total number of protons, giving it a net positive or negative electrical charge. The name was given by physicist Michael Faraday for the substances that allow a current to pass between electrodes in a...

 and Intel Core
Intel Core
Yonah was the code name for Intel's first generation of 65 nm process mobile microprocessors, based on the Banias/Dothan-core Pentium M microarchitecture. SIMD performance has been improved through the addition of SSE3 instructions and improvements to SSE and SSE2 implementations, while integer...

 i7, i5, i3; 2nd Generation Intel Core i7/5/3
AMD Radeon
Radeon
Radeon is a brand of graphics processing units and random access memory produced by Advanced Micro Devices , first launched in 2000 by ATI Technologies, which was acquired by AMD in 2006. Radeon is the successor to the Rage line. There are four different groups, which can be differentiated by...

, NVIDIA GeForce
GeForce
GeForce is a brand of graphics processing units designed by Nvidia. , there have been eleven iterations of the design. The first GeForce products were discrete GPUs designed for use on add-on graphics boards, intended for the high-margin PC gaming market...

 and Intel Core 2 Solo, Duo Quad, Extreme
ATI FirePro, NVIDIA Quadro and Intel Xeon 7x00,5x00,3x00 (Core based)
NVIDIA OpenCL Driver and Tools NVIDIA
NVIDIA
Nvidia is an American global technology company based in Santa Clara, California. Nvidia is best known for its graphics processors . Nvidia and chief rival AMD Graphics Techonologies have dominated the high performance GPU market, pushing other manufacturers to smaller, niche roles...

 Tesla
Nvidia Tesla
The Tesla graphics processing unit is nVidia's third brand of GPUs. It is based on high-end GPUs from the G80 , as well as the Quadro lineup. Tesla is nVidia's first dedicated General Purpose GPU...

 C/D/S
NVIDIA GeForce
GeForce
GeForce is a brand of graphics processing units designed by Nvidia. , there have been eleven iterations of the design. The first GeForce products were discrete GPUs designed for use on add-on graphics boards, intended for the high-margin PC gaming market...

 GTS/GT/GTX
NVIDIA ION
Ion
An ion is an atom or molecule in which the total number of electrons is not equal to the total number of protons, giving it a net positive or negative electrical charge. The name was given by physicist Michael Faraday for the substances that allow a current to pass between electrodes in a...

NVIDIA Quadro FX/NVX/Plex

See also

  • GPGPU
    GPGPU
    General-purpose computing on graphics processing units is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU...

  • CUDA
    CUDA
    CUDA or Compute Unified Device Architecture is a parallel computing architecture developed by Nvidia. CUDA is the computing engine in Nvidia graphics processing units that is accessible to software developers through variants of industry standard programming languages...

  • OpenHMPP
  • DirectCompute
    DirectCompute
    Microsoft DirectCompute is an application programming interface that supports general-purpose computing on graphics processing units on Microsoft Windows Vista and Windows 7. DirectCompute is part of the Microsoft DirectX collection of APIs and was initially released with the DirectX 11 API but...

  • FireStream
    AMD FireStream
    The AMD FireStream is a stream processor produced by Advanced Micro Devices to utilize the stream processing/GPGPU concept for heavy floating-point computations to target various industries, such as the High Performance Computing , scientific, and financial sectors...

  • Larrabee
  • Close to Metal
    Close to Metal
    Close To Metal is the name of a beta version of a low-level programming interface developed by ATI , aimed at enabling GPGPU computing...

  • BrookGPU
    BrookGPU
    BrookGPU is the Stanford University graphics group's compiler and runtime implementation of the Brook stream programming language for using modern graphics hardware for non-graphical, general purpose computations...

  • Lib Sh
    Lib Sh
    Sh is a metaprogramming language for programmable GPUs. Programmable GPUs are graphics processing units that execute some operations with higher efficiency than CPUs...

  • CLyther
    CLyther
    CLyther is a Python tool similar to Cython. CLyther is a Python language extension that makes writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features to OpenCL...

  • SIMD
    SIMD
    Single instruction, multiple data , is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously...


Documentation


Drivers

  • OpenCL for Nvidia
    NVIDIA
    Nvidia is an American global technology company based in Santa Clara, California. Nvidia is best known for its graphics processors . Nvidia and chief rival AMD Graphics Techonologies have dominated the high performance GPU market, pushing other manufacturers to smaller, niche roles...

     (Download page)
  • OpenCL for AMD
    Advanced Micro Devices
    Advanced Micro Devices, Inc. or AMD is an American multinational semiconductor company based in Sunnyvale, California, that develops computer processors and related technologies for commercial and consumer markets...

     (Download page)
  • OpenCL for Intel
    Intel Corporation
    Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States and the world's largest semiconductor chip maker, based on revenue. It is the inventor of the x86 series of microprocessors, the processors found in most...

     (Download Page)
  • OpenCL for IBM
    IBM
    International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

     (Download page)

Libraries


Language bindings and wrappers

  • WebCL JavaScript
    JavaScript
    JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

     bindings for Firefox
  • WebCL JavaScript
    JavaScript
    JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

     bindings for WebKit
    WebKit
    WebKit is a layout engine designed to allow web browsers to render web pages. WebKit powers Google Chrome and Apple Safari and by October 2011 held over 33% of the browser market share between them. It is also used as the basis for the experimental browser included with the Amazon Kindle ebook...

  • cl4d D
    D (programming language)
    The D programming language is an object-oriented, imperative, multi-paradigm, system programming language created by Walter Bright of Digital Mars. It originated as a re-engineering of C++, but even though it is mainly influenced by that language, it is not a variant of C++...

     bindings
  • JOCL Java
    Java (programming language)
    Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

     bindings
  • Aparapi Java
    Java (programming language)
    Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

     bindings
  • JogAmp Java
    Java (programming language)
    Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

     bindings
  • PyOpenCL Python
    Python (programming language)
    Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

     bindings
  • PyOpenCL Python
    Python (programming language)
    Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

     bindings
  • The Open Toolkit library cross-platform C#, OpenGL
    OpenGL
    OpenGL is a standard specification defining a cross-language, cross-platform API for writing applications that produce 2D and 3D computer graphics. The interface consists of over 250 different function calls which can be used to draw complex three-dimensional scenes from simple primitives. OpenGL...

    , OpenAL
    OpenAL
    OpenAL is a cross-platform audio API. It is designed for efficient rendering of multichannel three dimensional positional audio. Its API style and conventions deliberately resemble those of OpenGL.- History :...

     and OpenCL wrapper for Mono
    Mono (software)
    Mono, pronounced , is a free and open source project led by Xamarin to create an Ecma standard compliant .NET-compatible set of tools including, among others, a C# compiler and a Common Language Runtime....

    /.NET
    .NET Framework
    The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

  • OpenCL 1.1 headers for thinBasic
    ThinBasic
    thinBasic is a BASIC-like computer programming language interpreter with a central core engine architecture surrounded by many specialized modules...

  • CLoo C# bindings

Tools

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK