OpenCL - AbsoluteAstronomy.com

OpenCL is a framework for writing programs that execute across heterogeneous

Heterogeneous computing

Heterogeneous computing systems refer to electronic systems that use a variety of different types of computational units. A computational unit could be a general-purpose processor , a special-purpose processor Heterogeneous computing systems refer to electronic systems that use a variety of...

platforms consisting of CPUs

Central processing unit

The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

, GPUs

Graphics processing unit

A graphics processing unit or GPU is a specialized circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display...

, and other processors. OpenCL includes a language (based on C99

C99

C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

) for writing kernels (functions that execute on OpenCL devices), plus APIs

Application programming interface

An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

that are used to define and then control the platforms. OpenCL provides parallel computing

Parallel computing

Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

using task-based and data-based parallelism. It has been adopted by Intel, AMD, Nvidia

NVIDIA

Nvidia is an American global technology company based in Santa Clara, California. Nvidia is best known for its graphics processors . Nvidia and chief rival AMD Graphics Techonologies have dominated the high performance GPU market, pushing other manufacturers to smaller, niche roles...

, and ARM

ARM

An arm is an upper limb of the body.Arm may also refer to:-Geography:* Arm , a narrow stretch of a larger body of water** Canal arm, a subsidiary branch of a canal or inland waterway** Distributary or arm, a subsidiary branch of a river...

.

OpenCL gives any application access to the graphics processing unit for non-graphical computing. Thus, OpenCL extends the power of the Graphics Processing Unit beyond graphics (general-purpose computing on graphics processing units

GPGPU

General-purpose computing on graphics processing units is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU...

).
Academic researchers have investigated automatically compiling OpenCL programs into application-specific processors running on FPGAs, and commercial FPGA vendors are developing tools to translate OpenCL to run on their FPGA devices.

OpenCL is analogous to the open industry standards OpenGL

OpenGL

OpenGL is a standard specification defining a cross-language, cross-platform API for writing applications that produce 2D and 3D computer graphics. The interface consists of over 250 different function calls which can be used to draw complex three-dimensional scenes from simple primitives. OpenGL...

and OpenAL

OpenAL

OpenAL is a cross-platform audio API. It is designed for efficient rendering of multichannel three dimensional positional audio. Its API style and conventions deliberately resemble those of OpenGL.- History :...

, for 3D graphics and computer audio, respectively. OpenCL is managed by the non-profit

Non-profit organization

Nonprofit organization is neither a legal nor technical definition but generally refers to an organization that uses surplus revenues to achieve its goals, rather than distributing them as profit or dividends...

technology consortium Khronos Group

Khronos Group

The Khronos Group is a not-for-profit member-funded industry consortium based in Beaverton, Oregon, focused on the creation of open standard, royalty-free APIs to enable the authoring and accelerated playback of dynamic media on a wide variety of platforms and devices...

History

OpenCL was initially developed by Apple Inc., which holds trademark rights, and refined into an initial proposal in collaboration with technical teams at AMD

Advanced Micro Devices

Advanced Micro Devices, Inc. or AMD is an American multinational semiconductor company based in Sunnyvale, California, that develops computer processors and related technologies for commercial and consumer markets...

, IBM

IBM

International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

, Intel

Intel Corporation

Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States and the world's largest semiconductor chip maker, based on revenue. It is the inventor of the x86 series of microprocessors, the processors found in most...

, and Nvidia

NVIDIA

. Apple submitted this initial proposal to the Khronos Group

Khronos Group

. On June 16, 2008 the Khronos Compute Working Group was formed with representatives from CPU, GPU, embedded-processor, and software companies. This group worked for five months to finish the technical details of the specification for OpenCL 1.0 by November 18, 2008. This technical specification was reviewed by the Khronos members and approved for public release on December 8, 2008.

OpenCL 1.0 has been released with Mac OS X Snow Leopard. According to an Apple press release:

Snow Leopard further extends support for modern hardware with Open Computing Language (OpenCL), which lets any application tap into the vast gigaflops of GPU computing power previously available only to graphics applications. OpenCL is based on the C programming language and has been proposed as an open standard.

AMD has decided to support OpenCL (and DirectX

DirectX

Microsoft DirectX is a collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. Originally, the names of these APIs all began with Direct, such as Direct3D, DirectDraw, DirectMusic, DirectPlay,...

11) instead of the now deprecated Close to Metal

Close to Metal

Close To Metal is the name of a beta version of a low-level programming interface developed by ATI , aimed at enabling GPGPU computing...

in its Stream framework.
RapidMind

RapidMind

RapidMind Inc. was a privately held company founded and headquartered in Waterloo, Ontario, Canada, acquired by Intel in 2009. It provided a software product that aims to make it simpler for software developers to target multi-core processors and accelerators such as GPUs.-History:RapidMind was...

announced their adoption of OpenCL underneath their development platform, in order to support GPUs from multiple vendors with one interface.
On December 9, 2008, Nvidia announced its intention to add full support for the OpenCL 1.0 specification to its GPU Computing Toolkit. On October 30, 2009, IBM released its first OpenCL implementation as a part of the XL compilers.

OpenCL 1.1 was ratified by the Khronos Group June 14, 2010 and adds significant functionality for enhanced parallel programming flexibility, functionality and performance including:

New data types including 3-component vectors and additional image formats;
Handling commands from multiple host threads and processing buffers across multiple devices;
Operations on regions of a buffer including read, write and copy of 1D, 2D or 3D rectangular regions;
Enhanced use of events to drive and control command execution;
Additional OpenCL built-in C functions such as integer clamp, shuffle and asynchronous strided copies;
Improved OpenGL interoperability through efficient sharing of images and buffers by linking OpenCL and OpenGL events.

On Nov 15, 2011 the OpenCL 1.2 specification was announced by the Khronos Group which added significant functionality over the previous versions in terms of performance and features for parallel programming. Most notable features include:

Device partitioning: the ability to partition a device into sub-devices so that work assignments can be allocated to individual compute units. This is useful for reserving areas of the device in order to reduce latency for time-critical tasks.
Separate compilation and linking of objects: the functionality to compile OpenCL into external libraries for inclusion into other programs.
Enhanced image support: 1.2 adds support for 1D images and 1D/2D image arrays. Furthermore, the OpenGL sharing extensions now allow for OpenGL 1D textures and 1D/2D texture arrays to be used to create OpenCL images.
Built-in kernels: custom devices that contain specific unique functionality are now integrated more closley into the OpenCL framework. Kernels can be called to use specialised or non-programmable aspects of underlying hardware. Examples include, video encoding/decoding and digital signal processors.
DirectX functionality: DX9 media surface sharing allows for efficient sharing between OpenCL and DX9 or DXVA media surfaces. Equally, for DX11 seamless sharing between OpenCL and DX11 surfaces is enabled.

The OpenCL specification is under development at Khronos, which is open to any interested company to join.

Implementation

On December 10, 2008, AMD and Nvidia held the first public OpenCL demonstration, a 75-minute presentation at Siggraph Asia 2008
SIGGRAPH
SIGGRAPH is the name of the annual conference on computer graphics convened by the ACM SIGGRAPH organization. The first SIGGRAPH conference was in 1974. The conference is attended by tens of thousands of computer professionals...

. AMD showed a CPU-accelerated OpenCL demo explaining the scalability of OpenCL on one or more cores while Nvidia showed a GPU-accelerated demo.
On March 16, 2009, at the 4th Multicore Expo, Imagination Technologies announced the PowerVR
PowerVR
PowerVR is a division of Imagination Technologies that develops hardware and software for 2D and 3D rendering, and for video encoding, decoding, associated image processing and Direct X, OpenGL ES, OpenVG, and OpenCL acceleration....

SGX543MP, the first GPU of this company to feature OpenCL support.
On March 26, 2009, at GDC 2009
Game Developers Conference
The Game Developers Conference is the largest annual gathering of professional video game developers, focusing on learning, inspiration, and networking...

, AMD and Havok
Havok (company)
Havok is an Irish computer software company that provides interactive software and services for digital media creators in the video game and movie industries....

demonstrated the first working implementation for OpenCL accelerating Havok Cloth on AMD Radeon HD 4000 series GPU.
On April 20, 2009, Nvidia announced the release of its OpenCL driver and SDK
Software development kit
A software development kit is typically a set of software development tools that allows for the creation of applications for a certain software package, software framework, hardware platform, computer system, video game console, operating system, or similar platform.It may be something as simple...

to developers participating in its OpenCL Early Access Program.
On August 5, 2009, AMD unveiled the first development tools for its OpenCL platform as part of its ATI Stream SDK v2.0 Beta Program.
On August 28, 2009, Apple released Mac OS X Snow Leopard, which contains a full implementation of OpenCL.

OpenCL in Snow Leopard is supported on the NVIDIA GeForce 320M, GeForce GT 330M, GeForce 9400M, GeForce 9600M GT, GeForce 8600M GT, GeForce GT 120, GeForce GT 130, GeForce GTX 285, GeForce 8800 GT, GeForce 8800 GS, Quadro FX 4800, Quadro FX5600, ATI Radeon HD 4670, ATI Radeon HD 4850, Radeon HD 4870, ATI Radeon HD 5670, ATI Radeon HD 5750, ATI Radeon HD 5770 and ATI Radeon HD 5870.

On September 28, 2009, NVIDIA released its own OpenCL drivers and SDK implementation.
On October 13, 2009, AMD released the fourth beta of the ATI Stream SDK 2.0, which provides a complete OpenCL implementation on both R700
Radeon R700
The Radeon R700 is the engineering codename for a graphics processing unit series developed by Advanced Micro Devices under the ATI brand name. The foundation chip, codenamed RV770, was announced and demonstrated on June 16, 2008 as part of the FireStream 9250 and Cinema 2.0 initiative launch media...

/R800
Radeon R800
The Evergreen series is a family of GPUs developed by Advanced Micro Devices for its Radeon line under the ATI brand name.-Release:The existence was spotted on a presentation slide from AMD Technology Analyst Day July 2007 as "R8xx"...

GPUs and SSE3
SSE3
SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions , is the third iteration of the SSE instruction set for the IA-32 architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU...

capable CPUs. The SDK is available for both Linux and Windows.
On November 26, 2009, NVIDIA released drivers for OpenCL 1.0 (rev 48).

The Apple, Nvidia, RapidMind

RapidMind

and Gallium3D

Gallium3D

Gallium3D is a free software library for 3D graphics device drivers being developed by VMware, after they acquired Tungsten Graphics – the original authors....

implementations of OpenCL are all based on the LLVM Compiler technology and use the Clang

Clang

Clang is a compiler front end for the C, C++, Objective-C, and Objective-C++ programming languages. It uses the Low Level Virtual Machine as its back end, and Clang has been part of LLVM releases since LLVM 2.6....

Compiler as its frontend.

On October 27, 2009, S3
S3 Graphics
S3 Graphics, Ltd is an American company specializing in graphics chipsets. Although they do not have the large market share that they once had, they still produce graphics accelerators for home computers under the "S3 Chrome" brand name.-History:...

released their first product supporting native OpenCL 1.0 - the Chrome 5400E embedded graphics processor.
On December 10, 2009, VIA
VIA Technologies
VIA Technologies is a Taiwanese manufacturer of integrated circuits, mainly motherboard chipsets, CPUs, and memory, and is part of the Formosa Plastics Group. It is the world's largest independent manufacturer of motherboard chipsets...

released their first product supporting OpenCL 1.0 - ChromotionHD 2.0 video processor included in VN1000 chipset.
On December 21, 2009, AMD released the production version of the ATI Stream SDK 2.0, which provides OpenCL 1.0 support for R800
Radeon R800
The Evergreen series is a family of GPUs developed by Advanced Micro Devices for its Radeon line under the ATI brand name.-Release:The existence was spotted on a presentation slide from AMD Technology Analyst Day July 2007 as "R8xx"...

GPUs and beta support for R700
Radeon R700
The Radeon R700 is the engineering codename for a graphics processing unit series developed by Advanced Micro Devices under the ATI brand name. The foundation chip, codenamed RV770, was announced and demonstrated on June 16, 2008 as part of the FireStream 9250 and Cinema 2.0 initiative launch media...

GPUs.
On June 1, 2010, ZiiLABS
ZiiLABS
ZiiLABS is a global technology company, whose ZMS media-rich application processors, reference platforms and enabling software are designed to enable OEMs and ODMs to create products that target a range of low-power consumer electronics and embedded markets, including Android based tablets.-...

released details of their first OpenCL implementation for the ZMS processor for handheld, embedded and digital home products.
On June 30, 2010, IBM released a fully conformant version of OpenCL 1.0.
On September 13, 2010, Intel released details of their first OpenCL implementation for the Sandy Bridge chip architecture. Sandy Bridge will integrate Intel's newest graphics chip technology directly onto the central processing unit.
On November 15, 2010, Wolfram Research released Mathematica 8
Mathematica
Mathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing...

with OpenCLLink package.
On March 3, 2011, Khronos Group
Khronos Group
The Khronos Group is a not-for-profit member-funded industry consortium based in Beaverton, Oregon, focused on the creation of open standard, royalty-free APIs to enable the authoring and accelerated playback of dynamic media on a wide variety of platforms and devices...

announces the formation of the WebCL working group to explore defining a JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

binding to OpenCL. This creates the potential to harness GPU and multi-core CPU parallel processing from a Web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

.
On March 31, 2011, IBM released a fully conformant version of OpenCL 1.1.
On April 25, 2011, IBM released OpenCL Common Runtime v0.1 for Linux on x86 Architecture.
On May 4, 2011, Nokia Research releases an open source WebCL extension for the Firefox web browser, providing a JavaScript binding to OpenCL.
On July 1, 2011, Samsung Electronics releases an open source prototype implementation of WebCL for WebKit, providing a JavaScript binding to OpenCL.
On August 8, 2011, AMD released the OpenCL-driven AMD Accelerated Parallel Processing (APP) Software Development Kit (SDK) v2.5, replacing the ATI Stream SDK as technology and concept.

OpenCL language

The programming language used to write computation kernels is based on C99

C99

with some limitations and additions. It omits the use of function pointer

Function pointer

A function pointer is a type of pointer in C, C++, D, and other C-like programming languages, and Fortran 2003. When dereferenced, a function pointer can be used to invoke a function and pass it arguments just like a normal function...

s, recursion

Recursion (computer science)

Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science....

, bit field

Bit field

A bit field is a common idiom used in computer programming to compactly store multiple logical values as a short series of bits where each of the single bits can be addressed separately. A bit field is most commonly used to represent integral types of known, fixed bit-width. A well-known usage of...

s, variable-length array

Variable-length array

In programming, a variable-length array is an array data structure of automatic storage duration whose length is determined at run time ....

s, and standard C99 header files. The language is extended to easily use parallelism

Parallel computing

with vector types and operations, synchronization, functions to work with work-items/groups. It has memory region qualifiers: __global, __local, __constant, and __private. Also, a lot of built-in functions are added.

Example

This example will load a Fast Fourier Transformation

Fast Fourier transform

A fast Fourier transform is an efficient algorithm to compute the discrete Fourier transform and its inverse. "The FFT has been called the most important numerical algorithm of our lifetime ." There are many distinct FFT algorithms involving a wide range of mathematics, from simple...

(FFT) and execute it:

// create a compute context with GPU device
context = clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU, NULL, NULL, NULL);

// create a command queue
queue = clCreateCommandQueue(context, NULL, 0, NULL);

// allocate the buffer memory objects
memobjs[0] = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(float)*2*num_entries, srcA, NULL);
memobjs[1] = clCreateBuffer(context, CL_MEM_READ_WRITE, sizeof(float)*2*num_entries, NULL, NULL);

// create the compute program
program = clCreateProgramWithSource(context, 1, &fft1D_1024_kernel_src, NULL, NULL);

// build the compute program executable
clBuildProgram(program, 0, NULL, NULL, NULL, NULL);

// create the compute kernel
kernel = clCreateKernel(program, "fft1D_1024", NULL);

// set the args values
clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&memobjs[0]);
clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *)&memobjs[1]);
clSetKernelArg(kernel, 2, sizeof(float)*(local_work_size[0]+1)*16, NULL);
clSetKernelArg(kernel, 3, sizeof(float)*(local_work_size[0]+1)*16, NULL);

// create N-D range object with work-item dimensions and execute kernel
global_work_size[0] = num_entries;
local_work_size[0] = 64;
clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global_work_size, local_work_size, 0, NULL, NULL);

The actual calculation: (Based on Fitting FFT onto the G80 Architecture)

// This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into
// calls to a radix 16 function, another radix 16 function and then a radix 4 function

__kernel void fft1D_1024 (__global float2 *in, __global float2 *out,
__local float *sMemx, __local float *sMemy) {
int tid = get_local_id(0);
int blockIdx = get_group_id(0) * 1024 + tid;
float2 data[16];

// starting index of data to/from global memory
in = in + blockIdx; out = out + blockIdx;

globalLoads(data, in, 64); // coalesced global reads
fftRadix16Pass(data); // in-place radix-16 pass
twiddleFactorMul(data, tid, 1024, 0);

// local shuffle using local memory
localShuffle(data, sMemx, sMemy, tid, (((tid & 15) * 65) + (tid >> 4)));
fftRadix16Pass(data); // in-place radix-16 pass
twiddleFactorMul(data, tid, 64, 4); // twiddle factor multiplication

localShuffle(data, sMemx, sMemy, tid, (((tid >> 4) * 64) + (tid & 15)));

// four radix-4 function calls
fftRadix4Pass(data); // radix-4 function number 1
fftRadix4Pass(data + 4); // radix-4 function number 2
fftRadix4Pass(data + 8); // radix-4 function number 3
fftRadix4Pass(data + 12); // radix-4 function number 4

// coalesced global writes
globalStores(data, out, 64);
}

A full, open source implementation of an OpenCL FFT can be found on Apple's website

OpenCL conformant products

The Khronos Group

Khronos Group

announces an extended list of OpenCL conformant products, see OpenCL Conformant Products.

Synopsis Synopsis A synopsis is a brief summary of the major points of a written work, either as prose or as a table; an abridgment or condensation of a work.-See also:*Synopsys, an electronic design automation company based in Mountain View, California... of OpenCL conformant products
AMD APP SDK (supports OpenCL CPU and Accelerated processing unit Accelerated processing unit An accelerated processing unit is a processing system that includes additional processing capability designed to accelerate one or more types of computations outside of a CPU. This may include a graphics processing unit used for general-purpose computing , a field-programmable gate array , or... Devices)	X86 + SSE2 SSE2 SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3... (or higher) compatible CPUs 64bit & 32bit; Linux 2.6 PC, Windows Vista/7 PC	AMD Fusion AMD Fusion AMD Fusion is the marketing name for a series of APUs by AMD. There are two flavors of Fusion currently available, one with its CPU logic based on the Bobcat core and the other its CPU logic based on the 10h core. In both cases the GPU logic is HD6xxx, which itself is based on the mobile variant of... E-350, E-240, C-50, C-30 with HD 6310/HD 6250	AMD Radeon Radeon Radeon is a brand of graphics processing units and random access memory produced by Advanced Micro Devices , first launched in 2000 by ATI Technologies, which was acquired by AMD in 2006. Radeon is the successor to the Rage line. There are four different groups, which can be differentiated by... /Mobility HD 6800, HD 5x00 series GPU, iGPU HD 6310/HD 6250	ATI FirePro Vx800 series GPU
Intel OpenCL SDK 1.1 (supports only OpenCL Intel Core based CPU Device)	Intel CPUs with SSE SSE -Computing:Server-sent events, a technology to push content to web clientsSimple Sharing Extensions, a specification that extends RSS from unidirectional to bidirectional information flowsSPARQL Syntax ExpressionsMicrosoft SQL Server Express Edition... 4.1, SSE 4.2 or AVX Advanced Vector Extensions Advanced Vector Extensions is an extension to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008 and first supported by Intel with the Westmere processor shipping in Q1 2011 and now by AMD with the Bulldozer processor shipping in Q3 2011.AVX... support. Microsoft Windows Microsoft Windows Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal... , Linux Linux Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...	Intel Core Intel Core Yonah was the code name for Intel's first generation of 65 nm process mobile microprocessors, based on the Banias/Dothan-core Pentium M microarchitecture. SIMD performance has been improved through the addition of SSE3 instructions and improvements to SSE and SSE2 implementations, while integer... i7, i5, i3; 2nd Generation Intel Core i7/5/3	Intel Core 2 Solo, Duo Quad, Extreme	Intel Xeon 7x00,5x00,3x00 (Core based)
IBM IBM International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas... Servers with OpenCL Development Kit for Linux on Power running on Power VSX	IBM Power IBM Power Systems Power Systems is the name of IBM's Power Architecture-based server line.Before the Power Systems line was announced on April 2, 2008, IBM had two distinct Power-based lines: the System i running IBM i - and the System p series running AIX or Linux.- History :IBM had two discrete Power Architecture... 755 (PERCS PERCS PERCS , officially known as the Power 775, is IBM's answer to DARPA's High Productivity Computing Systems initiative.... ), 750	IBM BladeCenter IBM Power Systems Power Systems is the name of IBM's Power Architecture-based server line.Before the Power Systems line was announced on April 2, 2008, IBM had two distinct Power-based lines: the System i running IBM i - and the System p series running AIX or Linux.- History :IBM had two discrete Power Architecture... PS70x Express	IBM BladeCenter JS2x, JS43	IBM BladeCenter QS22
IBM IBM International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas... OpenCL Common Runtime (OCR)	X86 + SSE2 SSE2 SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3... (or higher) compatible CPUs 64bit & 32bit; Linux 2.6 PC	AMD Fusion AMD Fusion AMD Fusion is the marketing name for a series of APUs by AMD. There are two flavors of Fusion currently available, one with its CPU logic based on the Bobcat core and the other its CPU logic based on the 10h core. In both cases the GPU logic is HD6xxx, which itself is based on the mobile variant of... , NVIDIA ION Ion An ion is an atom or molecule in which the total number of electrons is not equal to the total number of protons, giving it a net positive or negative electrical charge. The name was given by physicist Michael Faraday for the substances that allow a current to pass between electrodes in a... and Intel Core Intel Core Yonah was the code name for Intel's first generation of 65 nm process mobile microprocessors, based on the Banias/Dothan-core Pentium M microarchitecture. SIMD performance has been improved through the addition of SSE3 instructions and improvements to SSE and SSE2 implementations, while integer... i7, i5, i3; 2nd Generation Intel Core i7/5/3	AMD Radeon Radeon Radeon is a brand of graphics processing units and random access memory produced by Advanced Micro Devices , first launched in 2000 by ATI Technologies, which was acquired by AMD in 2006. Radeon is the successor to the Rage line. There are four different groups, which can be differentiated by... , NVIDIA GeForce GeForce GeForce is a brand of graphics processing units designed by Nvidia. , there have been eleven iterations of the design. The first GeForce products were discrete GPUs designed for use on add-on graphics boards, intended for the high-margin PC gaming market... and Intel Core 2 Solo, Duo Quad, Extreme	ATI FirePro, NVIDIA Quadro and Intel Xeon 7x00,5x00,3x00 (Core based)
NVIDIA OpenCL Driver and Tools	NVIDIA NVIDIA Nvidia is an American global technology company based in Santa Clara, California. Nvidia is best known for its graphics processors . Nvidia and chief rival AMD Graphics Techonologies have dominated the high performance GPU market, pushing other manufacturers to smaller, niche roles... Tesla Nvidia Tesla The Tesla graphics processing unit is nVidia's third brand of GPUs. It is based on high-end GPUs from the G80 , as well as the Quadro lineup. Tesla is nVidia's first dedicated General Purpose GPU... C/D/S	NVIDIA GeForce GeForce GeForce is a brand of graphics processing units designed by Nvidia. , there have been eleven iterations of the design. The first GeForce products were discrete GPUs designed for use on add-on graphics boards, intended for the high-margin PC gaming market... GTS/GT/GTX	NVIDIA ION Ion An ion is an atom or molecule in which the total number of electrons is not equal to the total number of protons, giving it a net positive or negative electrical charge. The name was given by physicist Michael Faraday for the substances that allow a current to pass between electrodes in a...	NVIDIA Quadro FX/NVX/Plex

History

Implementation

OpenCL language

Example

OpenCL conformant products

See also

Documentation

Drivers

Libraries

Language bindings and wrappers

Tools