RCUDA
Encyclopedia
rCUDA is a middleware
Middleware
Middleware is computer software that connects software components or people and their applications. The software consists of a set of services that allows multiple processes running on one or more machines to interact...

 that enables Computer Unified Device Architecture CUDA
CUDA
CUDA or Compute Unified Device Architecture is a parallel computing architecture developed by Nvidia. CUDA is the computing engine in Nvidia graphics processing units that is accessible to software developers through variants of industry standard programming languages...

 remoting over a commodity network. That is, the middleware allows an application to use a CUDA-compatible graphics processing unit
Graphics processing unit
A graphics processing unit or GPU is a specialized circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display...

 (GPU) installed in a remote computer as if it were installed in the computer where the application is being executed. This approach is based on the observation that GPUs in a cluster are not usually fully utilized, and it is intended to reduce the number of GPUs in the cluster
Cluster
-In science:* Cluster , a small group of atoms or molecules* Cluster chemistry, an array of bound atoms intermediate in character between a molecule and a solid...

, thus lowering the costs related with acquisition and maintenance while keeping performance close to that of the fully equipped configuration.

Following a proposed distributed acceleration architecture for High Performance Computing Clusterswith GPUs attached only to a few of its node
Node
In general, a node is a localised swelling or a point of intersection .Node may refer to:In mathematics:*Node , behaviour for an ordinary differential equation near a critical point...

s (see Figure 1), when a node without a local GPU executes an application that makes use of a GPU to accelerate part of its code
Code
A code is a rule for converting a piece of information into another form or representation , not necessarily of the same type....

 (usually referred to as kernel
Kernel
-Computer science:* Kernel , the central component of most operating systems** The Linux kernel, from GNU/Linux operating systems** The Windows 9x kernel, used in Windows 95, 98 and ME...

), some support has to be provided to deal with the data and code transfers between the local main memory and the remote GPU memory, as well as the remote execution of the kernel.
rCUDA is designed following the client-server
Client-server
The client–server model of computing is a distributed application that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both...

 distributed architecture: on one side, clients employ a library of wrappers to the high-level CUDA Runtime API and, on the other side, there is a GPU network service listening for requests on a TCP port. Figure 1 illustrates this proposal, where several nodes running different GPU-accelerated applications can concurrently make use of the whole set of accelerators installed in the cluster. When an application demands a GPU service, its request is derived to the client side of our architecture, running in that computer.

The client forwards the request to one of the servers, which accesses the GPU installed in that computer and executes the request in it. Time-multiplexing (sharing
Sharing
Sharing the joint use of a resource or space. In its narrow sense, it refers to joint or alternating use of an inherently finite good, such as a common pasture or a shared residence. It is also the process of dividing and distributing. Apart from obvious instances, which we can observe in human...

) the GPU is accomplished by spawning a different server process for each remote execution over a new GPU context.

rCUDA 3.1

The rCUDA Framework enables the concurrent usage of CUDA-compatible devices remotely.

rCUDA employs the socket API for the communication between clients and servers. Thus, it can be useful in three different environments:
  • Clusters. To reduce the number of GPUs installed in High Performance Clusters. This leads to energy savings, as well as other related savings like acquisition costs, maintenance, space, cooling, etc.
  • Academia. In commodity networks, to offer access to a few high performance GPUs concurrently to many students.
  • Virtual Machines. To enable the access to the CUDA facilities on the physical machine.


The current version of rCUDA (v3.1) implements all functions in the CUDA Runtime API version 4.0, excluding graphics interoperability. rCUDA 3.1 targets the Linux OS (for 32- and 64-bit architectures) on both client and server sides.

Currently, rCUDA-ready applications have to be programmed using the plain C API. In addition, host
Host (network)
A network host is a computer connected to a computer network. A network host may offer information resources, services, and applications to users or other nodes on the network. A network host is a network node that is assigned a network layer host address....

 and device
Device
-Computing and electronics:* A component of personal computer hardware* Peripheral, any device attached to a computer that expands its functionality* Electronic component-Other uses:* Appliance, a device for a particular task* Device...

code need to be compiled separately. Find code examples in the rCUDA SDK package, based on the NVIDIA CUDA SDK. The rCUDA User's Guide on the rCUDA webpage explains more.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK