All Topics  
Grid computing

 

   Email Print
   Bookmark   Link






 

Grid computing



 
 
Grid computing (or the use of a computational grid) is the application of several computers to a single problem at the same time -- usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. According to John Patrick, IBM's vice president for Internet strategies, "the next big thing will be grid computing."

Grid computing depends on software to divide and apportion pieces of a program among several computers, sometimes up to many thousands.






Discussion
Ask a question about 'Grid computing'
Start a new discussion about 'Grid computing'
Answer questions from other users
Full Discussion Forum



Recent Posts









Encyclopedia


Grid computing (or the use of a computational grid) is the application of several computers to a single problem at the same time -- usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. According to John Patrick, IBM's vice president for Internet strategies, "the next big thing will be grid computing."

Grid computing depends on software to divide and apportion pieces of a program among several computers, sometimes up to many thousands. Grid computing can also be thought of as distributed and large-scale cluster computing, as well as a form of network-distributed parallel processing . It can be small -- confined to a network of computer workstations within a corporation, for example -- or it can be a large, public collaboration across many companies or networks.

It is a form of distributed computing
Distributed computing

Distributed computing deals with hardware and software systems containing more than one processing element or Computer data storage element, Concurrent computing processes, or multiple programs, running under a loosely or tightly controlled regime....
 whereby a "super and virtual computer" is composed of a cluster
Cluster (computing)

A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks....
 of networked, loosely coupled
Loose coupling

Loose coupling describes a resilient relationship between two or more systems or organizations with some kind of exchange relationship. Each end of the transaction makes its requirements explicit and makes few assumptions about the other end....
 computers, acting in concert to perform very large tasks. This technology has been applied to computationally intensive scientific, mathematical, and academic problems through volunteer computing
Volunteer computing

Volunteer computing is a type of distributed computing in which computer owners donate their computing resources to one or more "projects"....
, and it is used in commercial enterprises for such diverse applications as drug discovery
Drug discovery

In medicine, biotechnology and pharmacology, drug discovery is the process by which medication are discovered and/or designed.In the past most drugs have been discovered either by identifying the active ingredient from traditional remedies or by serendipity discovery....
, economic forecasting
Economic forecasting

Economic forecasting is the process of making predictions about the economy as a whole or in part.Relevant models include* Economic base analysis...
, seismic analysis
Seismic analysis

Seismic Analysis is a subset of structural analysis and is the calculation of the response of a building structure to earthquakes. It is part of the process of structural design, earthquake engineering or structural assessment and retrofit in regions where earthquakes are prevalent....
, and back-office
Back office

A back office is a part of most corporations where tasks dedicated to running the company itself take place. The term comes from the building layout of early companies where the front office would contain the sales and other customer-facing staff and the back office would be those manufacturing or developing the products or involved in admini...
 data processing in support of e-commerce and Web service
Web service

A Web service is defined by the W3C as "a software system designed to support interoperability Machine to Machine interaction over a computer network"....
s.

What distinguishes grid computing from conventional cluster computing systems is that grids tend to be more loosely coupled, heterogeneous, and geographically dispersed. Also, while a computing grid may be dedicated to a specialized application, it is often constructed with the aid of general-purpose grid software libraries and middleware.

Grids versus conventional supercomputers

"Distributed" or "grid" computing in general is a special type of parallel computing
Parallel computing

Parallel computing is a form of computing in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved Concurrency ....
  that relies on complete computers (with onboard CPU, storage, power supply, network interface, etc.) connected to a network (private, public or the Internet
Internet

The Internet is a global network of interconnected computers, enabling users to share information along multiple channels. Typically, a computer that connects to the Internet can access information from a vast array of available server and other computers by moving information from them to the computer's local memory....
) by a conventional network interface, such as Ethernet
Ethernet

Ethernet is a family of Data frame-based computer networking technologies for local area networks . The name comes from the physical concept of the Luminiferous aether....
. This is in contrast to the traditional notion of a supercomputer
Supercomputer

A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation. Supercomputers introduced in the 1960s were designed primarily by Seymour Cray at Control Data Corporation , and led the market into the 1970s until Cray left to form his own company, Cray Research....
, which has many processors connected by a local high-speed computer bus
Computer bus

In computer architecture, a bus is a subsystem that transfers data between computer components inside a computer or between computers. Each bus defines its set of connectors to physically plug devices, cards or cables together....
.

The primary advantage of distributed computing is that each node can be purchased as commodity hardware, which when combined can produce similar computing resources to a multiprocessor supercomputer, but at lower cost. This is due to the economies of scale
Economies of scale

Economies of scale, in microeconomics, are the cost advantages that a business obtains due to expansion. They are factors that cause a producer?s average cost per unit to fall as output rises....
 of producing commodity hardware, compared to the lower efficiency of designing and constructing a small number of custom supercomputers. The primary performance disadvantage is that the various processors and local storage areas do not have high-speed connections. This arrangement is thus well suited to applications in which multiple parallel computations can take place independently, without the need to communicate intermediate results between processors.

The high-end scalability
Scalability

In telecommunications and software engineering, scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged....
 of geographically dispersed grids is generally favorable, due to the low need for connectivity between nodes relative to the capacity of the public Internet.

There are also some differences in programming and deployment. It can be costly and difficult to write programs so that they can be run in the environment of a supercomputer, which may have a custom operating system, or require the program to address concurrency
Concurrency

Concurrency, 'concurrent, or concurrence may refer to:* Concurrence, a legal term referring to the need to prove both actus reus and mens rea...
 issues. If a problem can be adequately parallelized, a "thin" layer of "grid" infrastructure can allow conventional, standalone programs to run on multiple machines (but each given a different part of the same problem). This makes it possible to write and debug on a single conventional machine, and eliminates complications due to multiple instances of the same program running in the same shared memory and storage space at the same time.

Design considerations and variations

One feature of distributed grids is that they can be formed from computing resources belonging to multiple individuals or organizations (known as multiple administrative domain
Administrative domain

An administrative domain is a collection of Server s and routers, and the interconnecting network, managed by a single administrative authority....
s). This can facilitate commercial transactions, as in utility computing
Utility computing

Utility computing is the packaging of Computational resource, such as computation and storage, as a metered service similar to a traditional public utility ....
, or make it easier to assemble volunteer computing
Volunteer computing

Volunteer computing is a type of distributed computing in which computer owners donate their computing resources to one or more "projects"....
 networks.

One disadvantage of this feature is that the computers which are actually performing the calculations might not be entirely trustworthy. The designers of the system must thus introduce measures to prevent malfunctions or malicious participants from producing false, misleading, or erroneous results, and from using the system as an attack vector. This often involves assigning work randomly to different nodes (presumably with different owners) and checking that at least two different nodes report the same answer for a given work unit. Discrepancies would identify malfunctioning and malicious nodes.

Due to the lack of central control over the hardware, there is no way to guarantee that nodes will not drop out of the network at random times. Some nodes (like laptops or dialup Internet customers) may also be available for computation but not network communications for unpredictable periods. These variations can be accommodated by assigning large work units (thus reducing the need for continuous network connectivity) and reassigning work units when a given node fails to report its results as expected.

The impacts of trust and availability on performance and development difficulty can influence the choice of whether to deploy onto a dedicated computer cluster, to idle machines internal to the developing organization, or to an open external network of volunteers or contractors.

In many cases, the participating nodes must trust the central system not to abuse the access that is being granted, by interfering with the operation of other programs, mangling stored information, transmitting private data, or creating new security holes. Other systems employ measures to reduce the amount of trust "client" nodes must place in the central system such as placing applications in virtual machines.

Public systems or those crossing administrative domains (including different departments in the same organization) often result in the need to run on heterogeneous
Heterogeneous

Heterogeneous is an adjective used to describe an object or system consisting of multiple items having a large number of structural variations. It is the opposite of homogeneous, which means that an object or system consists of multiple identical items....
 systems, using different operating systems and hardware architectures
Computer architecture

Computer architecture in computer engineering is the conceptual design and fundamental operational structure of a computer system. It is a blueprint and functional description of requirements and design implementations for the various parts of a computer, focusing largely on the way by which the central processing unit performs internally an...
. With many languages, there is a tradeoff between investment in software development and the number of platforms that can be supported (and thus the size of the resulting network). Cross-platform
Cross-platform

In computing, cross-platform is a term used to refer to computer software or computing methods and concepts that are implemented and inter-operate on multiple computer platforms....
 languages can reduce the need to make this tradeoff, though potentially at the expense of high performance on any given node (due to run-time interpretation or lack of optimization for the particular platform).

Various middleware
Middleware

Middleware is computer software that connects software components or applications. The software consists of a set of enabling services that allow multiple processes running on one or more machines to interact across a network....
 projects have created generic infrastructure, to allow diverse scientific and commercial projects to harness a particular associated grid, or for the purpose of setting up new grids. BOINC is a common one for academic projects seeking public volunteers; more are listed at the end of the article.

In fact, the middleware can be seen as a layer between the hardware and the software. On top of the middleware, a number of technical areas have to be considered, and these may or may not be middleware independent. Example areas include SLA
Service Level Agreement

A service level agreement is a part of a service contract where the level of service is formally defined. In practice, the term SLA is sometimes used to refer to the contracted delivery time or performance....
 management, Trust and Security, VO
Virtual organization

Several unrelated things are named virtual organization:* In business a virtual organization that can take one of the following forms:** an organization that outsources the majority of its functions; see virtual corporation...
 management, License Management, Portals and Data Management. These technical areas may be taken care of in a commercial solution, though the cutting edge of each area is often found within specific research projects examining the field.

Market Segmentation of the Grid computing market

According to , for the segmentation of the Grid computing market, two perspectives need to be considered: the provider side and the user side:

The Provider Side

The overall Grid market comprises several specific markets. These are the Grid middleware market, the market for Grid-enabled applications, the utility computing
Utility computing

Utility computing is the packaging of Computational resource, such as computation and storage, as a metered service similar to a traditional public utility ....
 market, and the software-as-a-Service (SaaS) market.

Grid middleware
Middleware

Middleware is computer software that connects software components or applications. The software consists of a set of enabling services that allow multiple processes running on one or more machines to interact across a network....
 is a specific software product, which enables the sharing of heterogeneous resources, and virtual organizations. It is installed and integrated into the existing infrastructure of the involved company or companies, and provides a special layer placed among the heterogeneous infrastructure and the specific user applications. Major Grid middlewares are Globus Toolkit
Globus Toolkit

The Globus Toolkit, currently at version 4, is an open source toolkit for building Grid computing developed and provided by the Globus Alliance....
, gLite, and UNICORE
Unicore

Unicore is the name of a computer instruction architecture designed by Micro Processor Research and Development Center of Peking University in the People's Republic of China....
.

Utility computing is referred to as the provision of Grid computing and applications as service either as an open grid utility or as a hosting solution for one organization or a Virtual Organization
Virtual organization

Several unrelated things are named virtual organization:* In business a virtual organization that can take one of the following forms:** an organization that outsources the majority of its functions; see virtual corporation...
. Major players in the utility computing market are Sun Microsystems
Sun Microsystems

Sun Microsystems, Inc. is a multinational corporation vendor of computers, computer components, computer software, and information technology services, founded on February 24, 1982....
, IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
, and HP.

Grid-enabled applications are specific software applications that can utilize Grid infrastructure. This is made possible by the use of Grid middleware, as pointed out above.

Software as a Service
Software as a Service

Software as a Service is a model of software deployment where an application is licensed for use as a service provided to customers on demand. On demand licensing and use alleviates the customer's burden of equipping a device with every application....
 (SaaS) is “software that is owned, delivered and managed remotely by one or more providers.” (Gartner
Gartner

Gartner, Inc. is an information technology research and advisory firm headquartered in Stamford, Connecticut, Connecticut. It was known as The Gartner Group until 2001....
 2007) Additionally, SaaS applications are based on a single set of common code and data definitions. They are consumed in a one-to-many model, and SaaS uses a Pay As You Go (PAYG) model or a subscription model that is based on usage. Providers of SaaS do not necessarily own the computing resources themselves, which are required to run their SaaS. Therefore, SaaS providers may draw upon the utility computing market. The utility computing market provides computing resources for SaaS providers.

The User Side

For companies on the demand or user side of the Grid computing market, the different segments have significant implications for their IT deployment strategy. The IT deployment strategy as well as the type of IT investments made are relevant aspects for potential Grid users and play an important role for Grid adoption.

CPU scavenging


CPU-scavenging, cycle-scavenging, cycle stealing, or shared computing creates a "grid" from the unused resources in a network of participants (whether worldwide or internal to an organization). Typically this technique uses desktop computer instruction cycle
Instruction cycle

An instruction cycle is the time period during which a computer processes a machine language instruction from its computer storage or the sequence of actions that the central processing unit performs to execute each machine code instruction in a program....
s that would otherwise be wasted at night, during lunch, or even in the scattered seconds throughout the day when the computer is waiting for user input or slow devices.

Volunteer computing
Volunteer computing

Volunteer computing is a type of distributed computing in which computer owners donate their computing resources to one or more "projects"....
 projects use the CPU scavenging model almost exclusively.

In practice, participating computers also donate some supporting amount of disk storage space, RAM, and network bandwidth, in addition to raw CPU power. Since nodes are likely to go "offline" from time to time, as their owners use their resources for their primary purpose, this model must be designed to handle such contingencies.

Taxation Issues in Grid Computing


The project BEinGRID
BEinGRID

BEinGRID is a research project partly funded by the European Commission as an Integrated Project under the Sixth Framework Programme sponsorship program....
 has studied the legal issues involved in Grid computing. In particular, the issue of tax is crucial:

The first question to answer is why taxation issues are likely to be relevant in a Grid environment. This is a consequence of the distributed nature of the service provided: potentially taxable income may be generated from the use in combination of servers that are located in various tax domains. The problems that tax consultants and managers have to face are thus in many cases cumbersome and novel. It is therefore necessary to provide answers to the following questions:

Firstly, which VAT rules are applicable to a European ICT company that is willing to provide e-services to businesses or individuals located in the same country, in another European country or outside the EU? In particular, what services can be considered as e-services (electronically supplied services)?`

Secondly, As regards international income taxation, how should a single server, node, etc of a Grid infrastructure be considered? Is it a permanent establishment (thereinafter, PE) of the company? These questions and the corresponding solutions are likely to have a great impact on the concrete business of ICT undertakings, and taxation is one of the most important drivers when drafting business plans. As regards the former question, the solutions are based on the applicable EC law sources, namely Directive 112/2006/EC, including the amendments introduced by Directive 2008/8/EC, while for what concern the international profiles of the server (and, in more general terms, Grid components) as PE we will refer to the Model Tax Convention and its Commentaries drafted by the Organisation for Economic Cooperation and Development (OECD).

History


The term grid computing originated in the early 1990s as a metaphor
Metaphor

Metaphor is language that directly compares seemingly unrelated subjects. It is a figure of speech that compares two or more things without using the words "like" or "as." More generally, a metaphor describes a first subject as being or equal to a second object in some way....
 for making computer power as easy to access as an electric power grid in Ian Foster
Ian Foster

Ian Foster is a Distinguished Fellow and the Associate Division Director in the Mathematics and Computer Science Division at Argonne National Laboratory, where he leads the Distributed Systems Laboratory, and he is a Professor in the Department of Computer Science at the University of Chicago....
's and Carl Kesselman
Carl Kesselman

Carl Kesselman is a project leader at the University of Southern California's Information Sciences Institute and a Research Associate Professor in Computer Science, also at the University of Southern California....
's seminal work, "The Grid: Blueprint for a new computing infrastructure."

CPU scavenging and volunteer computing
Volunteer computing

Volunteer computing is a type of distributed computing in which computer owners donate their computing resources to one or more "projects"....
 were popularized beginning in 1997 by distributed.net
Distributed.net

distributed.net is a worldwide distributed computing effort that is attempting to solve large scale problems using otherwise Idle time. It is officially recognized as a non-profit organization under U.S....
 and later in 1999 by SETI@home
SETI@home

SETI@home is a distributed computing project using Internet-connected computers, hosted by the Space Sciences Laboratory, at the University of California, Berkeley, in the United States....
 to harness the power of networked PCs worldwide, in order to solve CPU-intensive research problems.

The ideas of the grid (including those from distributed computing, object-oriented programming, and Web services) were brought together by Ian Foster, Carl Kesselman, and Steve Tuecke, widely regarded as the "fathers of the grid." They led the effort to create the Globus Toolkit
Globus Toolkit

The Globus Toolkit, currently at version 4, is an open source toolkit for building Grid computing developed and provided by the Globus Alliance....
 incorporating not just computation management but also storage management, security provisioning, data movement, monitoring, and a toolkit for developing additional services based on the same infrastructure, including agreement negotiation, notification mechanisms, trigger services, and information aggregation. While the Globus Toolkit remains the de facto standard for building grid solutions, a number of other tools have been built that answer some subset of services needed to create an enterprise or global grid.

In 2007 the term cloud computing
Cloud computing

Cloud computing is Internet based development and use of computer technology . It is a style of computing in which dynamically scalability and often Virtualisation resources are provided Everything as a service over the Internet....
 came into popularity, which is conceptually similar to the canonical Foster definition of grid computing (in terms of computing resources being consumed as electricity is from the power grid). Indeed, grid computing is often (but not always) associated with the delivery of cloud computing systems as exemplified by the AppLogic system from 3tera
3tera

3tera, Inc., is a developer of system software for utility computing and cloud computing. It is headquartered in Aliso Viejo, California.3tera is among the pioneers in the cloud computing space, having launched its AppLogic system in February, 2006....
.

Fastest virtual supercomputers


  • BOINC -- 1.3 PFLOPS as of February 9, 2009.
  • Folding@Home
    Folding@home

    Folding@home is a distributed computing project designed to perform computationally intensive simulations of protein folding and other molecular dynamics ....
     -- 4.7 PFLOPS, as of February 8, 2009


Current projects and applications


Grids offer a way to solve Grand Challenge problem
Grand Challenge problem

A Grand Challenge Problem is a general category of unsolved problems. The definition of a Grand Challenge problem has a certain degree of inherent subjectivity surrounding what is, or is not, a Grand Challenge....
s such as protein folding
Protein folding

Protein folding is the physical process by which a polypeptide folds into its characteristic and functional protein structure.Each protein begins as a polypeptide, translated from a sequence of mRNA as a linear chain of amino acids....
, financial modeling
Model (abstract)

In mathematical logic, the formal languages, formal systems, and theory which are studied have no meaningful content until they are given an interpretation within some other system....
, earthquake
Earthquake

An earthquake is the result of a sudden release of energy in the Earth's crust that creates seismic waves. Earthquakes are recorded with a seismometer, also known as a seismograph....
 simulation, and climate
Climate

Climate encompasses the temperatures, humidity, atmospheric pressure, winds, rainfall, atmospheric particle count and numerous other Meteorology elements in a given region over long periods of time, as opposed to the term weather, which refers to current activity of these same elements....
/weather
Weather

Weather is a set of all the Phenomenon occurring in a given atmosphere at a given time. Weather phenomena lie in the hydrosphere and troposphere....
 modeling. Grids offer a way of using the information technology resources optimally inside an organization. They also provide a means for offering information technology as a utility
Utility computing

Utility computing is the packaging of Computational resource, such as computation and storage, as a metered service similar to a traditional public utility ....
 for commercial and noncommercial clients, with those clients paying only for what they use, as with electricity or water.

Grid computing is being applied by the National Science Foundation's National Technology Grid, NASA's Information Power Grid, Pratt & Whitney, Bristol-Myers Squibb Co., and American Express.

One of the most famous cycle-scavenging networks is SETI@home
SETI@home

SETI@home is a distributed computing project using Internet-connected computers, hosted by the Space Sciences Laboratory, at the University of California, Berkeley, in the United States....
, which was using more than 3 million computers to achieve 23.37 sustained teraflops
FLOPS

In computing, FLOPS is an acronym meaning FLoating point Operations Per Second. The FLOPS is a measure of a computer's computer performance, especially in fields of scientific calculations that make heavy use of floating point calculations, similar to instructions per second....
 (979 lifetime teraflops) .

As of March 2008, Folding@home
Folding@home

Folding@home is a distributed computing project designed to perform computationally intensive simulations of protein folding and other molecular dynamics ....
 had achieved peaks of 1,502 teraflops on over 270,000 machines.

The European Union
European Union

The European Union is an economic and political union of 27 European Union member state, located primarily in Europe. It was established by the Treaty of Maastricht on 1 November 1993 upon the foundations of the pre-existing European Economic Community....
 has been a major proponent of Grid computing. Many projects have been fundied through the framework programme of the European Commission
European Commission

The European Commission is the executive of the European Union. The body is responsible for proposing legislation, implementing decisions, upholding the Treaties of the European Union and the general day-to-day running of the Union....
. Many of the projects are highlighted below, but two deserve special mention: BEinGRID
BEinGRID

BEinGRID is a research project partly funded by the European Commission as an Integrated Project under the Sixth Framework Programme sponsorship program....
 and Enabling Grids for E-sciencE
Enabling Grids for E-sciencE

Enabling Grids for E-sciencE is a project funded by the European Commission's Sixth Framework Programme through Directorate F: Emerging Technologies and Infrastructures, of the Directorate-General for Information Society and Media....
.

BEinGRID
BEinGRID

BEinGRID is a research project partly funded by the European Commission as an Integrated Project under the Sixth Framework Programme sponsorship program....
 (Business Experiments in Grid) is a research project partly funded by the as an Integrated Project under the Sixth Framework Programme
Sixth Framework Programme

The Sixth Framework Programme was the Framework Programmes for Research and Technological Development from 2002 till 2006 set up by the European Union in order to fund and promote European research and technological development....
 (FP6) sponsorship program. Started in June 1 2006, the project will run 42 months, until November 2009. The project is coordinated by Atos Origin
Atos Origin

Atos Origin, SA is an international information technology corporation which operates in 40 countries worldwide, with over 50,000 employees.The corporate headquarters are located in Paris, France and Zaventem, Belgium....
. According to the project factsheet, their mission is "to establish effective routes to foster the adoption of Grid Computing across the EU and to stimulate research into innovative business models using Grid technologies." To extract best practice and common themes from the experimental implementations, two groups of consultants are analysing a series of pilots, one technical, one business. The results of these cross analyses are provided by the website . The project is significant not only for its long duration, but also for its budget, which at 24.8 million Euros, is the largest of any FP6 integrated project. Of this, 15.7 million is provided by the European commission and the remainder by its 98 contributing partner companies.

The Enabling Grids for E-sciencE
Enabling Grids for E-sciencE

Enabling Grids for E-sciencE is a project funded by the European Commission's Sixth Framework Programme through Directorate F: Emerging Technologies and Infrastructures, of the Directorate-General for Information Society and Media....
 project, which is based in the European Union
European Union

The European Union is an economic and political union of 27 European Union member state, located primarily in Europe. It was established by the Treaty of Maastricht on 1 November 1993 upon the foundations of the pre-existing European Economic Community....
 and includes sites in Asia and the United States, is a follow-up project to the European DataGrid (EDG) and is arguably the largest computing grid on the planet. This, along with the LHC Computing Grid (LCG), has been developed to support the experiments using the CERN Large Hadron Collider
Large Hadron Collider

The Large Hadron Collider is the List of accelerators in particle physics#Hadron colliders particle accelerator, intended to Collider opposing Charged particle beam, of either protons at an energy of 7 TeV/particle, or lead nuclei at an energy of 574 TeV/nucleus....
. The LCG project is driven by CERN
CERN

The European Organization for Nuclear Research , known as CERN , , is the world's largest particle physics laboratory, situated in the northwest suburbs of Geneva on the France-Switzerland border, established in 1954 in science....
's need to handle huge amounts of data, where storage rates of several gigabytes per second (10 petabytes per year) are required. A list of active sites participating within LCG can be found online as can real time monitoring of the EGEE infrastructure. The relevant software and documentation is also publicly accessible.

Another well-known project is distributed.net
Distributed.net

distributed.net is a worldwide distributed computing effort that is attempting to solve large scale problems using otherwise Idle time. It is officially recognized as a non-profit organization under U.S....
, which was started in 1997 and has run a number of successful projects in its history.

The NASA Advanced Supercomputing facility
NASA Advanced Supercomputing facility

The NASA Advanced Supercomputing Division is located at the NASA Ames Research Center in Moffett Field, California .It provides computing resources for various NASA projects including: simulating space shuttle launches for future space missions, projecting the impact of human activity on weather patterns, by designing safe, efficient spac...
 (NAS) has run genetic algorithm
Genetic algorithm

A genetic algorithm is a Search algorithm wikt:technique used in computing to find exact or approximate solutions to Optimization and Search algorithm problems....
s using the Condor cycle scavenger
Condor cycle scavenger

Condor is a High-Throughput Computing software framework for coarse-grained distributed parallelization of computationally intensive tasks. It can be used to manage workload on a dedicated Computer cluster, and/or to farm out work to idle desktop computers — so-called CPU scavenging....
 running on about 350 Sun
Sun Microsystems

Sun Microsystems, Inc. is a multinational corporation vendor of computers, computer components, computer software, and information technology services, founded on February 24, 1982....
 and SGI
Silicon Graphics

Silicon Graphics, Inc. is a company manufacturer high-performance computing solutions, including computer hardware and computer software. SGI was founded by James H....
 workstations.

Until April 27, 2007, United Devices
United Devices

United Devices, Inc. was a privately held, commercial distributed computing company that focused on the use of grid computing to manage High-performance computing infrastructures and Computer cluster....
 operated the United Devices Cancer Research Project based on its Grid MP
Grid MP

Grid MP is a commercial distributed computing software package developed and sold by Univa UD , a privately held company based primarily in Austin, Texas....
 product, which cycle-scavenges on volunteer PCs connected to the Internet. , the Grid MP ran on about 3.1 million machines .

Another well-known project is the World Community Grid . The World Community Grid's mission is to create the largest public computing grid that benefits humanity. This work is built on the belief that technological innovation combined with visionary scientific research and large-scale volunteerism can change our world for the better. IBM Corporation has donated the hardware, software, technical services, and expertise to build the infrastructure for World Community Grid and provides free hosting, maintenance, and support.

Definitions

Today there are many definitions of Grid computing:
  • In his article "What is the Grid? A Three Point Checklist", Ian Foster
    Ian Foster

    Ian Foster is a Distinguished Fellow and the Associate Division Director in the Mathematics and Computer Science Division at Argonne National Laboratory, where he leads the Distributed Systems Laboratory, and he is a Professor in the Department of Computer Science at the University of Chicago....
     lists these primary attributes:
    • Computing resources are not administered centrally.
    • Open standards are used.
    • Nontrivial quality of service
      Quality of service

      In the field of computer networking and other packet-switched telecommunication networks, the Traffic engineering term quality of service refers to resource reservation control mechanisms rather than the achieved service quality....
       is achieved.


  • Plaszczak/Wellner define grid technology as "the technology that enables resource virtualization, on-demand provisioning, and service (resource) sharing between organizations."
  • IBM defines grid computing as "the ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet. A grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across 'multiple' administrative domains based on their (resources) availability, capacity, performance, cost and users' quality-of-service requirements"
  • An earlier example of the notion of computing as utility was in 1965 by MIT's Fernando Corb. Fernando and the other designers of the Multics operating system envisioned a computer facility operating "like a power company or water company". http://www.multicians.org/fjcc3.html
  • Buyya/Venugopal define grid as "a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed autonomous resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements".
  • CERN
    CERN

    The European Organization for Nuclear Research , known as CERN , , is the world's largest particle physics laboratory, situated in the northwest suburbs of Geneva on the France-Switzerland border, established in 1954 in science....
    , one of the largest users of grid technology, talk of The Grid: "a service for sharing computer power and data storage capacity over the Internet
    Internet

    The Internet is a global network of interconnected computers, enabling users to share information along multiple channels. Typically, a computer that connects to the Internet can access information from a vast array of available server and other computers by moving information from them to the computer's local memory....
    ."


Grids can be categorized with a three stage model of departmental grids, enterprise grids and global grids. These correspond to a firm initially utilising resources within a single group i.e. an engineering department connecting desktop machines, clusters and equipment. This progresses to enterprise grids where nontechnical staff's computing resources can be used for cycle-stealing and storage. A global grid is a connection of enterprise and departmental grids that can be used in a commercial or collaborative manner.

See also


Concepts and related technology

  • Cloud Computing
    Cloud computing

    Cloud computing is Internet based development and use of computer technology . It is a style of computing in which dynamically scalability and often Virtualisation resources are provided Everything as a service over the Internet....
  • Data Grid
    Data grid

    A data grid is a grid computing system that deals with data — the controlled sharing and management of large amounts of distributed data. These are often, but not always, combined with computational grid computing systems....
  • Computer cluster
  • Computon
    Computon

    A computon is a combined unit of computing power, including Central processing unit cycles, memory, disk storage and Bandwidth . The term was popularized by researchers at Hewlett Packard, with the word being a cross between "computation" and "photon", the name for a packet of electromagnetic energy....
  • Distributed computing
    Distributed computing

    Distributed computing deals with hardware and software systems containing more than one processing element or Computer data storage element, Concurrent computing processes, or multiple programs, running under a loosely or tightly controlled regime....
  • Edge computing
    Edge computing

    Edge computing provides application processing load balancing capacity to corporate and other large-scale web servers. It is like an application cache, where the cache is in the Internet itself....
  • Grid FileSystem
    Grid FileSystem

    A Grid File System is a computer file system whose goal is improved reliability and availability by taking advantage of many smaller file storage areas....
  • High-performance computing
    High-performance computing

    High-performance computing uses supercomputers and computer clusters to solve advanced computation problems. Today, computer systems approaching the teraflops-region are counted as HPC-computers....
  • List of distributed computing projects
    List of distributed computing projects

    A list of distributed computing projects....
  • Metacomputing
    Metacomputing

    Metacomputing is all computing and computing-oriented activity which involves computing knowledge common for the research, development and application of different types of computing....
  • Network Agility
    Network Agility

    Network Agility is an architectural discipline for computer networking. It can be defined as:With regards network hardware, network agility is used when referring to automatic hardware configuration and reconfiguration of network devices e.g....
  • Render farm
    Render farm

    A render farm is a computer cluster built to Rendering computer-generated imagery , typically for film and television visual effects, using off-line batch processing....
  • Semantic grid
    Semantic Grid

    The Semantic Grid refers to an approach to Grid computing in which information, computing resources and services are described using the semantic data model....
  • Space based architecture (SBA)
  • Tuple Space
    Tuple space

    A tuple space is an implementation of the Content-addressable memory paradigm for parallel/distributed computing. It provides a repository of tuples that can be accessed concurrently....
  • Supercomputer
    Supercomputer

    A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation. Supercomputers introduced in the 1960s were designed primarily by Seymour Cray at Control Data Corporation , and led the market into the 1970s until Cray left to form his own company, Cray Research....
  • Wireless
    Wireless

    Wireless communication is the transfer of information over a distance without the use of electrical conductors or "wires". The distances involved may be short or long ....


Alliances and organizations

  • Open Grid Forum
    Open Grid Forum

    The Open Grid Forum is the community of users, developers, and vendors leading the global standardization effort for grid computing. It was formed in 2006 in a merger of the Global Grid Forum and the Enterprise Grid Alliance....
     (Formerly Global Grid Forum)
  • Object Management Group
    Object Management Group

    Object Management Group is a consortium, originally aimed at setting standardization for distributed object-oriented systems, and is now focused on modeling and model-based standards....


Production grids


  • Enabling Grids for E-sciencE
    Enabling Grids for E-sciencE

    Enabling Grids for E-sciencE is a project funded by the European Commission's Sixth Framework Programme through Directorate F: Emerging Technologies and Infrastructures, of the Directorate-General for Information Society and Media....
  • NorduGrid
    NorduGrid

    NorduGrid is a Grid Computing Research and Development collaboration aiming at development, maintenance and support of the free Grid middleware, known as the Advanced Resource Connector ....
  • Open Science Grid
  • OurGrid
    OurGrid

    OurGrid is a free-to-join peer-to-peer grid computing that has been in production since December 2004. Anyone can freely and easily join it to gain access to large amount of computational power and run parallel applications....
  • Sun Grid
    Sun Grid

    Sun Grid is an on-demand grid computing service operated by Sun Microsystems. The Sun Grid Compute Utility at provides access to a substantial computing resource over the Internet for United States dollar1 per CPU-hour....
  • Xgrid
    Xgrid

    Xgrid is a proprietary software program and distributed computing protocol developed by the Advanced Computation Group subdivision of Apple Inc....


International Grid Projects

Name Region Start End
Open Middleware Infrastructure Institute Europe
OMII-Europe

OMII-Europe is short for the Open Middleware Infrastructure Institute for Europe. It is a project that is being funded by the European Union to produce high-quality, interoperable Grid components....
 (OMII-Europe)
Europe May 2006 May 2008
Enabling Grids for E-sciencE (EGEE) Europe March 2004 March 2006
Enabling Grids for E-sciencE II (EGEE II) Europe April 2006 April 2008
D4Science (DIstributed colLaboratories Infrastructure on Grid ENabled Technology 4 Science) Europe and Asia and the Pacific January 2008 December 2009
E-science grid facility for Europe and Latin America (EELA-2) Europe and Latin America April 2008 March 2010
E-Infrastructure shared between Europe and Latin America (EELA) Europe and Latin America January 2006 December 2008
Business Experiments in GRID (BEinGRID
BEinGRID

BEinGRID is a research project partly funded by the European Commission as an Integrated Project under the Sixth Framework Programme sponsorship program....
)
Europe June 2006 November 2009
BREIN Europe September 2006 January 2010
KnowARC
KnowARC

KnowARC is a research and development project funded in 2006-2009 by the European Commission's Sixth Framework Programme through Directorate F: Emerging Technologies and Infrastructures, of the Directorate-General for Information Society and Media, under the Information Society Technologies Priority....
Europe June 2006 August 2009
Nordic Data Grid Facility
Nordic Data Grid Facility

The Nordic Data Grid Facility, or NDGF, is a collaboration between the Nordic countries .The motivation for NDGF is to ensure that researchers in the Nordic countries can create and participate in computational challenges of scope and size unreachable for the national research groups alone....
Scandinavia and Finland June 2006 December 2010
DataTAG Europe and North America January 2001 January 2003
European DataGrid (EDG) Europe March 2001 March 2004
BalticGrid/BalticGrid II Europe (Baltic States) November 2005 April 2010
EUFORIA (EU Fusion fOR Iter Applications)
EUFORIA project

EUFORIA is a project funded by European Union under the which will provide a comprehensive framework and infrastructure for core and edge transport and turbulence simulation, linking grid computing and High Performance Computing , to the Fusion power modelling community....
Europe January 2008 December 2010
World Community Grid
World Community Grid

World Community Grid is an effort to create the world's largest public grid computing to tackle scientific research projects that benefit humanity....
Global November 2004 unknown
XtreemOS Europe June 2006 June 2010
GridEcon Europe June 2006 April 2009


National Grid Projects


  • D-Grid
    D-Grid

    The D-Grid Initiative builds a sustainable grid infrastructure for education and research in Germany. The D-Grid infrastructure will help to establish methods of e-Science in three core areas:...
     (German)
  • GARUDA
    Garuda

    The Garuda is a large mythical bird or bird-like creature that appears in both Hinduism and Buddhism mythology.Garuda is the Hindu name for the constellation Aquila and the Brahminy kite is considered to be the contemporary representation of Garuda...
     (Indian)
  • National Grid Service
    National Grid Service

    The National Grid Service provides free grid computing resources and additional services for United Kingdom academics. It is funded by several governmental bodies, including the Engineering and Physical Sciences Research Council and the Joint Information Systems Committee ....
     (UK)
  • Open Science Grid (USA)
  • VECC (Calcutta, India)
  • INFN Grid
    INFN Grid

    The INFN Grid project is the general container used by INFN ? Italy's National Institute for Nuclear Physics ? for its grid computing initiatives....
     (Italian)


Standards and APIs

  • A Simple API for Grid Applications (SAGA)
    Saga

    Saga may refer to:...
  • Distributed Resource Management Application API (DRMAA)
    DRMAA

    DRMAA or Distributed Resource Management Application API is a high-level Open Grid Forum Application programming interface specification for the submission and control of jobs to one or more Distributed Resource Management Systems within a Grid computing architecture....
  • Grid Security Infrastructure (GSI)
    Grid Security Infrastructure

    The Grid Security Infrastructure , formerly called the Globus Toolkit Security Infrastructure, is a specification for secret, tamper-proof, delegatable communication between software in a grid computing environment....
  • Open Grid Services Architecture (OGSA)
    Open Grid Services Architecture

    The Open Grid Services Architecture describes an architecture for a Service-oriented architecture grid computing environment for business and scientific use, developed within the Global Grid Forum....
  • Open Grid Services Infrastructure (OGSI)
    Open Grid Services Infrastructure

    The Open Grid Services Infrastructure was published by the Global Grid Forum as a proposed recommendation in June 2003. It was intended to provide an infrastructure layer for the Open Grid Services Architecture....
  • Web Services Resource Framework (WSRF)
    Web Services Resource Framework

    Web Services Resource Framework is a family of OASIS -published specifications for web services. Major contributors include the Globus Alliance and IBM....


Software implementations and middleware


  • Advanced Resource Connector
    Advanced Resource Connector

    Advanced Resource Connector, or ARC, is a Grid computing middleware developed by NorduGrid. ARC is an open source software distributed under the GNU General Public License....
     (NorduGrid
    NorduGrid

    NorduGrid is a Grid Computing Research and Development collaboration aiming at development, maintenance and support of the free Grid middleware, known as the Advanced Resource Connector ....
    's ARC)
  • Berkeley Open Infrastructure for Network Computing (BOINC)
    Berkeley Open Infrastructure for Network Computing

    The Berkeley Open Infrastructure for Network Computing is a non-commercial middleware system for volunteer computing and grid computing. It was originally developed to support the SETI@home project before it became useful as a platform for other Distributed computing in areas as diverse as mathematics, medicine, molecular biology, climatolog...
  • Globus Toolkit
    Globus Toolkit

    The Globus Toolkit, currently at version 4, is an open source toolkit for building Grid computing developed and provided by the Globus Alliance....
  • Platform LSF
    Load Sharing Facility

    Load Sharing Facility is a commercial computer software job scheduler sold by Platform Computing. It can be used to execute batch jobs on networked Unix and Windows systems on many different architectures....
  • Message Passing Interface (MPI)
    Message Passing Interface

    Message Passing Interface is a specification for an API that allows many computers to communicate with one another. It is used in computer clusters and supercomputers....
  • OurGrid
    OurGrid

    OurGrid is a free-to-join peer-to-peer grid computing that has been in production since December 2004. Anyone can freely and easily join it to gain access to large amount of computational power and run parallel applications....
  • Simple Grid Protocol
    Simple Grid Protocol

    Simple Grid Protocol is a Freeware grid computing package. Developed & maintained by Brendan Kosowski, the package includes the protocol & software tools needed to get a computational grid up and running on Linux & BSD....
  • Sun Grid Engine
    Sun Grid Engine

    Sun Grid Engine , previously known as CODINE or GRD , is an open source batch-queuing system, developed and supported by Sun Microsystems....
  • ProActive
    Proactive

    The use of the word proactive, sometimes also written pro-activewas limited to the domain of experimental psychology in the 1930s....
  • UNICORE
    Unicore

    Unicore is the name of a computer instruction architecture designed by Micro Processor Research and Development Center of Peking University in the People's Republic of China....
  • SDSC Storage resource broker
    Storage Resource Broker

    Storage Resource Broker is a data grid middleware software system produced by the and commercialized by that is operating in many national and international computational science research projects....
     (data grid)
  • GridWay
    GridWay

    GridWay is an open source meta-scheduling technology that enables large-scale, secure, reliable and efficient sharing of computing resources , managed by different DRM systems, such as Sun Grid Engine, Condor cycle scavenger, Portable Batch System, Load Sharing Facility..., within a single organization or scattered across several administra...

Bibliography



External links

A good beginner's guide about making your own grid is available at . E-learning contents about grid-computing are available at .