All Topics  
FLOPS

 

   Email Print
   Bookmark   Link






 

FLOPS



 
 


In computing
Computing

Computing is usually defined as the activity of using and developing computer technology, computer hardware and computer software. It is the computer-specific part of information technology....
, FLOPS (or flops or flop/s) is an acronym meaning FLoating point Operations Per Second.






Discussion
Ask a question about 'FLOPS'
Start a new discussion about 'FLOPS'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Computer Performance
Nameflops
megaflop 106
gigaflop 109
teraflop 1012
petaflop 1015
exaflop 1018
zettaflop 1021
yottaflop 1024


In computing
Computing

Computing is usually defined as the activity of using and developing computer technology, computer hardware and computer software. It is the computer-specific part of information technology....
, FLOPS (or flops or flop/s) is an acronym meaning FLoating point Operations Per Second. The FLOPS is a measure of a computer
Computer

A computer is a machine that manipulates Data according to a list of Code .The first devices that resemble modern computers date to the mid-20th century , although the computer concept and various machines similar to computers existed earlier....
's performance
Computer performance

Computer performance is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used.Depending on the context, good computer performance may involve one or more of the following:...
, especially in fields of scientific calculations that make heavy use of floating point
Floating point

In computing, floating point describes a system for numerical representation in which a String of digits represents a rational number.The term floating point refers to the fact that the radix point can "float": that is, it can be placed anywhere relative to the Significant figures of the number....
 calculation
Calculation

A calculation is a deliberate process for transforming one or more inputs into one or more results, with variable change.The term is used in a variety of senses, from the very definite arithmetical calculation using an algorithm to the vague heuristics of calculating a strategy in a competition or calculating the chance of a successful rela...
s, similar to instructions per second
Instructions per second

Instructions per second is a measure of a computer's processor speed. Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads consist of a mix of instructions and applications, some of which take longer to execute than others....
. Since the final S stands for "second", conservative speakers consider "FLOPS" as both the singular and plural of the term, although the singular "FLOP" is frequently encountered. Alternatively, the singular FLOP (or flop) is used as an abbreviation for "FLoating-point OPeration", and a flop count is a count of these operations (e.g., required by a given algorithm or computer program). In this context, "flops" is simply the plural rather than a rate.

NEC
NEC

is a Japan multinational corporation IT company headquartered in Minato, Tokyo, Japan. NEC, part of the Sumitomo Group, provides information technology and network solutions to business enterprises, communications services providers and government....
's SX-9 supercomputer was the world's first vector processor
Vector processor

A vector processor, or array processor, is a Central processing unit design where the instruction set includes operations that can perform mathematical operations on multiple data elements simultaneously....
 to exceed 100 gigaFLOPS per single core. IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
's supercomputer
Supercomputer

A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation. Supercomputers introduced in the 1960s were designed primarily by Seymour Cray at Control Data Corporation , and led the market into the 1970s until Cray left to form his own company, Cray Research....
 dubbed Blue Gene/P
Blue Gene

Blue Gene is a computer architecture project designed to produce several supercomputers, designed to reach operating speeds in the FLOPS range, and currently reaching sustained speeds of nearly 500 FLOPS....
 is designed to eventually operate at three petaFLOPS. However, the IBM Roadrunner is the first supercomputer to sustain one petaFLOPS.

A basic calculator
Calculator

A calculator is a device for performing mathematical calculations, distinguished from a computer by having a limited problem solving ability and an interface optimized for interactive calculation rather than programming....
 performs relatively few FLOPS. Each calculation request to a typical calculator requires only a single operation, so there is rarely any need for its response time
Response time

In technology, response time is the time a system or functional unit takes to react to a given input....
 to exceed that needed by the operator. A response time below 0.1 second in a calculation context is usually perceived as instantaneous by a human operator, so a simple calculator with multiply and divide needs only about 10 FLOPS.

Measuring performance

In order for FLOPS to be useful as a measure of floating-point performance, a standard benchmark
Benchmark (computing)

In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it....
 must be available on all computers of interest. One example is the LINPACK
LINPACK

LINPACK is a software library_ for performing numerical linear algebra on digital computers. It was written in Fortran by Jack Dongarra, Jim Bunch, Cleve Moler, and Pete Stewart, and was intended for use on supercomputers in the 1970s and early 1980s....
 benchmark.

There are many factors in computer performance other than raw floating-point computation speed, such as I/O performance, interprocessor communication, cache coherence, and the memory hierarchy
Memory hierarchy

The hierarchical arrangement of computer storage in current computer architectures is called the memory hierarchy. It is designed to take advantage of memory locality in computer programs....
. This means that supercomputers are in general only capable of a small fraction of their "theoretical peak" FLOPS throughput (obtained by adding together the theoretical peak FLOPS performance of every element of the system). Even when operating on large highly parallel problems, their performance will be bursty, mostly due to the residual effects of Amdahl's law
Amdahl's law

Amdahl's law, also known as Amdahl's argument, is named after Computer architecture Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved....
. Real benchmarks therefore measure both peak actual FLOPS performance as well as sustained FLOPS performance.

For ordinary (non-scientific) applications, integer
Integer

The integers are natural numbers including 0 and their negative and non-negative numberss . They are numbers that can be written without a fractional or decimal component, and fall within the set ....
 operations (measured in MIPS) are far more common. Measuring floating point operation speed, therefore, does not predict accurately how the processor will perform on just any problem. However, for many scientific jobs such as analysis of data, a FLOPS rating is effective.

Historically, the earliest reliably documented serious use of the Floating Point Operation as a metric appears to be AEC
United States Atomic Energy Commission

The United States Atomic Energy Commission was an agency of the United States government established after World War II by United States Congress to foster and control the peace time development of atomic science and technology....
 justification to Congress for purchasing a Control Data CDC 6600
CDC 6600

The CDC 6600 was a mainframe computer from Control Data Corporation, first delivered in 1964. It is generally considered to be the first successful supercomputer, outperforming its fastest predecessor, IBM 7030 Stretch, by about three times....
 in the mid-1960s.

The terminology is currently so confusing that until April 24, 2006 U.S. export control was based upon measurement of "Composite Theoretical Performance" (CTP) in millions of "Theoretical Operations Per Second" or MTOPS. On that date, however, the U.S. Department of Commerce's Bureau of Industry and Security
Bureau of Industry and Security

The Bureau of Industry and Security is an agency of the United States Department of Commerce which deals with issues involving national security and high technology....
 amended the Export Administration Regulations to base controls on Adjusted Peak Performance
Adjusted Peak Performance

Adjusted Peak Performance is a metric introduced by the U.S. Department of Commerce's Bureau of Industry and Security to more accurately predict the suitability of a computing system to complex computational problems, specifically those used in simulating nuclear weapons....
 (APP) in Weighted TeraFLOPS
Weighted TeraFLOPS

Weighted TeraFLOPS is a unit of measurement introduced by the U.S. Department of Commerce's Bureau of Industry and Security to specify Adjusted Peak Performance ....
 (WT).

Records

In November 2008, the latest upgrade to the Cray XT Jaguar supercomputer at the Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) has increased the system's computing power to a peak 1.64 “petaflops,” or quadrillion mathematical calculations per second, making Jaguar the world’s first petaflop system dedicated to open research.

In June 2008, AMD released ATI Radeon HD4800 series, which are reported to be the first GPU's to achieve one teraFLOP scale. On August 12, 2008 AMD released the with two Radeon R770 GPUs totalling 2.4 teraFLOPs.

On May 25, 2008, an American military supercomputer built by IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
, named 'Roadrunner', reached the computing milestone of one petaflop by processing more than 1.026 quadrillion
Quadrillion

Quadrillion may mean either of the two numbers :* 1,000,000,000,000,000 - for all long and short scales countries - increasingly common meaning in English language usage...
 calculations per second. It headed the June, 2008 and November, 2008 TOP500
TOP500

The TOP500 project ranks and details the 500 most powerful known computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year....
 list of the most powerful supercomputers (excluding grid computers). The computer's name, Roadrunner, refers to the state bird of New Mexico.

On February 4, 2008, the NSF
National Science Foundation

The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering....
 and the University of Texas opened full scale research runs on an AMD, Sun
Sun Microsystems

Sun Microsystems, Inc. is a multinational corporation vendor of computers, computer components, computer software, and information technology services, founded on February 24, 1982....
 supercomputer , the most powerful supercomputing system in the world for open science research, which operates at sustained speed of half a petaflop.

On October 25, 2007, NEC
NEC

is a Japan multinational corporation IT company headquartered in Minato, Tokyo, Japan. NEC, part of the Sumitomo Group, provides information technology and network solutions to business enterprises, communications services providers and government....
 Corporation of Japan issued a press release announcing its SX series model SX-9, claiming it to be the world's fastest vector supercomputer with a peak processing performance of 839 teraFLOPS. The SX-9 features the first CPU capable of a peak vector performance of 102.4 gigaFLOPS per single core.

On June 26, 2007, IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
 announced the second generation of its top supercomputer, dubbed Blue Gene/P and designed to continuously operate at speeds exceeding one petaFLOPS. When configured to do so, it can reach speeds in excess of three petaFLOPS. In June 2007, Top500.org reported the fastest computer in the world to be the IBM Blue Gene/L
Blue Gene

Blue Gene is a computer architecture project designed to produce several supercomputers, designed to reach operating speeds in the FLOPS range, and currently reaching sustained speeds of nearly 500 FLOPS....
 supercomputer, measuring a peak of 596 TFLOPS. The Cray XT4
Cray XT4

The Cray XT4 is an updated version of the Cray XT3 supercomputer. It was released on November 18, 2006. It includes an updated version of the SeaStar interconnect router called PowerPC 400#SeaStar, processor sockets for Socket AM2 Opteron processors, and 240-pin unbuffered DDR2 SDRAM memory....
 hit second place with 101.7 TFLOPS.

In June 2006, a new computer was announced by Japanese research institute RIKEN
RIKEN

is a large natural sciences research institute in Japan. Founded in 1917, it now has approximately 3000 scientists on seven campuses across Japan, the main one in Wako, Saitama, just outside Tokyo....
, the MDGRAPE-3. The computer's performance tops out at one petaFLOPS, almost two times faster than the Blue Gene/L, but MDGRAPE-3 is not a general purpose computer, which is why it does not appear in the Top500.org list. It has special-purpose pipelines for simulating molecular dynamics.

Distributed computing
Distributed computing

Distributed computing deals with hardware and software systems containing more than one processing element or Computer data storage element, Concurrent computing processes, or multiple programs, running under a loosely or tightly controlled regime....
 uses the Internet to link personal computers to achieve a similar effect:
  • Folding@Home
    Folding@home

    Folding@home is a distributed computing project designed to perform computationally intensive simulations of protein folding and other molecular dynamics ....
     is of February 2009 sustaining over 4.9 PFLOPS , the first computing project of any kind to cross the four petaFLOPS milestone. This level of performance is primarily enabled by the cumulative effort of a vast array of PlayStation 3
    PlayStation 3

    The PlayStation 3 is the third home video game console produced by Sony Computer Entertainment, and the successor to the PlayStation 2 as part of the PlayStation ....
     and powerful GPU units.
  • The entire BOINC averages over 1.1 PFLOPS as of August 4, 2008.
  • SETI@Home
    SETI@home

    SETI@home is a distributed computing project using Internet-connected computers, hosted by the Space Sciences Laboratory, at the University of California, Berkeley, in the United States....
     computes data averages more than 528 TFLOPS
  • Einstein@Home
    Einstein@Home

    Einstein@Home is a distributed computing project hosted by the University of Wisconsin-Milwaukee and running on the Berkeley Open Infrastructure for Network Computing software platform....
     is crunching more than 150 TFLOPS
, GIMPS
Great Internet Mersenne Prime Search

The Great Internet Mersenne Prime Search is a collaborative project of volunteers who use Prime95 and MPrime computer software that can be downloaded from the Internet for free in order to search for Mersenne prime....
 is sustaining 27 TFLOPS.

Intel Corporation
Intel Corporation

Intel Corporation is the world's largest semiconductor company and the inventor of the X86 architecture series of microprocessors, the processors found in most personal computers....
 has recently unveiled the experimental multi-core POLARIS
Teraflops Research Chip

The Teraflops Research Chip is the first CPU prototype developed by Intel's Intel Tera-Scale in multi-core and energy efficient computing. The processor was briefly presented at the Intel Developer Forum on September 26, 2006 and officially announced on February 11, 2007 and shown working at the 2007 Integrated Solid State Circuits Conferenc...
 chip, which achieves 1 TFLOPS at 3.2 GHz. The 80-core chip can increase this to 1.8 TFLOPS at 5.6 GHz, although the thermal dissipation at this frequency exceeds 260 watts.

As of 2008, the fastest PC processors
Microprocessor

A microprocessor incorporates most or all of the functions of a central processing unit on a single integrated circuit . The first microprocessors emerged in the early 1970s and were used for electronic calculators, using Binary-coded decimal arithmetic on 4-bit Word ....
 (quad-core) perform over 37 GFLOPS (Intel QX9775). GPU
Graphics processing unit

A graphics processing unit or GPU is a dedicated graphics rendering device for a personal computer, workstation, or game console. Modern GPUs are very efficient at manipulating and displaying computer graphics, and their highly parallel structure makes them more effective than general-purpose Central processing unit for a range of com...
s in are considerably more powerful, for example, in the GeForce 8 Series
GeForce 8 Series

The GeForce 8 Series is the eighth generation of NVIDIA's GeForce graphics processing units. The series also represents the third fundamentally new GPU design developed at NVIDIA as well as the company's first Unified_shader_model#Unified_Shading_Architecture....
 the nVidia 8800 Ultra performs around 576 GFLOPS on 128 processing elements. It should be noted that the 8800 series performs only single precision
Single precision

In computing, single precision is a computer numbering format that occupies one storage location in computer memory at a given address. A single-precision number, sometimes simply a single, may be defined to be an integer, fixed point, or floating point....
 calculations, and that while GPUs are highly efficient at calculations they are not as flexible as a general purpose CPU. There are now graphics cards such as the ATi Radeon
Radeon

ATI Radeon is a brand of graphics processing units that since 2000 has been manufactured by ATI Technologies and subsequently AMD and is the successor to their ATI Rage line....
 HD 4870X2 which can run at over 2.4 TeraFLOPS.

Future developments

In May 2008 a collaboration was announced between NASA
NASA

The National Aeronautics and Space Administration is an agency of the Federal government of the United States, responsible for the nation's public list of space agencies....
, SGI
Silicon Graphics

Silicon Graphics, Inc. is a company manufacturer high-performance computing solutions, including computer hardware and computer software. SGI was founded by James H....
 and Intel
Intel Corporation

Intel Corporation is the world's largest semiconductor company and the inventor of the X86 architecture series of microprocessors, the processors found in most personal computers....
 to build a 1 petaflop computer in 2009, scaling up to 10 PFLOPs by 2012.

Given the current speed of progress, Supercomputer
Supercomputer

A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation. Supercomputers introduced in the 1960s were designed primarily by Seymour Cray at Control Data Corporation , and led the market into the 1970s until Cray left to form his own company, Cray Research....
s are projected to reach 1 Exaflop in 2019. Erik P. DeBenedictis of Sandia National Laboratories
Sandia National Laboratories

Sandia National Laboratories, which is managed and operated by the Sandia Corporation , is a major United States Department of Energy research and development United States Department of Energy National Labs with two locations, one in Albuquerque, New Mexico, New Mexico and the other in Livermore, California, California....
 theorizes that a Zettaflop computer is required to accomplish full weather modeling, which could cover a two week time span accurately. Such systems might be built around 2030.

Cost of computing


Hardware costs
The following is a list of examples of computers that demonstrates how performance has increased drastically and price has decreased drastically. The "cost per GFLOPS" is the cost for a set of hardware that would theoretically operate at one gigaflop per second. During the era when no single computation platform was able to achieve one GFLOPS, this table lists the total cost for multiple instances of a fast computation platform whose speed sums to one GFLOPS. Otherwise, the least expensive computing platform able to achieve one GFLOPS is listed.

Date Approximate cost per GFLOPS Technology Comments
1961 US$1,100,000,000,000 ($1.1 trillion), or US$1,100 per FLOPS About 17 million IBM 1620
IBM 1620

The IBM 1620 was announced by International Business Machines on October 21, 1959 and marketed as an inexpensive "scientific computer". After a total production of about two thousand machines, it was withdrawn on November 19, 1970....
 units costing $64,000 each
The 1620s multiplication operation takes 17.7ms.
1984 US$15,000,000 Cray X-MP
Cray X-MP

The Cray X-MP was a supercomputer designed, built and sold by Cray. The company's first parallel processing vector processor machine and a fourth generation super, it was the 1982 successor to the 1976 Cray-1, and the world's fastest computer 1983–1985....
 
1997 US$30,000 Two 16-processor Beowulf
Beowulf (computing)

Originally referring to a specific computer built in 1994, Beowulf is a class of computer clusters similar to the original NASA system. They are high-performance parallel computing clusters of inexpensive personal computer hardware....
 clusters with Pentium Pro
Pentium Pro

The Pentium Pro is a sixth-generation x86-based microprocessor developed and manufactured by Intel introduced in November 1995. It introduced the Intel P6 and was originally intended to replace the original Pentium in a full range of applications....
 microprocessors
 
2000, April $1,000 Bunyip was developed at Australian National University
Australian National University

The Australian National University, commonly abbreviated to ANU, is a Public university research university located in Canberra, Australia, the Federal capital city....
, and was the first sub-US$1/MFLOPS computing technology. It won the Gordon Bell Prize in 2000.
2000, May $640 KLAT2 was developed at the University of Kentucky
University of Kentucky

The University of Kentucky, also known as UK, is a state university , co-educational, university, and is also the state's land-grant university, located in Lexington, Kentucky, Kentucky....
.
2003, August $82 KASY0 was also developed at the University of Kentucky.
2007, March $0.42 Ambric
Ambric

Ambric, Inc., was a fabless semiconductor company founded in 2003 and based in Beaverton, Oregon, Oregon. Ambric produces Massively parallel processor array integrated circuit, for high-performance embedded systems and hardware acceleration of desktop computer and Server applications....
 AM2045
 


The trend toward a higher and higher numbers of transistors that can be placed inexpensively on an integrated circuit follows Moore's law
Moore's Law

Moore's law describes a long-term trend in the history of computing hardware. Since the invention of the integrated circuit in 1958, the number of transistors that can be placed inexpensively on an integrated circuit has increased exponential growth, doubling approximately every two years....
. This trend explains the increasing speed and decreasing cost of computer processing.

Operation costs

In energy cost, according to the Green500 list, as of November 2008 the most efficient TOP500
TOP500

The TOP500 project ranks and details the 500 most powerful known computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year....
 supercomputer runs at 536.24 MFLOPS per watt. This translates to an energy requirement of 1.86 watt
WATT

WATT is a radio station broadcasting a News radio-Talk radio-Sports radio format. Licensed to Cadillac, Michigan, it first began broadcasting in 1945....
s per GFLOPS, however this energy requirement will be much greater for less efficient supercomputers.

Hardware costs for low cost supercomputers may be less significant than energy costs when running continuously for several years. A Playstation 3
PlayStation 3

The PlayStation 3 is the third home video game console produced by Sony Computer Entertainment, and the successor to the PlayStation 2 as part of the PlayStation ....
 (PS3) 40 GiB (65 nm Cell) costs $399 and consumes 135 watts or $118 of electricity each year if operated 24 hours per day, conservatively assuming U.S. national average residential electric rates of $0.10/kWh (0.135 kW × 24 h × 365 d × 0.10 $/kWh = $118.26). The operating cost of electricity for 3.5 years ($413) is more than the cost of the PS3. However, "extreme gamers" only spend about 45 hours per week gaming, so in an "extreme" case, only 317 kWh are consumed annually at a cost of $31.68. Therefore a more realistic "extreme gamer" would require more than 12.5 years for total operating costs to exceed the original purchase price.

See also

  • Gordon Bell Prize
    Gordon Bell Prize

    The Gordon Bell Prizes are a set of awards awarded by the Association for Computing Machinery in conjunction with the Institute of Electrical and Electronics Engineers each year at to recognize outstanding achievement in high-performance computing applications....


External links

  • Linux High Performance Computing and Clustering Portal
  • Windows High Performance Computing and Clustering Portal
  • - Linpack
    LINPACK

    LINPACK is a software library_ for performing numerical linear algebra on digital computers. It was written in Fortran by Jack Dongarra, Jim Bunch, Cleve Moler, and Pete Stewart, and was intended for use on supercomputers in the 1970s and early 1980s....
    , Livermore Loops, Whetstone
    Whetstone

    The term Whetstone can refer to:* Whetstone, a sharpening stone used for knives and other cutting tools* Whetstone , a benchmark for measuring computing power...
     MFLOPS