|
|
|
|
FLOPS
|
| |
|
| |
In computing, FLOPS (or flops or flop/s) is an acronym meaning FLoating point Operations Per Second.

Discussion
Ask a question about 'FLOPS'
Start a new discussion about 'FLOPS'
Answer questions from other users
|
Encyclopedia
| Computer Performance |
|---|
| Name | flops |
|---|
| megaflop | 106 | | gigaflop | 109 | | teraflop | 1012 | | petaflop | 1015 | | exaflop | 1018 | | zettaflop | 1021 | | yottaflop | 1024 |
In computing, FLOPS (or flops or flop/s) is an acronym meaning FLoating point Operations Per Second. The FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating point calculations, similar to instructions per second. Since the final S stands for "second", conservative speakers consider "FLOPS" as both the singular and plural of the term, although the singular "FLOP" is frequently encountered. Alternatively, the singular FLOP (or flop) is used as an abbreviation for "FLoating-point OPeration", and a flop count is a count of these operations (e.g., required by a given algorithm or computer program). In this context, "flops" is simply the plural rather than a rate.
NEC's SX-9 supercomputer was the world's first vector processor to exceed 100 gigaFLOPS per single core. IBM's supercomputer dubbed Blue Gene/P is designed to eventually operate at three petaFLOPS. However, the IBM Roadrunner is the first supercomputer to sustain one petaFLOPS.
A basic calculator performs relatively few FLOPS. Each calculation request to a typical calculator requires only a single operation, so there is rarely any need for its response time to exceed that needed by the operator. A response time below 0.1 second in a calculation context is usually perceived as instantaneous by a human operator, so a simple calculator with multiply and divide needs only about 10 FLOPS.
Measuring performance
In order for FLOPS to be useful as a measure of floating-point performance, a standard benchmark must be available on all computers of interest. One example is the LINPACK benchmark.
There are many factors in computer performance other than raw floating-point computation speed, such as I/O performance, interprocessor communication, cache coherence, and the memory hierarchy. This means that supercomputers are in general only capable of a small fraction of their "theoretical peak" FLOPS throughput (obtained by adding together the theoretical peak FLOPS performance of every element of the system). Even when operating on large highly parallel problems, their performance will be bursty, mostly due to the residual effects of Amdahl's law. Real benchmarks therefore measure both peak actual FLOPS performance as well as sustained FLOPS performance.
For ordinary (non-scientific) applications, integer operations (measured in MIPS) are far more common. Measuring floating point operation speed, therefore, does not predict accurately how the processor will perform on just any problem. However, for many scientific jobs such as analysis of data, a FLOPS rating is effective.
Historically, the earliest reliably documented serious use of the Floating Point Operation as a metric appears to be AEC justification to Congress for purchasing a Control Data CDC 6600 in the mid-1960s.
The terminology is currently so confusing that until April 24, 2006 U.S. export control was based upon measurement of "Composite Theoretical Performance" (CTP) in millions of "Theoretical Operations Per Second" or MTOPS. On that date, however, the U.S. Department of Commerce's Bureau of Industry and Security amended the Export Administration Regulations to base controls on Adjusted Peak Performance (APP) in Weighted TeraFLOPS (WT).
Records
In November 2008, the latest upgrade to the Cray XT Jaguar supercomputer at the Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) has increased the system's computing power to a peak 1.64 “petaflops,” or quadrillion mathematical calculations per second, making Jaguar the world’s first petaflop system dedicated to open research.
In June 2008, AMD released ATI Radeon HD4800 series, which are reported to be the first GPU's to achieve one teraFLOP scale. On August 12, 2008 AMD released the with two Radeon R770 GPUs totalling 2.4 teraFLOPs.
On May 25, 2008, an American military supercomputer built by IBM, named 'Roadrunner', reached the computing milestone of one petaflop by processing more than 1.026 quadrillion calculations per second. It headed the June, 2008 and November, 2008 TOP500 list of the most powerful supercomputers (excluding grid computers). The computer's name, Roadrunner, refers to the state bird of New Mexico.
On February 4, 2008, the NSF and the University of Texas opened full scale research runs on an AMD, Sun supercomputer , the most powerful supercomputing system in the world for open science research, which operates at sustained speed of half a petaflop.
On October 25, 2007, NEC Corporation of Japan issued a press release announcing its SX series model SX-9, claiming it to be the world's fastest vector supercomputer with a peak processing performance of 839 teraFLOPS. The SX-9 features the first CPU capable of a peak vector performance of 102.4 gigaFLOPS per single core.
On June 26, 2007, IBM announced the second generation of its top supercomputer, dubbed Blue Gene/P and designed to continuously operate at speeds exceeding one petaFLOPS. When configured to do so, it can reach speeds in excess of three petaFLOPS.
In June 2007, Top500.org reported the fastest computer in the world to be the IBM Blue Gene/L supercomputer, measuring a peak of 596 TFLOPS. The Cray XT4 hit second place with 101.7 TFLOPS.
In June 2006, a new computer was announced by Japanese research institute RIKEN, the MDGRAPE-3. The computer's performance tops out at one petaFLOPS, almost two times faster than the Blue Gene/L, but MDGRAPE-3 is not a general purpose computer, which is why it does not appear in the Top500.org list. It has special-purpose pipelines for simulating molecular dynamics.
Distributed computing uses the Internet to link personal computers to achieve a similar effect:
- Folding@Home is of February 2009 sustaining over 4.9 PFLOPS , the first computing project of any kind to cross the four petaFLOPS milestone. This level of performance is primarily enabled by the cumulative effort of a vast array of PlayStation 3 and powerful GPU units.
- The entire BOINC averages over 1.1 PFLOPS as of August 4, 2008.
- SETI@Home computes data averages more than 528 TFLOPS
- Einstein@Home is crunching more than 150 TFLOPS
, GIMPS is sustaining 27 TFLOPS.
Intel Corporation has recently unveiled the experimental multi-core POLARIS chip, which achieves 1 TFLOPS at 3.2 GHz. The 80-core chip can increase this to 1.8 TFLOPS at 5.6 GHz, although the thermal dissipation at this frequency exceeds 260 watts.
As of 2008, the fastest PC processors (quad-core) perform over 37 GFLOPS (Intel QX9775). GPUs in are considerably more powerful, for example, in the GeForce 8 Series the nVidia 8800 Ultra performs around 576 GFLOPS on 128 processing elements. It should be noted that the 8800 series performs only single precision calculations, and that while GPUs are highly efficient at calculations they are not as flexible as a general purpose CPU. There are now graphics cards such as the ATi Radeon HD 4870X2 which can run at over 2.4 TeraFLOPS.
Future developments
In May 2008 a collaboration was announced between NASA, SGI and Intel to build a 1 petaflop computer in 2009, scaling up to 10 PFLOPs by 2012.
Given the current speed of progress, Supercomputers are projected to reach 1 Exaflop in 2019. Erik P. DeBenedictis of Sandia National Laboratories theorizes that a Zettaflop computer is required to accomplish full weather modeling, which could cover a two week time span accurately. Such systems might be built around 2030.
Cost of computing
Hardware costs
The following is a list of examples of computers that demonstrates how performance has increased drastically and price has decreased drastically. The "cost per GFLOPS" is the cost for a set of hardware that would theoretically operate at one gigaflop per second. During the era when no single computation platform was able to achieve one GFLOPS, this table lists the total cost for multiple instances of a fast computation platform whose speed sums to one GFLOPS. Otherwise, the least expensive computing platform able to achieve one GFLOPS is listed.
| Date | Approximate cost per GFLOPS | Technology | Comments |
|---|
| 1961 | US$1,100,000,000,000 ($1.1 trillion), or US$1,100 per FLOPS | About 17 million IBM 1620 units costing $64,000 each | The 1620s multiplication operation takes 17.7ms. | | 1984 | US$15,000,000 | Cray X-MP | | | 1997 | US$30,000 | Two 16-processor Beowulf clusters with Pentium Pro microprocessors | | | 2000, April | $1,000 | | Bunyip was developed at Australian National University, and was the first sub-US$1/MFLOPS computing technology. It won the Gordon Bell Prize in 2000. | | 2000, May | $640 | | KLAT2 was developed at the University of Kentucky. | | 2003, August | $82 | | KASY0 was also developed at the University of Kentucky. | | 2007, March | $0.42 | Ambric AM2045 | |
The trend toward a higher and higher numbers of transistors that can be placed inexpensively on an integrated circuit follows Moore's law. This trend explains the increasing speed and decreasing cost of computer processing.
Operation costs In energy cost, according to the Green500 list, as of November 2008 the most efficient TOP500 supercomputer runs at 536.24 MFLOPS per watt. This translates to an energy requirement of 1.86 watts per GFLOPS, however this energy requirement will be much greater for less efficient supercomputers.
Hardware costs for low cost supercomputers may be less significant than energy costs when running continuously for several years. A Playstation 3 (PS3) 40 GiB (65 nm Cell) costs $399 and consumes 135 watts or $118 of electricity each year if operated 24 hours per day, conservatively assuming U.S. national average residential electric rates of $0.10/kWh (0.135 kW × 24 h × 365 d × 0.10 $/kWh = $118.26). The operating cost of electricity for 3.5 years ($413) is more than the cost of the PS3. However, "extreme gamers" only spend about 45 hours per week gaming, so in an "extreme" case, only 317 kWh are consumed annually at a cost of $31.68. Therefore a more realistic "extreme gamer" would require more than 12.5 years for total operating costs to exceed the original purchase price.
See also
External links
- Linux High Performance Computing and Clustering Portal
- Windows High Performance Computing and Clustering Portal
- - Linpack, Livermore Loops, Whetstone MFLOPS
-
|
| |
|
|