Cray-3
Encyclopedia
The Cray-3 was a vector
Vector processor
A vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...

 supercomputer
Supercomputer
A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation.Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting, climate research, molecular modeling A supercomputer is a...

 intended to be Cray Research's successor to the Cray-2
Cray-2
The Cray-2 was a four-processor ECL vector supercomputer made by Cray Research starting in 1985. It was the fastest machine in the world when it was released, replacing the Cray Research X-MP designed by Steve Chen in that spot...

. The system was to be the first major application of gallium arsenide (GaAs) semiconductors in computing. The project was not considered a success, and the parent company in Minneapolis decided to end work on the Cray-3 in favour of their own design, the Cray C90
Cray C90
The Cray C90 series was a vector processor supercomputer launched by Cray Research in 1991. The C90 was a development of the Cray Y-MP architecture. Compared to the Y-MP, the C90 processor had a dual vector pipeline and a faster 4.1 ns clock cycle , which together gave three times the...

. The Cray-3 project was spun off to the newly formed Cray Computer Corporation, but only one Cray-3 was delivered, and never paid for. Seymour Cray
Seymour Cray
Seymour Roger Cray was an American electrical engineer and supercomputer architect who designed a series of computers that were the fastest in the world for decades, and founded Cray Research which would build many of these machines. Called "the father of supercomputing," Cray has been credited...

 moved onto the Cray-4
Cray-4
The Cray-4 was intended to be Cray Computer Corporation's successor to the failed Cray-3 supercomputer. It was marketed to compete with the T90 from Cray Research. CCC went bankrupt in 1995 before any Cray-4 had been delivered.-Design:...

 design, but the company went bankrupt before the project was completed.

Background

Cray generally set himself the goal of producing new machines with ten times the performance of the previous models. Although the machines did not always meet this goal, this was a useful technique in defining the project and clarifying what sort of process improvements would be needed to meet it. Cray had always attacked the problem of increased speed with three simultaneous advances: more functional units to give the system higher parallelism, tighter packaging to decrease signal delays, and faster components to allow for a higher clock speed. Of the three, Cray was normally least aggressive on the last issue, his designs tended to use only components that were already in widespread use, as opposed to leading-edge designs.

For the Cray-3, he decided to set an even higher performance improvement goal, an increase of 12x over the Cray-2
Cray-2
The Cray-2 was a four-processor ECL vector supercomputer made by Cray Research starting in 1985. It was the fastest machine in the world when it was released, replacing the Cray Research X-MP designed by Steve Chen in that spot...

. For the Cray-2 they had introduced a novel 3D-packaging system for its integrated circuit
Integrated circuit
An integrated circuit or monolithic integrated circuit is an electronic circuit manufactured by the patterned diffusion of trace elements into the surface of a thin substrate of semiconductor material...

s to allow higher densities, and it appeared that there was some room for improvement in this process. But for a 12x performance increase, packaging alone would not be enough. The Cray-2 appeared to be pushing the limits of speed of silicon
Silicon
Silicon is a chemical element with the symbol Si and atomic number 14. A tetravalent metalloid, it is less reactive than its chemical analog carbon, the nonmetal directly above it in the periodic table, but more reactive than germanium, the metalloid directly below it in the table...

-based transistor
Transistor
A transistor is a semiconductor device used to amplify and switch electronic signals and power. It is composed of a semiconductor material with at least three terminals for connection to an external circuit. A voltage or current applied to one pair of the transistor's terminals changes the current...

s at 4.1 ns (244 MHz), and it didn't appear that anything more than another 2x would be possible. If the goal of 12x would be met, more radical changes would be needed, and a "high tech" approach would have to be used.

Cray had intended to use gallium arsenide circuitry in the Cray-2, which would not only offer much higher switching speeds, but also used less energy and thus ran cooler as well. At the time the Cray-2 was being designed, the state of GaAs manufacturing simply wasn't up to the task of supplying a supercomputer. By the mid-1980s, things had changed and Cray decided it was the only way forward. Given a lack of investment on the part of large chip makers, Cray decided the only solution was to invest in a GaAs chipmaking startup, GigaBit Logic, and use them as an internal supplier.

Development

Development of the Cray-3 started in 1988, originally slated for delivery in 1991. Development quickly overran this date, while at the same time the supercomputer market was rapidly shrinking from 50% annual growth in 1980, to 10% in 1988.

During 1989 company was in the process of developing both the Cray-3 and C90, two machines of roughly similar power, yet the Cray-3 was compatible with the 25-sold Cray-2, while the C90 was compatible with the earlier Cray Y-MP
Cray Y-MP
The Cray Y-MP was a supercomputer sold by Cray Research from 1988, and the successor to the company's X-MP. The Y-MP retained software compatibility with the X-MP, but extended the address registers from 24 to 32 bits. High-density VLSI ECL technology was used and a new liquid cooling system was...

 as well as early machines. Given these issues, management decided that the Cray-3 should be put on "low priority" development. This was not the first time this had happened, and as in the past, Cray decided to form his own company. The result was Cray Computer Corporation, which Cray had no equity stake in, and worked under contract.

The Cray-3 was due to be delivered in 1991, but development quickly overran this date. Development slowed even more when Lawrence Livermore National Laboratory
Lawrence Livermore National Laboratory
The Lawrence Livermore National Laboratory , just outside Livermore, California, is a Federally Funded Research and Development Center founded by the University of California in 1952...

 cancelled its order for the first machine, in favor of the C90. Several executives, including the CEO, left the company. The company then announced they would be looking for a customer that needed a smaller version of the machine, with four to eight processors.

The first (and only) customer system (serial number S5, named Graywolf) was not delivered to NCAR until May 1993. NCAR's model was configured with 4 processors and a 128 MWord (64-bit words, 1 GB) common memory. In production it was learned that the square root code contained a bug, and one of their four CPU's was not running reliably. Replacements to fix both problems were developed. NCAR had not yet paid for the machine, and CCC folded in March 1995 after burning through about 300 million dollars of financing. NCAR's machine was officially decommissioned the next day. In practice, two of the processors were removed and the machine was used unofficially for some time after that.

Seven system cabinets, or "tanks", (with serial numbers S1 to S7) were built for Cray-3 machines (most for smaller two-CPU machines), but NCAR's was the only one ever delivered. Three of the smaller tanks were used on the Cray-4
Cray-4
The Cray-4 was intended to be Cray Computer Corporation's successor to the failed Cray-3 supercomputer. It was marketed to compete with the T90 from Cray Research. CCC went bankrupt in 1995 before any Cray-4 had been delivered.-Design:...

 project, essentially a Cray-3 with 64 faster CPUs running at 1 ns (1 GHz). Another was used for the Cray-3/SSS
Cray-3/SSS
The Cray-3/SSS was a pioneering massively parallel supercomputer project that bonded a two-processor Cray-3 to a new SIMD processing unit based entirely in the computer's main memory...

 project.

The failure of the Cray-3 seems to have little to do with the machine itself, however, and everything to do with the changing political and technical climate. The machine was being designed during the collapse of the Warsaw Pact
Warsaw Pact
The Warsaw Treaty Organization of Friendship, Cooperation, and Mutual Assistance , or more commonly referred to as the Warsaw Pact, was a mutual defense treaty subscribed to by eight communist states in Eastern Europe...

 and ending of the cold war
Cold War
The Cold War was the continuing state from roughly 1946 to 1991 of political conflict, military tension, proxy wars, and economic competition between the Communist World—primarily the Soviet Union and its satellite states and allies—and the powers of the Western world, primarily the United States...

, which led to a massive downsizing in "large machine" supercomputer purchases. At the same time, the market was increasingly investing in massively parallel
Massively parallel
Massively parallel is a description which appears in computer science, life sciences, medical diagnostics, and other fields.A massively parallel computer is a distributed memory computer system which consists of many individual nodes, each of which is essentially an independent computer in itself,...

 designs. Cray was critical of this approach, and was quoted by the Wall Street Journal as saying that MPP systems have not yet proven their supremacy over vector computers, noting the difficulty many users have had programming for large parallel machines. "I don't think they'll ever be universally successful, at least not in my lifetime", a statement that became true, if for no other reason than his untimely death as the result of a car accident.

Logical design

The Cray-3 system architecture comprised a foreground processing system, up to 16 background processors and up to 2 gigawords (16 GB) of common memory. The foreground system was dedicated to input/output
Input/output
In computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world, possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...

 and system management. It included a 32-bit processor and four synchronous data channels for mass storage
Mass storage
In computing, mass storage refers to the storage of large amounts of data in a persisting and machine-readable fashion. Devices and/or systems that have been described as mass storage include tape libraries, RAID systems, hard disk drives, magnetic tape drives, optical disc drives, magneto-optical...

 and network devices, primarily via HiPPI
HIPPI
HIPPI is a computer bus for the attachment of high speed storage devices to supercomputers. It was popular in the late 1980s and into the mid-to-late 1990s, but has since been replaced by ever-faster standard interfaces like SCSI and Fibre Channel.The first HIPPI standard defined a 50-wire...

 channels.

Each background processor consisted of a computation section, a control section and local memory. The computation section performed 64-bit
64-bit
64-bit is a word size that defines certain classes of computer architecture, buses, memory and CPUs, and by extension the software that runs on them. 64-bit CPUs have existed in supercomputers since the 1970s and in RISC-based workstations and servers since the early 1990s...

 scalar, floating point
Floating point
In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16...

 and vector arithmetic
Vector processor
A vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...

. The control section provided instruction buffers, memory management functions, and a real-time clock
Real-time clock
A real-time clock is a computer clock that keeps track of the current time. Although the term often refers to the devices in personal computers, servers and embedded systems, RTCs are present in almost any electronic device which needs to keep accurate time.-Terminology:The term is used to avoid...

. 16 kwords (128 kbytes) of high-speed local memory was incorporated into each background processor for use as temporary scratch memory.

Common memory consisted of silicon CMOS
CMOS
Complementary metal–oxide–semiconductor is a technology for constructing integrated circuits. CMOS technology is used in microprocessors, microcontrollers, static RAM, and other digital logic circuits...

 SRAM
Static random access memory
Static random-access memory is a type of semiconductor memory where the word static indicates that, unlike dynamic RAM , it does not need to be periodically refreshed, as SRAM uses bistable latching circuitry to store each bit...

, organized into octants of 64 banks each, with up to eight octants possible. The word size was 64-bits plus eight error-correction bits, and total memory bandwidth was rated at 128 gigabytes per second.

CPU design

As with previous designs, the core of the Cray-3 consisted of a number of "modules", each containing several circuit boards packed with parts. In order to increase density, the individual GaAs chips were not "packaged", and instead several were mounted directly with ultrasonic gold bonding to a board approximately 1 inch square. The boards were then turned over and mated to a second board carrying the electrical wiring, with wires on this card running through holes to the "bottom" (opposite the chips) side of the chip carrier where they were bonded, hence sandwiching the chip between the two layers of board. These "submodules" were then stacked four-deep and, as in the Cray-2, wired to each other to make a 3D circuit.

Unlike the Cray-2, the Cray-3 modules also included edge connector
Edge connector
An edge connector is the portion of a printed circuit board consisting of traces leading to the edge of the board that are intended to plug into a matching socket. The edge connector is a money-saving device because it only requires a single discrete female connector , and they also tend to be...

s. 16 such submodules were connected together in a 4×4 array to make a single module measuring 121 × 107 × 7 mm (approximately 4 inches square by 0.25 inch deep). Even with this advanced packaging the circuit density was low even by 1990s standards, at about 96,000 gates per cubic inch. Modern CPUs offer gate counts of millions per square inch, and the move to 3D circuits is still just being considered in 2011.

Thirty-two such modules were then stacked and wired together with a mass of twisted-pair wires into a single processor. The basic cycle time was 2.11 ns, or 474 MHz, allowing each processor to reach about 0.948 GFLOPS, and a 16 processor machine a theoretical 15.17 GFLOP. Key to the high performance was the high-speed access to main memory, which allowed each process to burst up to 8 GB/s.

Mechanical design

The modules were held together in an aluminum chassis known as a "brick". The bricks were immersed in liquid fluorinert
Fluorinert
Fluorinert is the trademarked brand name for the line of electronics coolant liquids sold commercially by 3M. It is an electrically insulating, stable fluorocarbon-based fluid which is used in various cooling applications. It is mainly used for cooling electronics...

 for cooling, as in the Cray-2. A four-processor system with 64 memory modules dissipated about 88 kW of power. The entire four-processor system was about 20" tall and front-to-back, and a little over two feet wide.

For systems with up to four processors, the processor assembly sat under a translucent bronzed acrylic cover at the top of a cabinet 42 inches (1.1 m) wide, 28 inch (0.7112 m) deep and 50 inches (1.3 m) high, with the memory below it, and then the power supplies and cooling systems on the bottom. Eight and 16-processors system would have been housed in a larger octagonal cabinet. All in all, the Cray-3 was considerably smaller than the Cray-2, itself relatively small compared to other supercomputers.

In addition to the system cabinet, a Cray-3 system also needed one or two (depending on number of processors) system control pods (or "C-Pods"), 52.5 inches (1.3 m) square and 55.3 inches (1.4 m) high, containing power and cooling control equipment.

System configurations

The following possible Cray-3 configurations were officially specified:
Name CPUs Memory (Mwords) I/O Modules
Cray-3/1-256 1 256 1
Cray-3/2-256 2 256 1
Cray-3/4-512 4 512 3
Cray-3/4-1024 4 1024 3
Cray-3/4-2048 4 2048 3
Cray-3/8-1024 8 1024 7
Cray-3/8-2048 8 2048 7
Cray-3/16-2048 16 2048 15

Software

The Cray-3 ran a Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

 operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 which included the X Window System
X Window System
The X window system is a computer software system and network protocol that provides a basis for graphical user interfaces and rich input device capability for networked computers...

 plus FORTRAN
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

 and C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

compilers.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK