All Topics  
R8000

 

   Email Print
   Bookmark   Link






 

R8000



 
 
The R8000 is a microprocessor
Microprocessor

A microprocessor incorporates most or all of the functions of a central processing unit on a single integrated circuit . The first microprocessors emerged in the early 1970s and were used for electronic calculators, using Binary-coded decimal arithmetic on 4-bit Word ....
 chip set implementing the MIPS IV
MIPS architecture

MIPS is a RISC instruction set architecture developed by MIPS Technologies . In the mid to late 1990s, it was estimated that one in three RISC microprocessors produced were MIPS implementations....
 instruction set architecture (ISA) jointly developed by MIPS Technologies, Inc.
MIPS Technologies

MIPS Technologies, Inc. , formerly MIPS Computer Systems, is most widely known for developing the MIPS architecture and a series of pioneering Reduced instruction set computer Central processing unit....
 (MTI), then a subsidiary of Silicon Graphics, Inc. (SGI), Toshiba
Toshiba

is a multinational corporation list of conglomerates manufacturing company, headquartered in Tokyo, Japan. The company's main business is in Infrastructure, Consumer Products, and Electronic devices and components....
 and Weitek
Weitek

Weitek Corporation was a former chip-design company that originally concentrated on floating point units for a number of commercial Central processing unit designs....
.

History
The R8000 was specifically designed to incorporate the performance of recent supercomputers in a microprocessor. At the time, the performance of supercomputers were rapidly diminishing as reduced instruction set computer (RISC) implementations advanced.






Discussion
Ask a question about 'R8000'
Start a new discussion about 'R8000'
Answer questions from other users
Full Discussion Forum



Encyclopedia


The R8000 is a microprocessor
Microprocessor

A microprocessor incorporates most or all of the functions of a central processing unit on a single integrated circuit . The first microprocessors emerged in the early 1970s and were used for electronic calculators, using Binary-coded decimal arithmetic on 4-bit Word ....
 chip set implementing the MIPS IV
MIPS architecture

MIPS is a RISC instruction set architecture developed by MIPS Technologies . In the mid to late 1990s, it was estimated that one in three RISC microprocessors produced were MIPS implementations....
 instruction set architecture (ISA) jointly developed by MIPS Technologies, Inc.
MIPS Technologies

MIPS Technologies, Inc. , formerly MIPS Computer Systems, is most widely known for developing the MIPS architecture and a series of pioneering Reduced instruction set computer Central processing unit....
 (MTI), then a subsidiary of Silicon Graphics, Inc. (SGI), Toshiba
Toshiba

is a multinational corporation list of conglomerates manufacturing company, headquartered in Tokyo, Japan. The company's main business is in Infrastructure, Consumer Products, and Electronic devices and components....
 and Weitek
Weitek

Weitek Corporation was a former chip-design company that originally concentrated on floating point units for a number of commercial Central processing unit designs....
.

History


The R8000 was specifically designed to incorporate the performance of recent supercomputers in a microprocessor. At the time, the performance of supercomputers were rapidly diminishing as reduced instruction set computer (RISC) implementations advanced. The R8000 was the first MIPS IV implementation, and was the first superscalar MIPS microprocessor.

The R8000 was however significantly delayed, and was introduced late in 1994 within systems from SGI. The R8000's high cost and narrow market (technical and scientific computing) restricted its market share, and although it was popular in its intended market, it was largely replaced with the cheaper and better performing R10000
R10000

The R10000, code-named "T5", is a microprocessor implementation of the MIPS architecture instruction set architecture developed by MIPS Technologies , then a division of Silicon Graphics ....
 in January 1996.

Users of the R8000 were SGI, who used it in their Power Indigo2 workstations, Power Challenge servers, Power CHALLENGEarray clusters and Power Onyx visualization system. In the November 1994 TOP500 list, R8000-based systems accounted for 50 out of 500. Four Power Challenge systems, each with 18 microprocessors, were ranked 154 to 157.

Description


The R8000 consisted of two chips, the R8000 microprocessor and the R8010 floating-point unit. These two chips were also accompanied by application specific integrated circuits (ASICs) which implemented the control hardware for the secondary cache.

R8000


The R8000 contained the majority of the R8000's logic. It executed integer instructions in integer function units and contained the integer register file, primary caches and hardware for instruction fetch, branch prediction and translation look aside buffers (TLBs).

The integer register file consisted of thirty-two 64-bit entries with nine read ports and four write ports. Four read ports are used to supply operands to two of the four functional units, and an additional four are used to supply operands to the two address generators. A single read port is used to deliver data to the two banks of data cache. Two write ports are used to write results from two functional units to the register file, and two write ports are used by the register file to read from the data cache, one for each bank.

Integer functional units consisted of two integer units, a shift unit, a multiply and divide unit, and two address generator units. Multiply and divide instructions are executed in the multiply-divide unit, which is not pipelined. As a result, the latency for a multiply instruction is four cycles for 32-bit and six cycles for 64-bit. The latency for a divide instruction depends on the number of significant digits in the result and thus it varies from 21 to 73 cycles.

R8010


The R8010 executed floating-point instructions. The R8010 is decoupled from the integer pipeline, thus implementing a limited form of out-of-order execution. This was done to hide the remaining latency of the secondary cache. It consisted of two execution units, the floating-point register file, a load queue and a store queue. The two execution units were identical,and executed double precision fused multiply-adds, adds, multiplies, divides and floating-point to integer conversions. The execution units are fully pipelined and bypassed. However, hardware for divides and square-roots was not pipelined. As a result, single and double precision divides require 14 and 20 cycles, respectively; and single and double precision square-roots require 14 and 23 cycles, respectively.

Cache


The R8000 used a split level cache. Integer data is fetched from a primary cache located on the R8000 die. Floating-point data is from a primary cache located externally. The same external cache also serves as a unified secondary cache for integer instructions and data.

This scheme was used as the R8000 was designed for sustained floating-point performance. The R010 floating-point unit required large amounts of data and executed a large amount of instructions so if a small high bandwidth primary cache was used, it would be emptied rapidly, prompting the microprocessor to refill the cache with new data. This restricted the floating-point performance so a large external cache with high bandwidth but also a higher latency, was used instead as it could provide the microprocessor with data continuously.

To mitigate some of the latency, the cache was pipelined and has five stages. During the first stage, the R8000 sends addresses to the tag RAM, which are accessed during the second stage. The third stage is for the signals from the tag RAM to propagate to the data SRAMs. Access of the data SRAMs occurs during the forth cycle with data being returned to the R8000 and R8010 during the fifth stage. A cycle was given for signals to propagate as transistor-transistor logic
Transistor-transistor logic

File:68k ttl.jpgTransistor?transistor logic is a class of digital circuits built from bipolar junction transistors and resistors. It is called transistor?transistor logic because both the logic gating function and the amplifying function are performed by transistors ....
 (TTL) drivers operating at 75 MHz with high loading require an entire cycle and because an extra stage avoided restricting the microprocessor from operating at higher clock frequencies in the future.

Primary caches


The data cache has a 16 KB capacity. It is dual-ported, with two 64-bit buses. It can service two loads or one load and one store per cycle. The cache is not protected by parity or by ECC. In the event of a miss, the data must be loaded from the streaming cache with an eight-cycle penalty. The cache is virtually indexed, physically tagged, direct mapped, has a 32-byte line size and uses a write-through with allocate protocol.

Physical


The R8000 consisted of 2.6 million transistors and the R8010 of 830,000, for a total of 3.43 million. Both dies were fabricated in VHMOSIII, a 0.7 µm process by Toshiba
Toshiba

is a multinational corporation list of conglomerates manufacturing company, headquartered in Tokyo, Japan. The company's main business is in Infrastructure, Consumer Products, and Electronic devices and components....
. Both are packaged in a 591-pin pin grid array
Pin grid array

A pin grid array, often abbreviated PGA, refers to the arrangement of pins on the integrated circuit packaging. In a PGA, the pins are arranged in a square array that may or may not cover the bottom of the package....
 (PGA). The chipset used a 3.3 V power supply.