|
|
|
|
R10000
|
| |
|
| |
The R10000, code-named "T5", is a microprocessor implementation of the MIPS IV instruction set architecture (ISA) developed by MIPS Technologies, Inc. (MTI), then a division of Silicon Graphics, Inc. (SGI). The microarchitecture of the R10000 was known as ANDES, an abbreviation for Architecture with Non-sequential Dynamic Execution Scheduling.
It was originally intended to be the last high-performance, non-embedded MIPS microprocessor to be developed for SGI, who had opted to replace MIPS with the Itanium.

Discussion
Ask a question about 'R10000'
Start a new discussion about 'R10000'
Answer questions from other users
|
Encyclopedia
The R10000, code-named "T5", is a microprocessor implementation of the MIPS IV instruction set architecture (ISA) developed by MIPS Technologies, Inc. (MTI), then a division of Silicon Graphics, Inc. (SGI). The microarchitecture of the R10000 was known as ANDES, an abbreviation for Architecture with Non-sequential Dynamic Execution Scheduling.
It was originally intended to be the last high-performance, non-embedded MIPS microprocessor to be developed for SGI, who had opted to replace MIPS with the Itanium. Due to repeated delays, the R10000's basic microarchitecture became the basis for successive derivatives. As MTI was a fabless semiconductor company, the R10000 was fabricated by NEC and Toshiba. Previous fabricators of MIPS microprocessors such as Integrated Device Technology (IDT) and three others did not fabricate the R10000 due to the high cost of doing so as a result of its design.
History The R10000 was introduced in January 1996, at clock frequencies ranging from 150 MHz to 195 MHz (the 200 MHz version was not introduced due to problems with the yield at the foundries). Later, the R10000 was fabricated in a 0.25 µm process and this enabled it it reach 250 MHz.
R10000 users were SGI, NEC and others. SGI used the R10000 in their workstations, servers and supercomputers. NEC built supercomputers utilizing the R10000, and other manufacturers built both workstations and servers.
Description MIPS IV is a 64-bit architecture, but the R10000 did not implement the entire physical or virtual address to reduce cost. Instead, it has a 40-bit physical address and a 44-bit virtual address, thus it is capable of addressing 1 TB of physical memory and 16 TB of virtual memory.
Integer unit The integer unit consists of three pipelines, two integer, one load store. The integer register file was 64 bits wide and contained 64 entries, of which 32 were architectural registers and 32 were rename registers used to implement register renaming. The register file had seven read ports and three write ports.
Floating-point unit The floating-point unit consisted of four functional units, an adder, a multiplier, divide unit and square root unit. The adder and multiplier are pipelined, but the divide and square root units are not. Adds and multiplies have a latency of three cycles and the adder and multiplier can accept a new instruction every cycle. The divide unit has a 12- or 19-cycle latency, depending on whether the divide is single precision or double precision, respectively.
The square root unit executes square root and reciprocal square root instructions. Square root instructions have a 18- or 33-cycle latency for single precision or double precision, respectively. A new square root instruction can be issued to the divide unit every 20 or 35 cycles for single precision and double precision respectively. Reciprocal square roots have longer latencies, 30 to 52 cycles for single precision and double precision respectively.
The floating-point register file contains sixty-four 64-bit registers, of which thirty-two are architectural and the remaining are rename registers.
The adder has its own dedicated read and write ports, where as the multiplier shares its with the divider and square root unit.
The divide and square root units use the SRT algorithm. The MIPS IV ISA has a multiply-add instruction. This instruction is implemented by the R10000 with a bypass - the result of the multiply can bypass the register file and be delivered to the add pipeline as an operand, thus it is not a fused multiply-add and has a four-cycle latency.
Caches The R10000 has a 32 KB instruction cache and a 32 KB data cache, which was large for the time (1996). The instruction cache was two-way set associative and has a 64-byte line size. Instructions are partially decoded by appending four bits to each instruction (which has a length of 32 bits) used to identify which execution unit the instruction is executed in before they are placed in the cache. The 32 KB data cache was two-way interleaved, with the cache consisting of two 16 KB banks that were two-way set associative. It is virtually indexed and physically tagged to enable the cache to be indexed in the same clock cycle and to maintain coherency with the secondary cache.
The secondary cache capacity was 512 KB to 16 MB, using synchronous static random access memory (SSRAM). It was accessed via a dedicated 128-bit bus with 9-bits for ECC. The cache and bus operated at the same clock frequency as the R10000, whose maximum was 200 MHz. At 200 MHz, the bus yielded a peak bandwidth of 3.2 GB/s.
Avalanche system bus The R10000 used the Avalanche bus, a 64-bit bus that operated at frequencies up to 100 MHz. Avalanche was an multiplexed address and data bus, so at 100 MHz it yielded a maximum theoretical bandwidth of 800 MB/s, but its peak bandwidth was 640 MB/s as it required some cycles to transmit addresses.
The system interface controller supported glue-less symmetrical multiprocessing (SMP) of up to four microprocessors. Systems using the R10000 with external logic could scale to hundreds of processors, such as the Origin 2000.
Fabrication The R10000 consisted of approximately 6.7 million transistors, of which approximately 4.4 million are contained in the primary caches. The die measured 16.64 mm by 17.934 mm, for a die area of 298 mm2. It was fabricated in a 0.35 µm process and packaged in 599-pad ceramic land grid array (LGA). Before the introduction of the R10000 at the 1994 Microprocessor Forum, the Microprocessor Report reported that the R10000 was packaged in a 527-pin ceramic pin grid array (CPGA) and that vendors also investigated the possibility of using a 339-pin multi-chip module (MCM) containing the microprocessor die and 1 MB of cache.
Derivatives
R12000 The R12000 was a further development of the R10000 introduced in November 1998. The R12000 was developed as a stop-gap solution following the cancellation of the "Beast" project, which intended to deliver a successor to the R10000. R12000 users included SGI, Tandem Computers (later Compaq, which had acquired Tandem) and Siemens-Nixdorf.
The microarchitecture of the R12000 was improved and it was fabricated in a 0.25 µm CMOS process with four levels of interconnect. The new use of a new process did not mean that the R12000 was a simple die shrink with a tweaked microarchitecture, the layout of the die was optimized to take advantage of the 0.25 µm process and extra level of interconnect.
R12000A The R12000A was a R12000 fabricated in a 0.18 µm process. It operated up to 400 MHz. It was introduced in July 2000.
R14000 The R14000 was a further development of the R12000 announced in July 2001. The R14000 operated at 500 MHz, enabled by the 0.13 µm CMOS process with five levels of copper interconnect it was fabricated with. It featured improvements to the microarchitecture of the R12000 by supporting double data rate (DDR) SSRAMs for the secondary cache and a 200 MHz system bus.
R14000A The R14000A was a further development of the R14000A announced in February 2002. It operated at 600 MHz, dissipated approximately 17 W, and was fabricated by NEC Corporation in a 0.13 µm CMOS process with seven levels of copper interconnect.
R16000 The R16000 was the last MIPS microprocessor for general-purpose computing. Improvements included higher clock frequencies and 64 KB instruction and data caches. Originally, the the fastest R16000 publicly known operated at 800 MHz, but SGI later revealed there were 1.0 GHz R16000s shipped to selected customers. R16000 users included SGI for their Tezro workstations and Origin 3000 servers and supercomputers; and HP for their NonStop Himalaya S-Series fault-tolerant servers inherited from Compaq via Tandem.
|
| |
|
|