|
|
|
|
UltraSPARC T1
|
| |
|
| |
Sun Microsystems' UltraSPARC T1 microprocessor, known until its 14 November 2005 announcement by its development codename "Niagara", is a multithreading, multicore CPU. Designed to lower the energy consumption of server computers, the CPU typically uses 72 W of power at 1.4 GHz.
The T1 is a new-from-the-ground-up SPARC microprocessor implementation that conforms to the and executes the full SPARC V9 instruction set.

Discussion
Ask a question about 'UltraSPARC T1'
Start a new discussion about 'UltraSPARC T1'
Answer questions from other users
|
Encyclopedia
Sun Microsystems' UltraSPARC T1 microprocessor, known until its 14 November 2005 announcement by its development codename "Niagara", is a multithreading, multicore CPU. Designed to lower the energy consumption of server computers, the CPU typically uses 72 W of power at 1.4 GHz.
The T1 is a new-from-the-ground-up SPARC microprocessor implementation that conforms to the and executes the full SPARC V9 instruction set. Sun has produced two previous multicore processors (UltraSPARC IV and IV+), but UltraSPARC T1 is its first microprocessor that is both multicore and multithreaded. The processor is available with four, six or eight CPU cores, each core able to handle four threads concurrently. Thus the processor is capable of processing up to 32 threads concurrently.
Similar to how high-end Sun SMP systems work, the UltraSPARC T1 can be partitioned. Thus, several cores can be partitioned for running a single or group of processes and/or threads, whilst the other cores deal with the rest of the processes on the system.
Cores
The UltraSPARC T1 was designed from scratch as a multi-threaded, special-purpose processor, and thus introduces a whole new architecture for obtaining high performance. Rather than try to make each core as intelligent and optimized as they can, Sun's goal was to run as many concurrent threads as possible, and maximize utilization of each core's pipeline.
The T1's cores are less complex than those of current high end processors in order to allow 8 cores to fit on the same die. The cores do not feature out-of-order execution, or a sizable amount of cache. Single-thread processors depend heavily on large caches for their performance because cache misses result in a wait while the data is fetched from main memory. By making the cache larger the probability of a cache miss is reduced, but the impact of a miss is still the same.
The T1 cores largely side-step the issue of cache misses by multithreading. Each core is a barrel processor, meaning it switches between available threads each cycle. When a long-latency event, such as cache miss occurs, the thread is taken out of rotation while the data is fetched into cache in the background. Once the long-latency event completes, the thread is made available for execution again. Sharing of the pipeline by multiple threads may make each thread slower, but the overall throughput (and utilization) of each core is much higher. It also means that the impact of cache misses is greatly reduced, and the T1 can maintain high throughput with a smaller amount of cache. The cache no longer needs to be large enough to hold all or most of the "working set", just the recent cache misses of each thread.
Benchmarks demonstrate this approach has worked very well on commercial (integer), multithreaded workloads such as Java application servers, Enterprise Resource Planning (ERP) application servers, email (such as Lotus Domino) servers, and web servers. These benchmarks suggest each core in the UltraSPARC T1 is more powerful than the circa 2001, single-core, single-threaded UltraSPARC III, and at a chip to chip comparison, significantly outperforms other processors on multithreaded integer workloads.
At the time of its release in December 2005, a single-chip, eight-core, 32-thread, 1.2 GHz UltraSPARC T1 server performed similarly to a two-socket, four-core, eight-thread, 1.9 GHz IBM POWER5 server, performed similarly to a four-socket, eight-core, sixteen-thread 3.0 GHz Intel Xeon "Paxville MP" server, and exceeded the performance of a four-socket, four-core, four-thread 1.6 GHz Intel Itanium server. Arguably, this made the UltraSPARC T1 the world's most powerful general-purpose commercial server processors, when considering multithreaded commercial workloads.
Studies by Intel show that even under full load, a typical x86 server CPU is idle 50 to 60% of the time. This is due to cache misses which all CPU architectures suffer from; they must wait for data to arrive from RAM. That is also why modern CPUs have larger cache, complex prefetch logic, etc. However, CPUs belonging to the T1 family do not suffer from this problem. Instead, as soon a T1 thread stalls due to a cache miss, the T1 switches thread in 1 clock cycle and continues to do work while waiting for the data. Typically on a modern CPU, a thread switch takes a much longer time than 1 clock cycle. This is the reason a T1 can work 95% of the time and only waits for data 5% of the time. Compare this to an x86 CPU at 3 GHz. Because the x86 CPU can only work at half speed due to cache misses, it can be compared to a 1.5 GHz CPU working at full speed. However, one of the T1 threads can compare to an Intel Pentium 3 CPU at 1 GHz in terms of computing power.
The T1 is slow on single threaded work but shines on multi-threaded work. A common mistake is that the T1 is not fully loaded when testing. When testing, typically it is loaded with small data, 1 GB or so. In that case an x86 CPU easily outperforms the T1. However, when the machine is heavily loaded with lots of data, the T1 will easily outperform the x86 CPU. The x86 CPU will stall but the T1 continues to work. The T1 degrades a magnitude slower than the x86 CPU. To fully take advantage of the T1, it must be loaded heavily. Otherwise it will not show its true potential.
Systems
The T1 processor can be found in the following products from Sun and Fujitsu Computer Systems:
- Sun/Fujitsu/Fujitsu Siemens SPARC Enterprise T1000 and T2000 servers
- Sun Fire T1000 and T2000 servers
- Sun Netra T2000 Server
- Sun Netra CP3060 Blade
- Sun Blade T6300 Server Module
Target market
The UltraSPARC T1 microprocessor is unique in its strength and weaknesses, and as such is targeted at specific markets. Rather than being used for high-end number-crunching and ultra-high performance applications, the chip is targeted at network-facing high-demand servers, such as high-traffic web servers, and mid-tier Java, ERP, and CRM application servers, which often utilize a large number of separate threads. One of the limitations of the T1 design is that a single floating point unit (FPU) is shared between all 8 cores, making the T1 unsuitable for applications performing a lot of floating point mathematics. However, since the processor's intended markets do not typically make much use of floating-point operations, Sun does not expect this to be a problem. Sun provides a tool for analysing an application's level of parallelism and use of floating point instructions to determine if it is suitable for use on a T1 or T2 platform.
In addition to web and application tier processing, the UltraSPARC T1 may be well suited for smaller database applications which have a large user count. One customer has published results showing that a MySQL application running on an UltraSPARC T1 server ran 13.5 times faster than on an AMD Opteron server.
Virtualization
T1 is the first SPARC processor that supports the Hyper-Privileged execution mode. The SPARC Hypervisor runs in this mode, and it can partition a T1 system into 32 Logical Domains, each of which can run an operating system instance.
Currently, Solaris and Linux are supported, and FreeBSD support is under development.
Software licensing issues
Traditionally, commercial software suites like Oracle database charge their customers based on the number of processors the software runs on. In early 2006, Oracle changed the licensing model by introducing the processor factor. With a processor factor of .25 for the T1, an 8-core T2000 requires only a 2-CPU license.
In Q3 2006, IBM introduced the concept of Value Unit (VU) pricing. Each core of the T1 is 30 PVUs instead of the default value of 100 PVUs per core.
Weaknesses
The T1 is only available in uniprocessor systems, limiting vertical scalability in large enterprise environments; Sun has announced that the follow-on "Victoria Falls" processor will address this.
"Rock" The UltraSPARC T1 is designed for single CPU systems only and is not capable of SMP. Future Sun CMT UltraSPARC processors such as Rock will support multiple chip server architectures. The Rock processor targets traditional data facing workloads such as databases. As such, it is seen as the logical follow-on to Sun's SMP processors such as UltraSPARC IV, rather than a replacement for the UltraSPARC T1 or T2.
Rock also targets floating point workloads, unlike UltraSPARC T1. Sun has publicly disclosed a feature in the Rock processor called hardware scout, which uses multithreaded hardware to perform prefetching.
Rock is the world's first general purpose processor with hardware transactional memory.
UltraSPARC T2 Formerly known by the codename Niagara 2, the follow-on to the UltraSPARC T1 supports eight threads per core, and each core has its own FPU.
UltraSPARC T2 Plus
In February 2007, Sun announced at its annual analyst summit that its third-generation simultaneous multithreading design, code-named Victoria Falls, was taped out in October 2006. A two-socket server (2 RU) will have 128 threads, 16 cores, and a 65× performance improvement over UltraSPARC III.
At the Hot Chips 19 conference, Sun announced that Victoria Falls will be in 2-way and 4-way servers. Thus, a single 4-way SMP server will support 256 concurrent hardware threads.
In April 2008, Sun released 2-way UltraSPARC T2 Plus servers, the SPARC Enterprise T5140 and T5240.
In October 2008, Sun released 4-way UltraSPARC T2 Plus SPARC Enterprise T5440 server.
Niagara 3
In October 2006, Sun disclosed that Niagara 3 will be built with a 45 nm process. According to an article in The Register from June 2008 the processor will have 16 cores with 16 threads each.
Open design
On March 21, 2006, Sun made the UltraSPARC T1 processor design available under the GNU General Public License via the OpenSPARC project. The published information includes:
- Verilog source code of the UltraSPARC T1 design;
- Verification suite and simulation models;
- ISA specification (UltraSPARC Architecture 2005);
- The Solaris 10 OS simulation images.
External links
-
-
-
- – By Jessica Davis, Electronic News, 14 Nov 2005
- by Linda Geppert, in IEEE Spectrum, January 2005
- by Poonacha Kongetira, Kathirgamar Aingaran, Kunle Olukotun, in IEEE Micro, March-April 2005
-
-
|
| |
|
|