All Topics  
Hyper-threading

 
Hyper Threading

   Email Print
   Bookmark   Link






 

Hyper-threading



 
 
Hyper-threading is Intel's trademarked term for its simultaneous multithreading
Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar Central processing unit with Multithreading ....
 implementation in their Pentium 4
Pentium 4

The Pentium 4 brand refers to Intel's line of single-core mainstream Desktop computer and laptop central processing units introduced on November 20, 2000 ....
, Atom
Intel Atom

Intel Atom is the brand name for a line of x86 and x86-64 CPUs from Intel, previously List of Intel codenames Silverthorne and Diamondville processors, designed for a 45 nm CMOS process and intended for use in MIDs, smart phones and ultra-mobile PCs meant for portable and low-power applications....
, and Core i7
Intel Core i7

Intel Core i7 is a family of three Intel desktop x86-64 processors, the first processors released using the Intel Nehalem and the successor to the Intel Core 2 family....
 CPUs. Hyper-threading (officially termed Hyper-Threading Technology or HTT) is an Intel-proprietary technology used to improve parallelization of computations (doing multiple tasks at once) performed on PC microprocessors. A processor with hyper-threading enabled is treated by the operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
 as two processors instead of one.






Discussion
Ask a question about 'Hyper-threading'
Start a new discussion about 'Hyper-threading'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Hyper-threading is Intel's trademarked term for its simultaneous multithreading
Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar Central processing unit with Multithreading ....
 implementation in their Pentium 4
Pentium 4

The Pentium 4 brand refers to Intel's line of single-core mainstream Desktop computer and laptop central processing units introduced on November 20, 2000 ....
, Atom
Intel Atom

Intel Atom is the brand name for a line of x86 and x86-64 CPUs from Intel, previously List of Intel codenames Silverthorne and Diamondville processors, designed for a 45 nm CMOS process and intended for use in MIDs, smart phones and ultra-mobile PCs meant for portable and low-power applications....
, and Core i7
Intel Core i7

Intel Core i7 is a family of three Intel desktop x86-64 processors, the first processors released using the Intel Nehalem and the successor to the Intel Core 2 family....
 CPUs. Hyper-threading (officially termed Hyper-Threading Technology or HTT) is an Intel-proprietary technology used to improve parallelization of computations (doing multiple tasks at once) performed on PC microprocessors. A processor with hyper-threading enabled is treated by the operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
 as two processors instead of one. This means that only one processor is physically present but the operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
 sees two virtual processors, and shares the workload between them. Hyper-threading requires both operating system and CPU support; conventional multiprocessor support is not enough. For example, Intel does not recommend that hyper-threading be enabled under Windows 2000, even though the operating system supports multiple CPUs.

Performance

The advantages of hyper-threading are listed as: improved support for multi-threaded code, allowing multiple threads to run simultaneously, improved reaction and response time.

According to Intel the first implementation only used 5% more die area than the comparable non-hyperthreaded processor, but the performance was 15–30% better.

Intel claims up to a 30% speed improvement compared with an otherwise identical, non-simultaneous multithreading
Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar Central processing unit with Multithreading ....
 Pentium 4. Intel also claims significant performance improvements with a hyper-threading-enabled Pentium 4 processor in some artificial intelligence algorithms. The performance improvement seen is very application-dependent, however, and some programs actually slow down slightly when Hyper Threading Technology is turned on. This is due to the replay system
Replay system

The Replay system is a little known subsystem within the Intel Pentium 4 processor. Its primary function is to catch operations that have been mistakenly sent for execution by the processor's Instruction scheduling....
 of the Pentium 4 tying up valuable execution resources, thereby starving the other thread. (The Pentium 4 Prescott core gained a replay queue, which reduces execution time needed for the replay system, but this is not enough to completely overcome the performance hit.) However, any performance degradation is unique to the Pentium 4 (due to various architectural nuances), and is not characteristic of simultaneous multithreading
Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar Central processing unit with Multithreading ....
 in general.

Details


Hyper-threading works by duplicating certain sections of the processor—those that store the architectural state
Architectural state

The architectural state is the part of the Central processing unit which holds the state ofa process, this includes:* Control registers** Instruction Flag Registers ...
—but not duplicating the main execution resources. This allows a hyper-threading processor to appear as two "logical" processors to the host operating system, allowing the operating system to schedule two threads or processes simultaneously. When execution resources would not be used by the current task in a processor without hyper-threading, and especially when the processor is stalled, a hyper-threading equipped processor can use those execution resources to execute another scheduled task. (The processor may stall due to a cache miss
CPU cache

A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access computer storage. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations....
, branch misprediction
Branch misprediction

Branch misprediction occurs when a central processing unit mispredicts the next instruction to process in branch prediction, which is aimed at speeding up execution....
, or data dependency
Data dependency

A data dependency in computer science is a situation in which a program statement refers to the data of a preceding statement. In compiler theory, the technique used to discover data dependencies among statements is called Dependence analysis....
.)

This technology is transparent to operating systems and programs. All that is required to take advantage of hyper-threading is symmetric multiprocessing
Symmetric multiprocessing

In computing, symmetric multiprocessing or SMP involves a multiprocessor computer-architecture where two or more identical processors can connect to a single shared main memory....
 (SMP) support in the operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
, as the logical processors appear as standard separate processors.

It is possible to optimize operating system behavior on multi-processor hyper-threading capable systems, such as the Linux techniques discussed in . For example, consider an SMP system with two physical processors that are both hyper-threaded (for a total of four logical processors). If the operating system's process scheduler
Scheduling (computing)

Scheduling is a key concept in computer multitasking and multiprocessing operating system design, and in real-time operating system design. In modern operating systems, there are typically many more processes running than there are CPUs available to run them....
 is unaware of hyper-threading it will treat all four processors as being the same. If only two processes are eligible to run it might choose to schedule those processes on the two logical processors that happen to belong to one of the physical processors; that processor would become extremely busy while the other would be idle, leading to poorer performance than is possible with better scheduling. This problem can be avoided by improving the scheduler to treat logical processors differently from physical processors; in a sense, this is a limited form of the scheduler changes that are required for NUMA
Non-Uniform Memory Access

Non-Uniform Memory Access or Non-Uniform Memory Architecture is a computer storage design used in multiprocessors, where the memory access time depends on the memory location relative to a processor....
 systems.

Security

In May 2005 Colin Percival presented a paper, (PDF file), demonstrating that a malicious thread operating with limited privileges can monitor the execution of another thread through their influence on a shared data cache, allowing for the theft of cryptographic keys. Note that while the attack described in the paper was demonstrated on an Intel Pentium 4 with HyperThreading processor, the same techniques could theoretically apply to any system where caches are shared between two or more non-mutually-trusted execution threads; see also side channel attack
Side channel attack

In cryptography, a side channel attack is any attack based on information gained from the physical implementation of a cryptosystem, rather than brute force attack or theoretical weaknesses in the algorithms ....
.

Past

Older Netburst Pentium 4 based CPUs use hyper-threading, but the newer Pentium M
Pentium M

The Pentium M brand refers to only two single-core 32-bit x86 microprocessors introduced in March 2003 , and forming a part of the Intel Centrino platform....
 based cores Merom
Intel Core 2

The Core 2 brand refers to a range of Intel's consumer 64-bit single- and dual-core and 2x2 Multi-Chip Module quad-core CPUs with the x86-64 instruction set, based on the Intel Core microarchitecture, derived from the 32-bit dual-core Intel Core laptop processor....
, Conroe
Intel Core 2

The Core 2 brand refers to a range of Intel's consumer 64-bit single- and dual-core and 2x2 Multi-Chip Module quad-core CPUs with the x86-64 instruction set, based on the Intel Core microarchitecture, derived from the 32-bit dual-core Intel Core laptop processor....
, and Woodcrest
Xeon

The Xeon brand refers to many families of Intel Corporation's x86 architecture multiprocessing Central processing units ? for dual processor and multi-processor configuration on a single motherboard targeted at non-consumer markets of server and workstation computers, and also at blade servers and embedded systems....
 do not. Hyper-threading is a specialized form of simultaneous multithreading
Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar Central processing unit with Multithreading ....
 (SMT).

Inefficiencies

More recently hyper-threading has been criticised as being energy inefficient. For example, specialist low-power CPU design company ARM has stated SMT
Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar Central processing unit with Multithreading ....
 can use up to 46% more power than dual CPU designs. Furthermore, they claim SMT
Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar Central processing unit with Multithreading ....
 increases cache thrashing by 42%, whereas dual core results in a 37% decrease. These considerations are claimed to be the reason Intel dropped SMT
Simultaneous multithreading

Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar Central processing unit with Multithreading ....
 from the following microarchitecture
Microarchitecture

In computer engineering, microarchitecture is a description of the electrical circuitry of a computer, central processing unit, or digital signal processor that is sufficient for completely describing the operation of the hardware....
.

Present & Future

Intel released the Nehalem
Nehalem (CPU architecture)

Nehalem is the codename for an Intel Corporation Central processing unit microarchitecture, successor to the Intel Core . The first processor released with the Nehalem architecture is the desktop Intel Core i7, which was released on November 15, 2008 in Tokyo and November 17, 2008 in the USA....
 (Core i7) in November 2008 in which hyper-threading makes a return. Nehalem contains 4 cores and effectively scales 8 threads.

The Intel Atom
Intel Atom

Intel Atom is the brand name for a line of x86 and x86-64 CPUs from Intel, previously List of Intel codenames Silverthorne and Diamondville processors, designed for a 45 nm CMOS process and intended for use in MIDs, smart phones and ultra-mobile PCs meant for portable and low-power applications....
 is an in-order single-core processor with hyper-threading, for low power mobile PCs and low-price desktop PCs. ]

See also

  • Multi-core
    Multi-core (computing)

    A multi-core processor combines two or more independent cores into a single package composed of a single integrated circuit , called a Die , or more dies packaged together....
  • Barrel processor
    Barrel processor

    A barrel processor is a Central processing unit that switches between Thread of execution on every Instruction cycle. This CPU design technique is also known as "interleaved" or "fine-grained" temporal multithreading....


External links


  • Intel's
  • from OSDEV Community
  • An from Ars Technica


  • [ftp://download.intel.com/technology/itj/2002/volume06issue01/vol6iss1_hyper_threading_technology.pdf Hyper-Threading Technology Architecture and Microarchitecture], technical description of Hyper-Threading (1.2 MB PDF-file)
  • Enter Patent Number 4,847,755
Security
  • KernelTrap
    KernelTrap

    KernelTrap is a computing news website which covers topics related to the development of Free and open source software operating system kernel s, and especially, the Linux kernel....
     discussion:
Performance problems
  • - Outlines problems of SMT solutions


Sources


Replay: Unknown Features of the NetBurst Core