All Topics  
3DNow!

 
3DNow!

   Email Print
   Bookmark   Link






 

3DNow!



 
 
3DNow! is the trade name
Trade name

A trade name, also known as a trading name or a business name, is the name which a business trades under for commercial purposes, although its registered, Legal name , used for contracts and other formal situations, may be another....
 of a multimedia extension created by AMD for its processors, starting with the K6-2 in 1998. It is an addition of SIMD
SIMD

In computing, SIMD is a technique employed to achieve data level parallelism....
 instructions to the traditional x86 instruction set
Instruction set

An instruction set is a list of all the instruction , and all their variations, that a processor can execute.Instructions include:* Arithmetic such as add and subtract...
, designed to improve a CPU
Central processing unit

A central processing unit is an electronic circuit that can execute computer programs. This broad definition can easily be applied to many early computers that existed long before the term "CPU" ever came into widespread usage....
's ability to perform the vector processing requirements of many graphic-intensive applications.

w! was originally developed as an enhancement to the MMX instruction set.






Discussion
Ask a question about '3DNow!'
Start a new discussion about '3DNow!'
Answer questions from other users
Full Discussion Forum



Recent Posts









Encyclopedia


3DNow! is the trade name
Trade name

A trade name, also known as a trading name or a business name, is the name which a business trades under for commercial purposes, although its registered, Legal name , used for contracts and other formal situations, may be another....
 of a multimedia extension created by AMD for its processors, starting with the K6-2 in 1998. It is an addition of SIMD
SIMD

In computing, SIMD is a technique employed to achieve data level parallelism....
 instructions to the traditional x86 instruction set
Instruction set

An instruction set is a list of all the instruction , and all their variations, that a processor can execute.Instructions include:* Arithmetic such as add and subtract...
, designed to improve a CPU
Central processing unit

A central processing unit is an electronic circuit that can execute computer programs. This broad definition can easily be applied to many early computers that existed long before the term "CPU" ever came into widespread usage....
's ability to perform the vector processing requirements of many graphic-intensive applications.

History

3DNow! was originally developed as an enhancement to the MMX instruction set. The original idea behind its creation was to extend it from only operating on integer
Integer

The integers are natural numbers including 0 and their negative and non-negative numberss . They are numbers that can be written without a fractional or decimal component, and fall within the set ....
 math to also accelerating floating-point calculations.

The strategic and marketing need to provide for 3D calculations in the floating-point domain was especially needed by AMD. The K6 processor
AMD K6

The K6 microprocessor was launched by AMD in 1997. The main advantage of this particular microprocessor is that it was designed to fit into existing desktop designs for Pentium branded CPUs....
 at the time was not well equipped for intensive floating-point mathematics in comparison to the Intel Pentium II.

The 3DNow! instruction set was created during the late 1990s when 3D graphics was exploding in popularity because of 3D gaming, and 3D games heavily use floating-point arithmetic.

Whereas earlier in the 1990s AMD could easily get by with limited floating-point performance, because the vast majority of software was integer-calculation-based, with which the K6 was extremely proficient, 3D gaming and advanced multimedia applications were quickly changing the landscape.

Versions


3DNow!


The first implementation of 3DNow! technology contains 21 new instructions that support SIMD
SIMD

In computing, SIMD is a technique employed to achieve data level parallelism....
 floating-point operations. The 3DNow! data format is packed, single-precision, floating-point. The 3DNow! instruction set also includes operations for SIMD integer operations, data prefetch, and faster MMX-to-floating-point switching. Later, Intel would add similar (but incompatible) instructions to the Pentium III
Pentium III

The Pentium III brand refers to Intel's 32-bit x86 desktop and mobile microprocessors based on the sixth-generation Intel P6 microarchitecture introduced on February 26, 1999....
, known as SSE
Streaming SIMD Extensions

In computing, Streaming SIMD Extensions is a SIMD instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! ....
 for Streaming SIMD Extensions.

3DNow! floating-point instructions
  • PAVGUSB - Packed 8-bit unsigned integer averaging
  • PI2FD - Packed 32-bit integer to floating-point conversion
  • PF2ID - Packed floating-point to 32-bit integer conversion
  • PFCMPGE - Packed floating-point comparison, greater or equal
  • PFCMPGT - Packed floating-point comparison, greater
  • PFCMPEQ - Packed floating-point comparison, equal
  • PFACC - Packed floating-point accumulate
  • PFADD - Packed floating-point addition
  • PFSUB - Packed floating-point subtraction
  • PFSUBR - Packed floating-point reverse subtraction
  • PFMIN - Packed floating-point minimum
  • PFMAX - Packed floating-point maximum
  • PFMUL - Packed floating-point multiplication
  • PFRCP - Packed floating-point reciprocal approximation
  • PFRSQRT - Packed floating-point reciprocal square root approximation
  • PFRCPIT1 - Packed floating-point reciprocal, first iteration step
  • PFRSQIT1 - Packed floating-point reciprocal square root, first iteration step
  • PFRCPIT2 - Packed floating-point reciprocal/reciprocal square root, second iteration step
  • PMULHRW - Packed 16-bit integer multiply with rounding


3DNow! performance-enhancement instructions
  • FEMMS - Faster entry/exit of the MMX or floating-point state
  • PREFETCH/PREFETCHW - Prefetch at least a 32-byte line into L1 data cache


3DNow! extensions


There is little or no evidence that the second version of 3DNow! was ever officially given its own trade name. This has led to some confusion in documentation that refers to this new instruction set. The most common terms are Extended 3DNow!, Enhanced 3DNow! and 3DNow!+. The phrase "Enhanced 3DNow!" can be found in a few locations on the AMD website but the capitalization of "Enhanced" appears to be either purely grammatical or used for emphasis on processors that may or may not have these extensions (the most notable of which references a benchmark page for the K6-III-P that does not have these extensions).

This extension to the 3DNow! instruction set was introduced with the first-generation Athlon
Athlon

Athlon is the brand name applied to a series of different x86 Central processing unit designed and manufactured by Advanced Micro Devices. The original Athlon was the first seventh-generation x86 processor and, in a first, retained the initial performance lead it had over Intel Corporation's competing processors for a significant period of t...
 processors. The Athlon added 5 new 3DNow! instructions and 19 new MMX instructions. Later, the K6-2+
AMD K6-2

The K6-2 was an x86 microprocessor introduced by Advanced Micro Devices on May 28, 1998, and available in speeds ranging from 266 to 550 Megahertz....
 and K6-III+
AMD K6-III

The K6-III, code-named "Sharptooth", was an x86 microprocessor manufactured by AMD, which was the last and perhaps fastest of all Socket 7 desktop processors....
 (both targeted at the mobile market) included the 5 new 3DNow! instructions, leaving out the 19 new MMX instructions. The new 3DNow! instructions were added to boost DSP
Digital signal processing

Digital signal processing is concerned with the representation of the signal s by a sequence of numbers or symbols and the processing of these signals....
. The new MMX instructions were added to boost streaming media
Streaming media

Streaming media is multimedia that is constantly received by, and normally presented to, an End-user while it is being delivered by a streaming provider ....
.

3DNow! or MMX extensions?

The 19 new MMX instructions are a subset of Intel's SSE1 instruction set. In AMD technical manuals, AMD segregates these instructions apart from the 3DNow! extensions. In AMD customer product literature, however, this segregation is less clear where the benefits of all 24 new instructions are credited to enhanced 3DNow! technology. This has led programmers to come up with their own name for the 19 new MMX instructions. The most common appears to be Integer SSE (ISSE). SSEMMX and MMX2 are also found in video filter documentation from the public domain sector. [It should also be noted that ISSE could also refer to Internet SSE, an early name for SSE.]

3DNow! extension DSP instructions
  • PF2IW - Packed floating-point to integer word conversion with sign extend
  • PI2FW - Packed integer word to floating-point conversion
  • PFNACC - Packed floating-point negative accumulate
  • PFPNACC - Packed floating-point mixed positive-negative accumulate
  • PSWAPD - Packed swap doubleword


MMX extension instructions (Integer SSE)
  • MASKMOVQ - Streaming (cache bypass) store using byte mask
  • MOVNTQ - Streaming (cache bypass) store
  • PAVGB - Packed average of unsigned byte
  • PAVGW - Packed average of unsigned word
  • PMAXSW - Packed maximum signed word
  • PMAXUB - Packed maximum unsigned byte
  • PMINSW - Packed minimum signed word
  • PMINUB - Packed minimum unsigned byte
  • PMULHUW - Packed multiply high unsigned word
  • PSADBW - Packed sum of absolute byte differences
  • PSHUFW - Packed shuffle word
  • PEXTRW - Extract word into integer register
  • PINSRW - Insert word from integer register
  • PMOVMSKB - Move byte mask to integer register
  • PREFETCHNTA - Prefetch
    Instruction prefetch

    In computer architecture, instruction prefetch is a technique used in microprocessors to speed up the execution of a program by reducing wait states....
     using the NTA reference
  • PREFETCHT0 - Prefetch using the T0 reference
  • PREFETCHT1 - Prefetch using the T1 reference
  • PREFETCHT2 - Prefetch using the T2 reference
  • SFENCE - Store fence


3DNow! Professional


3DNow! Professional does not appear to be an extension to the 3DNow! instruction set but rather a trade name created to indicate processors that combine 3DNow! technology with a complete SSE instructions set (such as SSE1, SSE2 or SSE3). The first processor to match this description would be the Athlon XP. The Athlon XP added the remainder of the SSE1 instruction set missing from earlier Athlon processors (for the total of: 21 original 3DNow! instructions; 5 3DNow! extension DSP instructions; 19 MMX extension instructions; and 52 additional SSE instructions for complete SSE1 compatibility).

3DNow! and the Geode GX/LX


The Geode GX and Geode LX
Geode (processor)

Geode is a series of x86-compatible System-on-a-chip microprocessors and I/O companions produced by AMD targeted at the Embedded system market....
 added 2 new 3DNow! instructions which are currently absent in all the other processors.

3DNow! Professional instructions unique to the Geode GX/LX
  • PFRSQRTV - Reciprocal square root approximation for a pair of 32-bit floats
  • PFRCPV - Reciprocal approximation for a pair of 32-bit floats


Advantages and disadvantages

One advantage of 3DNow! is that it is possible to add or multiply the two numbers that are stored in the same register
Processor register

In computer architecture, a processor register is a small amount of Computer storage available on the CPU whose contents can be accessed more quickly than storage available elsewhere....
. With SSE, each number can only be combined with a number in the same position in another register. This capability, known as horizontal in Intel terminology, was the major addition to the SSE3
SSE3

SSE3, also known by its Intel code name Prescott New Instructions , is the third iteration of the Streaming SIMD Extensions instruction set for the IA-32 architecture....
 instruction set.

A disadvantage with 3DNow! compared to SSE is that it only stores two numbers in a register, as opposed to four in SSE. However, 3DNow! instructions can generally be executed with a lower latency and quicker throughput than SSE instructions.

3DNow! also shares the same physical registers as MMX, while SSE has its own independent registers. Because these MMX and 3DNow! registers are also used by the standard x87
X87

x87 is a math-related instruction subset of the x86 architecture of Central processing unit. It is so called because initially such instructions were processed by an coprocessor#Intel coprocessors chip 8087....
 FPU, 3DNow! instructions and x87 instructions cannot be executed simultaneously. However, because it is aliased to the x87 FPU, the 3DNow! & MMX register states can be saved and restored by the traditional x87 FNSAVE and FRSTR instructions. Using the pre-existing x87 instructions meant that no operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
 modifications had to be made to support 3DNow!.

By contrast, to save and restore the state of SSE registers required the use of the newly added FXSAVE and FXRSTR instructions; the FX* instructions are an upgrade to the older x87 save and restore instructions because these could save not only SSE register states but also those x87 register states (hence which meant that it could save MMX and 3DNow! registers too).

On AMD Athlon XP and K8-based cores (i.e. Athlon 64
Athlon 64

The Athlon 64 is an eighth-generation, AMD64-architecture microprocessor produced by AMD, released on September 23, 2003. It is the third processor to bear the name Athlon, and the immediate successor to the Athlon XP....
), assembly programmers have noted that it is possible to actually use both 3DNow! and SSE at the same time. Although both share the same functional unit, this can allow more performance by avoiding some register pressure, but it is difficult to accomplish.

Processors supporting 3DNow!

  • All AMD processors after K6-2 (inclusive)
  • National Semiconductor Geode
    Geode (processor)

    Geode is a series of x86-compatible System-on-a-chip microprocessors and I/O companions produced by AMD targeted at the Embedded system market....
    , later AMD Geode.
  • VIA C3
    VIA C3

    The VIA C3 is a family of x86 central processing units for personal computers designed by Centaur Technology and sold by VIA Technologies. The different CPU cores are built following the Centaur Technology#Design_methodology....
     (also known as Cyrix III) "Samuel", "Ezra", and "Eden" cores.
  • IDT Winchip
    WinChip

    The WinChip series was a CPU electrical consumption Socket 7-based x86 central processing unit designed by Centaur Technology and marketed by its parent company Integrated Device Technology....
     2


External links