All Topics  
Digital signal processor

 
Digital Signal Processor

   Email Print
   Bookmark   Link






 

Digital signal processor



 
 
A digital signal processor (DSP) is a specialized microprocessor
Microprocessor

A microprocessor incorporates most or all of the functions of a central processing unit on a single integrated circuit . The first microprocessors emerged in the early 1970s and were used for electronic calculators, using Binary-coded decimal arithmetic on 4-bit Word ....
 designed specifically for digital signal processing
Digital signal processing

Digital signal processing is concerned with the representation of the signal s by a sequence of numbers or symbols and the processing of these signals....
, generally in real-time computing
Real-time computing

In computer science, real-time computing is the study of Computer hardware and computer software systems that are subject to a "real-time constraint"?i.e., operational deadlines from event to system response....
.

Typical characteristics
Digital signal processing algorithm
Algorithm

In mathematics, computing, linguistics and related subjects, an algorithm is a sequence of finite instructions, often used for calculation and data processing....
s typically require a large number of mathematical operations to be performed quickly on a set of data. Signals are converted from analog to digital, manipulated digitally, and then converted again to analog form, as diagrammed below.






Discussion
Ask a question about 'Digital signal processor'
Start a new discussion about 'Digital signal processor'
Answer questions from other users
Full Discussion Forum



Encyclopedia


A digital signal processor (DSP) is a specialized microprocessor
Microprocessor

A microprocessor incorporates most or all of the functions of a central processing unit on a single integrated circuit . The first microprocessors emerged in the early 1970s and were used for electronic calculators, using Binary-coded decimal arithmetic on 4-bit Word ....
 designed specifically for digital signal processing
Digital signal processing

Digital signal processing is concerned with the representation of the signal s by a sequence of numbers or symbols and the processing of these signals....
, generally in real-time computing
Real-time computing

In computer science, real-time computing is the study of Computer hardware and computer software systems that are subject to a "real-time constraint"?i.e., operational deadlines from event to system response....
.

Typical characteristics


Digital signal processing algorithm
Algorithm

In mathematics, computing, linguistics and related subjects, an algorithm is a sequence of finite instructions, often used for calculation and data processing....
s typically require a large number of mathematical operations to be performed quickly on a set of data. Signals are converted from analog to digital, manipulated digitally, and then converted again to analog form, as diagrammed below. Many DSP applications have constraints on latency
Latency (engineering)

Latency is a time delay between the moment something is initiated, and the moment one of its effects begins or becomes detectable. The word derives from the fact that during the period of latency the effects of an action are latent, meaning "potential" or "not yet observed"....
; that is, for the system to work, the DSP operation must be completed within some time constraint.

Most general-purpose microprocessors and operating systems can execute DSP algorithms successfully. But these microprocessors are not suitable for application of mobile telephone and pocket PDA systems etc. because of power supply and space limit. A specialized digital signal processor, however, will tend to provide a lower-cost solution, with better performance and lower latency.

The architecture of a digital signal processor is optimized specifically for digital signal processing work. Some useful features for optimizing DSP algorithms are outlined below.

Architecture


  • Hardware modulo addressing, allowing circular buffer
    Circular buffer

    A circular buffer or ring buffer is a data structure that uses a single, fixed-size buffer as if it were connected end-to-end.This structure lends itself easily to buffering data streams....
    s to be implemented without having to constantly test for wrapping.
  • A memory architecture designed for streaming data, using DMA
    Direct memory access

    Direct memory access is a feature of modern computers and microprocessors that allows certain hardware subsystems within the computer to access system Computer storage for reading and/or writing independently of the central processing unit....
     extensively.
  • Separate program and data memories (Harvard architecture
    Harvard architecture

    The Harvard architecture is a computer architecture with physically separate computer storage and signal pathways for instructions and data. The term originated from the Harvard Mark I relay-based computer, which stored instructions on punched tape and data in electro-mechanical counters ....
    )
  • Special SIMD
    SIMD

    In computing, SIMD is a technique employed to achieve data level parallelism....
     (single instruction, multiple data) operations
  • Special arithmetic operations, such as fast multiply-accumulate
    Multiply-accumulate

    In computing, especially digital signal processing, multiply-accumulate is a common operation that computes the product of two numbers and adds that product to an accumulator ....
    s (MACs). Many fundamental DSP algorithms, such as FIR filters
    Finite impulse response

    A finite impulse response filter is a type of a digital filter. The impulse response, the filter's response to a Kronecker delta input, is 'finite' because it settles to zero in a finite number of sampling intervals....
     or the Fast Fourier transform
    Fast Fourier transform

    A fast Fourier transform is an efficient algorithm to compute the discrete Fourier transform and its inverse. There are many distinct FFT algorithms involving a wide range of mathematics, from simple complex number to group theory and number theory; this article gives an overview of the available techniques and some of their general propert...
     (FFT) depend heavily on multiply-accumulate performance.
  • Bit-reversed addressing, a special addressing mode
    Addressing mode

    Addressing modes are an aspect of the instruction set architecture in most central processing unit designs. The various addressing modes that are defined in a given instruction set architecture define how Machine code Instruction in that architecture identify the operand of each instruction....
     useful only for calculating FFTs
  • Deliberate exclusion of a memory management unit
    Memory management unit

    A memory management unit , sometimes called paged memory management unit , is a computer hardware component responsible for handling accesses to computer memory requested by the central processing unit ....
    . DSPs frequently use multi-tasking operating systems, but have no support for virtual memory
    Virtual memory

    Virtual memory is a computer system technique which gives an application program the impression that it has contiguous working memory , while in fact it may be physically fragmented and may even overflow on to disk storage....
     or memory protection. Operating systems that use virtual memory require more time for context switching among processes
    Process (computing)

    In computing, a process is an Object of a computer program that is being sequentially executed by a computer system that has the ability to run several computer programs Concurrency ....
    , which increases latency.


Program flow


  • Floating-point unit integrated directly into the datapath
    Datapath

    A datapath is a collection of digital electronics, such as arithmetic logic units or multiplication ALUs, that perform data processing operations....
  • Pipelined architecture
  • Highly parallel multiplier–accumulators
    Multiply-accumulate

    In computing, especially digital signal processing, multiply-accumulate is a common operation that computes the product of two numbers and adds that product to an accumulator ....
     (MAC units)
  • Hardware-controlled looping
    Control flow

    In computer science control flow refers to the order in which the individual statement , Instruction or function calls of an imperative programming or functional programming computer program are execution or evaluated....
    , to reduce or eliminate the overhead required for looping operations


Memory architecture


  • DSPs often use special memory architectures that are able to fetch multiple data and/or instructions at the same time:
    • Harvard architecture
      Harvard architecture

      The Harvard architecture is a computer architecture with physically separate computer storage and signal pathways for instructions and data. The term originated from the Harvard Mark I relay-based computer, which stored instructions on punched tape and data in electro-mechanical counters ....
    • Modified von Neumann architecture
      Von Neumann architecture

      The von Neumann architecture is a design model for a stored-program digital computer that uses a central processing unit and a single separate computer storage structure to hold both instructions and data ....
  • Use of direct memory access
    Direct memory access

    Direct memory access is a feature of modern computers and microprocessors that allows certain hardware subsystems within the computer to access system Computer storage for reading and/or writing independently of the central processing unit....
  • Memory-address calculation unit


Data operations


  • Saturation arithmetic
    Saturation arithmetic

    Saturation arithmetic is a version of arithmetic in which all operations such as addition and multiplication are limited to a fixed range between a minimum and maximum value....
    , in which operations that produce overflows will accumulate at the maximum (or minimum) values that the register can hold rather than wrapping around (maximum+1 doesn't overflow to minimum as in many general-purpose CPUs, instead it stays at maximum). Sometimes various sticky bit
    Sticky bit

    The sticky bit is an access-right Flag that can be assigned to Computer file and directory on Unix systems....
    s operation modes are available.
  • Fixed-point arithmetic is often used to speed up arithmetic processing
  • Single-cycle operations to increase the benefits of pipelining


Instruction sets


  • Multiply-accumulate
    Multiply-accumulate

    In computing, especially digital signal processing, multiply-accumulate is a common operation that computes the product of two numbers and adds that product to an accumulator ....
     (MAC, aka fused multiply-add, FMA) operations, which are used extensively in all kinds of matrix
    Matrix (mathematics)

    In mathematics, a matrix is a rectangular array of numbers, as shown at the right. In addition to a number of elementary, entrywise operations such as matrix addition a key notion is matrix multiplication....
     operations, such as convolution
    Convolution

    In mathematics and, in particular, functional analysis, convolution is a mathematical operator on two function s f and g, producing a third function that is typically viewed as a modified version of one of the original functions....
     for filtering, dot product
    Dot product

    In mathematics, the dot product, also known as the scalar product, is an operation which takes two vector over the real numbers R and returns a real-valued scalar quantity....
    , or even polynomial evaluation (see Horner scheme
    Horner scheme

    In numerical analysis, the Horner scheme or Horner algorithm, named after William George Horner, is an algorithm for the efficient evaluation of polynomials in Monomial basis....
    )
  • Instructions to increase parallelism: SIMD
    SIMD

    In computing, SIMD is a technique employed to achieve data level parallelism....
    , VLIW, superscalar architecture
  • Specialized instructions for modulo
    Modular arithmetic

    In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" after they reach a certain value — the modulus....
     addressing in ring buffers
    Circular buffer

    A circular buffer or ring buffer is a data structure that uses a single, fixed-size buffer as if it were connected end-to-end.This structure lends itself easily to buffering data streams....
     and bit-reversed addressing mode for FFT
    Fast Fourier transform

    A fast Fourier transform is an efficient algorithm to compute the discrete Fourier transform and its inverse. There are many distinct FFT algorithms involving a wide range of mathematics, from simple complex number to group theory and number theory; this article gives an overview of the available techniques and some of their general propert...
     cross-referencing
  • Digital signal processors sometimes use time-stationary encoding to simplify hardware and increase coding efficiency.


History


Prior to the advent of stand-alone DSP chips discussed below, most DSP applications were implemented using bit slice processors. The AMD2901 bit slice chip with its family of components was a very popular choice. There were reference designs from AMD, but very often the specifics of a particular design were application specific. These bit slice architecture would sometimes include a peripheral multiplier chip. Examples of these multipliers were a series from TRW including the TRW1008 and TRW1010, some of which included an accumulator, providing the requisite multiply-accumulate (MAC) function.

In 1978, Intel released the 2920 as an "analog signal processor". It had an on-chip ADC/DAC with an internal signal processor, but it didn't have a hardware multiplier and was not successful in the market. In 1979, AMI released the S2811. It was designed as a microprocessor peripheral, and it had to be initialized by the host. The S2811 was likewise not successful in the market.

In 1980 the first stand-alone, complete DSPs – the NEC µPD7720
NEC µPD7720

The NEC ?PD7720 is the name of fixed point digital signal processors from NEC Corporation . It was introduced in 1980, at which time it was the first commercial DSP in the industry....
 and AT&T
AT&T

AT&T Inc. is the largest US provider of both local and long distance telephone services, and Digital subscriber line Internet access. AT&T is the second largest provider of wireless service in the United States, with over 77 million wireless customers, and more than 150 million total customers....
 DSP1
AT&T DSP1

The AT&T DSP1 was a pioneering digital signal processor created by Bell Labs.The DSP1 started in 1977 with a Bell Labs study that recommended creating a large-scale integrated circuit for digital signal processing....
 – were presented at the IEEE
Institute of Electrical and Electronics Engineers

The Institute of Electrical and Electronics Engineers or IEEE is an international non-profit, professional body for the advancement of technology related to electricity....
 International Solid-State Circuits
Solid state (electronics)

Solid-state electronic components, devices, and systems are based entirely on the semiconductor, such as transistors, microprocessor chips, and the bubble memory....
 Conference '80. Both processors were inspired by the research in PSTN
Public switched telephone network

The public switched telephone network is the network of the world's public circuit switching telephone networks, in much the same way that the Internet is the network of the world's public Internet protocol-based packet switching networks....
 telecommunication
Telecommunication

Telecommunication is the assisted Transmission of Signal over a distance for the purpose of communication. In earlier times, this may have involved the use of smoke signals, Drum , Semaphore line, flag signals or heliograph....
s.

The Altamira DX-1 was another early DSP, utilizing quad integer pipelines with delayed branches and branch prediction.

The first DSP produced by Texas Instruments
Texas Instruments

Texas Instruments , better known in the electronics industry as TI, is an United States company based in Dallas, Texas, Texas, United States, renowned for developing and commercializing semiconductor and computer technology....
 (TI), the TMS32010
Texas Instruments TMS320

Texas Instruments TMS320 is a blanket name for a series of digital signal processors from Texas Instruments. It was introduced on April 8 1983 through the TMS32010 processor, which was then the fastest DSP on the market....
 presented in 1983, proved to be an even bigger success. It was based on the Harvard architecture, and so had separate instruction and data memory. It already had a special instruction set, with instructions like load-and-accumulate or multiply-and-accumulate. It could work on 16-bit numbers and needed 390ns for a multiply-add operation. TI is now the market leader in general-purpose DSPs. Another successful design was the Motorola
Motorola

Motorola, Inc. is an United States, multinational, Fortune 100, telecommunications company based in Schaumburg, Illinois. It is a manufacturer of wireless telephone handsets, also designing and selling wireless network infrastructure equipment such as cellular transmission base stations and signal amplifiers....
 56000
Motorola 56000

The Motorola DSP56000 is a family of digital signal processor chips produced by Motorola Semiconductor starting in the 1980s and is still being produced in more advanced models in the 2000?2009....
.

About five years later, the second generation of DSPs began to spread. They had 3 memories for storing two operands simultaneously and included hardware to accelerate tight loops, they also had an addressing unit capable of loop-addressing. Some of them operated on 24-bit variables and a typical model only required about 21ns for a MAC (multiply-accumulate). Members of this generation were for example the AT&T DSP16A or the Motorola DSP56001.

The main improvement in the third generation was the appearance of application-specific units and instructions in the data path, or sometimes as coprocessors. These units allowed direct hardware acceleration of very specific but complex mathematical problems, like the Fourier-transform or matrix operations. Some chips, like the Motorola MC68356, even included more than one processor core to work in parallel. Other DSPs from 1995 are the TI TMS320C541 or the TMS 320C80.

The fourth generation is best characterized by the changes in the instruction set and the instruction encoding/decoding. SIMD and MMX extensions were added, VLIW and the superscalar architecture appeared. As always, the clock-speeds have increased, a 3ns MAC now became possible.

Modern DSPs


Modern signal processors yield greater performance. This is due in part to both technological and architectural advancements like lower design rules, fast-access two-level cache, (E)DMA
Direct memory access

Direct memory access is a feature of modern computers and microprocessors that allows certain hardware subsystems within the computer to access system Computer storage for reading and/or writing independently of the central processing unit....
 circuit and a wider bus system. Of course, not all DSPs provide the same speed and many kinds of signal processors exist, each one of them being better suited for a specific task, ranging in price from about US$1.50 to US$300. A Texas Instruments C6000 series DSP clocks at 1.2 GHz and implements separate instruction and data caches as well as an 8 MiB 2nd level cache, and its I/O speed is rapid thanks to its 64 EDMA channels. The top models are capable of as many as 8000 MIPS (million instructions per second
Instructions per second

Instructions per second is a measure of a computer's processor speed. Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads consist of a mix of instructions and applications, some of which take longer to execute than others....
), use VLIW (very long instruction word
Very long instruction word

Very Long Instruction Word or VLIW refers to a Central processing unit architecture designed to take advantage of instruction level parallelism ....
) encoding, perform eight operations per clock-cycle and are compatible with a broad range of external peripherals and various buses (PCI/serial/etc).

Another player at the high-end signal processor manufacturer today is Freescale. The company provides a multi-core DSPs family MSC81xx. The MSC81xx is based on StarCore Architecture processors. The latest MSC8144 DSP combines four programmable SC3400 StarCore DSP cores. Each SC3400 StarCore DSP core runs at 1 GHz. The SC3400 performed higher than any other programmable DSP at 1 GHz on BDTIsimMark2000 results published by Berkeley Design Technology, Inc. (BDTI).

Another major signal processor manufacturer today is Analog Devices
Analog Devices

Analog Devices is an United States Multinational corporation producer of semiconductor devices. Analog specializes in analog-to-digital converter, digital-to-analog converter, MEMS, and digital signal processing chips for consumer and industrial goods....
. The company provides a broad range of DSPs, but its main portfolio is multimedia processors, such as codecs, filters and digital-analog converters. Its SHARC
Super Harvard Architecture Single-Chip Computer

The Super Harvard Architecture Single-Chip Computer is a high performance floating-point and fixed-point digital signal processor from Analog Devices,...
-based processors range in performance from 66 MHz/198 MFLOPS (million floating-point operations per second) to 400 MHz/2400MFLOPS. Some models even support multiple multiplier
Multiplier

The term multiplier may refer to:In electrical engineering:* Binary multiplier, a digital circuit to perform rapid multiplication of two numbers in binary representation...
s and ALU
Arithmetic logic unit

In computing, an arithmetic logic unit is a digital circuit that performs arithmetic and logicaloperations. The ALU is a fundamental building block of the central processing unit of a computer, and even the simplest microprocessors contain one for purposes such as maintaining timers....
s, SIMD
SIMD

In computing, SIMD is a technique employed to achieve data level parallelism....
 instructions and audio processing-specific components and peripherals. Another product of the company is the Blackfin
Blackfin

Blackfin refers to a family of 16/32-bit microprocessors with built-in Digital Signal Processor functionality, which is traditionally only accompanied by a small and power-efficient microcontroller....
 family of embedded digital signal processors, with models like the ADSP-BF531 to ADSP-BF536. These processors combine the features of a DSP with those of a general use processor. As a result, these processors can run simple operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
s like µCLinux, velOSity and Nucleus RTOS
Nucleus RTOS

Nucleus OS is a real-time operating system and full-featured toolset created by the Embedded system Division of Mentor Graphics for various CPU platforms....
 while operating relatively efficiently on real-time data.

Another player is NXP Semiconductors based on TriMedia
Trimedia

*TriMedia , a VLIW Mediaprocessor from NXP Semiconductors*Trimedia International , a pan-European PR agency based in the UK*You may have been looking for Trymedia, a digital distribution company....
 VLIW technology, optimized for audio and video processing. In some products the DSP core is hidden as a fixed-function block into an SoC
SOC

SOC or SoC may refer to:Science and technology* Security Operation Center * Self-organized criticality, a property of dynamical systems in physics...
, but NXP also provides a range of flexible single core media processors, such as the with a complete software development kit and a library of codecs and filters. The TriMedia
Trimedia

*TriMedia , a VLIW Mediaprocessor from NXP Semiconductors*Trimedia International , a pan-European PR agency based in the UK*You may have been looking for Trymedia, a digital distribution company....
 media processors support both fixed-point arithmetic
Fixed-point arithmetic

In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point . Fixed-point number representation can be compared to the more complicated floating point number representation....
 as well as floating-point arithmetic, and have specific instructions to deal with complex filters and entropy coding.

Most DSPs use fixed-point arithmetic
Fixed-point arithmetic

In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point . Fixed-point number representation can be compared to the more complicated floating point number representation....
, because in real world signal processing the additional range provided by floating point is not needed, and there is a large speed benefit and cost benefit due to reduced hardware complexity. Floating point DSPs may be invaluable in applications where a wide dynamic range is required. Product developers might also use floating point DSPs to reduce the cost and complexity of software development in exchange for more expensive hardware, since it is generally easier to implement algorithms in floating point.

Generally, DSPs are dedicated integrated circuits, however DSP functionality can also be realized using Field Programmable Gate Array chips.

Embedded general-purpose RISC processors are becoming increasingly DSP like in functionality. For example, ARM Cortex-A8 has a 128-bit wide SIMD unit that can have impressive 16- and 8-bit performance for industry standard benchmarks.

See also


  • Digital Signal Controller
    Digital signal controller

    A digital signal controller can be thought of as a hybrid of microcontrollers and digital signal processor. Like microcontrollers, DSCs have fast interrupt responses, offer control-oriented peripherals like PWMs and watchdog timers, and are usually programmed using the C , although can be programmed using the device's native assembly langua...


External links