Fast Fourier transform - AbsoluteAstronomy.com

A fast Fourier transform (FFT) is an efficient algorithm

Algorithm

In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

to compute the discrete Fourier transform

Discrete Fourier transform

In mathematics, the discrete Fourier transform is a specific kind of discrete transform, used in Fourier analysis. It transforms one function into another, which is called the frequency domain representation, or simply the DFT, of the original function...

(DFT) and its inverse. "The FFT has been called the most important numerical algorithm

Numerical analysis

Numerical analysis is the study of algorithms that use numerical approximation for the problems of mathematical analysis ....

of our lifetime (Strang, 1994)." (Kent & Read 2002, 61) There are many distinct FFT algorithms involving a wide range of mathematics, from simple complex-number arithmetic

Complex number

A complex number is a number consisting of a real part and an imaginary part. Complex numbers extend the idea of the one-dimensional number line to the two-dimensional complex plane by using the number line for the real part and adding a vertical axis to plot the imaginary part...

to group theory

Group theory

In mathematics and abstract algebra, group theory studies the algebraic structures known as groups.The concept of a group is central to abstract algebra: other well-known algebraic structures, such as rings, fields, and vector spaces can all be seen as groups endowed with additional operations and...

and number theory

Number theory

Number theory is a branch of pure mathematics devoted primarily to the study of the integers. Number theorists study prime numbers as well...

; this article gives an overview of the available techniques and some of their general properties, while the specific algorithms are described in subsidiary articles linked below.

A DFT decomposes a sequence

Sequence

In mathematics, a sequence is an ordered list of objects . Like a set, it contains members , and the number of terms is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence...

of values into components of different frequencies. This operation is useful in many fields (see discrete Fourier transform

Discrete Fourier transform

for properties and applications of the transform) but computing it directly from the definition is often too slow to be practical. An FFT is a way to compute the same result more quickly: computing a DFT of N points in the naive way, using the definition, takes O

Big O notation

In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...

(N²) arithmetical operations, while an FFT can compute the same result in only O(N log N) operations. The difference in speed can be substantial, especially for long data sets where N may be in the thousands or millions—in practice, the computation time can be reduced by several orders of magnitude in such cases, and the improvement is roughly proportional to

Proportionality (mathematics)

In mathematics, two variable quantities are proportional if one of them is always the product of the other and a constant quantity, called the coefficient of proportionality or proportionality constant. In other words, are proportional if the ratio \tfrac yx is constant. We also say that one...

N / log(N). This huge improvement made many DFT-based algorithms practical; FFTs are of great importance to a wide variety of applications, from digital signal processing

Digital signal processing

Digital signal processing is concerned with the representation of discrete time signals by a sequence of numbers or symbols and the processing of these signals. Digital signal processing and analog signal processing are subfields of signal processing...

and solving partial differential equation

Partial differential equation

In mathematics, partial differential equations are a type of differential equation, i.e., a relation involving an unknown function of several independent variables and their partial derivatives with respect to those variables...

s to algorithms for quick multiplication of large integers

Multiplication algorithm

A multiplication algorithm is an algorithm to multiply two numbers. Depending on the size of the numbers, different algorithms are in use...

.

The most well known FFT algorithms depend upon the factorization

Factorization

In mathematics, factorization or factoring is the decomposition of an object into a product of other objects, or factors, which when multiplied together give the original...

of N, but (contrary to popular misconception) there are FFTs with O(N log N) complexity

Computational complexity theory

Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other...

for all N, even for prime

Prime number

A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself. A natural number greater than 1 that is not a prime number is called a composite number. For example 5 is prime, as only 1 and 5 divide it, whereas 6 is composite, since it has the divisors 2...

N. Many FFT algorithms only depend on the fact that

is an

th primitive root of unity, and thus can be applied to analogous transforms over any finite field

Finite field

In abstract algebra, a finite field or Galois field is a field that contains a finite number of elements. Finite fields are important in number theory, algebraic geometry, Galois theory, cryptography, and coding theory...

, such as number-theoretic transforms.

Since the inverse DFT is the same as the DFT, but with the opposite sign in the exponent and a 1/N factor, any FFT algorithm can easily be adapted for it.

Definition and speed

An FFT computes the DFT

Discrete Fourier transform

and produces exactly the same result as evaluating the DFT definition directly; the only difference is that an FFT is much faster. (In the presence of round-off error

Round-off error

A round-off error, also called rounding error, is the difference between the calculated approximation of a number and its exact mathematical value. Numerical analysis specifically tries to estimate this error when using approximation equations and/or algorithms, especially when using finitely many...

, many FFT algorithms are also much more accurate than evaluating the DFT definition directly, as discussed below.)

Let x₀, ...., x_N-1 be complex number

Complex number

s. The DFT is defined by the formula

Evaluating this definition directly requires O(N²) operations: there are N outputs X_k, and each output requires a sum of N terms. An FFT is any method to compute the same results in O(N log N) operations. More precisely, all known FFT algorithms require Θ(N log N) operations (technically, O only denotes an upper bound

Upper bound

In mathematics, especially in order theory, an upper bound of a subset S of some partially ordered set is an element of P which is greater than or equal to every element of S. The term lower bound is defined dually as an element of P which is lesser than or equal to every element of S...

), although there is no known proof that better complexity is impossible.

To illustrate the savings of an FFT, consider the count of complex multiplications and additions. Evaluating the DFT's sums directly involves N² complex multiplications and N(N − 1) complex additions [of which O(N) operations can be saved by eliminating trivial operations such as multiplications by 1]. The well-known radix-2 Cooley–Tukey algorithm, for N a power of 2, can compute the same result with only (N/2) log₂ N complex multiplies (again, ignoring simplifications of multiplications by 1 and similar) and N log₂N complex additions. In practice, actual performance on modern computers is usually dominated by factors other than arithmetic and is a complicated subject (see, e.g., Frigo & Johnson, 2005), but the overall improvement from O(N²) to O(N log N) remains.

Cooley–Tukey algorithm

By far the most common FFT is the Cooley–Tukey algorithm. This is a divide and conquer algorithm

Divide and conquer algorithm

In computer science, divide and conquer is an important algorithm design paradigm based on multi-branched recursion. A divide and conquer algorithm works by recursively breaking down a problem into two or more sub-problems of the same type, until these become simple enough to be solved directly...

that recursively

Recursion

Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...

breaks down a DFT of any composite

Composite number

A composite number is a positive integer which has a positive divisor other than one or itself. In other words a composite number is any positive integer greater than one that is not a prime number....

size N = N₁N₂ into many smaller DFTs of sizes N₁ and N₂, along with O(N) multiplications by complex roots of unity traditionally called twiddle factor

Twiddle factor

A twiddle factor, in fast Fourier transform algorithms, is any of the trigonometric constant coefficients that are multiplied by the data in the course of the algorithm...

s (after Gentleman and Sande, 1966).

This method (and the general idea of an FFT) was popularized by a publication of J. W. Cooley and J. W. Tukey in 1965, but it was later discovered (Heideman & Burrus, 1984) that those two authors had independently re-invented an algorithm known to Carl Friedrich Gauss

Carl Friedrich Gauss

Johann Carl Friedrich Gauss was a German mathematician and scientist who contributed significantly to many fields, including number theory, statistics, analysis, differential geometry, geodesy, geophysics, electrostatics, astronomy and optics.Sometimes referred to as the Princeps mathematicorum...

around 1805 (and subsequently rediscovered several times in limited forms).

The most well-known use of the Cooley–Tukey algorithm is to divide the transform into two pieces of size

at each step, and is therefore limited to power-of-two sizes, but any factorization can be used in general (as was known to both Gauss and Cooley/Tukey). These are called the radix-2 and mixed-radix cases, respectively (and other variants such as the split-radix FFT have their own names as well). Although the basic idea is recursive, most traditional implementations rearrange the algorithm to avoid explicit recursion. Also, because the Cooley–Tukey algorithm breaks the DFT into smaller DFTs, it can be combined arbitrarily with any other algorithm for the DFT, such as those described below.

Other FFT algorithms

There are other FFT algorithms distinct from Cooley–Tukey. For

with coprime

Coprime

In number theory, a branch of mathematics, two integers a and b are said to be coprime or relatively prime if the only positive integer that evenly divides both of them is 1. This is the same thing as their greatest common divisor being 1...

and

, one can use the Prime-Factor

Prime-factor FFT algorithm

The prime-factor algorithm , also called the Good–Thomas algorithm , is a fast Fourier transform algorithm that re-expresses the discrete Fourier transform of a size N = N1N2 as a two-dimensional N1×N2 DFT, but only for the case where N1 and N2 are relatively prime...

(Good-Thomas) algorithm (PFA), based on the Chinese Remainder Theorem

Chinese remainder theorem

The Chinese remainder theorem is a result about congruences in number theory and its generalizations in abstract algebra.In its most basic form it concerned with determining n, given the remainders generated by division of n by several numbers...

, to factorize the DFT similarly to Cooley–Tukey but without the twiddle factors. The Rader-Brenner algorithm (1976) is a Cooley–Tukey-like factorization but with purely imaginary twiddle factors, reducing multiplications at the cost of increased additions and reduced numerical stability; it was later superseded by the split-radix

Split-radix FFT algorithm

The split-radix FFT is a fast Fourier transform algorithm for computing the discrete Fourier transform , and was first described in an initially little-appreciated paper by R. Yavne and subsequently rediscovered simultaneously by various authors in 1984. The split-radix FFT is a fast Fourier...

variant of Cooley–Tukey (which achieves the same multiplication count but with fewer additions and without sacrificing accuracy). Algorithms that recursively factorize the DFT into smaller operations other than DFTs include the Bruun and QFT algorithms. (The Rader-Brenner and QFT algorithms were proposed for power-of-two sizes, but it is possible that they could be adapted to general composite

. Bruun's algorithm applies to arbitrary even composite sizes.) Bruun's algorithm

Bruun's FFT algorithm

Bruun's algorithm is a fast Fourier transform algorithm based on an unusual recursive polynomial-factorization approach, proposed for powers of two by G. Bruun in 1978 and generalized to arbitrary even composite sizes by H. Murakami in 1996...

, in particular, is based on interpreting the FFT as a recursive factorization of the polynomial

Polynomial

In mathematics, a polynomial is an expression of finite length constructed from variables and constants, using only the operations of addition, subtraction, multiplication, and non-negative integer exponents...

, here into real-coefficient polynomials of the form

and

.

Another polynomial viewpoint is exploited by the Winograd algorithm, which factorizes

into cyclotomic polynomial

Cyclotomic polynomial

In algebra, the nth cyclotomic polynomial, for any positive integer n, is the monic polynomial:\Phi_n = \prod_\omega \,where the product is over all nth primitive roots of unity ω in a field, i.e...

s—these often have coefficients of 1, 0, or −1, and therefore require few (if any) multiplications, so Winograd can be used to obtain minimal-multiplication FFTs and is often used to find efficient algorithms for small factors. Indeed, Winograd showed that the DFT can be computed with only

irrational multiplications, leading to a proven achievable lower bound on the number of multiplications for power-of-two sizes; unfortunately, this comes at the cost of many more additions, a tradeoff no longer favorable on modern processors

Central processing unit

The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

with hardware multipliers. In particular, Winograd also makes use of the PFA as well as an algorithm by Rader for FFTs of prime sizes.

Rader's algorithm

Rader's FFT algorithm

Rader's algorithm is a fast Fourier transform algorithm that computes the discrete Fourier transform of prime sizes by re-expressing the DFT as a cyclic convolution...

, exploiting the existence of a generator

Generating set of a group

In abstract algebra, a generating set of a group is a subset that is not contained in any proper subgroup of the group. Equivalently, a generating set of a group is a subset such that every element of the group can be expressed as the combination of finitely many elements of the subset and their...

for the multiplicative group

Group (mathematics)

In mathematics, a group is an algebraic structure consisting of a set together with an operation that combines any two of its elements to form a third element. To qualify as a group, the set and the operation must satisfy a few conditions called group axioms, namely closure, associativity, identity...

modulo prime

, expresses a DFT of prime size

as a cyclic convolution

Convolution

In mathematics and, in particular, functional analysis, convolution is a mathematical operation on two functions f and g, producing a third function that is typically viewed as a modified version of one of the original functions. Convolution is similar to cross-correlation...

of (composite) size

, which can then be computed by a pair of ordinary FFTs via the convolution theorem

Convolution theorem

In mathematics, the convolution theorem states that under suitableconditions the Fourier transform of a convolution is the pointwise product of Fourier transforms. In other words, convolution in one domain equals point-wise multiplication in the other domain...

(although Winograd uses other convolution methods). Another prime-size FFT is due to L. I. Bluestein, and is sometimes called the chirp-z algorithm; it also re-expresses a DFT as a convolution, but this time of the same size (which can be zero-padded to a power of two

Power of two

In mathematics, a power of two means a number of the form 2n where n is an integer, i.e. the result of exponentiation with as base the number two and as exponent the integer n....

and evaluated by radix-2 Cooley–Tukey FFTs, for example), via the identity

FFT algorithms specialized for real and/or symmetric data

In many applications, the input data for the DFT are purely real, in which case the outputs satisfy the symmetry

and efficient FFT algorithms have been designed for this situation (see e.g. Sorensen, 1987). One approach consists of taking an ordinary algorithm (e.g. Cooley–Tukey) and removing the redundant parts of the computation, saving roughly a factor of two in time and memory. Alternatively, it is possible to express an even-length real-input DFT as a complex DFT of half the length (whose real and imaginary parts are the even/odd elements of the original real data), followed by O(N) post-processing operations.

It was once believed that real-input DFTs could be more efficiently computed by means of the discrete Hartley transform

Discrete Hartley transform

A discrete Hartley transform is a Fourier-related transform of discrete, periodic data similar to the discrete Fourier transform , with analogous applications in signal processing and related fields. Its main distinction from the DFT is that it transforms real inputs to real outputs, with no...

(DHT), but it was subsequently argued that a specialized real-input DFT algorithm (FFT) can typically be found that requires fewer operations than the corresponding DHT algorithm (FHT) for the same number of inputs. Bruun's algorithm (above) is another method that was initially proposed to take advantage of real inputs, but it has not proved popular.

There are further FFT specializations for the cases of real data that have even/odd

Even and odd functions

In mathematics, even functions and odd functions are functions which satisfy particular symmetry relations, with respect to taking additive inverses. They are important in many areas of mathematical analysis, especially the theory of power series and Fourier series...

symmetry, in which case one can gain another factor of (roughly) two in time and memory and the DFT becomes the discrete cosine/sine transform(s) (DCT

Discrete cosine transform

A discrete cosine transform expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of audio and images A discrete cosine transform...

/DST

Discrete sine transform

In mathematics, the discrete sine transform is a Fourier-related transform similar to the discrete Fourier transform , but using a purely real matrix...

). Instead of directly modifying an FFT algorithm for these cases, DCTs/DSTs can also be computed via FFTs of real data combined with O(N) pre/post processing.

Bounds on complexity and operation counts

A fundamental question of longstanding theoretical interest is to prove lower bounds on the complexity

Computational complexity theory

and exact operation counts of fast Fourier transforms, and many open problems remain. It is not even rigorously proved whether DFTs truly require

(i.e., order

or greater) operations, even for the simple case of power of two

Power of two

In mathematics, a power of two means a number of the form 2n where n is an integer, i.e. the result of exponentiation with as base the number two and as exponent the integer n....

sizes, although no algorithms with lower complexity are known. In particular, the count of arithmetic operations is usually the focus of such questions, although actual performance on modern-day computers is determined by many other factors such as cache

Cache

In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

or CPU pipeline optimization.

Following pioneering work by Winograd

Shmuel Winograd

Shmuel Winograd is an American computer scientist, noted for his contributions to computational complexity. He has proved several major results regarding the computational aspects of arithmetic; his contributions include the Coppersmith-Winograd algorithm and an algorithm for Fast Fourier...

(1978), a tight

lower bound is known for the number of real multiplications required by an FFT

Arithmetic complexity of the discrete Fourier transform

See Fast Fourier transform#Bounds on complexity and operation counts for a general summary of this issue.-Bounds on the multiplicative complexity of FFT:...

. It can be shown that only

irrational real multiplications are required to compute a DFT of power-of-two length

. Moreover, explicit algorithms that achieve this count are known (Heideman & Burrus, 1986; Duhamel, 1990). Unfortunately, these algorithms require too many additions to be practical, at least on modern computers with hardware multipliers.

A tight lower bound is not known on the number of required additions, although lower bounds have been proved under some restrictive assumptions on the algorithms. In 1973, Morgenstern proved an

lower bound on the addition count for algorithms where the multiplicative constants have bounded magnitudes (which is true for most but not all FFT algorithms). Pan (1986) proved an

lower bound assuming a bound on a measure of the FFT algorithm's "asynchronicity", but the generality of this assumption is unclear. For the case of power-of-two

, Papadimitriou (1979) argued that the number

of complex-number additions achieved by Cooley–Tukey algorithms is optimal under certain assumptions on the graph

Graph (mathematics)

In mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges...

of the algorithm (his assumptions imply, among other things, that no additive identities in the roots of unity are exploited). (This argument would imply that at least

real additions are required, although this is not a tight bound because extra additions are required as part of complex-number multiplications.) Thus far, no published FFT algorithm has achieved fewer than

complex-number additions (or their equivalent) for power-of-two

.

A third problem is to minimize the total number of real multiplications and additions, sometimes called the "arithmetic complexity" (although in this context it is the exact count and not the asymptotic complexity that is being considered). Again, no tight lower bound has been proven. Since 1968, however, the lowest published count for power-of-two

was long achieved by the split-radix FFT algorithm

Split-radix FFT algorithm

, which requires

real multiplications and additions for

. This was recently reduced to

(Johnson and Frigo, 2007; Lundy and Van Buskirk, 2007).

Most of the attempts to lower or prove the complexity of FFT algorithms have focused on the ordinary complex-data case, because it is the simplest. However, complex-data FFTs are so closely related to algorithms for related problems such as real-data FFTs, discrete cosine transform

Discrete cosine transform

s, discrete Hartley transform

Discrete Hartley transform

s, and so on, that any improvement in one of these would immediately lead to improvements in the others (Duhamel & Vetterli, 1990).

Accuracy and approximations

All of the FFT algorithms discussed below compute the DFT exactly (in exact arithmetic, i.e. neglecting floating-point errors). A few "FFT" algorithms have been proposed, however, that compute the DFT approximately, with an error that can be made arbitrarily small at the expense of increased computations. Such algorithms trade the approximation error for increased speed or other properties. For example, an approximate FFT algorithm by Edelman et al. (1999) achieves lower communication requirements for parallel computing

Parallel computing

Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

with the help of a fast multipole method

Fast Multipole Method

The fast multipole method is a mathematical technique that was developed to speed up the calculation of long-ranged forces in the n-body problem...

. A wavelet

Wavelet

A wavelet is a wave-like oscillation with an amplitude that starts out at zero, increases, and then decreases back to zero. It can typically be visualized as a "brief oscillation" like one might see recorded by a seismograph or heart monitor. Generally, wavelets are purposefully crafted to have...

-based approximate FFT by Guo and Burrus (1996) takes sparse inputs/outputs (time/frequency localization) into account more efficiently than is possible with an exact FFT. Another algorithm for approximate computation of a subset of the DFT outputs is due to Shentov et al. (1995). Only the Edelman algorithm works equally well for sparse and non-sparse data, however, since it is based on the compressibility (rank deficiency) of the Fourier matrix itself rather than the compressibility (sparsity) of the data.

Even the "exact" FFT algorithms have errors when finite-precision floating-point arithmetic is used, but these errors are typically quite small; most FFT algorithms, e.g. Cooley–Tukey, have excellent numerical properties as a consequence of the pairwise summation

Pairwise summation

In numerical analysis, pairwise summation, also called cascade summation, is a technique to sum a sequence of finite-precision floating-point numbers that substantially reduces the accumulated round-off error compared to naively accumulating the sum in sequence...

structure of the algorithms. The upper bound on the relative error

Approximation error

The approximation error in some data is the discrepancy between an exact value and some approximation to it. An approximation error can occur because#the measurement of the data is not precise due to the instruments...

for the Cooley–Tukey algorithm is O(ε log N), compared to O(εN^3/2) for the naïve DFT formula (Gentleman and Sande, 1966), where ε is the machine floating-point relative precision. In fact, the root mean square

Root mean square

In mathematics, the root mean square , also known as the quadratic mean, is a statistical measure of the magnitude of a varying quantity. It is especially useful when variates are positive and negative, e.g., sinusoids...

(rms) errors are much better than these upper bounds, being only O(ε √log N) for Cooley–Tukey and O(ε √N) for the naïve DFT (Schatzman, 1996). These results, however, are very sensitive to the accuracy of the twiddle factors used in the FFT (i.e. the trigonometric function

Trigonometric function

In mathematics, the trigonometric functions are functions of an angle. They are used to relate the angles of a triangle to the lengths of the sides of a triangle...

values), and it is not unusual for incautious FFT implementations to have much worse accuracy, e.g. if they use inaccurate trigonometric recurrence

Generating trigonometric tables

In mathematics, tables of trigonometric functions are useful in a number of areas. Before the existence of pocket calculators, trigonometric tables were essential for navigation, science and engineering...

formulas. Some FFTs other than Cooley–Tukey, such as the Rader-Brenner algorithm, are intrinsically less stable.

In fixed-point arithmetic

Fixed-point arithmetic

In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point...

, the finite-precision errors accumulated by FFT algorithms are worse, with rms errors growing as O(√N) for the Cooley–Tukey algorithm (Welch, 1969). Moreover, even achieving this accuracy requires careful attention to scaling in order to minimize the loss of precision, and fixed-point FFT algorithms involve rescaling at each intermediate stage of decompositions like Cooley–Tukey.

To verify the correctness of an FFT implementation, rigorous guarantees can be obtained in O(N log N) time by a simple procedure checking the linearity, impulse-response, and time-shift properties of the transform on random inputs (Ergün, 1995).

Multidimensional FFTs

As defined in the multidimensional DFT article, the multidimensional DFT

transforms an array

with a

-dimensional vector

Coordinate vector

In linear algebra, a coordinate vector is an explicit representation of a vector in an abstract vector space as an ordered list of numbers or, equivalently, as an element of the coordinate space Fn....

of indices

by a set of

nested summations (over

for each

), where the division

, defined as

, is performed element-wise. Equivalently, it is simply the composition of a sequence of

sets of one-dimensional DFTs, performed along one dimension at a time (in any order).

This compositional viewpoint immediately provides the simplest and most common multidimensional DFT algorithm, known as the row-column algorithm (after the two-dimensional case, below). That is, one simply performs a sequence of

one-dimensional FFTs (by any of the above algorithms): first you transform along the

dimension, then along the

dimension, and so on (or actually, any ordering will work). This method is easily shown to have the usual

complexity, where

is the total number of data points transformed. In particular, there are

transforms of size

, etcetera, so the complexity of the sequence of FFTs is:

In two dimensions, the

can be viewed as an

matrix

Matrix (mathematics)

In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...

, and this algorithm corresponds to first performing the FFT of all the rows and then of all the columns (or vice versa), hence the name.

In more than two dimensions, it is often advantageous for cache

Cache

locality to group the dimensions recursively. For example, a three-dimensional FFT might first perform two-dimensional FFTs of each planar "slice" for each fixed

, and then perform the one-dimensional FFTs along the

direction. More generally, an asymptotically optimal

Asymptotically optimal

In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor worse than the best possible algorithm...

cache-oblivious algorithm consists of recursively dividing the dimensions into two groups

and

that are transformed recursively (rounding if

is not even) (see Frigo and Johnson, 2005). Still, this remains a straightforward variation of the row-column algorithm that ultimately requires only a one-dimensional FFT algorithm as the base case, and still has

complexity. Yet another variation is to perform matrix transpositions

Transpose

In linear algebra, the transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:...

in between transforming subsequent dimensions, so that the transforms operate on contiguous data; this is especially important for out-of-core and distributed memory

Distributed memory

In computer science, distributed memory refers to a multiple-processor computer system in which each processor has its own private memory. Computational tasks can only operate on local data, and if remote data is required, the computational task must communicate with one or more remote processors...

situations where accessing non-contiguous data is extremely time-consuming.

There are other multidimensional FFT algorithms that are distinct from the row-column algorithm, although all of them have

complexity. Perhaps the simplest non-row-column FFT is the vector-radix FFT algorithm, which is a generalization of the ordinary Cooley–Tukey algorithm where one divides the transform dimensions by a vector

of radices at each step. (This may also have cache benefits.) The simplest case of vector-radix is where all of the radices are equal (e.g. vector-radix-2 divides all of the dimensions by two), but this is not necessary. Vector radix with only a single non-unit radix at a time, i.e.

, is essentially a row-column algorithm. Other, more complicated, methods include polynomial transform algorithms due to Nussbaumer (1977), which view the transform in terms of convolutions and polynomial products. See Duhamel and Vetterli (1990) for more information and references.

Other generalizations

An O(N^5/2 log N) generalization to spherical harmonics

Spherical harmonics

In mathematics, spherical harmonics are the angular portion of a set of solutions to Laplace's equation. Represented in a system of spherical coordinates, Laplace's spherical harmonics Y_\ell^m are a specific set of spherical harmonics that forms an orthogonal system, first introduced by Pierre...

on the sphere S² with N² nodes was described by Mohlenkamp (1999), along with an algorithm conjectured (but not proven) to have O(N² log² N) complexity; Mohlenkamp also provides an implementation in the libftsh library. A spherical-harmonic algorithm with O(N² log N) complexity is described by Rokhlin and Tygert (2006).

Various groups have also published "FFT" algorithms for non-equispaced data, as reviewed in Potts et al. (2001). Such algorithms do not strictly compute the DFT (which is only defined for equispaced data), but rather some approximation thereof (a non-uniform discrete Fourier transform

Non-uniform discrete Fourier transform

In applied mathematics, the non-uniform discrete Fourier transform of a signal is a type of Fourier transform, related to a discrete Fourier transform or discrete-time Fourier transform, but in which the input signal is not sampled at equally-spaced intervals. As a result of this, the computed...

, or NDFT, which itself is often computed only approximately).

External links

Fast Fourier Algorithm
Fast Fourier Transforms, Connexions
Connexions
Connexions is a global repository of educational content provided by Rice University. The entire collection is available free of charge, and students and learners alike can explore all the content they desire....

online book edited by C. Sidney Burrus, with chapters by C. Sidney Burrus, Ivan Selesnick, Markus Pueschel, Matteo Frigo, and Steven G. Johnson (2008).
Links to FFT code and information online.
National Taiwan University – FFT
FFT programming in C++ — Cooley–Tukey algorithm.
Online documentation, links, book, and code.
Using FFT to construct aggregate probability distributions
Sri Welaratna, "30 years of FFT Analyzers", Sound and Vibration (January 1997, 30th anniversary issue). A historical review of hardware FFT devices.
FFT Basics and Case Study Using Multi-Instrument
FFT Textbook notes, PPTs, Videos at Holistic Numerical Methods Institute.
ALGLIB FFT Code GPL Licensed multilanguage (VBA, C++, Pascal, etc.) numerical analysis and data processing library.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

Definition and speed

Cooley–Tukey algorithm

Other FFT algorithms

FFT algorithms specialized for real and/or symmetric data

Bounds on complexity and operation counts

Accuracy and approximations

Multidimensional FFTs

Other generalizations

See also

External links