Shannon–Hartley theorem

# Shannon–Hartley theorem

Discussion

Encyclopedia
In information theory
Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...

, the Shannon–Hartley theorem tells the maximum rate at which information can be transmitted over a communications channel of a specified bandwidth in the presence of noise
Noise (electronics)
Electronic noise is a random fluctuation in an electrical signal, a characteristic of all electronic circuits. Noise generated by electronic devices varies greatly, as it can be produced by several different effects...

. It is an application of the noisy channel coding theorem
Noisy channel coding theorem
In information theory, the noisy-channel coding theorem , establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data nearly error-free up to a computable maximum rate through the channel...

to the archetypal case of a continuous-time analog
Analog signal
An analog or analogue signal is any continuous signal for which the time varying feature of the signal is a representation of some other time varying quantity, i.e., analogous to another time varying signal. It differs from a digital signal in terms of small fluctuations in the signal which are...

communications channel subject to Gaussian noise
Gaussian noise
Gaussian noise is statistical noise that has its probability density function equal to that of the normal distribution, which is also known as the Gaussian distribution. In other words, the values that the noise can take on are Gaussian-distributed. A special case is white Gaussian noise, in which...

. The theorem establishes Shannon's channel capacity
Channel capacity
In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a communications channel...

for such a communication link, a bound on the maximum amount of error-free digital data (that is, information
Information
Information in its most restricted technical sense is a message or collection of messages that consists of an ordered sequence of symbols, or it is the meaning that can be interpreted from such a message or collection of messages. Information can be recorded or transmitted. It can be recorded as...

) that can be transmitted with a specified bandwidth in the presence of the noise interference, assuming that the signal power is bounded, and that the Gaussian noise process is characterized by a known power or power spectral density. The law is named after Claude Shannon
Claude Elwood Shannon
Claude Elwood Shannon was an American mathematician, electronic engineer, and cryptographer known as "the father of information theory"....

and Ralph Hartley
Ralph Hartley
Ralph Vinton Lyon Hartley was an electronics researcher. He invented the Hartley oscillator and the Hartley transform, and contributed to the foundations of information theory.-Biography:...

.

## Statement of the theorem

Considering all possible multi-level and multi-phase encoding techniques, the Shannon–Hartley theorem states the channel capacity
Channel capacity
In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a communications channel...

C, meaning the theoretical tightest upper bound on the information rate (excluding error correcting codes) of clean (or arbitrarily low bit error rate) data that can be sent with a given average signal power S through an analog communication channel subject to additive white Gaussian noise
Additive white Gaussian noise is a channel model in which the only impairment to communication is a linear addition of wideband or white noise with a constant spectral density and a Gaussian distribution of amplitude. The model does not account for fading, frequency selectivity, interference,...

of power N, is: $C = B \log_2 \left\left( 1+\frac\left\{S\right\}\left\{N\right\} \right\right)$ whereNEWLINE
NEWLINE
C is the channel capacity
Channel capacity
In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a communications channel...

in bits per second;
NEWLINE
B is the bandwidth of the channel in hertz
Hertz
The hertz is the SI unit of frequency defined as the number of cycles per second of a periodic phenomenon. One of its most common uses is the description of the sine wave, particularly those used in radio and audio applications....

(passband bandwidth in case of a modulated signal);
NEWLINE
S is the total received signal power over the bandwidth (in case of a modulated signal, often denoted C
Carrier-to-noise ratio
In telecommunications, the carrier-to-noise ratio, often written CNR or C/N, is the signal-to-noise ratio of a modulated signal. The term is used to distinguish the CNR of the radio frequency passband signal from the SNR of an analogue base band message signal after demodulation, for example an...

, i.e. modulated carrier), measured in watt or volt2;
NEWLINE
N is the total noise or interference power over the bandwidth, measured in watt or volt2; and
NEWLINE
S/N is the signal-to-noise ratio
Signal-to-noise ratio
Signal-to-noise ratio is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the noise power. A ratio higher than 1:1 indicates more signal than noise...

(SNR) or the carrier-to-noise ratio
Carrier-to-noise ratio
In telecommunications, the carrier-to-noise ratio, often written CNR or C/N, is the signal-to-noise ratio of a modulated signal. The term is used to distinguish the CNR of the radio frequency passband signal from the SNR of an analogue base band message signal after demodulation, for example an...

(CNR) of the communication signal to the Gaussian noise interference expressed as a linear power ratio (not as logarithmic decibels).
NEWLINE

## Historical development

During the late 1920s, Harry Nyquist
Harry Nyquist
Harry Nyquist was an important contributor to information theory.-Personal life:...

and Ralph Hartley
Ralph Hartley
Ralph Vinton Lyon Hartley was an electronics researcher. He invented the Hartley oscillator and the Hartley transform, and contributed to the foundations of information theory.-Biography:...

developed a handful of fundamental ideas related to the transmission of information, particularly in the context of the telegraph as a communications system. At the time, these concepts were powerful breakthroughs individually, but they were not part of a comprehensive theory. In the 1940s, Claude Shannon
Claude Elwood Shannon
Claude Elwood Shannon was an American mathematician, electronic engineer, and cryptographer known as "the father of information theory"....

developed the concept of channel capacity, based in part on the ideas of Nyquist and Hartley, and then formulated a complete theory of information and its transmission.

### Nyquist rate

NEWLINE
NEWLINE
Main article: Nyquist rate
Nyquist rate
In signal processing, the Nyquist rate, named after Harry Nyquist, is two times the bandwidth of a bandlimited signal or a bandlimited channel...

NEWLINE In 1927, Nyquist determined that the number of independent pulses that could be put through a telegraph channel per unit time is limited to twice the bandwidth of the channel. In symbols, $f_p \le 2B \,$ where fp is the pulse frequency (in pulses per second) and B is the bandwidth (in hertz). The quantity 2B later came to be called the Nyquist rate
Nyquist rate
In signal processing, the Nyquist rate, named after Harry Nyquist, is two times the bandwidth of a bandlimited signal or a bandlimited channel...

, and transmitting at the limiting pulse rate of 2B pulses per second as signalling at the Nyquist rate. Nyquist published his results in 1928 as part of his paper "Certain topics in Telegraph Transmission Theory."

### Hartley's law

During that same year, Hartley formulated a way to quantify information and its line rate (also known as data signalling rate or gross bitrate inclusive of error-correcting code 'R' across a communications channel). This method, later known as Hartley's law, became an important precursor for Shannon's more sophisticated notion of channel capacity. Hartley argued that the maximum number of distinct pulses that can be transmitted and received reliably over a communications channel is limited by the dynamic range of the signal amplitude and the precision with which the receiver can distinguish amplitude levels. Specifically, if the amplitude of the transmitted signal is restricted to the range of [ –A ... +A ] volts, and the precision of the receiver is ±ΔV volts, then the maximum number of distinct pulses M is given by $M = 1 + \left\{ A \over \Delta V \right\}.$ By taking information per pulse in bit/pulse to be the base-2-logarithm
Logarithm
The logarithm of a number is the exponent by which another fixed value, the base, has to be raised to produce that number. For example, the logarithm of 1000 to base 10 is 3, because 1000 is 10 to the power 3: More generally, if x = by, then y is the logarithm of x to base b, and is written...

of the number of distinct messages M that could be sent, Hartley constructed a measure of the line rate R as: $R = f_p \log_2\left(M\right), \,$ where fp is the pulse rate, also known as the symbol rate, in symbols/second or baud
Baud
In telecommunications and electronics, baud is synonymous to symbols per second or pulses per second. It is the unit of symbol rate, also known as baud rate or modulation rate; the number of distinct symbol changes made to the transmission medium per second in a digitally modulated signal or a...

. Hartley then combined the above quantification with Nyquist's observation that the number of independent pulses that could be put through a channel of bandwidth B hertz
Hertz
The hertz is the SI unit of frequency defined as the number of cycles per second of a periodic phenomenon. One of its most common uses is the description of the sine wave, particularly those used in radio and audio applications....

was 2B pulses per second, to arrive at his quantitative measure for achievable line rate. Hartley's law is sometimes quoted as just a proportionality between the analog bandwidth, B, in Hertz and what today is called the digital bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...

, R, in bit/s. Other times it is quoted in this more quantitative form, as an achievable line rate of R bits per second: $R \le 2B \log_2\left(M\right).$ Hartley did not work out exactly how the number M should depend on the noise statistics of the channel, or how the communication could be made reliable even when individual symbol pulses could not be reliably distinguished to M levels; with Gaussian noise statistics, system designers had to choose a very conservative value of M to achieve a low error rate. The concept of an error-free capacity awaited Claude Shannon, who built on Hartley's observations about a logarithmic measure of information and Nyquist's observations about the effect of bandwidth limitations. Hartley's rate result can be viewed as the capacity of an errorless M-ary channel of 2B symbols per second. Some authors refer to it as a capacity. But such an errorless channel is an idealization, and the result is necessarily less than the Shannon capacity of the noisy channel of bandwidth B, which is the Hartley–Shannon result that followed later.

### Noisy channel coding theorem and capacity

{{main|noisy-channel coding theorem}} Claude Shannon
Claude Elwood Shannon
Claude Elwood Shannon was an American mathematician, electronic engineer, and cryptographer known as "the father of information theory"....

's development of information theory
Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...

during World War II provided the next big step in understanding how much information could be reliably communicated through noisy channels. Building on Hartley's foundation, Shannon's noisy channel coding theorem
Noisy channel coding theorem
In information theory, the noisy-channel coding theorem , establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data nearly error-free up to a computable maximum rate through the channel...

(1948) describes the maximum possible efficiency of error-correcting methods versus levels of noise interference and data corruption. The proof of the theorem shows that a randomly constructed error correcting code is essentially as good as the best possible code; the theorem is proved through the statistics of such random codes. Shannon's theorem shows how to compute a channel capacity
Channel capacity
In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a communications channel...

from a statistical description of a channel, and establishes that given a noisy channel with capacity C and information transmitted at a line rate R, then if $R < C \,$ there exists a coding technique which allows the probability of error at the receiver to be made arbitrarily small. This means that theoretically, it is possible to transmit information nearly without error up to nearly a limit of C bits per second. The converse is also important. If $R > C \,$ the probability of error at the receiver increases without bound as the rate is increased. So no useful information can be transmitted beyond the channel capacity. The theorem does not address the rare situation in which rate and capacity are equal.

### Shannon–Hartley theorem

The Shannon–Hartley theorem establishes what that channel capacity is for a finite-bandwidth continuous-time channel subject to Gaussian noise. It connects Hartley's result with Shannon's channel capacity theorem in a form that is equivalent to specifying the M in Hartley's line rate formula in terms of a signal-to-noise ratio, but achieving reliability through error-correction coding rather than through reliably distinguishable pulse levels. If there were such a thing as an infinite-bandwidth, noise-free analog channel, one could transmit unlimited amounts of error-free data over it per unit of time. Real channels, however, are subject to limitations imposed by both finite bandwidth and nonzero noise. So how do bandwidth and noise affect the rate at which information can be transmitted over an analog channel? Surprisingly, bandwidth limitations alone do not impose a cap on maximum information rate. This is because it is still possible for the signal to take on an indefinitely large number of different voltage levels on each symbol pulse, with each slightly different level being assigned a different meaning or bit sequence. If we combine both noise and bandwidth limitations, however, we do find there is a limit to the amount of information that can be transferred by a signal of a bounded power, even when clever multi-level encoding techniques are used. In the channel considered by the Shannon-Hartley theorem, noise and signal are combined by addition. That is, the receiver measures a signal that is equal to the sum of the signal encoding the desired information and a continuous random variable that represents the noise. This addition creates uncertainty as to the original signal's value. If the receiver has some information about the random process that generates the noise, one can in principle recover the information in the original signal by considering all possible states of the noise process. In the case of the Shannon-Hartley theorem, the noise is assumed to be generated by a Gaussian process with a known variance. Since the variance of a Gaussian process is equivalent to its power, it is conventional to call this variance the noise power. Such a channel is called the Additive White Gaussian Noise channel, because Gaussian noise is added to the signal; "white" means equal amounts of noise at all frequencies within the channel bandwidth. Such noise can arise both from random sources of energy and also from coding and measurement error at the sender and receiver respectively. Since sums of independent Gaussian random variables are themselves Gaussian random variables, this conveniently simplifies analysis, if one assumes that such error sources are also Gaussian and independent.

### Comparison of Shannon's capacity to Hartley's law

Comparing the channel capacity to the information rate from Hartley's law, we can find the effective number of distinguishable levels M: $2B \log_2\left(M\right) = B \log_2 \left\left( 1+\frac\left\{S\right\}\left\{N\right\} \right\right)$$M = \sqrt\left\{1+\frac\left\{S\right\}\left\{N\right\}\right\}.$ The square root effectively converts the power ratio back to a voltage ratio, so the number of levels is approximately proportional to the ratio of rms
Root mean square
In mathematics, the root mean square , also known as the quadratic mean, is a statistical measure of the magnitude of a varying quantity. It is especially useful when variates are positive and negative, e.g., sinusoids...

signal amplitude to noise standard deviation. This similarity in form between Shannon's capacity and Hartley's law should not be interpreted to mean that M pulse levels can be literally sent without any confusion; more levels are needed, to allow for redundant coding and error correction, but the net data rate that can be approached with coding is equivalent to using that M in Hartley's law.

### Frequency-dependent (colored noise) case

In the simple version above, the signal and noise are fully uncorrelated, in which case S + N is the total power of the received signal and noise together. A generalization of the above equation for the case where the additive noise is not white (or that the S/N is not constant with frequency over the bandwidth) is obtained by treating the channel as many narrow, independent Gaussian channels in parallel: $C = \int_\left\{0\right\}^B \log_2 \left\left( 1+\frac\left\{S\left(f\right)\right\}\left\{N\left(f\right)\right\} \right\right) df$ whereNEWLINE
NEWLINE
C is the channel capacity
Channel capacity
In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a communications channel...

in bits per second;
NEWLINE
B is the bandwidth of the channel in Hz;
NEWLINE
S(f) is the signal power spectrum
NEWLINE
N(f) is the noise power spectrum
NEWLINE
f is frequency in Hz.
NEWLINE Note: the theorem only applies to Gaussian stationary process
Stationary process
In the mathematical sciences, a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time or space...

noise. This formula's way of introducing frequency-dependent noise cannot describe all continuous-time noise processes. For example, consider a noise process consisting of adding a random wave whose amplitude is 1 or -1 at any point in time, and a channel that adds such a wave to the source signal. Such a wave's frequency components are highly dependent. Though such a noise may have a high power, it is fairly easy to transmit a continuous signal with much less power than one would need if the underlying noise was a sum of independent noises in each frequency band.

### Approximations

For large or small and constant signal-to-noise ratios, the capacity formula can be approximated: NEWLINE
NEWLINE
• If S/N >> 1, then
NEWLINE NEWLINE
NEWLINE
NEWLINE
NEWLINE
$C \approx 0.332 \cdot B \cdot \mathrm\left\{SNR\ \left(in\ dB\right)\right\}$
NEWLINE
NEWLINE
whereNEWLINE
NEWLINE
NEWLINE
NEWLINE
$\mathrm\left\{SNR\ \left(in \ dB\right)\right\} = 10\log_\left\{10\right\}\left\{S \over N\right\}.$
NEWLINE
NEWLINE
NEWLINE NEWLINE
NEWLINE
• Similarly, if S/N << 1, then
NEWLINE NEWLINE
NEWLINE
NEWLINE
NEWLINE
$C \approx 1.44 \cdot B \cdot \left\{S \over N\right\}.$
NEWLINE
NEWLINE NEWLINE
NEWLINE
In this low-SNR approximation, capacity is independent of bandwidth if the noise is white, of spectral density
Spectral density
In statistical signal processing and physics, the spectral density, power spectral density , or energy spectral density , is a positive real function of a frequency variable associated with a stationary stochastic process, or a deterministic function of time, which has dimensions of power per hertz...

$N_0$ watts per hertz, in which case the total noise power is $B \cdot N_0$.
NEWLINE NEWLINE
NEWLINE
NEWLINE
NEWLINE
$C \approx 1.44 \cdot \left\{S \over N_0\right\}$
NEWLINE
NEWLINE

## Examples

NEWLINE
NEWLINE
1. If the SNR is 20 dB, and the bandwidth available is 4 kHz, which is appropriate for telephone communications, then C = 4 log2(1 + 100) = 4 log2 (101) = 26.63 kbit/s. Note that the value of S/N = 100 is equivalent to the SNR of 20 dB.
2. NEWLINE
3. If the requirement is to transmit at 50 kbit/s, and a bandwidth of 1 MHz is used, then the minimum S/N required is given by 50 = 1000 log2(1+S/N) so S/N = 2C/B -1 = 0.035, corresponding to an SNR of -14.5 dB (10 x log10(0.035)).
4. NEWLINE
5. Let’s take the example of W-CDMA
W-CDMA (UMTS)
W-CDMA , UMTS-FDD, UTRA-FDD, or IMT-2000 CDMA Direct Spread is an air interface standard found in 3G mobile telecommunications networks. It is the basis of Japan's NTT DoCoMo's FOMA service and the most-commonly used member of the UMTS family and sometimes used as a synonym for UMTS...

(Wideband Code Division Multiple Access), the bandwidth = 5 MHz, you want to carry 12.2 kbit/s of data (AMR voice), then the required SNR is given by 212.2/5000 -1 corresponding to an SNR of -27.7 dB for a single channel. This shows that it is possible to transmit using signals which are actually much weaker than the background noise level, as in spread-spectrum communications. However, in W-CDMA the required SNR will vary based on design calculations.
6. NEWLINE
7. As stated above, channel capacity is proportional to the bandwidth of the channel and to the logarithm of SNR. This means channel capacity can be increased linearly either by increasing the channel's bandwidth given a fixed SNR requirement or, with fixed bandwidth, by using higher-order modulations that need a very high SNR to operate. As the modulation rate increases, the spectral efficiency
Spectral efficiency
Spectral efficiency, spectrum efficiency or bandwidth efficiency refers to the information rate that can be transmitted over a given bandwidth in a specific communication system...

improves, but at the cost of the SNR requirement. Thus, there is an exponential rise in the SNR requirement if one adopts a 16QAM or 64QAM (see: Quadrature amplitude modulation
Quadrature amplitude modulation is both an analog and a digital modulation scheme. It conveys two analog message signals, or two digital bit streams, by changing the amplitudes of two carrier waves, using the amplitude-shift keying digital modulation scheme or amplitude modulation analog...

); however, the spectral efficiency improves.
NEWLINE

NEWLINE
NEWLINE
• The relationship between information, bandwidth and noise
• NEWLINE
• On-line textbook: Information Theory, Inference, and Learning Algorithms, by David MacKay
David MacKay (scientist)
David John Cameron MacKay, FRS, is the professor of natural philosophy in the department of Physics at the University of Cambridge and chief scientific adviser to the UK Department of Energy and Climate Change...

- gives an entertaining and thorough introduction to Shannon theory, including two proofs of the noisy-channel coding theorem. This text also discusses state-of-the-art methods from coding theory, such as low-density parity-check code
Low-density parity-check code
In information theory, a low-density parity-check code is a linear error correcting code, a method of transmitting a message over a noisy transmission channel, and is constructed using a sparse bipartite graph...

s, and Turbo code
Turbo code
In information theory, turbo codes are a class of high-performance forward error correction codes developed in 1993, which were the first practical codes to closely approach the channel capacity, a theoretical maximum for the code rate at which reliable communication is still possible given a...

s.
• NEWLINE
• MIT News article on Shannon Limit
NEWLINE