History of information theory
Encyclopedia
The decisive event which established the discipline of information theory
Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...

, and brought it to immediate worldwide attention, was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication
A Mathematical Theory of Communication
"A Mathematical Theory of Communication" is an influential 1948 article by mathematician Claude E. Shannon. As of November 2011, Google Scholar has listed more than 48,000 unique citations of the article and the later-published book version...

" in the Bell System Technical Journal
Bell System Technical Journal
The Bell System Technical Journal was the in-house scientific journal of Bell Labs that was published from 1922 to 1983.- Notable papers :...

in July and October of 1948.

In this revolutionary and groundbreaking paper, the work for which Shannon had substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that
"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."


With it came the ideas of
  • the information entropy
    Information entropy
    In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits...

     and redundancy
    Redundancy (information theory)
    Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. Informally, it is the amount of wasted "space" used to transmit certain data...

     of a source, and its relevance through the source coding theorem;
  • the mutual information
    Mutual information
    In probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two random variables...

    , and the channel capacity
    Channel capacity
    In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a communications channel...

     of a noisy channel, including the promise of perfect loss-free communication given by the noisy-channel coding theorem;
  • the practical result of the Shannon–Hartley law
    Shannon–Hartley theorem
    In information theory, the Shannon–Hartley theorem tells the maximum rate at which information can be transmitted over a communications channel of a specified bandwidth in the presence of noise. It is an application of the noisy channel coding theorem to the archetypal case of a continuous-time...

     for the channel capacity of a Gaussian channel; and of course
  • the bit
    Bit
    A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

     - a new way of seeing the most fundamental unit of information.

Early telecommunications

Some of the oldest methods of instant telecommunication
Telecommunication
Telecommunication is the transmission of information over significant distances to communicate. In earlier times, telecommunications involved the use of visual signals, such as beacons, smoke signals, semaphore telegraphs, signal flags, and optical heliographs, or audio messages via coded...

s implicitly use many of the ideas that would later be quantified in information theory. Modern telegraphy
Telegraphy
Telegraphy is the long-distance transmission of messages via some form of signalling technology. Telegraphy requires messages to be converted to a code which is known to both sender and receiver...

, starting in the 1830s
1830s
- Wars :* The First Opium War between the United Kingdom and the Qing Empire of China started in 1839. It would end three years later with the signing of the Treaty of Nanking on 29 August 1842.- Internal conflicts :* French Revolution of 1830...

, used Morse code
Morse code
Morse code is a method of transmitting textual information as a series of on-off tones, lights, or clicks that can be directly understood by a skilled listener or observer without special equipment...

, in which more common letters
Letter frequencies
The frequency of letters in text has often been studied for use in cryptography, and frequency analysis in particular. No exact letter frequency distribution underlies a given language, since all writers write slightly differently. Linotype machines sorted the letters' frequencies as etaoin shrdlu...

 (like "E
E
E is the fifth letter and a vowel in the basic modern Latin alphabet. It is the most commonly used letter in the Czech, Danish, Dutch, English, French, German, Hungarian, Latin, Norwegian, Spanish, and Swedish languages.-History:...

," which is expressed as one "dot") are transmitted more quickly than less common letters (like "J
J
Ĵ or ĵ is a letter in Esperanto orthography representing the sound .While Esperanto orthography uses a diacritic for its four postalveolar consonants, as do the Latin-based Slavic alphabets, the base letters are Romano-Germanic...

," which is expressed by one "dot" followed by three "dashes"). The idea of encoding information in this manner is the cornerstone of lossless data compression
Lossless data compression
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange...

. A hundred years later, frequency modulation
Frequency modulation
In telecommunications and signal processing, frequency modulation conveys information over a carrier wave by varying its instantaneous frequency. This contrasts with amplitude modulation, in which the amplitude of the carrier is varied while its frequency remains constant...

 illustrated that bandwidth can be considered merely another degree of freedom. The vocoder
Vocoder
A vocoder is an analysis/synthesis system, mostly used for speech. In the encoder, the input is passed through a multiband filter, each band is passed through an envelope follower, and the control signals from the envelope followers are communicated to the decoder...

, now largely looked at as an audio engineering curiosity, was originally designed in 1939 to use less bandwidth than that of an original message, in much the same way that mobile phone
Mobile phone
A mobile phone is a device which can make and receive telephone calls over a radio link whilst moving around a wide geographic area. It does so by connecting to a cellular network provided by a mobile network operator...

s now trade off voice quality with bandwidth.

Quantitative ideas of information

The most direct antecedents of Shannon's work were two papers published in the 1920s
1920s
File:1920s decade montage.png|From left, clockwise: Third Tipperary Brigade Flying Column No. 2 under Sean Hogan during the Irish Civil War; Prohibition agents destroying barrels of alcohol in accordance to the 18th amendment, which made alcoholic beverages illegal throughout the entire decade; In...

 by Harry Nyquist
Harry Nyquist
Harry Nyquist was an important contributor to information theory.-Personal life:...

 and Ralph Hartley
Ralph Hartley
Ralph Vinton Lyon Hartley was an electronics researcher. He invented the Hartley oscillator and the Hartley transform, and contributed to the foundations of information theory.-Biography:...

, who were both still research leaders at Bell Labs when Shannon arrived in the early 1940s
1940s
File:1940s decade montage.png|Above title bar: events which happened during World War II : From left to right: Troops in an LCVP landing craft approaching "Omaha" Beach on "D-Day"; Adolf Hitler visits Paris, soon after the Battle of France; The Holocaust occurred during the war as Nazi Germany...

.

Nyquist's 1924 paper, Certain Factors Affecting Telegraph Speed is mostly concerned with some detailed engineering aspects of telegraph signals. But a more theoretical section discusses quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation


where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant.

Hartley's 1928 paper, called simply Transmission of Information, went further by using the word information (in a technical sense), and making explicitly clear that information in this context was a measurable quantity, reflecting only the receiver's ability to distinguish that one sequence of symbols had been intended by the sender rather than any other—quite regardless of any associated meaning or other psychological or semantic aspect the symbols might represent. This amount of information he quantified as


where S was the number of possible symbols, and n the number of symbols in a transmission. The natural unit of information was therefore the decimal digit, much later renamed the hartley
Ban (information)
A ban, sometimes called a hartley or a dit , is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit. As a bit corresponds to a binary digit, so a ban is a decimal digit...

 in his honour as a unit or scale or measure of information. The Hartley information, H0, is still used as a quantity for the logarithm of the total number of possibilities.

A similar unit of log10 probability, the ban, and its derived unit the deciban
Ban (information)
A ban, sometimes called a hartley or a dit , is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit. As a bit corresponds to a binary digit, so a ban is a decimal digit...

 (one tenth of a ban), were introduced by Alan Turing
Alan Turing
Alan Mathison Turing, OBE, FRS , was an English mathematician, logician, cryptanalyst, and computer scientist. He was highly influential in the development of computer science, providing a formalisation of the concepts of "algorithm" and "computation" with the Turing machine, which played a...

 in 1940 as part of the statistical analysis of the breaking of the German second world war Enigma
Cryptanalysis of the Enigma
Cryptanalysis of the Enigma enabled the western Allies in World War II to read substantial amounts of secret Morse-coded radio communications of the Axis powers that had been enciphered using Enigma machines. This yielded military intelligence which, along with that from other decrypted Axis radio...

 cyphers. The decibannage represented the reduction in (the logarithm of) the total number of possibilities (similar to the change in the Hartley information); and also the log-likelihood ratio (or change in the weight of evidence
Weight of evidence
Weight of evidence is a measure of evidence on one side of an issue as compared with the evidence on the other side of the issue, or to measure the evidence on multiple issues.Weight of evidence or WofE may be used in:...

) that could be inferred for one hypothesis over another from a set of observations. The expected change in the weight of evidence is equivalent to what was later called the Kullback discrimination information.

But underlying this notion was still the idea of equal a-priori probabilities, rather than the information content of events of unequal probability; nor yet any underlying picture of questions regarding the communication of such varied outcomes.

Entropy in statistical mechanics

One area where unequal probabilities were indeed well known was statistical mechanics, where Ludwig Boltzmann
Ludwig Boltzmann
Ludwig Eduard Boltzmann was an Austrian physicist famous for his founding contributions in the fields of statistical mechanics and statistical thermodynamics...

 had, in the context of his H-theorem
H-theorem
In Classical Statistical Mechanics, the H-theorem, introduced by Ludwig Boltzmann in 1872, describes the increase in the entropy of an ideal gas in an irreversible process. H-theorem follows from considerations of Boltzmann's equation...

 of 1872, first introduced the quantity


as a measure of the breadth of the spread of states available to a single particle in a gas of like particles, where f represented the relative frequency distribution
Frequency distribution
In statistics, a frequency distribution is an arrangement of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of...

 of each possible state. Boltzmann argued mathematically that the effect of collisions between the particles would cause the H-function to inevitably increase from any initial configuration until equilibrium was reached; and further identified it as an underlying microscopic rationale for the macroscopic thermodynamic entropy of Clausius.

Boltzmann's definition was soon reworked by the American mathematical physicist J. Willard Gibbs into a general formula for statistical-mechanical entropy, no longer requiring identical and non-interacting particles, but instead based on the probability distribution pi for the complete microstate i of the total system:


This (Gibbs) entropy, from statistical mechanics, can be found to directly correspond to the Clausius's classical thermodynamic definition.

Shannon himself was apparently not particularly aware of the close similarity
Entropy in thermodynamics and information theory
There are close parallels between the mathematical expressions for the thermodynamic entropy, usually denoted by S, of a physical system in the statistical thermodynamics established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s; and the information-theoretic entropy, usually expressed as...

 between his new measure and earlier work in thermodynamics, but John von Neumann
John von Neumann
John von Neumann was a Hungarian-American mathematician and polymath who made major contributions to a vast number of fields, including set theory, functional analysis, quantum mechanics, ergodic theory, geometry, fluid dynamics, economics and game theory, computer science, numerical analysis,...

 was. It is said that, when Shannon was deciding what to call his new measure and fearing the term 'information' was already over-used, von Neumann told him firmly: "You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, no one really knows what entropy really is, so in a debate you will always have the advantage."

(Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer
Rolf Landauer
Rolf William Landauer was an IBM physicist who in 1961 argued that when information is lost in an irreversible circuit, the information becomes entropy and an associated amount of energy is dissipated as heat...

 in the 1960s, are explored further in the article Entropy in thermodynamics and information theory
Entropy in thermodynamics and information theory
There are close parallels between the mathematical expressions for the thermodynamic entropy, usually denoted by S, of a physical system in the statistical thermodynamics established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s; and the information-theoretic entropy, usually expressed as...

).

Development since 1948

The publication of Shannon's 1948 paper, "A Mathematical Theory of Communication
A Mathematical Theory of Communication
"A Mathematical Theory of Communication" is an influential 1948 article by mathematician Claude E. Shannon. As of November 2011, Google Scholar has listed more than 48,000 unique citations of the article and the later-published book version...

", in the Bell System Technical Journal was the founding of information theory as we know it today. Many developments and applications of the theory have taken place since then, which have made many modern devices for data communication and storage such as CD-ROM
CD-ROM
A CD-ROM is a pre-pressed compact disc that contains data accessible to, but not writable by, a computer for data storage and music playback. The 1985 “Yellow Book” standard developed by Sony and Philips adapted the format to hold any form of binary data....

s and mobile phone
Mobile phone
A mobile phone is a device which can make and receive telephone calls over a radio link whilst moving around a wide geographic area. It does so by connecting to a cellular network provided by a mobile network operator...

s possible. Notable developments are listed in a timeline of information theory
Timeline of information theory
A timeline of events related to  information theory,  quantum information theory,  data compression,  error correcting codes and related subjects....

.

See also

  • Timeline of information theory
    Timeline of information theory
    A timeline of events related to  information theory,  quantum information theory,  data compression,  error correcting codes and related subjects....

  • Shannon, C.E.
    Claude Elwood Shannon
    Claude Elwood Shannon was an American mathematician, electronic engineer, and cryptographer known as "the father of information theory"....

  • Hartley, R.V.L.
    Ralph Hartley
    Ralph Vinton Lyon Hartley was an electronics researcher. He invented the Hartley oscillator and the Hartley transform, and contributed to the foundations of information theory.-Biography:...

  • H-theorem
    H-theorem
    In Classical Statistical Mechanics, the H-theorem, introduced by Ludwig Boltzmann in 1872, describes the increase in the entropy of an ideal gas in an irreversible process. H-theorem follows from considerations of Boltzmann's equation...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK