All Topics  
Information theory

 
Information Theory

   Email Print
   Bookmark   Link






 

Information theory



 
 
Information theory is a branch of applied mathematics
Applied mathematics

Applied mathematics is a branch of mathematics that concerns itself with the mathematical techniques typically used in the application of mathematical knowledge to other domains....
 and electrical engineering
Electrical engineering

Electrical engineering, sometimes referred to as electrical and electronic engineering, is a field of engineering that deals with the study and application of electricity, electronics and electromagnetism....
 involving the quantification of information
Information

Information as a Conveyed concept has a diversity of meanings, from everyday usage to technical settings. Generally speaking, the concept of information is closely related to notions of constraint, communication, control system, data, form, instruction, knowledge, Meaning , stimulation, pattern, perception, and knowledge representation....
. Historically, information theory was developed by Claude E. Shannon to find fundamental limits on compressing and reliably communicating
Communication

Communication is commonly defined as "the imparting or interchange of thoughts, opinions, or information by speech, writing, or signs...",, 1: an act or instance of transmitting and 3 a: "a process by which information is exchanged between individuals through a common system of symbols, signs, or beha...
 data. Since its inception it has broadened to find applications in many other areas, including statistical inference
Statistical inference

Inferential statistics or statistical induction comprises the use of statistics to make inferences concerning some unknown aspect of a population....
, natural language processing
Natural language processing

Natural language processing is a field of computer science concerned with the interactions between computers and human languages. Natural language generation systems convert information from computer databases into readable human language....
, cryptography
Cryptography

Cryptography is the practice and study of hiding information. In modern times cryptography is considered a branch of both mathematics and computer science and is affiliated closely with information theory, computer security and engineering....
 generally, networks other than communication networks — as in neurobiology
Neurobiology

Neurobiology is the study of cell s of the nervous system and the organization of these cells into functional biological neural network that process information and mediate behavior....
, the evolution and function of molecular codes, model selection in ecology, thermal physics, quantum computing, plagiarism detection and other forms of data analysis
Data analysis

Data analysis is a process of gathering, modeling, and transforming data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making....
.

A key measure of information in the theory is known as entropy, which is usually expressed by the average number of bits needed for storage or communication.






Discussion
Ask a question about 'Information theory'
Start a new discussion about 'Information theory'
Answer questions from other users
Full Discussion Forum



Recent Posts









Encyclopedia


Information theory is a branch of applied mathematics
Applied mathematics

Applied mathematics is a branch of mathematics that concerns itself with the mathematical techniques typically used in the application of mathematical knowledge to other domains....
 and electrical engineering
Electrical engineering

Electrical engineering, sometimes referred to as electrical and electronic engineering, is a field of engineering that deals with the study and application of electricity, electronics and electromagnetism....
 involving the quantification of information
Information

Information as a Conveyed concept has a diversity of meanings, from everyday usage to technical settings. Generally speaking, the concept of information is closely related to notions of constraint, communication, control system, data, form, instruction, knowledge, Meaning , stimulation, pattern, perception, and knowledge representation....
. Historically, information theory was developed by Claude E. Shannon to find fundamental limits on compressing and reliably communicating
Communication

Communication is commonly defined as "the imparting or interchange of thoughts, opinions, or information by speech, writing, or signs...",, 1: an act or instance of transmitting and 3 a: "a process by which information is exchanged between individuals through a common system of symbols, signs, or beha...
 data. Since its inception it has broadened to find applications in many other areas, including statistical inference
Statistical inference

Inferential statistics or statistical induction comprises the use of statistics to make inferences concerning some unknown aspect of a population....
, natural language processing
Natural language processing

Natural language processing is a field of computer science concerned with the interactions between computers and human languages. Natural language generation systems convert information from computer databases into readable human language....
, cryptography
Cryptography

Cryptography is the practice and study of hiding information. In modern times cryptography is considered a branch of both mathematics and computer science and is affiliated closely with information theory, computer security and engineering....
 generally, networks other than communication networks — as in neurobiology
Neurobiology

Neurobiology is the study of cell s of the nervous system and the organization of these cells into functional biological neural network that process information and mediate behavior....
, the evolution and function of molecular codes, model selection in ecology, thermal physics, quantum computing, plagiarism detection and other forms of data analysis
Data analysis

Data analysis is a process of gathering, modeling, and transforming data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making....
.

A key measure of information in the theory is known as entropy, which is usually expressed by the average number of bits needed for storage or communication. Intuitively, entropy quantifies the uncertainty involved when encountering a random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
. For example, a fair coin flip (2 equally likely outcomes) will have less entropy than a roll of a die (6 equally likely outcomes).

Applications of fundamental topics of information theory include lossless data compression
Lossless data compression

Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data....
 (e.g. ZIP files
ZIP (file format)

The ZIP file format is a data compression and file archiver file format. A ZIP file contains one or more files that have been compressed to reduce file size, or stored as-is....
), lossy data compression
Lossy data compression

A lossy compression method is one where data compression and then decompressing it retrieves data that may well be different from the original, but is close enough to be useful in some way....
 (e.g. MP3
MP3

MPEG-1 Audio Layer 3, more commonly referred to as MP3, is a digital audio Encoder format using a form of lossy data compression. It is a common audio format for consumer audio storage, as well as a de facto standard encoding for the transfer and playback of music on digital audio players....
s), and channel coding
Channel capacity

In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a channel ....
 (e.g. for DSL lines). The field is at the intersection of mathematics
Mathematics

Mathematics is the study of quantity, structure, space, change, and related topics of pattern and form. Mathematicians seek out patterns whether found in numbers, space, natural science, computers, imaginary abstractions, or elsewhere....
, statistics
Statistics

Statistics is a Mathematics pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data....
, computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, physics
Physics

Physics is the natural science which examines basic concepts such as energy, force, and spacetime and all that derives from these, such as mass, charge, matter and its Motion ....
, neurobiology
Neurobiology

Neurobiology is the study of cell s of the nervous system and the organization of these cells into functional biological neural network that process information and mediate behavior....
, and electrical engineering
Electrical engineering

Electrical engineering, sometimes referred to as electrical and electronic engineering, is a field of engineering that deals with the study and application of electricity, electronics and electromagnetism....
. Its impact has been crucial to the success of the Voyager
Voyager program

The Voyager program is a series of U.S. unmanned space missions that consists of a pair of unmanned scientific Space probes, Voyager 1 and Voyager 2....
 missions to deep space, the invention of the compact disc, the feasibility of mobile phones, the development of the Internet
Internet

The Internet is a global network of interconnected computers, enabling users to share information along multiple channels. Typically, a computer that connects to the Internet can access information from a vast array of available server and other computers by moving information from them to the computer's local memory....
, the study of linguistics
Linguistics

Linguistics is the science study of natural language. Linguistics encompasses a number of sub-fields. An important topical division is between the study of language structure and the study of Meaning ....
 and of human perception, the understanding of black hole
Black hole

In general relativity, a black hole is a region of space in which the gravitational field is so powerful that nothing, including electromagnetic radiation , can escape its pull after having fallen past its event horizon....
s, and numerous other fields. Important sub-fields of information theory are source coding, channel coding, algorithmic complexity theory, algorithmic information theory, and measures of information.

Overview

The main concepts of information theory can be grasped by considering the most widespread means of human communication: language. Two important aspects of a good language are as follows: First, the most common words (e.g., "a", "the", "I") should be shorter than less common words (e.g., "benefit", "generation", "mediocre"), so that sentences will not be too long. Such a tradeoff in word length is analogous to data compression
Data compression

In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits than an code representation would use through use of specific encoding schemes....
 and is the essential aspect of source coding
Source coding

In information theory, Shannon's source coding theorem establishes the limits to possible data compression, and the operational meaning of the Shannon entropy....
. Second, if part of a sentence is unheard or misheard due to noise — e.g., a passing car — the listener should still be able to glean the meaning of the underlying message. Such robustness is as essential for an electronic communication system as it is for a language; properly building such robustness into communications is done by channel coding
Channel capacity

In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a channel ....
. Source coding and channel coding are the fundamental concerns of information theory.

Note that these concerns have nothing to do with the importance of messages. For example, a platitude such as "Thank you; come again" takes about as long to say or write as the urgent plea, "Call an ambulance!" while clearly the latter is more important and more meaningful. Information theory, however, does not consider message importance or meaning, as these are matters of the quality of data rather than the quantity and readability of data, the latter of which is determined solely by probabilities.

Information theory is generally considered to have been founded in 1948 by Claude Shannon
Claude Elwood Shannon

Claude Elwood Shannon , an United States of America electronic engineer and mathematician, is known as "the father of information theory".Shannon is famous for having founded information theory with one landmark paper published in 1948....
 in his seminal work, "A Mathematical Theory of Communication
A Mathematical Theory of Communication

"A Mathematical Theory of Communication" is an influential 1948 article by mathematician Claude E. Shannon....
." The central paradigm of classical information theory is the engineering problem of the transmission of information over a noisy channel. The most fundamental results of this theory are Shannon's source coding theorem, which establishes that, on average, the number of bits needed to represent the result of an uncertain event is given by its entropy
Information entropy

In information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the self-information contained in a message, usually in units such as bits....
; and Shannon's noisy-channel coding theorem, which states that reliable communication is possible over noisy channels provided that the rate of communication is below a certain threshold called the channel capacity. The channel capacity can be approached in practice by using appropriate encoding and decoding systems.

Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of rubrics
Rubric (academic)

A rubric is a scoring tool for subjective assessments. It is a set of criteria and standardization linked to learning objectives that is used to assess a student's performance on papers, projects, essays, and other assignments....
 throughout the world over the past half century or more: adaptive system
Adaptive system

An adaptive system is a set of interacting or interdependent entities, real or abstract, forming an integrated whole that together are able to respond to environmental changes or changes in the interacting parts....
s, anticipatory systems, artificial intelligence
Artificial intelligence

Artificial intelligence is the intelligence of machines and the branch of computer science which aims to create it. Major AI textbooks define the field as "the study and design of intelligent agents,"...
, complex system
Complex system

A complex system is a system composed of interconnected parts that as a whole exhibit one or more properties not obvious from the properties of the individual parts....
s, complexity science, cybernetics
Cybernetics

Cybernetics is the interdisciplinary study of the structure of regulatory systems. Cybernetics is closely related to control theory and systems theory....
, informatics
Informatics

Informatics is the science of information, the practice of information processing, and the engineering of information systems. Informatics studies the structure, algorithms, behavior, and interactions of natural and artificial systems that store, process, access and communicate information....
, machine learning
Machine learning

Machine learning is the subfield of artificial intelligence that is concerned with the design and development of algorithms that allow computers to improve their performance over time based on data, such as from sensor data or databases....
, along with systems science
Systems science

Systems science is an interdisciplinary field of science that studies the nature of complex systems in nature, society, and science. It aims to develop interdisciplinary foundations, which are applicable in a variety of areas, such as engineering, biology, medicine and social sciences....
s of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of coding theory
Coding theory

Coding theory is a branch of information theory, electrical engineering, digital communication, mathematics, and computer science designing efficient and reliable data transmission methods, so that redundancy in the data can be removed and errors induced by a noisy channel can be corrected....
.

Coding theory is concerned with finding explicit methods, called codes, of increasing the efficiency and reducing the net error rate of data communication over a noisy channel to near the limit that Shannon proved is the maximum possible for that channel. These codes can be roughly subdivided into data compression
Data compression

In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits than an code representation would use through use of specific encoding schemes....
 (source coding) and error-correction (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible. A third class of information theory codes are cryptographic algorithms (both code
Code (cryptography)

In cryptography, a code is a method used to transform a message into an obscured form, preventing those who do not possess special information, or key , required to apply the transform from understanding what is actually transmitted....
s and cipher
Cipher

In cryptography, a cipher is an algorithm for performing encryption and decryption — a series of well-defined steps that can be followed as a procedure....
s). Concepts, methods and results from coding theory and information theory are widely used in cryptography
Cryptography

Cryptography is the practice and study of hiding information. In modern times cryptography is considered a branch of both mathematics and computer science and is affiliated closely with information theory, computer security and engineering....
 and cryptanalysis
Cryptanalysis

Cryptanalysis is the study of methods for obtaining the meaning of encrypted information, without access to the secret information which is normally required to do so....
. See the article ban (information)
Ban (information)

A ban, sometimes called a hartley or a dit , is a logarithmic unit which measures information or information entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and binary logarithm which define the bit....
 for a historical application.


Information theory is also used in information retrieval
Information retrieval

Information retrieval is the science of searching for documents, for information within documents and for Metadata about documents, as well as that of searching relational databases and the World Wide Web....
, intelligence gathering
Intelligence (information gathering)

Intelligence is not information, but the product of evaluated information, valued for its currency and relevance rather than its detail or accuracy —in contrast with "data" which typically refers to precision or particular information, or "fact," which typically refers to veracity information....
, gambling
Gambling

Gambling is the wikt:wager#Verb of money or something of material Value on an event with an uncertain outcome with the primary intent of winning additional money and/or material goods....
, statistics
Statistics

Statistics is a Mathematics pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data....
, and even in musical composition
Musical composition

Musical composition is:* an original piece of music* the musical form of a musical piece* the process of creating a new piece of music...
.

Historical background


The landmark event that established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication
A Mathematical Theory of Communication

"A Mathematical Theory of Communication" is an influential 1948 article by mathematician Claude E. Shannon....
" in the Bell System Technical Journal in July and October of 1948.

Prior to this paper, limited information theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist
Harry Nyquist

Harry Nyquist , was an important contributor to information theory....
's 1924 paper, Certain Factors Affecting Telegraph Speed, contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation , where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant. Ralph Hartley
Ralph Hartley

Ralph Vinton Lyon Hartley was an electronics researcher. He invented the Hartley oscillator and the Hartley transform, and contributed to the foundations of information theory....
's 1928 paper, Transmission of Information, uses the word information as a measurable quantity, reflecting the receiver's ability to distinguish that one sequence of symbols from any other, thus quantifying information as , where S was the number of possible symbols, and n the number of symbols in a transmission. The natural unit of information was therefore the decimal digit, much later renamed the hartley
Ban (information)

A ban, sometimes called a hartley or a dit , is a logarithmic unit which measures information or information entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and binary logarithm which define the bit....
 in his honour as a unit or scale or measure of information. Alan Turing
Alan Turing

Alan Mathison Turing, Order of the British Empire, Fellow of the Royal Society was a British mathematician, logician and Cryptanalysis....
 in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma
Cryptanalysis of the Enigma

Cryptanalysis of the Enigma enabled the Allies of World War II in World War II to read substantial amounts of secret Morse code radio communications of the Axis powers enciphered using Enigma machines....
 ciphers.

Much of the mathematics behind information theory with events of different probabilities was developed for the field of thermodynamics
Thermodynamics

In physics, thermodynamics is the study of the conversion of heat energy into different forms of energy ; different energy conversions into heat energy; and its relation to macroscopic variables such as temperature, pressure, and volume....
 by Ludwig Boltzmann
Ludwig Boltzmann

Ludwig Eduard Boltzmann was an Austrian physicist famous for his founding contributions in the fields of statistical mechanics and statistical thermodynamics....
 and J. Willard Gibbs. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer
Rolf Landauer

Rolf William Landauer was an IBM physicist who in 1961 demonstrated that when information is lost in an irreversible circuit, the information becomes entropy and an associated amount of energy is dissipated as heat....
 in the 1960s, are explored in Entropy in thermodynamics and information theory
Entropy in thermodynamics and information theory

There are close parallels between the mathematical expressions for the thermodynamic entropy, usually denoted by S, of a physical system in the statistical thermodynamics established by Ludwig Boltzmann and J....
.

In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that
"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."


With it came the ideas of
  • the information entropy
    Information entropy

    In information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the self-information contained in a message, usually in units such as bits....
     and redundancy
    Redundancy (information theory)

    Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message....
     of a source, and its relevance through the source coding theorem;
  • the mutual information
    Mutual information

    In probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two variables....
    , and the channel capacity
    Channel capacity

    In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a channel ....
     of a noisy channel, including the promise of perfect loss-free communication given by the noisy-channel coding theorem;
  • the practical result of the Shannon–Hartley law for the channel capacity of a Gaussian channel; and of course
  • the bit
    Bit

    A bit is a binary numeral system numerical digit, taking a value of either 0 or 1. Binary digits are a basic unit of information Computer data storage and transmission in digital computing and digital information theory....
    —a new way of seeing the most fundamental unit of information.


Quantities of information


Information theory is based on probability theory
Probability theory

Probability theory is the branch of mathematics concerned with analysis of Statistical randomness phenomena. The central objects of probability theory are random variables, stochastic processes, and event s: mathematical abstractions of determinism events or measured quantities that may either be single occurrences or evolve over time in an a...
 and statistics
Statistics

Statistics is a Mathematics pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data....
. The most important quantities of information are entropy, the information in a random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
, and mutual information
Mutual information

In probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two variables....
, the amount of information in common between two random variables. The former quantity indicates how easily message data can be compressed
Data compression

In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits than an code representation would use through use of specific encoding schemes....
 while the latter can be used to find the communication rate across a channel
Channel (communications)

Channel, in communications , refers to the :wikt:medium used to information transfer information from a sender to a receiver ....
.

The choice of logarithmic base in the following formulae determines the unit
Units of measurement

The definition, agreement and practical use of units of measurement have played a crucial role in human endeavour from early ages up to this day....
 of information entropy
Information entropy

In information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the self-information contained in a message, usually in units such as bits....
 that is used. The most common unit of information is the bit
Bit

A bit is a binary numeral system numerical digit, taking a value of either 0 or 1. Binary digits are a basic unit of information Computer data storage and transmission in digital computing and digital information theory....
, based on the binary logarithm
Binary logarithm

In mathematics, the binary logarithm is the logarithm for base 2. It is the inverse function of ....
. Other units include the nat
Nat (information)

A nat is a logarithmic unit of information or information entropy, based on natural logarithms and powers of e , rather than the powers of 2 and binary logarithm which define the bit....
, which is based on the natural logarithm
Natural logarithm

The natural logarithm, formerly known as the hyperbolic logarithm, is the logarithm to the base e , where e is an irrational number constant approximately equal to 2.718281828....
, and the hartley, which is based on the common logarithm
Common logarithm

The common logarithm is the logarithm with base 10. It is also known as the decadic logarithm, named after its base. It is indicated by log10, or sometimes Log with a capital L ....
.

In what follows, an expression of the form is considered by convention to be equal to zero whenever This is justified because for any logarithmic base.

Entropy

The entropy, , of a discrete random variable is a measure of the amount of uncertainty associated with the value of .

Suppose one transmits 1000 bits (0s and 1s). If these bits are known ahead of transmission (to be a certain value with absolute probability), logic dictates that no information has been transmitted. If, however, each is equally and independently likely to be 0 or 1, 1000 bits (in the information theoretic sense) have been transmitted. Between these two extremes, information can be quantified as follows. If is the set of all messages that could be, and is the probability of given , then the entropy of is defined:

(Here, is the self-information
Self-information

In information theory , self-information is a measure of the information content associated with the outcome of a random variable. It is expressed in a Units of measurement of information, for example bits,...
, which is the entropy contribution of an individual message, and is the expected value
Expected value

In probability theory and statistics, the expected value of a random variable is the Lebesgue integral of the random variable with respect to its probability measure....
.) An important property of entropy is that it is maximized when all the messages in the message space are equiprobable ,—i.e., most unpredictable—in which case .

The special case of information entropy for a random variable with two outcomes is the binary entropy function
Binary entropy function

In information theory, the binary entropy function, denoted or , is defined as the information entropy of a Bernoulli trial with probability of success p....
:

Joint entropy

The joint entropy
Joint entropy

The joint entropy is an information entropy used in information theory. The joint entropy measures how much entropy is contained in a joint system of two random variables....
 of two discrete random variables and is merely the entropy of their pairing: . This implies that if and are independent
Statistical independence

In probability theory, to say that two event s are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs....
, then their joint entropy is the sum of their individual entropies.

For example, if represents the position of a chess
Chess

Chess is a recreational and competitive game played between two Player . Sometimes called Western chess or international chess to distinguish it from History of chess and other chess variants, the current form of the game emerged in Southern Europe during the second half of the 15th century after evolving from similar, much older...
 piece — the row and the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.

Despite similar notation, joint entropy should not be confused with cross entropy
Cross entropy

In information theory, the cross entropy between two probability distributions measures the average number of bits needed to identify an event from a set of possibilities, if a coding scheme is used based on a given probability distribution , rather than the "true" distribution ....
.

Conditional entropy (equivocation)

The conditional entropy
Conditional entropy

In information theory, the conditional entropy quantifies the remaining information entropy of a random variable given that the value of a second random variable is known....
 or conditional uncertainty of given random variable (also called the equivocation of about ) is the average conditional entropy over :

Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:



Mutual information (transinformation)

Mutual information
Mutual information

In probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two variables....
 measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of relative to is given by:

where (Specific mutual Information) is the pointwise mutual information
Pointwise Mutual Information

Pointwise mutual information is a measure of association used in information theory and statistics.The PMI of a pair of probability space x and y belonging to discrete random variables quantifies the discrepancy between the probability of their coincidence given their joint distribution versus the probability of their coincidence...
.

A basic property of the mutual information is that
That is, knowing Y, we can save an average of bits in encoding X compared to not knowing Y.

Mutual information is symmetric
Symmetric function

In mathematics, the term "symmetric function" can mean two different things. A symmetric function of n variables is one whose value at any n-tuple of arguments is the same as its value at any permutation of that n-tuple....
:


Mutual information can be expressed as the average Kullback–Leibler divergence
Kullback–Leibler divergence

In probability theory and information theory, the Kullback?Leibler divergence is a non-commutative measure of the difference between two probability distributions P and Q....
 (information gain) of the posterior probability distribution
Posterior probability

The posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned after the relevant Scientific evidence is taken into account....
 of X given the value of Y to the prior distribution
Prior probability

A prior probability is a conditional probability, interpreted as a description of what is known about a variable in the absence of some Marginal likelihood....
 on X:
In other words, this is a measure of how much, on the average, the probability distribution on X will change if we are given the value of Y. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:


Mutual information is closely related to the log-likelihood ratio test
Likelihood-ratio test

The likelihood ratio, often denoted by , is the ratio of the maximum probability of a result under two different hypotheses. A likelihood-ratio test is a statistical test for making a decision between two hypotheses based on the value of this ratio....
 in the context of contingency tables and the multinomial distribution
Multinomial distribution

In probability theory, the multinomial distribution is a generalization of the binomial distribution.The binomial distribution is the probability distribution of the number of "successes" in n statistical independence Bernoulli trials, with the same probability of "success" on each trial....
 and to Pearson's ?2 test
Pearson's chi-square test

Pearson's chi-square test is the best-known of several chi-square tests ? Statistics procedures whose results are evaluated by reference to the chi-square distribution....
: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.

Kullback–Leibler divergence (information gain)

The Kullback–Leibler divergence
Kullback–Leibler divergence

In probability theory and information theory, the Kullback?Leibler divergence is a non-commutative measure of the difference between two probability distributions P and Q....
 (or information divergence, information gain, or relative entropy) is a way of comparing two distributions: a "true" probability distribution
Probability distribution

In probability theory and statistics, a probability distribution identifies either the probability of each value of an unidentified random variable , or the probability of the value falling within a particular interval ....
 p(X), and an arbitrary probability distribution q(X). If we compress data in a manner that assumes q(X) is the distribution underlying some data, when, in reality, p(X) is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined

Although it is sometimes used as a 'distance metric', it is not a true metric
Metric (mathematics)

In mathematics, a metric or distance function is a function which defines a distance between elements of a Set . A set with a metric is called a metric space....
 since it is not symmetric and does not satisfy the triangle inequality
Triangle inequality

In mathematics, the triangle inequality states that for any triangle, the length of a given side must be less than the sum of the other two sides but greater than the difference between the two sides....
 (making it a semi-quasimetric).

Other quantities

Other important information theoretic quantities include Rényi entropy
Rényi entropy

In information theory, the R?nyi entropy, a generalisation of Shannon entropy, is one of a family of functionals for quantifying the diversity, uncertainty or randomness of a system....
, (a generalization of entropy,) differential entropy
Differential entropy

Differential entropy is a concept in information theory which tries to extend the idea of information entropy, a measure of average surprisal of a random variable, to continuous probability distributions....
, (a generalization of quantities of information to continuous distributions,) and the conditional mutual information
Conditional mutual information

In probability theory, and in particular, information theory, the conditional mutual information is, in its most basic form, the expected value of the mutual information of two random variables given the value of a third....
.

Coding theory


Cdscratches
Coding theory
Coding theory

Coding theory is a branch of information theory, electrical engineering, digital communication, mathematics, and computer science designing efficient and reliable data transmission methods, so that redundancy in the data can be removed and errors induced by a noisy channel can be corrected....
 is one of the most important and direct applications of information theory. It can be subdivided into source coding
Data compression

In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits than an code representation would use through use of specific encoding schemes....
 theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.

  • Data compression (source coding): There are two formulations for the compression problem:
  1. lossless data compression
    Lossless data compression

    Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data....
    : the data must be reconstructed exactly;
  2. lossy data compression
    Lossy data compression

    A lossy compression method is one where data compression and then decompressing it retrieves data that may well be different from the original, but is close enough to be useful in some way....
    : allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function. This subset of Information theory is called rate–distortion theory.


  • Error-correcting codes (channel coding): While data compression removes as much redundancy
    Redundancy (information theory)

    Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message....
     as possible, an error correcting code adds just the right kind of redundancy (i.e., error correction) needed to transmit the data efficiently and faithfully across a noisy channel.


This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel
Relay channel

In information theory, a relay Channel is a probability model on the communication between a sender and a receiver aided by one or more intermediate relay nodes....
), or more general networks
Computer network

A computer network is a group of interconnected computers. Networks may be classified according to a wide variety of characteristics. This article provides a general overview of some types and categories and also presents the basic components of a network....
, compression followed by transmission may no longer be optimal. Network information theory refers to these multi-agent communication models.

Source theory


Any process that generates successive messages can be considered a source
Communication source

A source or sender is one of the basic concepts of communication and information processing. Sources are objects which encode message data and Transmission the information, via a channel , to one or more observations ....
 of information. A memoryless source is one in which each message is an independent identically-distributed random variable, whereas the properties of ergodicity
Ergodic theory

Ergodic theory is a branch of mathematics that studies dynamical systemswith an invariant measure and related problems. Its initial development was motivated by problems of statistical physics....
 and stationarity
Stationary process

In the mathematics, a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time or space....
 impose more general constraints. All such sources are stochastic
Stochastic process

A stochastic process, or sometimes random process, is the counterpart to a deterministic process in probability theory. Instead of dealing with only one possible 'reality' of how the process might evolve under time , in a stochastic or random process there is some indeterminacy in its future evolution described by probability distribu...
. These terms are well studied in their own right outside information theory.

Rate
Information rate
Entropy rate

The entropy rate or source information rate of a stochastic process is, informally, the time density of the average information in a stochastic process....
 is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is

that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is

that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.

It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy
Redundancy (information theory)

Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message....
 and how well it can be compressed
Data compression

In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits than an code representation would use through use of specific encoding schemes....
, the subject of source coding.

Channel capacity


Communications over a channel—such as an ethernet
Ethernet

Ethernet is a family of Data frame-based computer networking technologies for local area networks . The name comes from the physical concept of the Luminiferous aether....
 wire—is the primary motivation of information theory. As anyone who's ever used a telephone (mobile or landline) knows, however, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality. How much information can one hope to communicate over a noisy (or otherwise imperfect) channel?

Consider the communications process over a discrete channel. A simple model of the process is shown below:

Here X represents the space of messages transmitted, and Y the space of messages received during a unit time over our channel. Let be the conditional probability
Conditional probability

Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written P, and is read "the probability of A, given B"....
 distribution function of Y given X. We will consider to be an inherent fixed property of our communications channel (representing the nature of the noise
Signal noise

In science, and especially in physics and telecommunication, noise is fluctuations in and the addition of external factors to the stream of target information being received at a detector....
 of our channel). Then the joint distribution of X and Y is completely determined by our channel and by our choice of , the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the signal
Signal (electrical engineering)

In the fields of telecommunications, signal processing, and in electrical engineering more generally, a signal is any time-varying or spatial-varying quantity....
, we can communicate over the channel. The appropriate measure for this is the mutual information
Mutual information

In probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two variables....
, and this maximum mutual information is called the channel capacity
Channel capacity

In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a channel ....
 and is given by: This capacity has the following property related to communicating at information rate R (where R is usually bits per symbol). For any information rate R < C and coding error e > 0, for large enough N, there exists a code of length N and rate = R and a decoding algorithm, such that the maximal probability of block error is = e; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate R > C, it is impossible to transmit with arbitrarily small block error.

Channel coding
Channel code

In computer science, a channel code is a broadly used term mostly referring to the forward error correction code and bit interleaving in communication and storage where the communication media or storage media is viewed as a channel....
 is concerned with finding such nearly optimal codes
Error detection and correction

In mathematics, computer science, telecommunication, and information theory, error detection and correction has great practical importance in maintaining data integrity across noisy channels and less-than-reliable storage media....
 that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.

Channel capacity of particular model channels
  • A continuous-time analog communications channel subject to Gaussian noise — see Shannon–Hartley theorem
    Shannon–Hartley theorem

    In information theory, the Shannon?Hartley theorem is an application of the noisy channel coding theorem to the archetypal case of a continuous-time analog communications channel subject to Gaussian noise....
    .


  • A binary symmetric channel
    Binary symmetric channel

    A binary symmetric channel is a common communications channel model used in coding theory and information theory. In this model, a transmitter wishes to send a bit , and the receiver receives a bit....
     (BSC) with crossover probability p is a binary input, binary output channel that flips the input bit with probability p. The BSC has a capacity of bits per channel use, where is the binary entropy function
    Binary entropy function

    In information theory, the binary entropy function, denoted or , is defined as the information entropy of a Bernoulli trial with probability of success p....
    :


  • A binary erasure channel (BEC) with erasure probability p is a binary input, ternary output channel. The possible channel outputs are 0, 1, and a third symbol 'e' called an erasure. The erasure represents complete loss of information about an input bit. The capacity of the BEC is 1 - p bits per channel use.

Applications to other fields


Intelligence uses and secrecy applications


Information theoretic concepts apply to cryptography
Cryptography

Cryptography is the practice and study of hiding information. In modern times cryptography is considered a branch of both mathematics and computer science and is affiliated closely with information theory, computer security and engineering....
 and cryptanalysis
Cryptanalysis

Cryptanalysis is the study of methods for obtaining the meaning of encrypted information, without access to the secret information which is normally required to do so....
. Turing
Turing

Turing may refer to:*Alan Turing, after whom the items listed below are ultimately named*Turing *Turing *Turing completeness*Turing machine...
's information unit, the ban
Ban (information)

A ban, sometimes called a hartley or a dit , is a logarithmic unit which measures information or information entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and binary logarithm which define the bit....
, was used in the Ultra
Ultra

Ultra was the name used by the United Kingdom for intelligence resulting from decryption of encrypted Nazi Germany radio communications in World War II....
 project, breaking the German Enigma machine
Enigma machine

The Enigma machine is any of a family of related electro-mechanical rotor machines that have been used to generate ciphers for the encryption and decryption of secret messages....
 code and hastening the end of WWII in Europe
Victory in Europe Day

Victory in Europe Day was May 7 and May 8, 1945, the dates when the World War II Allies of World War II formally accepted the unconditional surrender of the armed forces of Nazi Germany and the end of Adolf Hitler's Nazi Germany....
. Shannon himself defined an important concept now called the unicity distance
Unicity distance

Unicity distance is a term used in cryptography referring to the length of an original ciphertext needed to break the cipher by reducing the number of possible spurious keys to zero in a brute force attack....
. Based on the redundancy
Redundancy (information theory)

Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message....
 of the plaintext
Plaintext

In cryptography, plaintext is the information which the sender wishes to transmit to the receiver. Before the computer era, plaintext simply meant text in the language of the communicating parties....
, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability.

Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack
Brute force attack

In cryptanalysis, a brute force attack is a method of defeating a cryptographic scheme by systematically trying a large number of possibilities; for example, a large number of the possible key s in a key space in order to decrypt a message....
 can break systems based on asymmetric key algorithms
Public-key cryptography

Public-key cryptography is a method for secret communication between two parties without requiring an initial key exchange of secret key. It can also be used to create digital signature....
 or on most commonly used methods of symmetric key algorithms
Symmetric-key algorithm

Symmetric-key algorithms are a class of algorithms for cryptography that use trivially related, often identical, cryptographic keys for both decryption and encryption....
 (sometimes called secret key algorithms), such as block cipher
Block cipher

In cryptography, a block cipher is a symmetric key algorithm cipher which operates on fixed-length groups of bits, termed blocks, with an unvarying transformation....
s. The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.

Information theoretic security
Information theoretic security

A cryptosystem is information-theoretically secure if its security derives purely from information theory. That is, it is secure even when the adversary has computational boundedness....
 refers to methods such as the one-time pad
One-time pad

In cryptography, the one-time pad is an encryption algorithm where the plaintext is combined with a random key or "pad" that is as long as the plaintext and used only once....
 that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information
Mutual information

In probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two variables....
 between the plaintext
Plaintext

In cryptography, plaintext is the information which the sender wishes to transmit to the receiver. Before the computer era, plaintext simply meant text in the language of the communicating parties....
 and ciphertext (conditioned on the key
Key (cryptography)

In cryptography, a key is a piece of information that determines the functional output of a cryptographic algorithm or cipher. Without a key, the algorithm would have no result....
) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the Venona project
Venona project

The Venona project was a long-running and highly secret collaboration between intelligence agencies of the United States and United Kingdom that involved the cryptanalysis of messages sent by several Chronology of Soviet secret police agencies of the Soviet Union, mostly during World War II....
 was able to crack the one-time pads of the Soviet Union
Soviet Union

The Union of Soviet Socialist Republics was a Constitution of the Soviet Union socialist state that existed in Eurasia from 1922 to 1991.The name is a translation of the , romanization of Russian Soyuz Sovetskikh Sotsialisticheskikh Respublik, abbreviated ????, SSSR....
 due to their improper reuse of key material.

Pseudorandom number generation

Pseudorandom number generator
Pseudorandom number generator

A pseudorandom number generator is an algorithm for generating a sequence of numbers that approximates the properties of random numbers. The sequence is not truly random in that it is completely determined by a relatively small set of initial values, called the PRNG's state. Although sequences that are closer to truly random can be gen...
s are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed Cryptographically secure pseudorandom number generator
Cryptographically secure pseudorandom number generator

A cryptographically secure pseudo-random number generator is a pseudo-random number generator with properties that make it suitable for use in cryptography....
s, but even they require external to the software random seed
Random seed

A random seed is a number used to initialize a pseudorandom number generator.The choice of a good random seed is crucial in the field of computer security....
s to work as intended. These can be obtained via extractor
Extractor

An -extractor is a bipartite graph with nodes on the left and nodes on the right such that each node on the left has neighbors , which has the added property that...
s, if done carefully. The measure of sufficient randomness in extractors is min-entropy
Min-entropy

In probability theory or information theory, the min-entropy of a discrete random event x with possible states 1... n and corresponding probabilities p1... pn is...
, a value related to Shannon entropy through Rényi entropy
Rényi entropy

In information theory, the R?nyi entropy, a generalisation of Shannon entropy, is one of a family of functionals for quantifying the diversity, uncertainty or randomness of a system....
; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
 with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.

Seismic Exploration

One early commercial application of information theory was in the field seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and digital signal processing
Digital signal processing

Digital signal processing is concerned with the representation of the signal s by a sequence of numbers or symbols and the processing of these signals....
 offer a major improvement of resolution and image clarity over previous analog methods.

Miscellaneous applications

Information theory also has applications in gambling and investing
Gambling and information theory

Bayesian_inference might be thought of as gambling theory applied to the world around. The myriad applications for logarithmic information measures tell us precisely how to take the best guess in the face of partial information....
, black holes
Black hole information paradox

The black hole information paradox results from the combination of quantum mechanics and general relativity. It suggests that physical information could "disappear" in a black hole, allowing many State to evolve into precisely the same state....
, bioinformatics
Bioinformatics

Bioinformatics is the application of information technology to the field of molecular biology. The term bioinformatics was coined by Paulien Hogeweg in 1978 for the study of informatic processes in biotic systems....
, and music
Music

Music is an art form whose media is sound organized in time. Common elements of music are pitch , rhythm , dynamics , and the sonic qualities of timbre and texture ....
.

Footnotes


The classic work

  • Shannon, C.E.
    Claude Elwood Shannon

    Claude Elwood Shannon , an United States of America electronic engineer and mathematician, is known as "the father of information theory".Shannon is famous for having founded information theory with one landmark paper published in 1948....
     (1948), "A Mathematical Theory of Communication
    A Mathematical Theory of Communication

    "A Mathematical Theory of Communication" is an influential 1948 article by mathematician Claude E. Shannon....
    ", Bell System Technical Journal, 27, pp. 379–423 & 623–656, July & October, 1948.
  • R.V.L. Hartley, , Bell System Technical Journal, July 1928
  • Andrey Kolmogorov
    Andrey Kolmogorov

    Andrey Nikolaevich Kolmogorov was a Soviet Union Russian mathematician, preeminent in the 20th century who advanced various scientific fields ....
    (1968) "Three approaches to the quantitative definition of information" in International Journal of Computer Mathematics.


Other journal articles


  • J. L. Kelly, Jr., "," Bell System Technical Journal, Vol. 35, July 1956, pp. 917-26.


  • R. Landauer, Proc. Workshop on Physics and Computation PhysComp'92 (IEEE Comp. Sci.Press, Los Alamitos, 1993) pp. 1-4.


  • R. Landauer, "" IBM J. Res. Develop. Vol. 5, No. 3, 1961


Textbooks on information theory

  • Claude E. Shannon, Warren Weaver. The Mathematical Theory of Communication. Univ of Illinois Press, 1949. ISBN 0-252-72548-4
  • Robert Gallager. Information Theory and Reliable Communication. New York: John Wiley and Sons, 1968. ISBN 0-471-29048-3
  • Robert B. Ash. Information Theory. New York: Interscience, 1965. ISBN 0-470-03445-9. New York: Dover 1990. ISBN 0-486-66521-6
  • Thomas M. Cover
    Thomas M. Cover

    Thomas M. Cover is Professor jointly in the Departments of Electrical Engineering and Statistics at Stanford University. He is past President of the IEEE Information Theory Society and is a Fellow of the Institute for Mathematical Statistics and of the IEEE....
    , Joy A. Thomas. Elements of information theory, 1st Edition. New York: Wiley-Interscience, 1991. ISBN 0-471-06259-6.
2nd Edition. New York: Wiley-Interscience, 2006. ISBN 0-471-24195-4.
  • Imre Csiszar
    Imre Csiszár

    Imre Csisz?r is a Hungarian mathematician with contributions to information theoryand probability theory. In 1996 he won the Claude E. Shannon Award, the highest annual...
    , Janos Korner. Information Theory: Coding Theorems for Discrete Memoryless Systems Akademiai Kiado: 2nd edition, 1997. ISBN 9630574403
  • Raymond W. Yeung. Kluwer Academic/Plenum Publishers, 2002. ISBN 0-306-46791-7
  • David J. C. MacKay. Cambridge: Cambridge University Press, 2003. ISBN 0-521-64298-1
  • Raymond W. Yeung. Springer 2008, 2002. ISBN 978-0-387-79233-0
  • Stanford Goldman. Information Theory. New York: Prentice Hall, 1953. New York: Dover 1968 ISBN 0-486-62209-6, 2005 ISBN 0-486-44271-3
  • Fazlollah Reza
    Fazlollah Reza

    Fazlollah M. Reza is an Iranian university professor....
    . An Introduction to Information Theory. New York: McGraw-Hill 1961. New York: Dover 1994. ISBN 0-486-68210-2
  • Masud Mansuripur. Introduction to Information Theory. New York: Prentice Hall, 1987. ISBN 0-13-484668-0
  • Christoph Arndt: Information Measures, Information and its Description in Science and Engineering (Springer Series: Signals and Communication Technology), 2004, ISBN 978-3-540-40855-0, ;


Other books

  • Leon Brillouin, Science and Information Theory, Mineola, N.Y.: Dover, [1956, 1962] 2004. ISBN 0-486-43918-6
  • A. I. Khinchin, Mathematical Foundations of Information Theory, New York: Dover, 1957. ISBN 0-486-60434-9
  • H. S. Leff and A. F. Rex, Editors, Maxwell's Demon: Entropy, Information, Computing, Princeton University Press, Princeton, NJ (1990). ISBN 0-691-08727-X
  • Tom Siegfried, The Bit and the Pendulum, Wiley, 2000. ISBN 0-471-32174-5
  • Charles Seife, Decoding The Universe, Viking, 2006. ISBN 0-670-03441-X
  • Jeremy Campbell, Grammatical Man, Touchstone/Simon & Schuster, 1982, ISBN 0-671-44062-4
  • Henri Theil, Economics and Information Theory, Rand McNally & Company - Chicago, 1967.


See also

  • Communication theory
    Communication theory

    There is much discussion in the academic world of communication as to what actually constitutes communication. Currently, many definitions of communication are used in order to conceptualize the processes by which people navigate and assign meaning....
  • List of important publications
  • Philosophy of information
    Philosophy of information

    The philosophy of information is the area of research that studies conceptual issues arising at the intersection of computer science, information technology, and philosophy....


Applications

  • Cryptography
    Cryptography

    Cryptography is the practice and study of hiding information. In modern times cryptography is considered a branch of both mathematics and computer science and is affiliated closely with information theory, computer security and engineering....
  • Cryptanalysis
    Cryptanalysis

    Cryptanalysis is the study of methods for obtaining the meaning of encrypted information, without access to the secret information which is normally required to do so....
  • Entropy in thermodynamics and information theory
    Entropy in thermodynamics and information theory

    There are close parallels between the mathematical expressions for the thermodynamic entropy, usually denoted by S, of a physical system in the statistical thermodynamics established by Ludwig Boltzmann and J....
  • seismic exploration
    Reflection seismology

    Reflection seismology is a method of exploration geophysics that uses the principles of seismology to estimate the properties of the Earth's subsurface from reflection seismic waves....
  • Intelligence (information gathering)
    Intelligence (information gathering)

    Intelligence is not information, but the product of evaluated information, valued for its currency and relevance rather than its detail or accuracy —in contrast with "data" which typically refers to precision or particular information, or "fact," which typically refers to veracity information....
  • Gambling
    Gambling

    Gambling is the wikt:wager#Verb of money or something of material Value on an event with an uncertain outcome with the primary intent of winning additional money and/or material goods....
  • Cybernetics
    Cybernetics

    Cybernetics is the interdisciplinary study of the structure of regulatory systems. Cybernetics is closely related to control theory and systems theory....


History

  • History of information theory
    History of information theory

    The decisive event which established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of Claude E....
  • Timeline of information theory
    Timeline of information theory

    A timeline of events related to information theory, data compression, error correcting codes and related subjects.* 1872 - Ludwig Boltzmann presents his H-theorem, and with it the formula Spi log pi for the entropy of a single gas particle....
  • Shannon, C.E.
    Claude Elwood Shannon

    Claude Elwood Shannon , an United States of America electronic engineer and mathematician, is known as "the father of information theory".Shannon is famous for having founded information theory with one landmark paper published in 1948....
  • Hartley, R.V.L.
    Ralph Hartley

    Ralph Vinton Lyon Hartley was an electronics researcher. He invented the Hartley oscillator and the Hartley transform, and contributed to the foundations of information theory....
  • Yockey, H.P.
    Hubert Yockey

    Professor Hubert P. Yockey , Doctor of Philosophy is a physicist and Information theory. He worked under Robert Oppenheimer on the Manhattan Project, and at the University of California, Berkeley....


Theory


Concepts


External links

  • Gibbs, M., "Quantum Information Theory",
  • Schneider, T., "Information Theory Primer",
  • Srinivasa, S. "A Review on Multivariate Mutual Information" .
  • Challis, J.
  • and .
  • , by David MacKay
    David MacKay (scientist)

    David J. C. MacKay is the professor of natural philosophy in the department of Physics at the University of Cambridge.He was born the fifth child of Donald MacCrimmon MacKay and Valerie MacKay....
     - gives an entertaining and thorough introduction to Shannon theory, including state-of-the-art methods from coding theory, such as arithmetic coding
    Arithmetic coding

    Arithmetic coding is a method for lossless data compression. Normally, a string of characters such as the words "hello there" is represented using a fixed number of bits per character, as in the American Standard Code for Information Interchange code....
    , low-density parity-check code
    Low-density parity-check code

    In information theory, a low-density parity-check code is an error correcting code, a method of transmitting a message over a signal noise transmission channel....
    s, and Turbo code
    Turbo code

    In electrical engineering and digital communications, turbo codes are a class of high-performance error-correcting code developed in 1993 which are finding use in deep space satellite telecommunication and other applications where designers seek to achieve maximal information transfer over a limited-bandwidth communication link in the prese...
    s.