There are close parallels between the mathematical expressions for the thermodynamic
entropyEntropy is a concept of information maintaining great importance in physics, chemistry, and information theory...
, usually denoted by
S, of a physical system in the statistical thermodynamics established by
Ludwig BoltzmannLudwig Eduard Boltzmann was an Austrian physicist famous for his founding contributions in the fields of statistical mechanics and statistical thermodynamics...
and J. Willard Gibbs in the 1870s; and the
information-theoretic entropyIn information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the information contained in a message, usually in units such as bits...
, usually expressed as
H, of
Claude ShannonClaude Elwood Shannon , an American electronic engineer and mathematician, is known as "the father of information theory".Shannon is famous for having founded information theory with one landmark paper published in 1948...
and
Ralph HartleyRalph Vinton Lyon Hartley was an electronics researcher. He invented the Hartley oscillator and the Hartley transform, and contributed to the foundations of information theory.-Biography:...
developed in the 1940s. Shannon, although not initially aware of this similarity, commented on it upon publicizing information theory in
A Mathematical Theory of Communication"A Mathematical Theory of Communication" is an influential 1948 article by mathematician Claude E. Shannon.-Description:The article was one of the founding works of the field of information theory. Shannon expanded the ideas of this article in a 1963 book with Warren Weaver titled The Mathematical...
.
This article explores what links there are between the two concepts, and how far they can be regarded as connected.
Discrete case
The defining expression for
entropyEntropy is a concept of information maintaining great importance in physics, chemistry, and information theory...
in the theory of
statistical mechanicsStatistical mechanics is the application of probability theory, which includes mathematical tools for dealing with large populations, to the field of mechanics, which is concerned with the motion of particles or objects when subjected to a force...
established by
Ludwig BoltzmannLudwig Eduard Boltzmann was an Austrian physicist famous for his founding contributions in the fields of statistical mechanics and statistical thermodynamics...
and J. Willard Gibbs in the 1870s, is of the form:
where is the probability of the
microstateIn statistical mechanics, a microstate describes a specific detailed microscopic configuration of a thermodynamic system that the system visits in the course of its thermal fluctuations....
i taken from an equilibrium ensemble.
The defining expression for
entropyIn information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the information contained in a message, usually in units such as bits...
in the theory of
informationInformation theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Historically, information theory was developed by Claude E. Shannon to find fundamental limits on compressing and reliably storing and communicating data...
established by Claude E. Shannon in 1948 is of the form:
where is the probability of the message taken from the message space
M.
Mathematically
H may also be seen as an average information, taken over the message space, because when a certain message occurs with probability
pi, the information
−log(
pi) will be obtained.
If all the microstates are equiprobable (a
microcanonical ensembleIn statistical physics, the microcanonical ensemble is a theoretical tool used to describe the thermodynamic properties of an isolated system. In such a system, the possible states of the system all have the same energy and the probability for the system to be in any given state is the same.-...
), the statistical thermodynamic entropy reduces to the form on Boltzmann's tombstone,
where
W is the number of microstates.
If all the messages are equiprobable, the information entropy reduces to the Hartley entropy
where is the
cardinalityIn mathematics, the cardinality of a set is a measure of the "number of elements of the set". For example, the set A = {2, 4, 6} contains 3 elements, and therefore A has a cardinality of 3...
of the message space
M.
The logarithm in the thermodynamic definition is the
natural logarithmThe natural logarithm is the logarithm to the base e, where e is an irrational constant approximately equal to 2.718
281828...
. It can be shown that the Gibbs entropy formula, with the natural logarithm, reproduces all of the properties of the macroscopic
classical thermodynamicsClassical thermodynamics is a branch of physics developed in the nineteenth century, by Sadi Carnot , Emile Clapeyron , Rudolf Clausius , Willard Gibbs , Hermann von Helmholtz , and others that studied heat and work and their relation to the collision and interaction of particles in large,...
of Clausius. (See article:
Entropy (statistical views)In thermodynamics, statistical entropy is the modeling of the energetic function entropy using probability theory. The statistical entropy perspective was introduced in 1870 with the work of the Austrian physicist Ludwig Boltzmann.- Gibbs Entropy Formula:...
).
The
logarithmIn mathematics, the logarithm of a number to a given base is the power or exponent to which the base must be raised in order to produce the number....
can also be taken to the natural base in the case of information entropy. This is equivalent to choosing to measure information in
natA nat is a logarithmic unit of information or entropy, based on natural logarithms and powers of e, rather than the powers of 2 and base 2 logarithms which define the bit. The nat is the natural unit for information entropy...
s instead of the usual
bitsBITS or bits may refer to:* Binary digits* Drill bits* The pieces of a Spanish dollar* Bits , a Beanie Baby bear produced by Ty, Inc.* Background Intelligent Transfer Service, a file transfer protocol...
. In practice, information entropy is almost always calculated using base 2 logarithms, but this distinction amounts to nothing other than a change in units. One nat is about 1.44 bits.
The presence of Boltzmann's constant
k in the thermodynamic definitions is a historical accident, reflecting the conventional units of temperature. It is there to make sure that the statistical definition of thermodynamic entropy matches the classical entropy of Clausius, thermodynamically conjugate to
temperatureIn physics, temperature is a physical property of a system that underlies the common notions of hot and cold; something that feels hotter generally has the higher temperature. Temperature is one of the principal parameters of thermodynamics...
. For a simple compressible system that can only perform volume work, the
first law of thermodynamicsThe first law of thermodynamics, an expression of the principle of conservation of energy, states that energy can be transformed , but cannot be created or destroyed. Alternatively:-Description:...
becomes
But one can equally well write this equation in terms of what physicists and chemists sometimes call the 'reduced' or dimensionless entropy, σ =
S/
k, so that
Just as
S is conjugate to
T, so σ is conjugate to
kT (the energy that is characteristic of
T on a molecular scale).
Continuous case
The most obvious extension of the Shannon entropy is the
differential entropyDifferential entropy is a concept in information theory that extends the idea of entropy, a measure of average surprisal of a random variable, to continuous probability distributions.-Definition:...
,
As long as
f(
x) is a probability density function, p. d. f.,
H represents the entropy (average information,
disorderDisorder may refer to :* Disorder * Chaos, unpredictability and in the metaphysical sense, it is the opposite of law and order* Entropy, a state function of a thermodynamic system* Lawlessness, a lack of laws or law enforcement...
, diversity, etc.) of
f(
x). For any uniform p. d. f.
f(
x), the exponential of
H is the volume covered by
f(
x) (in analogy to the
cardinalityIn mathematics, the cardinality of a set is a measure of the "number of elements of the set". For example, the set A = {2, 4, 6} contains 3 elements, and therefore A has a cardinality of 3...
in the discrete case). The volume covered by a n-dimensional multivariate Gaussian distribution with moment matrix M is proportional to the volume of the ellipsoid of concentration and is equal to '. The volume is always positive.
Entropy may be maximized using
Gaussian adaptationGaussian adaptation is an evolutionary algorithm designed for the maximization of manufacturing yield due to statistical deviation of component values of signal processing systems...
- one of the evolutionary algorithms - keeping the mean fitness - i. e. the probability of becoming a parent to new individuals in the population - constant (and without the need for any knowledge about entropy as a criterion function). This is illustrated by the figure below, showing Gaussian adaptation climbing a mountain crest in a phenotypic landscape. The lines in the figure are part of a contour line enclosing a region of acceptability in the landscape. At the start the cluster of red points represents a very homogeneous population with small variances in the phenotypes. Evidently, even small environmental changes in the landscape, may cause the process to become extinct.
After a sufficiently large number of generations, the increase in entropy may result in the green cluster. Actually, the mean fitness is the same for both red and green cluster (about 65%). The effect of this adaptation is not very salient in a 2-dimensional case, but in a high-dimensional case, the efficiency of the search process may be increased by many orders of magnitude.
Besides, a Gaussian distribution has the highest entropy compared to other distributions having the same second order moment matrix (Middleton 1960).
But it turns out that this is
not in general a good measure of uncertainty or information. For example, the differential entropy can be negative; also it is not invariant under continuous coordinate transformations.
JaynesJaynes is a surname, and may refer to* Dwight Jaynes, American sports journalist* Edwin Thompson Jaynes, American physicist and theorist of probability* Jeremy Jaynes, American convicted spammer* Julian Jaynes, American psychologist...
showed in fact that the expression above is not the correct limit of the expression for a finite set of probabilities.
The correct expression, appropriate for the continuous case, is the relative entropy of a distribution, defined as the Kullback-Leibler divergence from the distribution to a reference measure
m(
x),
(or sometimes the negative of this).
The relative entropy carries over directly from discrete to continuous distributions, and is invariant under coordinate reparametrisations.
Theoretical relationship
Despite all that, there is an important difference between the two quantities. The information entropy
H can be calculated for
any probability distribution (if the "message" is taken to be that the event
i which had probability
pi occurred, out of the space of the events possible). But the thermodynamic entropy
S refers to thermodynamic probabilities
pi specifically.
Furthermore, the thermodynamic entropy
S is dominated by different arrangements of the system, and in particular its energy, that are possible on a molecular scale. In comparison, information entropy of any macroscopic event is so small as to be completely irrelevant.
However, a connection can be made between the two, if the probabilities in question are the thermodynamic probabilities
pi: the (reduced) Gibbs entropy σ can then be seen as simply the amount of Shannon information needed to define the detailed microscopic state of the system, given its macroscopic description. Or, in the words of G. N. Lewis writing about chemical entropy in 1930, "Gain in entropy always means loss of information, and nothing more". To be more concrete, in the discrete case using base two logarithms, the reduced Gibbs entropy is equal to the minimum number of yes/no questions that need to be answered in order to fully specify the microstate, given that we know the macrostate.
Furthermore, the prescription to find the equilibrium distributions of statistical mechanics, such as the Boltzmann distribution, by maximising the Gibbs entropy subject to appropriate constraints (the
Gibbs algorithmIn statistical mechanics, the Gibbs algorithm, first introduced by J. Willard Gibbs in 1878, is the injunction to choose a statistical ensemble for the unknown microscopic state of a thermodynamic system by minimising the average log probabilitysubject to the probability distribution satisfying a...
), can now be seen as something not unique to thermodynamics, but as a principle of general relevance in all sorts of statistical inference, if it desired to find a
maximally uninformative probability distributionIn subjectivist probability, the principle of maximum entropy is a postulate which states that, subject to known constraints , the probability distribution which best represents the current state of knowledge is the one with largest entropy.Let some testable information about a probability...
, subject to certain constraints on the behaviour of its averages. (These perspectives are explored further in the article
Maximum entropy thermodynamicsIn physics, maximum entropy thermodynamics views equilibrium thermodynamics and statistical mechanics as inference processes. More specifically, MaxEnt applies inference techniques rooted in Shannon information theory, Bayesian probability, and the principle of maximum entropy...
).
Szilard's engine
A neat physical thought-experiment demonstrating how just the possession of information might in principle have thermodynamic consequences was established in 1929 by
Leó SzilárdLeó Szilárd was a Hungarian physicist who conceived the nuclear chain reaction and worked on the Manhattan Project...
, in a refinement of the famous
Maxwell's demonMaxwell's demon is a thought experiment, first formulated in 1867 by the Scottish physicist James Clerk Maxwell, intended by Maxwell primarily to "show that the 2nd Law of Thermodynamics has only a statistical certainty," and commonly used for imagining the possibility of violating it...
scenario.
Consider Maxwell's set-up, but with only a single gas particle in a box. If the supernatural demon knows which half of the box the particle is in (equivalent to a single bit of information), it can close a shutter between the two halves of the box, close a piston unopposed into the empty half of the box, and then extract joules of useful work if the shutter is opened again. The particle can then be left to isothermally expand back to its original equilibrium occupied volume. In just the right circumstances therefore, the possession of a single bit of Shannon information (a single bit of
negentropyThe negentropy, also negative entropy or syntropy, of a living system is the entropy that it exports to keep its own entropy low; it lies at the intersection of entropy and life...
in Brillouin's term) really does correspond to a reduction in physical entropy, which theoretically can indeed be parlayed into useful physical work.
Landauer's principle
In fact one can generalise: any information that has a physical representation must somehow be embedded in the statistical mechanical degrees of freedom of a physical system.
Thus,
Rolf LandauerRolf William Landauer was an IBM physicist who in 1961 argued that when information is lost in an irreversible circuit, the information becomes entropy and an associated amount of energy is dissipated as heat. This principle is relevant to reversible computing, quantum information and quantum...
argued in 1961, if one were to imagine starting with those degrees of freedom in a thermalised state, there would be a real reduction in thermodynamic entropy if they were then re-set to a known state. This can only be achieved under information-preserving microscopically deterministic dynamics if the uncertainty is somehow dumped somewhere else — ie if the entropy of the environment (or the non information-bearing degrees of freedom) is increased by at least an equivalent amount, as required by the Second Law, by gaining an appropriate quantity of heat: specifically
kT ln 2 of heat for every 1 bit of randomness erased.
On the other hand, Landauer argued, there is no thermodynamic objection to a logically reversible operation potentially being achieved in a physically reversible way in the system. It is only logically irreversible operations — for example, the erasing of a bit to a known state, or the merging of two computation paths — which must be accompanied by a corresponding entropy increase.
Applied to the Maxwell's demon/Szilard engine scenario, this suggests that it might be possible to "read" the state of the particle into a computing apparatus with no entropy cost; but
only if the apparatus has already been
SET into a known state, rather than being in a thermalised state of uncertainty. To
SET (or
RESET) the apparatus into this state will cost all the entropy that can be saved by knowing the state of Szilard's particle.
Negentropy
Shannon entropy has been related by physicist
Léon BrillouinLéon Nicolas Brillouin was a French physicist. He was born in Sèvres , France. His father, Marcel Brillouin, grandfather, Éleuthère Mascart, and great-grandfather, Charles Briot, were physicists as well...
to a concept sometimes called
negentropyThe negentropy, also negative entropy or syntropy, of a living system is the entropy that it exports to keep its own entropy low; it lies at the intersection of entropy and life...
. In his 1962 book
Science and Information Theory, Brillouin described the Negentropy Principle of Information or NPI, the gist of which is that acquiring information about a system’s microstates is associated with a decrease in entropy (work is needed to extract information, erasure leads to increase in thermodynamic entropy).
There is no violation of the second law of thermodynamics, according to Brillouin, since a reduction in any local system’s thermodynamic entropy results in an increase in thermodynamic entropy elsewhere. Negentropy was considered as controversial because its' earlier understanding can yield Carnot efficiency higher than one.
In 2003 Weiss and Weiss came on a background of psychometric data and theoretical considerations to the conclusion that the negentropy principle underlies the information processing by the brain. In 2008 this has been empirically confirmed by a group of neurobiologists. In 2009, Mahulikar & Herwig redefined thermodynamic negentropy as the specific entropy deficit of the dynamically ordered sub-system relative to its' surroundings. This definition enabled the formulation of the
Negentropy Principle, which is mathematically shown to follow from the 2
nd Law of Thermodynamics, during order existence.
Black holes
Stephen HawkingStephen William Hawking, CH, CBE, FRS, FRSA is a British theoretical physicist. He is known for his contributions to the fields of cosmology and quantum gravity, especially in the context of black holes...
often speaks of the thermodynamic entropy of
black holeIn general relativity, a black hole is a region of space in which the gravitational field is so powerful that nothing, not even light, can escape. The black hole has a one-way surface, called an event horizon, into which objects can fall, but out of which nothing can come...
s in terms of their information content. Do black holes destroy information? See
Black hole thermodynamicsIn physics, black hole thermodynamics is the area of study that seeks to reconcile the laws of thermodynamics with the existence of black hole event horizons...
and
Black hole information paradoxThe black hole information paradox results from the combination of quantum mechanics and general relativity. It suggests that physical information could "disappear" in a black hole, allowing many physical states to evolve into precisely the same state...
.
Quantum theory
Hirschman showed in 1957, however, that Heisenberg's uncertainty principle can be expressed as a particular lower bound on the sum of the entropies of the
observable probability distributions of a particle's position and momentum, when they are expressed in
Planck unitsIn physics, Planck units are physical units of measurement defined exclusively in terms of five universal physical constants listed below, in such a manner that these five physical constants take on the numerical value of one when expressed in terms of these units. Planck units elegantly simplify...
. (One could speak of the "
joint entropyThe joint entropy is an entropy measure used in information theory. The joint entropy measures how much entropy is contained in a joint system of two random variables. If the random variables are and , the joint entropy is written...
" of these distributions by considering them independent, but since they are not jointly observable, they cannot be considered as a
joint distributionIn the study of probability, given two random variables X and Y, the joint distribution for X and Y defines the probability of events defined in terms of both X and Y...
.)
It is well known that a Shannon based definition of information entropy leads in the classical case to the Boltzmann entropy. It is tempting to regard the
Von Neumann entropyIn quantum statistical mechanics, von Neumann entropy refers to the extension of classical entropy concepts to the field of quantum mechanics....
as the corresponding quantum mechanical definition. But the latter is problematic from quantum information point of view. Consequently Stotland, Pomeransky, Bachmat and Cohen have introduced a new definition of entropy that reflects the inherent uncertainty of quantum mechanical states. This definition allows to distinguish between the minimum uncertainty entropy of pure states, and the excess statistical entropy of mixtures.
The fluctuation theorem
The
fluctuation theoremThe fluctuation theorem which originated from statistical mechanics deals with the relative probability that the entropy of a system which is currently away from thermodynamic equilibrium will increase or decrease over a given amount of time...
provides a mathematical justification of the
second law of thermodynamicsThe second law of thermodynamics is an expression of the universal principle of entropy, stating that the entropy of an isolated system which is not in equilibrium will tend to increase over time, approaching a maximum value at equilibrium, and that the entropy change dS of a system undergoing any...
under these principles, and precisely defines the limitations of the applicability of that law to the microscopic realm of individual particle movements.
Is information quantized?
In 1995, Dr Tim Palmer signalled two unwritten assumptions about Shannon's definition of information that may make it inapplicable as such to
quantum mechanicsQuantum mechanics is a set of principles describing the physical reality at the atomic level of matter and the subatomic . These descriptions include the simultaneous wave-like and particle-like behavior of both matter and radiation...
:
- The supposition that there is such a thing as an observable state (for instance the upper face of a die or a coin) before the observation begins
- The fact that knowing this state does not depend on the order in which observations are made (commutativity)
The article
Conceptual inadequacy of the Shannon information in quantum measurement, published in 2001 by
Anton ZeilingerAnton Zeilinger is an Austrian quantum physicist. He is currently professor of physics at the University of Vienna, previously University of Innsbruck. He is also the director of the Vienna branch of the Institute for Quantum Optics and Quantum Information IQOQI at the Austrian Academy of Sciences...
and Caslav Brukner, synthesized and developed these remarks. The so-called Zeilinger's principle suggests that the quantization observed in QM could be bound to
information quantization (one cannot observe less than one bit, and what is not observed is by definition "random"). But these claims remain highly controversial.
See also
- Thermodynamic entropy
Entropy is a concept of information maintaining great importance in physics, chemistry, and information theory...
- Information entropy
In information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the information contained in a message, usually in units such as bits...
- Thermodynamics
In physics, thermodynamics is the study of the conversion of energy into work and heat and its relation to macroscopic variables such as temperature, volume and pressure...
- Statistical mechanics
Statistical mechanics is the application of probability theory, which includes mathematical tools for dealing with large populations, to the field of mechanics, which is concerned with the motion of particles or objects when subjected to a force...
- Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Historically, information theory was developed by Claude E. Shannon to find fundamental limits on compressing and reliably storing and communicating data...
- Physical information
In physics, physical information refers generally to the information that is contained in a physical system. Its usage in quantum mechanics is important, for example in the concept of quantum entanglement to describe effectively direct or causal relationships between apparently distinct or...
- Fluctuation theorem
The fluctuation theorem which originated from statistical mechanics deals with the relative probability that the entropy of a system which is currently away from thermodynamic equilibrium will increase or decrease over a given amount of time...
- Black hole entropy
- Black hole information paradox
The black hole information paradox results from the combination of quantum mechanics and general relativity. It suggests that physical information could "disappear" in a black hole, allowing many physical states to evolve into precisely the same state...
- Entropy (information theory)
- Entropy (statistical thermodynamics)
- Entropy (order and disorder)
In thermodynamics, entropy is commonly associated with the amount of order, disorder, and/or chaos in a thermodynamic system. This stems from Rudolf Clausius' 1862 assertion that any thermodynamic processes always "admits to being reduced to the alteration in some way or another of the arrangement...
External links