Kraft's inequality
Encyclopedia
In coding theory
Coding theory
Coding theory is the study of the properties of codes and their fitness for a specific application. Codes are used for data compression, cryptography, error-correction and more recently also for network coding...

, Kraft's inequality, named after Leon Kraft, gives a necessary and sufficient condition for the existence of a uniquely decodable code for a given set of codeword lengths. Its applications to prefix codes and trees often find use in computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

 and information theory
Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...

.

More specifically, Kraft's inequality limits the lengths of codewords in a prefix code
Prefix code
A prefix code is a type of code system distinguished by its possession of the "prefix property"; which states that there is no valid code word in the system that is a prefix of any other valid code word in the set...

: if one takes an exponential function
Exponential function
In mathematics, the exponential function is the function ex, where e is the number such that the function ex is its own derivative. The exponential function is used to model a relationship in which a constant change in the independent variable gives the same proportional change In mathematics,...

 of each length, the resulting values must look like a probability mass function
Probability mass function
In probability theory and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value...

. Kraft's inequality can be thought of in terms of a constrained budget to be spent on codewords, with shorter codewords being more expensive.
  • If Kraft's inequality holds with strict inequality, the code has some redundancy
    Redundancy (information theory)
    Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. Informally, it is the amount of wasted "space" used to transmit certain data...

    .
  • If Kraft's inequality holds with strict equality, the code in question is a complete code.
  • If Kraft's inequality does not hold, the code is not uniquely decodable.


Kraft's inequality was published by . However, Kraft's paper discusses only prefix codes, and attributes the analysis leading to the inequality to Raymond Redheffer
Raymond Redheffer
Raymond Moos Redheffer was an American mathematician. He was the creator of the first electronic games known by us, the knowledge game called Nim...

. The inequality is sometimes also called the Kraft–McMillan theorem after the independent discovery of the result by ; McMillan proves the result for the general case of uniquely decodable codes, and attributes the version for prefix codes to a spoken observation in 1955 by Joseph Leo Doob
Joseph Leo Doob
Joseph Leo Doob was an American mathematician, specializing in analysis and probability theory.The theory of martingales was developed by Doob.-Early life and education:...

.

Binary trees

Any binary tree
Binary tree
In computer science, a binary tree is a tree data structure in which each node has at most two child nodes, usually distinguished as "left" and "right". Nodes with children are parent nodes, and child nodes may contain references to their parents. Outside the tree, there is often a reference to...

 can be viewed as defining a prefix code for the leaves of the tree. Kraft's inequality states that


Here the sum is taken over the leaves of the tree, i.e. the nodes without any children. The depth is the distance to the root node. In the tree to the right, this sum is

Chaitin's constant

In algorithmic information theory
Algorithmic information theory
Algorithmic information theory is a subfield of information theory and computer science that concerns itself with the relationship between computation and information...

, Chaitin's constant
Chaitin's constant
In the computer science subfield of algorithmic information theory, a Chaitin constant or halting probability is a real number that informally represents the probability that a randomly constructed program will halt...

 is defined as


This is an infinite sum
Series (mathematics)
A series is the sum of the terms of a sequence. Finite sequences and series have defined first and last terms, whereas infinite sequences and series continue indefinitely....

 which has one summand for every syntactically correct program which halts. |p| stands for the length of the bit string of p. The programs are required to be prefix-free in the sense that no summand has a prefix representing a syntactically valid program that halts. Hence the bit strings are prefix codes, and Kraft's inequality gives that .

Formal statement

Let each source symbol from the alphabet

be encoded into a uniquely decodable code over an alphabet of size with codeword lengths


Then


Conversely, for a given set of natural numbers satisfying the above inequality, there exists a uniquely decodable code over an alphabet of size with those codeword lengths.

A commonly occurring special case of a uniquely decodable code
Code
A code is a rule for converting a piece of information into another form or representation , not necessarily of the same type....

 is a prefix code
Prefix code
A prefix code is a type of code system distinguished by its possession of the "prefix property"; which states that there is no valid code word in the system that is a prefix of any other valid code word in the set...

. Kraft's inequality therefore also holds for any prefix code
Prefix code
A prefix code is a type of code system distinguished by its possession of the "prefix property"; which states that there is no valid code word in the system that is a prefix of any other valid code word in the set...

.

Proof for prefix codes

Suppose that . Let be the full -ary tree of depth . Every word of length over an -ary alphabet corresponds to a node in this tree at depth . The th word in the prefix code
Prefix code
A prefix code is a type of code system distinguished by its possession of the "prefix property"; which states that there is no valid code word in the system that is a prefix of any other valid code word in the set...

corresponds to a node ; let be the set of all leaf nodes in the subtree of rooted at . Clearly

Since the code is a prefix code,
.
Thus, given that the total number of nodes at depth is ,


from which the result follows.

Conversely, given any ordered sequence of natural numbers,


satisfying the Kraft's inequality, one can construct a prefix code with codeword lengths equal to by pruning subtrees from a full -ary tree of depth . First choose any node from the full tree at depth and remove all of its descendents. This removes fraction of the nodes from the full tree from being considered for the rest of the remaining codewords. Next iteration removes fraction of the full tree for total of . After iterations,


fraction of the full tree nodes are removed from consideration for any remaining codewords. But, by the assumption, this sum is less than 1 for all , thus prefix code with lengths can be constructed for all source symbols.

Proof of the general case

Consider the generating function in inverse of x for the code S


in which —the coefficient in front of —is the number of distinct codewords of length . Here min is the length of the shortest codeword in S, and max is the length of the longest codeword in S.

For any positive integer m consider the m-fold product Sm,
which consists of all the words of the form , where
are indices between 1 and m. Note that, since S was assumed to uniquely decodable,
if , then .
In other words, every word in comes from a unique sequence of codewords in . Because of this property,
one can compute the generating function for from the generating function as


Here, similarly as before, —the coefficient in front of in —is the number of words of length in . Clearly, can not exceed .
Hence for any positive x


Substituting the value x = r we have

for any positive integer . Left side of the inequality grows exponentially in
and right side only linearly. The only possibility for the inequality to be valid for all
is that .
Looking back on the definition of we finally get the inequality.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK