In
cryptographyCryptography is the practice and study of hiding information. Modern cryptography intersects the disciplines of mathematics, computer science, and engineering...
, a
substitution cipher is a method of
encryptionIn cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...
by which units of plaintext are replaced with
ciphertextIn cryptography, ciphertext is the result of the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. This result is also known as encrypted information...
according to a regular system; the "units" may be single letters (the most common), pairs of letters, triplets of letters, mixtures of the above, and so forth. The receiver deciphers the text by performing an inverse substitution.
Substitution ciphers can be compared with
transposition cipherIn cryptography, a transposition cipher is a method of encryption by which the positions held by units of plaintext are shifted according to a regular system, so that the ciphertext constitutes a permutation of the plaintext. That is, the order of the units is changed...
s. In a transposition cipher, the units of the plaintext are rearranged in a different and usually quite complex order, but the units themselves are left unchanged. By contrast, in a substitution cipher, the units of the plaintext are retained in the same sequence in the ciphertext, but the units themselves are altered.
There are a number of different types of substitution cipher. If the cipher operates on single letters, it is termed a
simple substitution cipher; a cipher that operates on larger groups of letters is termed
polygraphic. A
monoalphabetic cipher uses fixed substitution over the entire message, whereas a
polyalphabetic cipher uses a number of substitutions at different times in the message, where a unit from the plaintext is mapped to one of several possibilities in the ciphertext and vice-versa.
Simple substitution
Substitution over a single letter—
simple substitution—can be demonstrated by writing out the alphabet in some order to represent the substitution. This is termed a
substitution alphabet. The cipher alphabet may be shifted or reversed (creating the
CaesarIn cryptography, a Caesar cipher, also known as a Caesar's cipher, the shift cipher, Caesar's code or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number...
and
AtbashAtbash is a simple substitution cipher for the Hebrew alphabet. It consists in substituting aleph for tav , beth for shin , and so on, reversing the alphabet. In the Book of Jeremiah, Lev Kamai is Atbash for Kasdim , and Sheshakh is Atbash for Bavel...
ciphers, respectively) or scrambled in a more complex fashion, in which case it is called a
mixed alphabet or
deranged alphabet. Traditionally, mixed alphabets are created by first writing out a keyword, removing repeated letters in it, then writing all the remaining letters in the alphabet.
Examples
Using this system, the keyword "
zebras" gives us the following alphabets:
| Plaintext alphabet: |
abcdefghijklmnopqrstuvwxyz |
| Ciphertext alphabet: |
ZEBRASCDFGHIJKLMNOPQTUVWXY |
A message of
flee at once. we are discovered!
enciphers to
SIAA ZQ LKBA. VA ZOA RFPBLUAOAR!
Traditionally, the ciphertext is written out in blocks of fixed length, omitting punctuation and spaces; this is done to help avoid transmission errors and to disguise word boundaries from the
plaintextIn cryptography, plaintext is information a sender wishes to transmit to a receiver. Cleartext is, sometimes confusingly, often used as a synonym. Before the computer era, plaintext most commonly meant message text in the language of the communicating parties...
. These blocks are called "groups", and sometimes a "group count" (i.e., the number of groups) is given as an additional check. Five letter groups are traditional, dating from when messages used to be transmitted by
telegraphTelegraphy is the long-distance transmission of written messages without physical transport of letters. It is a compound term formed from the Greek words tele = far and graphein = write. Radiotelegraphy or wireless telegraphy transmits messages using radio...
:
SIAAZ QLKBA VAZOA RFPBL UAOAR
If the length of the message happens not to be divisible by five, it may be padded at the end with "
nullNull means 'nothing' or without value or consequence.Null may refer to:* Null , a special value in computer programming* Null , a zero value in several branches of mathematics...
s". These can be any characters that decrypt to obvious nonsense, so the receiver can easily spot them and discard them.
The ciphertext alphabet is sometimes different from the plaintext alphabet; for example, in the
pigpen cipherThe pigpen cipher is a simple substitution cipher exchanging letters for symbols based on a grid. The use of symbols is no impediment to cryptanalysis however, and cryptanalysis is identical to that of other simple substitution schemes...
, the ciphertext consists of a set of symbols derived from a grid. For example:
Such features make little difference to the security of a scheme, however — at the very least, any set of strange symbols can be transcribed back into an A-Z alphabet and dealt with as normal.
In lists and catalogues for sales people sometimes a very simple encryption is used to replace numeric digits by letters.
| Plain digits: |
1234567890 |
| Ciphertext alphabet: |
MAKEPROFIT |
Example: MAT would be used to represent 120.
Security for simple substitution ciphers
A disadvantage of this method of derangement is that the last letters of the alphabet (which are mostly low frequency) tend to stay at the end. A stronger way of constructing a mixed alphabet is to perform a columnar transposition on the ordinary alphabet using the keyword, but this is not often done.
Although the number of possible
keyIn cryptography, a key is a piece of information that determines the functional output of a cryptographic algorithm or cipher. Without a key, the algorithm would have no result. In encryption, a key specifies the particular transformation of plaintext into ciphertext, or vice versa during decryption...
s is very large (26! ≈ 2
88.4, or about
88 bitsIn cryptography, key size or key length is the size of the key used in a cryptographic algorithm . An algorithm's key length is distinct from its cryptographic security, which is a logarithmic measure of the fastest known computational attack on the algorithm, also measured in bits...
), this cipher is not very strong, being easily broken. Provided the message is of reasonable length (see below), the
cryptanalystCryptanalysis is the study of methods for obtaining the meaning of encrypted information, without access to the secret information which is normally required to do so. Typically, this involves knowing how the system works and finding a secret key...
can deduce the probable meaning of the most common symbols by analyzing the
frequency distributionIn statistics, a frequency distribution is a tabulation of the values that one or more variables take in a sample.-Univariate frequency tables:...
of the ciphertext—
frequency analysisIn cryptanalysis, frequency analysis is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers....
. This allows formation of partial words, which can be tentatively filled in, progressively expanding the (partial) solution (see frequency analysis for a demonstration of this). In some cases, underlying words can also be determined from the pattern of their letters; for example,
attract,
osseous, and words with those two as the root are the only common
EnglishEnglish is a West Germanic language that developed in England during the Anglo-Saxon era. As a result of the military, economic, scientific, political, and cultural influence of the British Empire during the 18th, 19th, and early 20th centuries, and of the United States since the mid 20th century,...
words with the pattern
ABBCADB. Many people solve such ciphers for recreation, as with
cryptogramA cryptogram is a type of puzzle which consists of a short piece of encrypted text. Generally the cipher used to encrypt the text is simple enough that cryptogram can be solved by hand. Frequently used are substitution ciphers where each letter is replaced by a different letter or number. To solve...
puzzles in the newspaper.
According to the
unicity distanceUnicity distance is a term used in cryptography referring to the length of an original ciphertext needed to break the cipher by reducing the number of possible spurious keys to zero in a brute force attack...
of
EnglishEnglish is a West Germanic language that developed in England during the Anglo-Saxon era. As a result of the military, economic, scientific, political, and cultural influence of the British Empire during the 18th, 19th, and early 20th centuries, and of the United States since the mid 20th century,...
, 27.6 letters of ciphertext are required to crack a mixed alphabet simple substitution. In practice, typically about 50 letters are needed, although some messages can be broken with fewer if unusual patterns are found. In other cases, the plaintext can be contrived to have a nearly flat frequency distribution, and much longer plaintexts will then be required by the user.
Homophonic substitution
An early attempt to increase the difficulty of frequency analysis attacks on substitution ciphers was to disguise plaintext letter frequencies by
homophony. In these ciphers, plaintext letters map to more than one ciphertext symbol. Usually, the highest-frequency plaintext symbols are given more equivalents than lower frequency letters. In this way, the frequency distribution is flattened, making analysis more difficult.
Since more than 26 characters will be required in the ciphertext alphabet, various solutions are employed to invent larger alphabets. Perhaps the simplest is to use a numeric substitution 'alphabet'. Another method consists of simple variations on the existing alphabet; uppercase, lowercase, upside down, etc. More artistically, though not necessarily more securely, some homophonic ciphers employed wholly invented alphabets of fanciful symbols. (See
PoeEdgar Allan Poe was an American writer, poet, editor and literary critic, considered part of the American Romantic Movement. Best known for his tales of mystery and the macabre, Poe was one of the earliest American practitioners of the short story and is considered the inventor of the...
's "
The Gold-Bug"The Gold-Bug" is a short story by Edgar Allan Poe. Set on Sullivan's Island, South Carolina, the plot follows William Legrand, who was recently bitten by a gold-colored bug, as well as his servant Jupiter and an unnamed narrator...
" for a literary example; cf. the
Voynich manuscriptThe Voynich manuscript is a mysterious, undeciphered illustrated book. It is thought to have been written in the 15th or 16th century. The author, script, and language of the manuscript remain unknown....
.)
An interesting variant is the
nomenclator. Named after the public official who announced the titles of visiting dignitaries, this cipher combined a small
codebookIn cryptography, a codebook is a document used for implementing a code. A codebook contains a lookup table for coding and decoding; each word or phrase has one or more strings which replace it. To decipher messages written in code, corresponding copies of the codebook must be available at either end...
with large homophonic substitution tables. Originally the
codeIn cryptography, a code is a method used to transform a message into an obscured form, preventing those who do not possess special information, or key, required to apply the transform from understanding what is actually transmitted. The usual method is to use a codebook with a list of common...
was restricted to the names of important people, hence the name of the cipher; in later years it covered many common words and place names as well. The symbols for whole words (
codewords in modern parlance) and letters (
cipherIn cryptography, a cipher is an algorithm for performing encryption or decryption — a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. In non-technical usage, a “cipher” is the same thing as a “code”; however, the concepts...
in modern parlance) were not distinguished in the ciphertext. The
RossignolsThe Rossignols, a family of French cryptographers and cryptanalysts, included:* Antoine Rossignol * Bonaventure Rossignol* Antoine-Bonaventure RossignolThe family name meant "nightingale" in French...
'
Great CipherIn the history of cryptography, the Great Cipher or Grand Chiffre was a nomenclator cipher developed by the Rossignols, several generations of whom served the French Crown as cryptographers. The Great Cipher was excellent of its class and so was given this name; it was reputed to be unbreakable...
used by
Louis XIV of FranceLouis XIV , popularly known as the Sun King , was King of France and of Navarre His reign, from 1643 to his death in 1715, lasted seventy-two years, three months, and eighteen days, and is the longest documented reign of any European monarch.Louis began personally governing France after the death...
was one; after it went out of use, messages in French
archiveAn archive is a collection of historical records, and the location in which the collection is kept. Archives contain records which have accumulated over the course of an individual or organization's lifetime....
s were unbreakable for several hundred years.
Nomenclators were the standard fare of
diplomaticDiplomacy is the art and practice of conducting negotiations between representatives of groups or states. It usually refers to international diplomacy, the conduct of international relations through the intercession of professional diplomats with regard to issues of peace-making, trade, war,...
correspondence,
espionageEspionage or spying involves an individual obtaining information that is considered secret or confidential without the permission of the holder of the information. Espionage is inherently clandestine, as the legitimate holder of the information may change plans or take other countermeasures once it...
, and advanced political
conspiracyIn a political sense, conspiracy refers to a group of persons united in the goal of usurping or overthrowing an established political power. Typically, the final goal is to gain power through a revolutionary coup d'état or through assassination....
from the early fifteenth century to the late eighteenth century; most conspirators were and have remained less cryptographically sophisticated. Although
governmentA government is the body within a community, political entity or organization which has the authority to make and enforce rules, laws and regulations.....
intelligenceAn intelligence agency is a governmental agency that is devoted to information gathering for purposes of national security and defense. Means of information gathering may include espionage, communication interception, cryptanalysis, cooperation with other institutions, and evaluation of public...
cryptanalysts were systematically breaking nomenclators by the mid-sixteenth century, and superior systems had been available since 1467, the usual response to
cryptanalysisCryptanalysis is the study of methods for obtaining the meaning of encrypted information, without access to the secret information which is normally required to do so. Typically, this involves knowing how the system works and finding a secret key...
was simply to make the tables larger. By the late eighteenth century, when the system was beginning to die out, some nomenclators had 50,000 symbols.
Nevertheless, not all nomenclators were broken; today, cryptanalysis of archived ciphertexts remains a fruitful area of
historical researchHistory is the study of the human past, with special attention to the written record. Scholars who write about history are called historians. It is a field of research which uses a narrative to examine and analyse the sequence of events, and it often attempts to investigate objectively the patterns...
.
The
book cipherA book cipher is a cipher in which the key is some aspect of a book or other piece of text; books being common and widely available in modern times, users of book cyphers take the position that the details of the key is sufficiently well hidden from attackers in practice. This is in some ways an...
and
straddling checkerboardIn cryptography, a straddling checkerboard is a device for converting an alphabetic plaintext into digits whilst simultaneously achieving fractionation and homophony...
are types of homophonic cipher.
Polyalphabetic substitution
Polyalphabetic substitution ciphers were first described in 1467 by
Leone Battista AlbertiLeon Battista Alberti was an Italian author, artist, architect, poet, priest, linguist, philosopher, and cryptographer, and general Renaissance humanist polymath...
in the form of disks.
Johannes TrithemiusJohannes Trithemius was born Johann Heidenberg. He was an abbot, cryptographer and occultist who had an influence on later occultism. The name by which he is more commonly known is derived from his native town of Trittenheim on the Mosel in Germany.-Life:He studied at the University of Heidelberg...
, in his book
Steganographia (
Ancient GreekAncient Greek is the historical stage in the development of the Greek language spanning across the Archaic , Classical , and Hellenistic periods of ancient Greece and the ancient world. It is predated in the 2nd millennium BC by Mycenaean Greek...
for "hidden writing") introduced the now more standard form of a
tableau (see below; ca. 1500 but not published until much later). A more sophisticated version using mixed alphabets was described in 1563 by Giovanni Battista della Porta in his book,
De Furtivis Literarum Notis (
LatinLatin is an Italic language originally spoken in Latium and Ancient Rome. Through the Roman conquest, Latin spread throughout the Mediterranean and a large part of Europe...
for "On concealed characters in writing").
In a polyalphabetic cipher, multiple cipher alphabets are used. To facilitate encryption, all the alphabets are usually written out in a large
tableA table is both a mode of visual communication and a means of arranging data. The use of tables is pervasive throughout all communication, research and data analysis. Tables appear in print media, handwritten notes, computer software, architectural ornamentation, traffic signs and many other places...
, traditionally called a
tableau. The tableau is usually 26×26, so that 26 full ciphertext alphabets are available. The method of filling the tableau, and of choosing which alphabet to use next, defines the particular polyalphabetic cipher. All such ciphers are easier to break than once believed, as substitution alphabets are repeated for sufficiently large plaintexts.
One of the most popular was that of
Blaise de VigenèreBlaise de Vigenère was a French diplomat and cryptographer. The Vigenère cipher is so named due to the cipher being incorrectly attributed to him in the 19th century....
. First published in 1585, it was considered unbreakable until 1863, and indeed was commonly called
le chiffre indéchiffrable (
FrenchFrench is a Romance language globally spoken by about 65 million people as a first language , by 50 million as a second language, and by about another 200 million people as an acquired foreign language, with significant speakers in 57 countries. Most native speakers of the language live in France,...
for "indecipherable cipher").
In the
Vigenère cipherThe Vigenère cipher is a method of encrypting alphabetic text by using a series of different Caesar ciphers based on the letters of a keyword. It is a simple form of polyalphabetic substitution....
, the first row of the tableau is filled out with a copy of the plaintext alphabet, and successive rows are simply shifted one place to the left. (Such a simple tableau is called a
tabula rectaIn cryptography, the tabula recta is a square table of alphabets, each row of which is made by shifting the previous one to the left. The term was invented by Johannes Trithemius in 1518, and used in his cipher....
, and mathematically corresponds to adding the plaintext and key letters,
moduloIn mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" after they reach a certain value—the modulus...
26.) A keyword is then used to choose which ciphertext alphabet to use. Each letter of the keyword is used in turn, and then they are repeated again from the beginning. So if the keyword is 'CAT', the first letter of plaintext is enciphered under alphabet 'C', the second under 'A', the third under 'T', the fourth under 'C' again, and so on. In practice, Vigenère keys were often phrases several words long.
In 1863,
Friedrich KasiskiMajor Friedrich Wilhelm Kasiski was a Prussian infantry officer, cryptographer and archeologist. Kasiski was born in Schlochau, West Prussia .-Military service:...
published a method (probably discovered secretly and independently before the
Crimean WarThe Crimean War was fought between the Russian Empire on one side and an alliance of the British Empire, France, the Ottoman Empire and the Kingdom of Sardinia on the other. The war was part of a long-running contest between the major European powers for influence over territories of the declining...
by
Charles BabbageCharles Babbage, FRS was an English mathematician, philosopher, inventor and mechanical engineer who originated the concept of a programmable computer. Parts of his uncompleted mechanisms are on display in the London Science Museum. In 1991, a perfectly functioning difference engine was...
) which enabled the calculation of the length of the keyword in a Vigenère ciphered message. Once this was done, ciphertext letters that had been enciphered under the same alphabet could be picked out and attacked separately as a number of semi-independent simple substitutions - complicated by the fact that within one alphabet letters were separated and did not form complete words, but simplified by the fact that usually a
tabula recta had been employed.
As such, even today a Vigenère type cipher should theoretically be difficult to break if mixed alphabets are used in the tableau, if the keyword is random, and if the total length of ciphertext is less than 27.6 times the length of the keyword. These requirements are rarely understood in practice, and so Vigenère enciphered message security is usually less than might have been.
Other notable polyalphabetics include:
- The Gronsfeld cipher. This is identical to the Vigenère except that only 10 alphabets are used, and so the "keyword" is numerical.
- The Beaufort cipher
The Beaufort cipher, created by Sir Francis Beaufort, is a substitution cipher that is similar to the Vigenère cipher but uses a slightly modified enciphering mechanism and tableau...
. This is practically the same as the Vigenère, except the tabula recta is replaced by a backwards one, mathematically equivalent to ciphertext = key - plaintext. This operation is self-inverse, whereby the same table is used for both encryption and decryption.
- The autokey cipher
An autokey cipher is a cipher which incorporates the message into the key. There are two forms of autokey cipher: key autokey and text autokey ciphers. A key-autokey cipher uses previous members of the keystream to determine the next element in the keystream...
, which mixes plaintext with a key to avoid periodicIn mathematics, a periodic function is a function that repeats its values in regular intervals or periods. The most important examples are the trigonometric functions, which repeat over intervals of length 2π...
ity.
- The running key cipher
In classical cryptography, the running key cipher is a type of polyalphabetic substitution cipher in which a text, typically from a book, is used to provide a very long keystream...
, where the key is made very long by using a passage from a book or similar text.
Modern
stream cipherIn cryptography, a stream cipher is a symmetric key cipher where plaintext bits are combined with a pseudorandom cipher bit stream , typically by an exclusive-or operation. In a stream cipher the plaintext digits are encrypted one at a time, and the transformation of successive digits varies...
s can also be seen, from a sufficiently abstract perspective, to be a form of polyalphabetic cipher in which all the effort has gone into making the
keystreamIn cryptography, a keystream is a stream of random or pseudorandom characters that are combined with a plaintext message to produce an encrypted message ....
as long and unpredictable as possible.
Polygraphic substitution
In a polygraphic substitution cipher, plaintext letters are substituted in larger groups, instead of substituting letters individually. The first advantage is that the frequency distribution is much flatter than that of individual letters (though not actually flat in real languages; for example, 'TH' is much more common than 'XQ' in English). Second, the larger number of symbols requires correspondingly more ciphertext to productively analyze letter frequencies.
To substitute
pairs of letters would take a substitution alphabet 676 symbols long . In the same
De Furtivis Literarum Notis mentioned above, della Porta actually proposed such a system, with a 20 x 20 tableau (for the 20 letters of the Italian/Latin alphabet he was using) filled with 400 unique
glyphA glyph is an element of writing. It is a slightly vague term, but a more precise definition might be an individual mark on paper or another written medium which contributes to the meaning of what is written there...
s. However the system was impractical and probably never actually used.
The earliest practical
digraphic cipher (pairwise substitution), was the so-called
Playfair cipherThe Playfair cipher or Playfair square is a manual symmetric encryption technique and was the first literal digraph substitution cipher. The scheme was invented in 1854 by Charles Wheatstone, but bears the name of Lord Playfair who promoted the use of the cipher.The technique encrypts pairs of...
, invented by Sir
Charles WheatstoneSir Charles Wheatstone FRS , was a British scientist and inventor of many scientific breakthroughs of the Victorian era, including the English concertina, the stereoscope , and the Playfair cipher...
in 1854. In this cipher, a 5 x 5 grid is filled with the letters of a mixed alphabet (two letters, usually I and J, are combined). A digraphic substitution is then simulated by taking pairs of letters as two corners of a rectangle, and using the other two corners as the ciphertext (see the
Playfair cipherThe Playfair cipher or Playfair square is a manual symmetric encryption technique and was the first literal digraph substitution cipher. The scheme was invented in 1854 by Charles Wheatstone, but bears the name of Lord Playfair who promoted the use of the cipher.The technique encrypts pairs of...
main article for a diagram). Special rules handle double letters and pairs falling in the same row or column. Playfair was in military use from the
Boer WarThe Second Boer War , commonly referred to as The Boer War and also known as the South African War , the Anglo-Boer War and in Afrikaans as the Anglo-Boereoorlog or Tweede Vryheidsoorlog , or the Engelse oorlog was fought...
through
World War IIWorld War II, or the Second World War , was a global military conflict which involved a majority of the world's nations, including all great powers, organized into two opposing military alliances: the Allies and the Axis...
.
Several other practical polygraphics were introduced in 1901 by
Felix DelastelleFélix Marie Delastelle was a Frenchman most famous for his invention of several systems of polygraphic substitution ciphers including the bifid, trifid, and the four-square ciphers....
, including the
bifidIn classical cryptography, the bifid cipher is a cipher which combines the Polybius square with transposition, and uses fractionation to achieve diffusion...
and
four-square cipherThe four-square cipher is a manual symmetric encryption technique. It was invented by famous French cryptographer Felix Delastelle.The technique encrypts pairs of letters , and thus falls into a category of ciphers known as polygraphic substitution ciphers...
s (both digraphic) and the
trifid cipherIn classical cryptography, the trifid cipher is a cipher invented around 1901 by Felix Delastelle, which extends the concept of the bifid cipher to a third dimension, allowing each symbol to be fractionated into 3 elements instead of two...
(probably the first practical trigraphic).
The
Hill cipherIn classical cryptography, the Hill cipher is a polygraphic substitution cipher based on linear algebra. Invented by Lester S. Hill in 1929, it was the first polygraphic cipher in which it was practical to operate on more than three symbols at once...
, invented in 1929 by Lester S. Hill, is a polygraphic substitution which can combine much larger groups of letters simultaneously using
linear algebraLinear algebra is a branch of mathematics concerned with the study of vectors, vector spaces , linear maps , and systems of linear equations. Vector spaces are a central theme in modern mathematics; thus, linear algebra is widely used in both abstract algebra and functional analysis...
. Each letter is treated as a digit in
base 26A numeral system is a writing system for expressing numbers, that is a mathematical notation for representing numbers of a given set, using graphemes or symbols in a consistent manner....
: A = 0, B =1, and so on. (In a variation, 3 extra symbols are added to make the
basisIn linear algebra, a basis is a set of vectors that, in a linear combination, can represent every vector in a given vector space or free module, and such that no element of the set can be represented as a linear combination of the others...
primeIn mathematics, a prime number is a natural number which has exactly two distinct natural number divisors: 1 and itself. The first twenty-six prime numbers are:An infinitude of prime numbers exists, as demonstrated by Euclid around 300 BC. The number 1 is by definition not a prime number...
.) A block of n letters is then considered as a
vectorA vector space is a mathematical structure formed by a collection of vectors: objects that may be added together and multiplied by numbers, called scalars in this context. Scalars are often taken to be real numbers, but one may also consider vector spaces with scalar multiplication by complex...
of n
dimensionIn mathematics and physics, the dimension of a space or object is informally defined as the minimum number of coordinates needed to specify each point within it. Thus a line has a dimension of one because only one coordinate is needed to specify a point on it...
s, and multiplied by a n x n
matrixIn mathematics, a matrix is a rectangular array of numbers, such asEntries of a matrix are often denoted by a variable with two subscripts, as shown on the right. Matrices of the same size can be added and subtracted entrywise and matrices of compatible size can be multiplied...
,
moduloIn mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" after they reach a certain value—the modulus...
26. The components of the matrix are the key, and should be random provided that the matrix is invertible in (to ensure decryption is possible). A Hill cipher of dimension 6 was once implemented mechanically.
The Hill cipher is vulnerable to a
known-plaintext attackThe known-plaintext attack is an attack model for cryptanalysis where the attacker has samples of both the plaintext and its encrypted version and is at liberty to make use of them to reveal further secret information such as secret keys and code books.-History:At Bletchley Park in World War II,...
because it is completely
linearThe word linear comes from the Latin word linearis, which means created by lines.In mathematics, a linear map or function f is a function which satisfies the following two properties......
, so it must be combined with some non-linear step to defeat this attack. The combination of wider and wider weak, linear
diffusiveIn cryptography, confusion and diffusion are two properties of the operation of a secure cipher which were identified by Claude Shannon in his paper Communication Theory of Secrecy Systems, published in 1949....
steps like a Hill cipher, with non-linear substitution steps, ultimately leads to a
substitution-permutation networkIn cryptography, an SP-network, or substitution-permutation network , is a series of linked mathematical operations used in block cipher algorithms such as AES ....
(e.g. a
Feistel cipherIn cryptography, a Feistel cipher is a symmetric structure used in the construction of block ciphers, named after the German-born physicist and cryptographer Horst Feistel who did pioneering research while working for IBM ; it is also commonly known as a Feistel network. A large proportion of block...
), so it is possible — from this extreme perspective — to consider modern
block cipherIn cryptography, a block cipher is a symmetric key cipher operating on fixed-length groups of bits, termed blocks, with an unvarying transformation. A block cipher encryption algorithm might take a 128-bit block of plaintext as input, and output a corresponding 128-bit block of ciphertext...
s as a type of polygraphic substitution.
Mechanical substitution ciphers
Between circa
World War IWorld War I , also known as the First World War, the Great War, and the War to End All Wars, was a global military conflict which involved most of the world's great powers, assembled in two opposing alliances: the Triple Entente and the Triple Alliance...
and the widespread availability of
computerA computer is a machine that manipulates data according to a set of instructions.Although mechanical examples of computers have existed through much of recorded human history, the first electronic computers were developed in the mid-20th century . These were the size of a large room, consuming as...
s (for some governments this was approximately the 1950s or 1960s; for other organizations it was a decade or more later; for individuals it was no earlier than 1975), mechanical implementations of polyalphabetic substitution ciphers were widely used. Several inventors had similar ideas about the same time, and
rotor cipher machineIn cryptography, a rotor machine is an electro-mechanical device used for encrypting and decrypting secret messages. Rotor machines were the cryptographic state-of-the-art for a brief but prominent period of history; they were in widespread use in the 1930s–1950s...
s were patented four times in 1919. The most important of the resulting machines was the
EnigmaAn Enigma machine is any of a family of related electro-mechanical rotor machines used for the encryption and decryption of secret messages. The first Enigma was invented by German engineer Arthur Scherbius at the end of World War I...
, especially in the versions used by the
German militaryWehrmacht was the name of the unified armed forces of Germany from 1935 to 1945. It consisted of the Heer , the Kriegsmarine and the Luftwaffe ....
from approximately 1930. The
AlliesIn general, allies are people, groups, or nations that have joined together in an association for mutual benefit or to achieve some common purpose. In English usage, those who share a common goal and whose work toward that goal is complementary may be viewed as allies for various purposes even when...
also developed and used rotor machines (eg,
SIGABAIn the history of cryptography, the ECM Mark II was a rotor machine used by the United States from World War II until the 1950s. The machine was also known as the SIGABA or Converter M-134 by the Army, or CSP-888/889 by the Navy, and a modified Navy version was termed the CSP-2900.Like many...
and
TypexIn the history of cryptography, Typex machines were British cipher machines used from 1937. It was an adaptation of the commercial German Enigma with a number of enhancements that greatly increased its security....
).
All of these were similar in that the substituted letter was chosen electrically from amongst the huge number of possible combinations resulting from the rotation of several letter disks. Since one or more of the disks rotated mechanically with each plaintext letter enciphered, the number of alphabets used was substantially more than astronomical. Early versions of these machine were, nevertheless, breakable.
William F. FriedmanWilliam Frederick Friedman was a US Army cryptographer . He ran the research division of the Army's Signals Intelligence Service in the 1930s, and parts of its follow-on services into the 1950s...
of the US Army's
SISThe Signals Intelligence Service was the United States Army codebreaking division, headquartered at Arlington Hall. It was a part of the Signal Corps so secret that outside the office of the Chief Signal officer, it did not officially exist. William Friedman began the division with three "junior...
early found vulnerabilities in
Hebern's rotor machineThe Hebern Rotor Machine was an electro-mechanical encryption machine built by combining the mechanical parts of a standard typewriter with the electrical parts of an electric typewriter, connecting the two through a scrambler...
, and GC&CS's Dillwyn Knox solved versions of the Enigma machine (those without the "plugboard") well before
WWIIWorld War II, or the Second World War , was a global military conflict which involved a majority of the world's nations, including all great powers, organized into two opposing military alliances: the Allies and the Axis...
began. Traffic protected by essentially all of the German military Enigmas was broken by Allied cryptanalysts, most notably those at
Bletchley ParkBletchley Park, also known as Station X, is an estate located in the town of Bletchley, in Buckinghamshire, England. Since 1967, Bletchley has been part of Milton Keynes....
, beginning with the German Army variant used in the early 1930s. This version was broken by inspired mathematical insight by
Marian RejewskiMarian Adam Rejewski was a Polish mathematician and cryptologist who in 1932 solved the plugboard-equipped Enigma machine, the main cipher device used by Germany...
in
PolandPoland , officially the Republic of Poland , is a country in Central Europe . Poland is bordered by Germany to the west; the Czech Republic and Slovakia to the south; Ukraine, Belarus and Lithuania to the east; and the Baltic Sea and Kaliningrad Oblast, a Russian exclave, to the north...
.
No messages protected by the
SIGABAIn the history of cryptography, the ECM Mark II was a rotor machine used by the United States from World War II until the 1950s. The machine was also known as the SIGABA or Converter M-134 by the Army, or CSP-888/889 by the Navy, and a modified Navy version was termed the CSP-2900.Like many...
and
TypexIn the history of cryptography, Typex machines were British cipher machines used from 1937. It was an adaptation of the commercial German Enigma with a number of enhancements that greatly increased its security....
machines were ever, so far as is publicly known, broken.
The one-time pad
One type of substitution cipher, the
one-time padIn cryptography, the one-time pad is an encryption algorithm in which the plaintext is combined with a secret random key or pad, which is used only once. A modular addition is typically used to combine plaintext elements with pad elements. It was invented in 1917 and patented a couple of years...
, is quite special. It was invented near the end of WWI by
Gilbert VernamGilbert Sandford Vernam was a AT&T Bell Labs engineer who, in 1917, invented the stream cipher and later co-invented the one-time pad cipher. Vernam proposed a teletype cipher in which a previously-prepared key, kept on paper tape, is combined character by character with the plaintext message to...
and
Joseph MauborgneIn the history of cryptography, Joseph Oswald Mauborgne co-invented the one-time pad with Gilbert Vernam of Bell Labs. In 1914 he published the first recorded solution of the Playfair cipher....
in the US. It was mathematically proved unbreakable by
Claude ShannonClaude Elwood Shannon , an American electronic engineer and mathematician, is known as "the father of information theory".Shannon is famous for having founded information theory with one landmark paper published in 1948...
, probably during WWII; his work was first published in the late 1940s. In its most common implementation, the one-time pad can be called a substitution cipher only from an unusual perspective; typically, the plaintext letter is combined (not substituted) in some manner (eg, XOR) with the key material character at that position.
The one-time pad is, in most cases, impractical as it requires that the key material be as long as the plaintext,
actually random, used once and
only once, and kept entirely secret from all except the sender and intended receiver. When these conditions are violated, even marginally, the one-time pad is no longer unbreakable.
SovietThe Union of Soviet Socialist Republics was a constitutionally socialist state that existed in Eurasia from 1922 to 1991. The name is a translation of the , tr. Soyuz Sovetskikh Sotsialisticheskikh Respublik, abbreviated СССР, SSSR. The common short name is Soviet Union, from , Sovetskiy Soyuz...
one-time pad messages sent from the US for a brief time during WWII used non-random key material. US cryptanalysts, beginning in the late 40s, were able to, entirely or partially, break a few thousand messages out of several hundred thousand. (See VENONA)
In a mechanical implementation, rather like the
ROCKEXRockex, or Telekrypton, was an offline one-time tape cipher machine known to have been used by Britain and Canada from 1943. It was developed by Benjamin deForest Bayly, working during the war for British Security Coordination....
equipment, the one-time pad was used for messages sent on the
MoscowMoscow is the capital and the largest city of Russia. It is also the largest metropolitan area in Europe, and ranks among the largest urban areas in the world. Moscow is a major political, economic, cultural, religious, financial, educational, and transportation centre of Russia and the world, a...
-
WashingtonWashington, D.C. , formally the District of Columbia and commonly referred to as Washington, the District, or simply D.C., is the capital of the United States, founded on July 16, 1790...
hot line established after the
Cuban missile crisisThe Cuban Missile Crisis was a confrontation between the United States, the Soviet Union, and Cuba in October 1962, during the Cold War. In Russia, former Eastern Bloc, and communist countries , it is termed the "Caribbean Crisis" , while in Cuba it is called the "October Crisis"...
.
Substitution in modern cryptography
Substitution ciphers as discussed above, especially the older pencil-and-paper hand ciphers, are no longer in serious use. However, the cryptographic concept of substitution carries on even today. From a sufficiently abstract perspective, modern bit-oriented
block cipherIn cryptography, a block cipher is a symmetric key cipher operating on fixed-length groups of bits, termed blocks, with an unvarying transformation. A block cipher encryption algorithm might take a 128-bit block of plaintext as input, and output a corresponding 128-bit block of ciphertext...
s (eg,
DESThe Data Encryption Standard is a block cipher that was selected by the National Bureau of Standards as an official Federal Information Processing Standard for the United States in 1976 and which has subsequently enjoyed widespread use internationally. It is based on a symmetric-key algorithm...
, or
AESIn cryptography, the Advanced Encryption Standard is an encryption standard adopted by the U.S. government. The standard comprises three block ciphers, AES-128, AES-192 and AES-256, adopted from a larger collection originally published as Rijndael. Each AES cipher has a 128-bit block size, with...
) can be viewed as substitution ciphers on an enormously large
binaryThe binary numeral system, or base-2 number system represents numeric values using two symbols, 0 and 1. More specifically, the usual base-2 system is a positional notation with a radix of 2...
alphabet. In addition, block ciphers often include smaller substitution tables called S-boxes. See also
substitution-permutation networkIn cryptography, an SP-network, or substitution-permutation network , is a series of linked mathematical operations used in block cipher algorithms such as AES ....
.
Substitution ciphers in popular culture
- Sherlock Holmes
Sherlock Holmes is a fictional character of the late nineteenth and early twentieth centuries who first appeared in publication in 1887. He is the creation of British author and physician Sir Arthur Conan Doyle...
breaks a substitution cipher in "The Adventure of the Dancing MenThe Adventure of the Dancing Men, one of the 56 Sherlock Holmes short stories written by British author Sir Arthur Conan Doyle, is one of 13 stories in the cycle collected as The Return of Sherlock Holmes.-Summary:...
."
- The Al Bhed language in Final Fantasy X
is a console role-playing game developed and published by Square as the tenth title in the Final Fantasy series. It was released in 2001 for Sony's PlayStation 2. The game marks the Final Fantasy series' transition from entirely pre-rendered backdrops to fully three-dimensional areas, and is also...
is actually a substitution cipher, although it is pronounced phonetically (i.e. "you" in English is translated to "oui" in Al Bhed, but is pronounced the same way that "oui" is pronounced in FrenchFrench is a Romance language globally spoken by about 65 million people as a first language , by 50 million as a second language, and by about another 200 million people as an acquired foreign language, with significant speakers in 57 countries. Most native speakers of the language live in France,...
).
- Uryuomoco, the fictional language from El Goonish Shive
El Goonish Shive is a contemporary fantasy webcomic, written and drawn by Dan Shive. It debuted on 2002-01-21 and was hosted on Keenspot from mid-2003 to early-2009, when it changed hosting to 910CMX. Shive rates EGS as being "13+"....
is a substitution cipher from English.
- The Minbari
The Minbari are a fictional alien race featured in the television show Babylon 5. The Minbari characters of Delenn, Lennier, Neroon, Draal, and Dukhat figure prominently in the series....
's alphabet from the Babylon 5Babylon 5 is an American science fiction television series created, produced and largely written by J. Michael Straczynski. The show centers on the Babylon 5 space station: a focal point for politics, diplomacy, and conflict during the years 2257–2262...
series is a substitution cipher from English.
- The language in Starfox Adventures: Dinosaur Planet spoken by native Saurians and Krystal
Krystal may refer to:*Krystal , one of the oldest fast-food chains in the United States, founded in 1932*Krystal, a video game character from the Star Fox series*Krystal, a character from a video game, Dark Wizard...
is also a substitution cipher of the English alphabetThe modern English alphabet is a Latin-based alphabet consisting of 26 letters – the same letters that are found in the Basic modern Latin alphabet:
The exact shape of printed letters varies depending on the typeface...
.
- The television program Futurama
Futurama is an animated American sci-fi sitcom created by Matt Groening and developed by Groening and David X. Cohen for the Fox network. The series follows the adventures of a late 20th-century New York City pizza delivery boy, Philip J...
contained a substitution cipher in which all 26 letters were replaced by symbols and called "Alien Language". This was deciphered rather quickly by the die hard viewers by showing a "Slurm" ad with the word "Drink" in both plain English and the Alien language thus giving the key. Later, the producers created a second alien language that used a combination of replacement and mathematical Ciphers. Once the English letter of the alien language is deciphered, then the numerical value of that letter (1 through 26 respectively) is then added to the value of the previous letter showing the actual intended letter. These messages can be seen throughout every episode of the series and it's subsequent movies.
See also
- Vigenère cipher
The Vigenère cipher is a method of encrypting alphabetic text by using a series of different Caesar ciphers based on the letters of a keyword. It is a simple form of polyalphabetic substitution....
- Topics in cryptography
- Ban (information)
A ban, sometimes called a hartley or a dit , is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit. As a bit corresponds to a binary digit, so a ban is a decimal digit...
with Centiban Table
External links