Check digit
Encyclopedia
A check digit is a form of redundancy check used for error detection
Error detection and correction
In information theory and coding theory with applications in computer science and telecommunication, error detection and correction or error control are techniques that enable reliable delivery of digital data over unreliable communication channels...

, the decimal equivalent of a binary checksum
Checksum
A checksum or hash sum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and...

. It consists of a single digit computed from the other digits in the message.

With a check digit, one can detect simple errors in the input of a series of digits, such as a single mistyped digit or some permutations of two successive digits.

Design

Check digit algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

s are generally designed to capture human transcription errors. In order of complexity, these include the following:
  • single digit errors, such as 1 → 2
  • transposition errors, such as 12 → 21
  • twin errors, such as 11 → 22
  • jump transpositions errors, such as 132 → 231
  • jump twin errors, such as 131 → 232
  • phonetic errors, such as 60 → 16 ("sixty" to "sixteen")


In choosing a system, a high probability of catching errors is traded off against implementation difficulty; simple check digit systems are easily understood and implemented by humans but do not catch as many errors as complex ones, which require sophisticated programs to implement.

A desirable feature is that left-padding with zeros should not change the check digit. This allows variable length digits to be used and the length to be changed.

If there is a single check digit added to the original number, the system will not always capture multiple errors, such as two replacement errors (12 → 34) though, typically, double errors will be caught 90% of the time (both changes would need to change the output by offsetting amounts).

A very simple check digit method would be to take the sum of all digits (digital sum
Digital sum
- Values :* The digit sum - add the digits of the representation of a number in a given base. For example, considering 84001 in base 10 the digit sum would be 8 + 4 + 0 + 0 + 1 = 13....

) modulo
Modulo operation
In computing, the modulo operation finds the remainder of division of one number by another.Given two positive numbers, and , a modulo n can be thought of as the remainder, on division of a by n...

 10. This would catch any single-digit error, as such an error would always change the sum, but does not catch any transposition errors (switching two digits) as re-ordering does not change the sum.

A slightly more complex method is to take the weighted sum of the digits, modulo 10, with different weights for each number position.

To illustrate this, for example if the weights for a four digit number were 5, 3, 2, 7 and the number to be coded was 4871, then one would take 5×4 + 3×8 + 2×7 + 7×1 = 65, ie 5 modulo 10, and the check digit would be 5, giving 48715.

Systems with weights of 1, 3, 7, or 9, with the weights on neighboring numbers being different, are widely used: for example, 31 31 weights in UPC
Universal Product Code
The Universal Product Code is a barcode symbology , that is widely used in North America, and in countries including the UK, Australia, and New Zealand for tracking trade items in stores. Its most common form, the UPC-A, consists of 12 numerical digits, which are uniquely assigned to each trade item...

 codes, 13 13 weights in EAN
European Article Number
An EAN-13 barcode is a 13 digit barcoding standard which is a superset of the original 12-digit Universal Product Code system developed in the United States...

 numbers (GS1 algorithm), and the 371 371 371 weights used in United States bank routing transit number
Routing transit number
A routing transit number is a nine digit bank code, used in the United States, which appears on the bottom of negotiable instruments such as checks identifying the financial institution on which it was drawn...

s. This system detects all single-digit errors and around 90% of transposition errors. 1, 3, 7, and 9 are used because they are coprime
Coprime
In number theory, a branch of mathematics, two integers a and b are said to be coprime or relatively prime if the only positive integer that evenly divides both of them is 1. This is the same thing as their greatest common divisor being 1...

 to 10, so changing any digit changes the check digit; using a coefficient that is divisible by 2 or 5 would lose information (because ) and thus not catch some single-digit errors. Using different weights on neighboring numbers means that most transpositions change the check digit; however, because all weights differ by an even number, this does not catch transpositions of two digits that differ by 5, (0 and 5, 1 and 6, 2 and 7, 3 and 8, 4 and 9), since the 2 and 5 multiply to yield 10.

The code instead uses modulo 11, which is prime, and all the number positions have different weights . This system thus detects all single digit substitution and transposition errors (including jump transpositions), but at the cost of the check digit possibly being 10, represented by "X". (An alternative is simply to avoid using the serial numbers which result in an "X" check digit.) instead uses the GS1 algorithm used in EAN numbers.

More complicated algorithms include the Luhn algorithm
Luhn algorithm
The Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm,is a simple checksum formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, National Provider Identifier numbers in US and Canadian Social Insurance Numbers...

 (1954), which captures 98% of single digit transposition errors (it does not detect 90 ↔ 09), while more sophisticated is the Verhoeff algorithm
Verhoeff algorithm
The Verhoeff algorithm, a checksum formula for error detection first published in 1969, was developed by Dutch mathematician Jacobus Verhoeff . Like the more widely known Luhn algorithm, it works with strings of decimal digits of any length...

 (1969), which catches all single digit substitution and transposition errors, and many (but not all) more complex errors. Both these methods use a single check digit and will therefore fail to capture around 10% of more complex errors. To reduce this failure rate, it is necessary to use more than one check digit (for example, the modulo 97 check referred to below, which uses two check digits - for the algorithm, see International Bank Account Number
International Bank Account Number
The International Bank Account Number is an international standard for identifying bank accounts across national borders with a minimal risk of propagating transcription errors. It was originally adopted by the European Committee for Banking Standards , and was later adopted as an international...

) and/or to use a wider range of characters in the check digit, for example letters plus numbers.

UPC

The final digit of a Universal Product Code
Universal Product Code
The Universal Product Code is a barcode symbology , that is widely used in North America, and in countries including the UK, Australia, and New Zealand for tracking trade items in stores. Its most common form, the UPC-A, consists of 12 numerical digits, which are uniquely assigned to each trade item...

 is a check digit computed as follows:
  1. Add the digits (up to but not including the check digit) in the odd-numbered positions (first, third, fifth, etc.) together and multiply by three.
  2. Add the digits (up to but not including the check digit) in the even-numbered positions (second, fourth, sixth, etc.) to the result.
  3. Take the remainder of the result divided by 10 (modulo operation) and subtract this from 10 to derive the check digit.

For instance, the UPC-A barcode for a box of tissues is "036000241457". The last digit is the check digit "7", and if the other numbers are correct then the check digit calculation must produce 7.
  1. Add the odd number digits: 0+6+0+2+1+5 = 14
  2. Multiply the result by 3: 14 × 3 = 42
  3. Add the even number digits: 3+0+0+4+4 = 11
  4. Add the two results together: 42 + 11 = 53
  5. To calculate the check digit, take the remainder of (53 / 10), which is also known as (53 modulo 10), and subtract from 10. Therefore, the check digit value is 7.

Another example: to calculate the check digit for the following food item "01010101010".
  1. Add the odd number digits: 0+0+0+0+0+0 = 0
  2. Multiply the result by 3: 0 x 3 = 0
  3. Add the even number digits: 1+1+1+1+1 = 5
  4. Add the two results together: 0 + 5 = 5
  5. To calculate the check digit, take the remainder of (5 / 10), which is also known as (5 modulo 10), and subtract from 10 i.e. (10 - 5 modulo 10) = 5. Therefore, the check digit value is 5.
  6. If the remainder is 0, subtracting from 10 would give 10. In that case, use 0 as the check digit.

ISBN 10

The final character of a ten digit International Standard Book Number
International Standard Book Number
The International Standard Book Number is a unique numeric commercial book identifier based upon the 9-digit Standard Book Numbering code created by Gordon Foster, Emeritus Professor of Statistics at Trinity College, Dublin, for the booksellers and stationers W.H...

 is a check digit computed so that multiplying each digit by its position in the number (counting from the right) and taking the sum of these products modulo
Modulo operation
In computing, the modulo operation finds the remainder of division of one number by another.Given two positive numbers, and , a modulo n can be thought of as the remainder, on division of a by n...

 11 is 0. The digit the farthest to the right (which is multiplied by 1) is the check digit, chosen to make the sum correct. It may need to have the value 10, which is represented as the letter X. For example, take the ISBN 0-201-53082-1. The sum of products is 0×10 + 2×9 + 0×8 + 1×7 + 5×6 + 3×5 + 0×4 + 8×3 + 2×2 + 1×1 = 99 ≡ 0 modulo 11. So the ISBN is valid.

While this may seem more complicated than the first scheme, it can be validated simply by adding all the products together then dividing by 11. The sum can be computed without any multiplications by initializing two variables, t and sum, to 0 and repeatedly performing t = t + digit; sum = sum + t; (which can be expressed in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 as sum += t += digit;). If the final sum is a multiple of 11, the ISBN is valid.

ISBN 13

ISBN 13 (in use January 2007) is equal to the EAN-13 code found underneath a book's barcode. Its check digit is generated the same way as the UPC except that the even digits are multiplied by 3 instead of the odd digits.

EAN (GLN,GTIN, EAN numbers administered by GS1)

EAN (European Article Number
European Article Number
An EAN-13 barcode is a 13 digit barcoding standard which is a superset of the original 12-digit Universal Product Code system developed in the United States...

) check digits (administered by GS1
GS1
Founded in 1977, GS1 is an international not-for-profit association dedicated to the development and implementation of global standards and solutions to improve the efficiency and visibility of supply and demand chains globally and across multiple sectors...

) are calculated by summing the even position numbers and multiplying by 3 and then by adding the sum of the odd position numbers. The final digit of the result is subtracted from 10 to calculate the check digit (or left as is if already zero).
A GS1 check digit calculator and detailed documentation is online at GS1
GS1
Founded in 1977, GS1 is an international not-for-profit association dedicated to the development and implementation of global standards and solutions to improve the efficiency and visibility of supply and demand chains globally and across multiple sectors...

's website.

Other examples of check digits

  • The tenth digit of the National Provider Identifier
    National Provider Identifier
    A National Provider Identifier or NPI is a unique 10-digit identification number issued to health care providers in the United States by the Centers for Medicare and Medicaid Services ....

     for the US healthcare industry
  • The Australian Tax File Number
    Tax File Number
    Tax File Number is an 8 or 9 digit number issued by the Australian Taxation Office to each taxpayer to identify that taxpayer's Australian tax dealings. When it was introduced in 1988, individuals received a 9 digit TFN and non-individuals were issued an 8 digit TFN. Now both are issued 9 digit...

     (based on modulo 11)
  • The Guatemalan Tax Number (NIT - Número de Identificación Tributaria) based on modulo 11
  • The North American CUSIP
    CUSIP
    The acronym CUSIP historically refers to the Committee on Uniform Security Identification Procedures, which was founded in 1964, during the paper crunch in Wall Street. This 9-character alphanumeric code identifies any North American security for the purposes of facilitating clearing and settlement...

     number
  • The final (ninth) digit of the routing transit number
    Routing transit number
    A routing transit number is a nine digit bank code, used in the United States, which appears on the bottom of negotiable instruments such as checks identifying the financial institution on which it was drawn...

    , a bank code
    Bank code
    A Bank Code is a code assigned by a central bank, a Bank Supervisory Body or a Bankers Association in a country to all its licensed member banks. The rules vary to a great extent between the countries. Also the name of such a code varies...

     used in the United States
  • The International SEDOL
    SEDOL
    SEDOL stands for Stock Exchange Daily Official List, a list of security identifiers used in the United Kingdom and Ireland for clearing purposes. The numbers are assigned by the London Stock Exchange, on request by the security issuer...

     number
  • The International Securities Identifying Number (ISIN)
  • The International CAS registry number
    CAS registry number
    CAS Registry Numbersare unique numerical identifiers assigned by the "Chemical Abstracts Service" toevery chemical described in the...

    's final digit.
  • Modulo 10 check digits in credit card
    Credit card
    A credit card is a small plastic card issued to users as a system of payment. It allows its holder to buy goods and services based on the holder's promise to pay for these goods and services...

     account numbers, calculated with the Luhn algorithm
    Luhn algorithm
    The Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm,is a simple checksum formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, National Provider Identifier numbers in US and Canadian Social Insurance Numbers...

    .
    • Also used in the Norwegian KID (customer identification number) numbers used in bank giros (credit transfer).
  • The final character encoded in a magnetic stripe card
    Magnetic stripe card
    A magnetic stripe card is a type of card capable of storing data by modifying the magnetism of tiny iron-based magnetic particles on a band of magnetic material on the card...

     is a computed Longitudinal redundancy check
    Longitudinal redundancy check
    In telecommunication, a longitudinal redundancy check or horizontal redundancy check is a form of redundancy check that is applied independently to each of a parallel group of bit streams...

  • final digit of a POSTNET
    POSTNET
    POSTNET is a barcode symbology that was used by the United States Postal Service to assist in directing mail. The ZIP Code or ZIP+4 code is encoded in half- and full-height bars...

     code
  • final digit of an ISSN code
  • final digit of a DUNS
    Data Universal Numbering System
    The Data Universal Numbering System, abbreviated as DUNS or D-U-N-S, is a system developed and regulated by Dun & Bradstreet , that assigns a unique numeric identifier, referred to as a "DUNS number" to a single business entity. It was introduced in 1963 to support D&B's credit reporting practice....

     number (though this is scheduled to change, such as that the final digit will be chosen freely in new allocations, rather than being a check digit)
  • The Spanish fiscal identification number (número de identificación fiscal, NIF
    NIF
    -Localities:* Nif, former name of the town of Kemalpaşa in western Turkey* Mount Nif, near Kemalpaşa* The River Nif in the same region, which joins the Gediz River-Organizations and other abbreviations:...

    ), (based on modulo 23).
  • The ninth digit of a Vehicle Identification Number
    Vehicle identification number
    A Vehicle Identification Number, commonly abbreviated to VIN, is a unique serial number used by the automotive industry to identify individual motor vehicles. VINs were first used in 1954...

     (VIN).
  • The ninth digit of an Israel
    Israel
    The State of Israel is a parliamentary republic located in the Middle East, along the eastern shore of the Mediterranean Sea...

    i Teudat Zehut
    Teudat Zehut
    Teudat Zehut is the Israeli compulsory identity document, as prescribed in the Identity Card Carrying and Displaying Act of 1982:Any resident sixteen years of age or older must at all times carry an Identity card, and present it upon demand to a senior police officer, head of Municipal or Regional...

     (Identity Card) number.
  • The 13th digit of Serbia
    Serbia
    Serbia , officially the Republic of Serbia , is a landlocked country located at the crossroads of Central and Southeast Europe, covering the southern part of the Carpathian basin and the central part of the Balkans...

    n and Former Yugoslav Unique Master Citizen Number (JMBG)
    Unique Master Citizen Number
    Unique Master Citizen Number was a unique identification number that was assigned to every citizen of former Yugoslav republics of the SFR Yugoslavia. Today it continues to be used in all of the countries that were created after the dissolution of Yugoslavia – Bosnia and Herzegovina, Croatia,...

  • Last check digit in EAN/UPC serialisation of Global Trade Identification Number (GTIN). It applies to GTIN-8, GTIN-12, GTIN-13 and GTIN-14.
  • The seventh character of a New Zealand
    New Zealand
    New Zealand is an island country in the south-western Pacific Ocean comprising two main landmasses and numerous smaller islands. The country is situated some east of Australia across the Tasman Sea, and roughly south of the Pacific island nations of New Caledonia, Fiji, and Tonga...

     NHI Number
    NHI Number
    The National Health Index number is the unique person identifier used within the New Zealand health system. It is technically not a number but rather an alphanumeric identifier consisting of 7 characters, with three letters and four numbers...

    .
  • The last digit on a New Zealand locomotive
    Locomotives of New Zealand
    Locomotives of New Zealand currently in operation owned by KiwiRail consist of 172 diesel-electric locomotives, 22 electric locomotives, 3 railcars, and 103 shunting locomotives...

    's Traffic Monitoring System (TMS) number.
  • The last two digits of the 11-digit Turkish Identification Number
    Turkish Identification Number
    Turkish Identification Number is a unique personal identification number that is assigned to every citizen of Turkey.Foreigners residing in Turkey at least six months for any purpose receive a Foreigner Identification Number, which is different from the Turkish Identification Number.- Purpose :The...

     .
  • The third and fourth digits in an International Bank Account Number
    International Bank Account Number
    The International Bank Account Number is an international standard for identifying bank accounts across national borders with a minimal risk of propagating transcription errors. It was originally adopted by the European Committee for Banking Standards , and was later adopted as an international...

     (Modulo 97 check).
  • The ninth character in the 14-character EU
    European Union
    The European Union is an economic and political union of 27 independent member states which are located primarily in Europe. The EU traces its origins from the European Coal and Steel Community and the European Economic Community , formed by six countries in 1958...

     cattle passport number (cycles from 1 to 7: see British Cattle Movement Service).
  • The ninth digit in an Iceland
    Iceland
    Iceland , described as the Republic of Iceland, is a Nordic and European island country in the North Atlantic Ocean, on the Mid-Atlantic Ridge. Iceland also refers to the main island of the country, which contains almost all the population and almost all the land area. The country has a population...

    ic Kennitala
    Kennitala
    The kennitala is a unique national identification number used by the Icelandic government to identify individuals and organisations in Iceland, administered by the National Registry . Kennitölur are issued to Icelandic citizens at birth, and to foreign nationals resident in Iceland upon registration...

     (national ID number).
  • Modulo 97 check digits in a Belgian
    Belgium
    Belgium , officially the Kingdom of Belgium, is a federal state in Western Europe. It is a founding member of the European Union and hosts the EU's headquarters, and those of several other major international organisations such as NATO.Belgium is also a member of, or affiliated to, many...

     and Serbia
    Serbia
    Serbia , officially the Republic of Serbia , is a landlocked country located at the crossroads of Central and Southeast Europe, covering the southern part of the Carpathian basin and the central part of the Balkans...

    n bank account numbers.
  • Mayo Clinic
    Mayo Clinic
    Mayo Clinic is a not-for-profit medical practice and medical research group specializing in treating difficult patients . Patients are referred to Mayo Clinic from across the U.S. and the world, and it is known for innovative and effective treatments. Mayo Clinic is known for being at the top of...

     patient identification numbers used in Arizona and Florida include a trailing check digit

Algorithms

Notable algorithms include:
  • Luhn algorithm
    Luhn algorithm
    The Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm,is a simple checksum formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, National Provider Identifier numbers in US and Canadian Social Insurance Numbers...

     (1954)
  • Verhoeff algorithm
    Verhoeff algorithm
    The Verhoeff algorithm, a checksum formula for error detection first published in 1969, was developed by Dutch mathematician Jacobus Verhoeff . Like the more widely known Luhn algorithm, it works with strings of decimal digits of any length...

    (1969)

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK