Run-length encoding
Encyclopedia
Run-length encoding is a very simple form of data compression
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

 in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run. This is most useful on data that contains many such runs: for example, simple graphic images such as icons, line drawings, and animations. It is not useful with files that don't have many runs as it could greatly increase the file size.

RLE also refers to a little-used image format in Windows 3.x
Windows 3.x
Windows 3.x can refer to either an individual or all of the following versions of Microsoft Windows:*Windows 3.0*Windows 3.1x*Windows 3.2...

, with the extension rle, which is a Run Length Encoded Bitmap, used to compress the Windows 3.x startup screen.

Example

For example, consider a screen containing plain black text on a solid white background. There will be many long runs of white pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....

s in the blank space, and many short runs of black pixels within the text. Let us take a hypothetical single scan line
Scan line
A scan line or scanline is one line, or row, in a raster scanning pattern, such as a line of video on a cathode ray tube display of a television set or computer monitor....

, with B representing a black pixel and W representing white:
WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW


If we apply the run-length encoding (RLE) data
compression algorithm to the above hypothetical scan line, we get the following:
12W1B12W3B24W1B14W


This is to be interpreted as twelve Ws, one B, twelve Ws, three Bs, etc.

The run-length code represents the original 67 characters in only 18. Of course, the actual format used for the storage of images is generally binary rather than ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 characters like this, but the principle remains the same. Even binary data files can be compressed with this method; file format specifications often dictate repeated bytes in files as padding space. However, newer compression methods such as DEFLATE
DEFLATE
Deflate is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. It was originally defined by Phil Katz for version 2 of his PKZIP archiving tool and was later specified in RFC 1951....

 often use LZ77-based algorithms, a generalization of run-length encoding that can take advantage of runs of strings of characters (such as BWWBWWBWWBWW).

Applications

Run-length encoding performs lossless data compression
Lossless data compression
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange...

 and is well suited to palette
Palette (computing)
In computer graphics, a palette is either a given, finite set of colors for the management of digital images , or a small on-screen graphical element for choosing from a limited set of choices, not necessarily colors .Depending on the context In computer graphics, a palette is either a given,...

-based iconic images. It does not work well at all on continuous-tone images such as photographs, although JPEG
JPEG
In computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality....

 uses it quite effectively on the coefficients that remain after transforming and quantizing
Quantization (image processing)
Quantization, involved in image processing, is a lossy compression technique achieved by compressing a range of values to a single quantum value. When the number of discrete symbols in a given stream is reduced, the stream becomes more compressible. For example, reducing the number of colors...

 image blocks.

Common formats for run-length encoded data include Truevision TGA
Truevision TGA
Truevision TGA, often referred to as TARGA, is a raster graphics file format created by Truevision Inc. . It was the native format of TARGA and VISTA boards, which were the first graphic cards for IBM-compatible PCs to support Highcolor/truecolor display...

, PackBits
PackBits
PackBits is a fast, simple lossless compression scheme for run-length encoding of data.Apple introduced the PackBits format with the release of MacPaint on the Macintosh computer. This compression scheme is one of the types of compression that can be used in TIFF-files...

, PCX
PCX
PCX is an image file format developed by the now-defunct ZSoft Corporation of Marietta, Georgia. It was the native file format for PC Paintbrush and became one of the first widely accepted DOS imaging standards, although it has since been succeeded by more sophisticated image formats, such as GIF,...

 and ILBM
ILBM
ILBM is a subtype of the Interchange File Format used for storing picture data. ILBM stands for InterLeaved BitMap which refers to the way the pictures are stored. The image data is stored as a varying number of bitplanes, each storing one bit of data for each pixel in the image...

.

Run-length encoding is used in fax
Fax
Fax , sometimes called telecopying, is the telephonic transmission of scanned printed material , normally to a telephone number connected to a printer or other output device...

 machines (combined with other techniques into Modified Huffman coding
Modified Huffman coding
Modified Huffman coding is used in fax machines to encode black on white images . It combines the variable length codes of Huffman coding with the coding of repetitive data in run-length encoding....

). It is relatively efficient because most faxed documents are mostly white space, with occasional interruptions of black.

See also

  • Look-and-say sequence
    Look-and-say sequence
    In mathematics, the look-and-say sequence is the sequence of integers beginning as follows:To generate a member of the sequence from the previous member, read off the digits of the previous member, counting the number of digits in groups of the same digit...

  • Comparison of graphics file formats
    Comparison of graphics file formats
    -General:Ownership of the format and related information.-Technical details:...

  • Golomb coding
    Golomb coding
    Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the...

  • Modified Huffman coding
    Modified Huffman coding
    Modified Huffman coding is used in fax machines to encode black on white images . It combines the variable length codes of Huffman coding with the coding of repetitive data in run-length encoding....

  • Burrows-Wheeler transform
    Burrows-Wheeler transform
    The Burrows–Wheeler transform , is an algorithm used in data compression techniques such as bzip2. It was invented by Michael Burrows and David Wheeler in 1994 while working at DEC Systems Research Center in Palo Alto, California...

  • Run Length Limited
    Run Length Limited
    Run length limited or RLL coding is a line coding technique that is used to send arbitrary data over a communications channel with bandwidth limits. This is used in both telecommunication and storage systems which move a medium past a fixed head. Specifically, RLL bounds the length of stretches ...

  • Bitmap index
    Bitmap Index
    A bitmap index is a special kind of database index that uses bitmaps.Bitmap indexes have traditionally been considered to work well for data such as gender, which has a small number of distinct values, for example male and female, but many occurrences of those values. This would happen if, for...

  • Forsyth–Edwards Notation, which uses run-length-encoding for empty spaces in chess positions.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK