JPEG-LS
Encyclopedia
Lossless JPEG refers to a 1993 addition to JPEG
JPEG
In computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality....

 standard by the Joint Photographic Experts Group
Joint Photographic Experts Group
The Joint Photographic Experts Group is the joint committee between ISO/IEC JTC1 and ITU-T that created the JPEG, JPEG 2000, and JPEG XR standards. It is one of two sub-groups of ISO/IEC Joint Technical Committee 1, Subcommittee 29, Working Group 1 - titled as Coding of still pictures...

 to enable lossless compression. However, it might be used as an umbrella term to refer to all lossless compression schemes developed by the Joint Photographic Expert group. They include JPEG 2000
JPEG 2000
JPEG 2000 is an image compression standard and coding system. It was created by the Joint Photographic Experts Group committee in 2000 with the intention of superseding their original discrete cosine transform-based JPEG standard with a newly designed, wavelet-based method...

 and JPEG-LS.

Lossless JPEG

Lossless JPEG was developed as a late addition to JPEG in 1993, using a completely different technique from the lossy JPEG standard. It uses a predictive scheme based on the three nearest (causal) neighbors (upper, left, and upper-left), and entropy
Information entropy
In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits...

 coding is used on the prediction error. It is not supported by the standard Independent JPEG Group libraries
Libjpeg
libjpeg is a library written entirely in C which contains a widely-used implementation of a JPEG decoder, JPEG encoder and other JPEG utilities...

, although Ken Murchison of Oceana Matrix Ltd. wrote a patch that extends the IJG library to support Lossless JPEG. Lossless JPEG has some popularity in medical imaging, and is used in DNG
Digital Negative (file format)
Digital Negative is an open raw image format owned by Adobe used for digital photography. It was launched on September 27, 2004. The launch was accompanied by the first version of the DNG specification, plus various products including a free of charge DNG Converter utility...

 and some digital cameras to compress raw images, but otherwise was never widely adopted.

Lossless mode of operation

Lossless JPEG is actually a mode of operation of JPEG. This mode exists because the Discrete Cosine Transform
Discrete cosine transform
A discrete cosine transform expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of audio and images A discrete cosine transform...

 (DCT) based form cannot guarantee that encoder input would exactly match decoder output since the Inverse DCT is not rigorously defined. Unlike the lossy mode which is based on the DCT
DCT
DCT may refer to:In mathematics:* Discrete cosine transform, a mathematical transform related to the Fourier transform* Dominated convergence theorem, a central mathematical theorem in the theory of integration first proposed by Henri LebesgueIn biology:...

, the lossless coding process employs a simple predictive coding model called differential pulse code modulation (DPCM
DPCM
Differential pulse-code modulation is a signal encoder that uses the baseline of pulse-code modulation but adds some functionalities based on the prediction of the samples of the signal...

). This is a model in which predictions of the sample values are estimated from the neighboring samples that are already coded in the image. Most predictors take the average of the samples immediately above and to the left of the target sample. DPCM encodes the differences between the predicted samples instead of encoding each sample independently. The differences from one sample to the next are usually close to zero. A typical DPCM encoder is displayed in Fig.1. The block in the figure acts as a storage of the current sample which will later be a previous sample.

The main steps of lossless operation mode are depicted in Fig.2. In the process, the predictor combines up to three neighboring samples at A, B, and C shown in Fig.3 in order to produce a prediction of the sample value at the position labeled by X. The three neighboring samples must be already predicted samples. Any one of the predictors shown in the table below can be used to estimate the sample located at X . Any one of the eight predictors listed in the table can be used. Note that selections 1, 2, and 3 are one-dimensional predictors and selections 4, 5, 6, and 7 are two-dimensional predictors. The first selection value in the table, zero, is only used for differential coding in the hierarchical mode of operation.
Once all the samples are predicted, the differences between the samples can be obtained and entropy-coded in a lossless fashion using Huffman coding
Huffman coding
In computer science and information theory, Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variable-length code table for encoding a source symbol where the variable-length code table has been derived in a particular way based on...

 or arithmetic coding
Arithmetic coding
Arithmetic coding is a form of variable-length entropy encoding used in lossless data compression. Normally, a string of characters such as the words "hello there" is represented using a fixed number of bits per character, as in the ASCII code...

.
Selection-value Prediction
0 No prediction
1 A
2 B
3 C
4 A + B – C
5 A + (B – C)/2
6 B + (A – C)/2
7 (A + B)/2


Typically, compressions using lossless operation mode can achieve around 2:1 compression ratio for color images. This mode is quite popular in the medical imaging field, but otherwise it is not very widely used.

JPEG-LS

JPEG-LS is a lossless/near-lossless compression standard for continuous-tone images. Its official designation is ISO-14495-1/ITU-T.87. It is a simple and efficient baseline algorithm which consists of two independent and distinct stages called modeling and encoding. JPEG-LS was developed with the aim of providing a low-complexity lossless and near-lossless image compression standard that could offer better compression efficiency than lossless JPEG. It was developed because at the time, the Huffman coding
Huffman coding
In computer science and information theory, Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variable-length code table for encoding a source symbol where the variable-length code table has been derived in a particular way based on...

-based JPEG lossless standard and other standards were limited in their compression performance. Total decorrelation
Decorrelation
Decorrelation is a general term for any process that is used to reduce autocorrelation within a signal, or cross-correlation within a set of signals, while preserving other aspects of the signal. A frequently used method of decorrelation is the use of a matched linear filter to reduce the...

 cannot be achieved by first order entropy of the prediction residuals employed by these inferior standards. JPEG-LS, on the other hand, can obtain good decorrelation. Part 1 of this standard was finalized in 1999; and when released, Part 2 of this standard will introduce extensions such as arithmetic coding
Arithmetic coding
Arithmetic coding is a form of variable-length entropy encoding used in lossless data compression. Normally, a string of characters such as the words "hello there" is represented using a fixed number of bits per character, as in the ASCII code...

. The core of JPEG-LS is based on the LOCO-I algorithm, that relies on prediction, residual modeling
Errors and residuals in statistics
In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...

 and context-based coding of the residuals. Most of the low complexity of this technique comes from the assumption that prediction residuals follow a two-sided geometric distribution (also called a discrete Laplace distribution) and from the use of Golomb
Golomb coding
Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the...

-like codes, which are known to be approximately optimal for geometric distributions. Besides lossless compression, JPEG-LS also provides a lossy mode ("near-lossless") where the maximum absolute error can be controlled by the encoder. Compression for JPEG-LS is generally much faster than JPEG 2000 and much better
Data compression ratio
Data compression ratio, also known as compression power, is a computer-science term used to quantify the reduction in data-representation size produced by a data compression algorithm...

 than the original lossless JPEG standard.

LOCO-I algorithm

Prior to encoding, there are two essential steps to be done in the modeling stage: decorrelation
Decorrelation
Decorrelation is a general term for any process that is used to reduce autocorrelation within a signal, or cross-correlation within a set of signals, while preserving other aspects of the signal. A frequently used method of decorrelation is the use of a matched linear filter to reduce the...

 (prediction) and error modeling.

Decorrelation/prediction

In the LOCO-I algorithm, primitive edge detection
Edge detection
Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities...

 of horizontal or vertical edges is achieved by examining the neighboring pixels of the current pixel X as illustrated in Fig.3. The pixel labeled by B is used in the case of a vertical edge while the pixel located at A is used in the case of a horizontal edge. This simple predictor is called the Median Edge Detection (MED) predictor or LOCO-I predictor. The pixel X is predicted by the LOCO-I predictor according to the following guesses:


The three simple predictors are selected according to the following conditions: (1) it tends to pick B in cases where a vertical edge exists left of the X, (2) A in cases of an horizontal edge above X, or (3) A + B – C if no edge is detected.

Context modeling

The JPEG-LS algorithm estimates the conditional expectations of the prediction errors using corresponding sample means within each context Ctx. The purpose of context modeling is that the higher order structures like texture patterns and local activity of the image can be exploited by context modeling of the prediction error. Contexts are determined by obtaining the differences of the neighboring samples which represents the local gradient
Gradient
In vector calculus, the gradient of a scalar field is a vector field that points in the direction of the greatest rate of increase of the scalar field, and whose magnitude is the greatest rate of change....

:


The local gradient reflects the level of activities such as smoothness and edginess of the neighboring samples. Notice that these differences are closely related to the statistical behavior of prediction errors. Each one of the differences found in the above equation is then quantized into roughly equiprobable and connected regions. For JPEG-LS, the differences g1, g2, and g3 are quantized into 9 regions and the region are indexed from -4 to 4. The purpose of the quantization is to maximize the mutual information between the current sample value and its context such that the high-order dependencies can be captured. One can obtain the contexts based on the assumption that



After merging contexts of both positive and negative signs, the total number of contexts is contexts. A bias estimation could be obtained by dividing cumulative prediction errors within each context by a count of context occurrences. In LOCO-I algorithm, this procedure is modified and improved such that the number of subtractions and additions are reduced. The division-free bias computation procedure is demonstrated in http://www.hpl.hp.com/loco/. Prediction refinement can then be done by applying these estimates in a feedback mechanism which eliminates prediction biases in different contexts.

Coding corrected prediction residuals

In the regular mode of JPEG-LS, the standard uses Golomb-Rice codes
Golomb coding
Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the...

 which are a way to encode non-negative run lengths. Its special case with the optimal encoding value 2k allows simpler encoding procedures.

Run length coding in uniform areas

Since Golomb-Rice codes are quite inefficient for encoding low entropy distributions because the coding rate is at least one bit per symbol, significant redundancy may be produced because the smooth regions in an image can be encoded at less than 1 bit per symbol. To avoid having excess code length over the entropy, one can use alphabet extension which codes blocks of symbols instead of coding individual symbols. This spreads out the excess coding length over many symbols. This is the “run” mode of JPEG-LS and it is executed once a flat or smooth context region characterized by zero gradients is detected. A run of west symbol “a” is expected and the end of run occurs when a new symbol occurs or the end of line is reached. The total run of length is encoded and the encoder would return to the “regular” mode.

JPEG 2000

JPEG 2000 includes a lossless mode based on a special integer wavelet
Wavelet
A wavelet is a wave-like oscillation with an amplitude that starts out at zero, increases, and then decreases back to zero. It can typically be visualized as a "brief oscillation" like one might see recorded by a seismograph or heart monitor. Generally, wavelets are purposefully crafted to have...

 filter (biorthogonal 3/5). JPEG 2000's lossless mode runs more slowly and has often worse compression ratios
Data compression ratio
Data compression ratio, also known as compression power, is a computer-science term used to quantify the reduction in data-representation size produced by a data compression algorithm...

than JPEG-LS on artificial and compound images. JPEG 2000 fares better than the UBC implementation of JPEG-LS on digital camera pictures. JPEG 2000 is also scalable, progressive, and more widely supported.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK