Motion compensation
Encyclopedia
Motion compensation is an algorithmic technique employed in the encoding of video data for video compression, for example in the generation of MPEG-2
MPEG-2
MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods which permit storage and transmission of movies using currently available storage media and transmission...

 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.

How it works

Motion compensation exploits the fact that, often, for many frames
Film frame
In filmmaking, video production, animation, and related fields, a film frame or video frame is one of the many still images which compose the complete moving picture...

 of a movie, the only difference between one frame and another is the result of either the camera moving
Motion (physics)
In physics, motion is a change in position of an object with respect to time. Change in action is the result of an unbalanced force. Motion is typically described in terms of velocity, acceleration, displacement and time . An object's velocity cannot change unless it is acted upon by a force, as...

 or an object in the frame moving. In reference to a video file, this means much of the information that represents one frame will be the same as the information used in the next frame.

Using motion compensation, a video stream will contain some full (reference) frames; then the only information stored for the frames in between would be the information needed to transform the previous frame into the next frame.

Illustrated example

The following is a simplistic illustrated explanation of how motion compensation works. Two successive frames were captured from the movie Elephants Dream
Elephants Dream
Elephants Dream is a computer-generated short film that was produced almost completely using the free software 3D suite Blender . It premiered on March 24, 2006, after about 8 months of work...

. As can be seen from the images, the bottom (motion compensated) difference between two frames contains significantly less detail than the prior images, and thus compresses much better than the rest.
Type Example Frame Description
Original Full original frame, as shown on screen.
Difference Differences between the original frame and the next frame.
Motion compensated difference Differences between the original frame and the next frame, shifted right by 2 pixels. Shifting the frame compensates for the panning
Panning (camera)
In photography, panning refers to the horizontal movement or rotation of a still or video camera, or the scanning of a subject horizontally on video or a display device...

 of the camera, thus there is greater overlap
Overlap
Overlap may mean one of:* In music theory, overlap is a synonym for reinterpretation of a chord at the boundary of two musical phrases.* In railway signalling, an Overlap is the length of track beyond a stop signal that is proved to be clear of vehicles in the controls of the previous signal, as a...

 between the two frames.

Motion Compensation in MPEG

In MPEG, images are predicted from previous frames (P frames) or bidirectionally from previous and future frames (B frames). B frames are more complex because the image sequence must be transmitted/stored out of order so that the future frame is available to generate the B frames.

After predicting frames using motion compensation, the coder finds the error (residual) which is
then compressed using the DCT
Discrete cosine transform
A discrete cosine transform expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of audio and images A discrete cosine transform...

 and transmitted.

Global motion compensation

In global motion compensation
Global Motion Compensation
Global motion compensation is a technique used in video compression to reduce the bitrate required to encode video. It is most commonly used in MPEG-4 ASP, such as with the DivX and Xvid codecs.-Operation:...

, the motion model basically reflects camera motions such as:
  • Dolly - moving the camera forward or backwards
  • Track - moving the camera left or right
  • Boom - moving the camera up or down
  • Pan - rotating the camera around its Y axis, moving the view left or right
  • Tilt - rotating the camera around its X axis, moving the view up or down
  • Roll - rotating the camera around the view axis


It works best for still scenes without moving objects.

There are several advantages of global motion compensation:
  • It models the dominant motion usually found in video sequences with just a few parameters. The share in bit-rate of these parameters is negligible.
  • It does not partition the frames. This avoids artifacts at partition borders.
  • A straight line (in the time direction) of pixels with equal spatial positions in the frame corresponds to a continuously moving point in the real scene. Other MC schemes introduce discontinuities in the time direction.


MPEG-4 ASP supports GMC with three reference points, although some implementations can only make use of one. A single reference point only allows for translational motion which for its relatively large performance cost provides little advantage over block based motion compensation.

Moving objects within a frame are not sufficiently represented by global motion compensation.
Thus, local motion estimation is also needed.

Block motion compensation

In block motion compensation (BMC), the frames are partitioned in blocks of pixels (e.g. macroblocks of 16×16 pixels in MPEG).
Each block is predicted from a block of equal size in the reference frame.
The blocks are not transformed in any way apart from being shifted to the position of the predicted block.
This shift is represented by a motion vector.

To exploit the redundancy between neighboring block vectors, (e.g. for a single moving object covered by multiple blocks) it is common to encode only the difference between the current and previous motion vector in the bit-stream. The result of this differencing process is mathematically equivalent to a global motion compensation capable of panning.
Further down the encoding pipeline, an entropy coder
Entropy encoding
In information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium....

 will take advantage of the resulting statistical distribution of the motion vectors around the zero vector to reduce the output size.

It is possible to shift a block by a non-integer number of pixels, which is called sub-pixel precision.
The in-between pixels are generated by interpolating neighboring pixels. Commonly, half-pixel or quarter pixel precision (Qpel
Qpel
Quarter pixel refers to a quarter of a standard pixel. It is used in many modern video encoding standards such as MPEG-4 ASP and H.264/AVC to refer to quarter pixel precision in motion estimation and motion compensation...

, used by H.264 and MPEG-4/ASP) is used. The computational expense of sub-pixel precision is much higher due to the extra processing required for interpolation and on the encoder side, a much greater number of potential source blocks to be evaluated.

The main disadvantage of block motion compensation is that it introduces discontinuities at the block borders (blocking artifacts).
These artifacts appear in the form of sharp horizontal and vertical edges which are easily spotted by the human eye and produce ringing effects (large coefficients in high frequency sub-bands) in the Fourier-related transform used for transform coding
Transform coding
Transform coding is a type of data compression for "natural" data like audio signals or photographic images. The transformation is typically lossy, resulting in a lower quality copy of the original input....

 of the residual frame
Residual frame
In video compression algorithms a residual frame is formed by subtracting the reference frame from the desired frame. This difference is known as the error or residual frame...

s.

Block motion compensation divides up the current frame into non-overlapping blocks, and the motion compensation vector tells where those blocks come from
(a common misconception is that the previous frame is divided up into non-overlapping blocks, and the motion compensation vectors tell where those blocks move to).
The source blocks typically overlap in the source frame.
Some video compression algorithms assemble the current frame out of pieces of several different previously-transmitted frames.

Frames can also be predicted from future frames.
The future frames then need to be encoded before the predicted frames and thus, the encoding order does not necessarily match the real frame order.
Such frames are usually predicted from two directions, i.e. from the I- or P-frames that immediately precede or follow the predicted frame.
These bidirectionally predicted frames are called B-frames.
A coding scheme could, for instance, be IBBPBBPBBPBB.

Variable block-size motion compensation

Variable block-size motion compensation (VBSMC) is the use of BMC with the ability for the encoder to dynamically select the size of the blocks. When coding video, the use of larger blocks can reduce the number of bits needed to represent the motion vectors, while the use of smaller blocks can result in a smaller amount of prediction residual information to encode. Older designs such as H.261
H.261
H.261 is a ITU-T video coding standard, ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Video Coding Experts Group , and was the first video codec that was useful in practical terms.H.261 was originally designed for...

 and MPEG-1
MPEG-1
MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting possible.Today, MPEG-1 has become...

 video typically use a fixed block size, while newer ones such as H.263
H.263
H.263 is a video compression standard originally designed as a low-bitrate compressed format for videoconferencing. It was developed by the ITU-T Video Coding Experts Group in a project ending in 1995/1996 as one member of the H.26x family of video coding standards in the domain of the ITU-T.H.263...

, MPEG-4 Part 2
MPEG-4 Part 2
MPEG-4 Part 2, MPEG-4 Visual is a video compression technology developed by MPEG. It belongs to the MPEG-4 ISO/IEC standards. It is a discrete cosine transform compression standard, similar to previous standards such as MPEG-1 and MPEG-2...

, H.264/MPEG-4 AVC
H.264/MPEG-4 AVC
H.264/MPEG-4 Part 10 or AVC is a standard for video compression, and is currently one of the most commonly used formats for the recording, compression, and distribution of high definition video...

, and VC-1
VC-1
VC-1 is the informal name of the SMPTE 421M video codec standard, which was initially developed as a proprietary video format by Microsoft before it was released as a formal SMPTE standard video format on April 3, 2006...

 give the encoder the ability to dynamically choose what block size will be used to represent the motion.

Overlapped block motion compensation

Overlapped block motion compensation (OBMC) is a good solution to these problems because it not only increases prediction accuracy but also avoids blocking artifacts. When using OBMC,
blocks are typically twice as big in each dimension and overlap quadrant-wise with all 8 neighbouring blocks.
Thus, each pixel belongs to 4 blocks. In such a scheme, there are 4 predictions for each pixel which are summed up to a weighted mean.
For this purpose, blocks are associated with a window function that has the property that the sum of 4 overlapped windows is equal to 1 everywhere.

Studies of methods for reducing the complexity of OBMC have shown that the contribution to the window function is smallest for the diagonally-adjacent block. Reducing the weight for this contribution to zero and increasing the other weights by an equal amount leads to a substantial reduction in complexity without a large penalty in quality. In such a scheme, each pixel then belongs to 3 blocks rather than 4, and rather than using 8 neighboring blocks, only 4 are used for each block to be compensated. Such a scheme is found in the H.263
H.263
H.263 is a video compression standard originally designed as a low-bitrate compressed format for videoconferencing. It was developed by the ITU-T Video Coding Experts Group in a project ending in 1995/1996 as one member of the H.26x family of video coding standards in the domain of the ITU-T.H.263...

 Annex F Advanced Prediction mode

Quarter Pixel (QPel) and Half Pixel motion compensation

In motion compensation, quarter or half samples are actually interpolated sub-samples caused by fractional motion vectors. Based on the vectors and full-samples, the sub-samples can be calculated by using bicubic or bilinear 2-D filtering. See subclause 8.4.2.2 "Fractional sample interpolation process" of the H.264 standard.

3D image coding techniques

Motion compensation is utilized in Stereoscopic Video Coding
Stereoscopic Video Coding
-3D Video Coding:3D Video Coding is one of the stages required for the deployment of stereoscopic content in the home. There are three techniques which are used to achieve stereoscopic video:# Color shifting...



In video, time is often considered as the third dimension. Still image coding techniques can be expanded to an extra dimension.

JPEG2000 uses wavelets, and these can also be used to encode motion without gaps between blocks in an adaptive way. Fractional pixel affine transformation
Affine transformation
In geometry, an affine transformation or affine map or an affinity is a transformation which preserves straight lines. It is the most general class of transformations with this property...

s lead to bleeding between adjacent pixels. If no higher internal resolution is used the delta images mostly fight against the image smearing out. The delta image can also be encoded as wavelets, so that the borders of the adaptive blocks match.

2D+Delta Encoding techniques utilize H.264 and MPEG-2
MPEG-2
MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods which permit storage and transmission of movies using currently available storage media and transmission...

 compatible coding and can use motion compensation to compress between stereoscopic images.

Expanding the 8x8 JPEG
JPEG
In computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality....

 blocks into the third dimension that is into 8x8x8 cubes and modifying the DCT more into a DFT
Discrete Fourier transform
In mathematics, the discrete Fourier transform is a specific kind of discrete transform, used in Fourier analysis. It transforms one function into another, which is called the frequency domain representation, or simply the DFT, of the original function...

 enables compression of linear translations with speeds below and around one pixel per frame (sub-pixel precision).

See also

  • HDTV blur
    HDTV blur
    HDTV blur is a common term used to describe a number of different artifacts on modern consumer high-definition television sets.The following factors are generally the primary or secondary causes of HDTV blur; in some cases more than one of these factors may be in play at the studio or receiver end...

  • Television standards conversion
    Television standards conversion
    Television standards conversion is the process of changing one type of TV system to another. The most common is from NTSC to PAL or the other way around. This is done so TV programs in one nation may be viewed in a nation with a different standard...

  • VidFIRE
    VidFIRE
    VidFIRE is a restoration technique intended to restore the video-like motion of footage originally shot with television cameras now existing only in formats with telerecording as their basis...

  • X-Video Motion Compensation

External links


Applications

  • video compression
  • change of framerate for playback of 24 frames per second movies on 60 Hz LCDs or 100 Hz interlaced cathode ray tubes

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK