Solid compression
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

, solid compression refers to a method for data compression
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

 of multiple files, wherein all the compressed files are concatenated and treated as a single data block. Such an archive is called a solid archive. It is used natively in the 7z
7z
7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public...

  and RAR  formats, as well as indirectly in tar
Tar (file format)
In computing, tar is both a file format and the name of a program used to handle such files...

-based formats such as .tar.gz
Gzip
Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

and .tar.bz2
Bzip2
bzip2 is a free and open source implementation of the Burrows–Wheeler algorithm. It is developed and maintained by Julian Seward. Seward made the first public release of bzip2, version 0.15, in July 1996.-Compression efficiency:...

.
By contrast, the ZIP format
ZIP (file format)
Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...

 is not solid because it stores separate compressed files (even though solid compression can be portably emulated).

The term is ostensibly because the data is compressed as a single solid block, rather than as individual files.

Explanation

Compressed file formats often feature both compression (storing the data in a small space) and archiving
File archiver
A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage...

 (storing multiple files and metadata in a single file). One can combine these in two natural ways:
  • compress the individual files, and then archive into a single file;
  • archive into a single data block, and then compress.


The order matters (these operations do not commute
Commutativity
In mathematics an operation is commutative if changing the order of the operands does not change the end result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it...

), and this latter is solid compression.

In Unix, compression and archiving are traditionally separate operations, which allows one to understand this distinction:
  • compressing individual files and then archiving would be a tar of gzip'ed files – this is very uncommon, while
  • archiving via tar and then compressing yields a compressed archive: a .tar.gz – and this is solid compression.

Benefits

Solid compression allows for much better compression rates when all the files are similar, which is often the case if they are of the same file format
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...

. It is also very efficient when archiving a large number of rather small files.

Costs

On the other hand, getting a single file out of a solid archive originally required processing all the files before it, so modifying solid archives could be slow and inconvenient.
Later versions of 7-zip use a variable solid block size, so that only a limited amount of data must be processed in order to extract one file.
Parameters control the maximum solid block window size, the number of files in a block, and whether blocks are separated by file extension.
http://www.7-zip.org/history.txt
http://docs.bugaco.com/7zip/MANUAL/switches/method.htm

Additionally, if the archive becomes even slightly damaged, some of the data (sometimes even all data) after the damaged part can be unusable (depending on the compression and archiving format), whereas in a non-solid archive format, usually only one file is unusable and the subsequent files can usually still be extracted.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK