File Formats Wiki
Register
Advertisement
7z
Blank

Filename extension .7z
Internet media type application/x-7z-compressed
Developed by Igor Pavlov
Type Data compression
archive format
File formats category - v  e   edit

7z is a compressed archive file format that supports several different data compression, encryption and pre-processing filters. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public License. The LZMA SDK 4.62 was placed in the public domain in December 2008. The latest version of 7-Zip and LZMA SDK is version 4.65.

Features and enhancements[]

The 7z format provides the following main features:

  • Open, modular architecture which allows any compression, conversion, or encryption method to be stacked.
  • High compression ratios (depending on the compression method used)
  • Strong Rijndael/AES-256 encryption.
  • Large file support (up to approximately 16 exabytes).
  • Unicode file names
  • Support for solid compression, where multiple files of like type are compressed within a single stream, in order to exploit the combined redundancy inherent in similar files.
  • Compression and encryption of archive headers.

The format's open architecture allows additional future compression methods to be added to the standard.

Compression method filters[]

The following compression methods are currently defined:

  • LZMA – A variation of the LZ77 algorithm, using a sliding dictionary up to 4 GB in length for duplicate string elimination. The LZ stage is followed by entropy coding using a Markov chain based range coder and binary trees.
  • LZMA2 - modified version of LZMA. it provides the following advantages: better compression ratio for data than can't be compressed, better multithreading support.
  • Bzip2 – The standard Burrows-Wheeler transform algorithm. Bzip2 uses two reversible transformations; BWT, then Move to front with Huffman coding for symbol reduction (the actual compression element).
  • PPMd – Dmitry Shkarin's 2002 PPMdH (PPMII/cPPMII) with small changes: PPMII is an improved version of the 1984 PPM compression algorithm (prediction by partial matching).
  • DEFLATE

A suite of recompression tools called AdvanceCOMP contains a copy of the DEFLATE encoder from the 7-Zip implementation; these utilities can often be used to further compress the size of existing gzip, ZIP, PNG, or MNG files.

Pre-processing filters[]

The LZMA SDK comes with the BCJ / BCJ2 preprocessor included, so that later stages are able to achieve greater compression: For x86, ARM, PowerPC (PPC), IA64 and ARM Thumb processors, jump targets are normalized before compression by changing relative position into absolute values. For x86, this means that near jumps, calls and conditional jumps (but not short jumps and conditional jumps) are converted from the machine language "jump 1655 bytes backwards" style notation to normalized "jump to address 5554" style notation.

  • BCJ - Converter for 32-bit x86 executables. Normalise target addresses of near jumps and calls from relative distances to absolute destinations.
  • BCJ2 - Pre-processor for 32-bit x86 executables. BCJ2 is an improvement on BCJ, adding additional x86 jump/call instruction processing. Near jump, near call, conditional near jump targets are split out and compressed separately in another stream.
  • Delta - delta filter, basic preprocessor for multimedia data.

Similar executable pre-processing technology is included in other software; the RAR compressor features displacement compression for 32-bit x86 executables and IA64 Itanium executables, and the UPX runtime executable file compressor includes support for working with 16 bit values within DOS binary files.

Encryption[]

The 7z format supports encryption with the AES algorithm with a 256-bit key. The key is generated from a user-supplied passphrase using an algorithm based on the SHA-256 hash algorithm. The SHA-256 is executed 218 (256K) times[1] which causes a significant delay on slow PCs before compression or extraction starts. This technique is called key strengthening and is used to make a brute-force search for the passphrase more difficult.

Limitations[]

The 7z format does not store UNIX owner/group permissions, and hence can be inappropriate for backup/archival purposes. A workaround is to convert data to a tar bitstream before compressing with 7z.

Unlike WinRAR, 7z cannot extract "broken files" - that is (for example) if one has the first segment of a series of 7z files, 7z cannot give the start of the files within the archive - it must wait until all segments are downloaded. WinZip has the same limitation.

References[]

External links[]

  • 7z Format — General description about the 7z archive format.
This page uses CC-BY-SA content from Wikipedia (authors). Smallwikipedialogo.png
Advertisement