Fandom

File Formats Wiki

MP3

261pages on
this wiki
Add New Page
Talk0 Share
MPEG-1 Audio Layer 3
MP3
Filename extension .mp3
Internet media type audio/mpeg
Type Audio
File formats category - v  e   edit
Smallwikipedialogo.png Wikipedia has an article related to:

MPEG-1 Audio Layer 3, more commonly referred to as MP3, is a digital audio encoding format using a form of lossy data compression. It is a common audio format for consumer audio storage, as well as a de facto standard encoding for the transfer and playback of music on digital audio players. MP3 is an audio-specific format that was designed by the Moving Picture Experts Group. It was approved as an ISO/IEC standard in 1991.

The use in MP3 of a lossy compression algorithm is designed to greatly reduce the amount of data required to represent the audio recording and still sound like a faithful reproduction of the original uncompressed audio for most listeners, but is not considered high fidelity audio by audiophiles. An MP3 file that is created using the mid-range bit rate setting of 128 kbit/s will result in a file that is typically about 1/10th the size of the CD file created from the original audio source. An MP3 file can also be constructed at higher or lower bit rates, with higher or lower resulting quality. The compression works by reducing accuracy of certain parts of sound that are deemed beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual coding.[1] It internally provides a representation of sound within a short term time/frequency analysis window, by using psychoacoustic models to discard or reduce precision of components less audible to human hearing, and recording the remaining information in an efficient manner. This is relatively similar to the principles used by JPEG, an image compression format.

Encoding audioEdit

The MPEG-1 standard does not include a precise specification for an MP3 encoder, but does provide example psychoacoustic models, rate loop, and the like in the non-normative part of the original standard. At the present, these suggested implementations are quite dated. Implementers of the standard were supposed to devise their own algorithms suitable for removing parts of the information in the raw audio (or rather its MDCT representation in the frequency domain). During encoding, 576 time-domain samples are taken and are transformed to 576 frequency-domain samples. If there is a transient, 192 samples are taken instead of 576. This is done to limit the temporal spread of quantization noise accompanying the transient. (See psychoacoustics.)

As a result, there are many different MP3 encoders available, each producing files of differing quality. Comparisons are widely available, so it is easy for a prospective user of an encoder to research the best choice. It must be kept in mind that an encoder that is proficient at encoding at higher bit rates (such as LAME) is not necessarily as good at lower bit rates.

Decoding audioEdit

Decoding, on the other hand, is carefully defined in the standard. Most decoders are "bitstream compliant", which means that the decompressed output - that they produce from a given MP3 file - will be the same (within a specified degree of rounding tolerance) as the output specified mathematically in the ISO/IEC standard document (ISO/IEC 11172-3).

The MP3 file has a standard format, which is a frame that consists of 384, 576, or 1152 samples (depends on MPEG version and layer), and all the frames have associated header information (32 bits) and side information (9, 17, or 32 bytes, depending on MPEG version and stereo/mono). The header and side information help the decoder to decode the associated Huffman encoded data correctly.

Therefore, comparison of decoders is usually based on how computationally efficient they are (i.e., how much memory or CPU time they use in the decoding process).

Audio quality Edit

When performing lossy audio encoding, such as creating an MP3 file, there is a trade-off between the amount of space used and the sound quality of the result. Typically, the creator is allowed to set a bit rate, which specifies how many kilobits the file may use per second of audio, as in when ripping a compact disc to MP3 format. Using a lower bit rate provides a relatively lower audio quality and produces a smaller file size. Likewise, using a higher bit rate outputs a higher quality audio, but also results in a larger file.

Files encoded with a lower bit rate will generally play back at a lower quality. With too low a bit rate, "compression artifacts" (i.e., sounds that were not present in the original recording) may be audible in the reproduction. Some audio is hard to compress because of its randomness and sharp attacks. When this type of audio is compressed, artifacts such as ringing or pre-echo are usually heard. A sample of applause compressed with a relatively low bit rate provides a good example of compression artifacts.

Besides the bit rate of an encoded piece of audio, the quality of MP3 files also depends on the quality of the encoder itself, and the difficulty of the signal being encoded. As the MP3 standard allows quite a bit of freedom with encoding algorithms, different encoders may feature quite different quality, even with identical bit rates.

The simplest type of MP3 file uses one bit rate for the entire file — this is known as constant bitrate (CBR) encoding. Using a constant bit rate makes encoding simpler and faster. However, it is also possible to create files where the bit rate changes throughout the file. These are known as variable bitrate (VBR) files. The idea behind this is that, in any piece of audio, some parts will be much easier to compress, such as silence or music containing only a few instruments, while others will be more difficult to compress. So, the overall quality of the file may be increased by using a lower bit rate for the less complex passages and a higher one for the more complex parts. With some encoders, it is possible to specify a given quality, and the encoder will vary the bit rate accordingly. Users who know a particular "quality setting" that is transparent to their ears can use this value when encoding all of their music, and not need to worry about performing personal listening tests on each piece of music to determine the correct bit rate.

In a listening test, MP3 encoders at low bit rates performed significantly worse than those using more modern compression methods (such as AAC).

Perceived quality can be influenced by listening environment (ambient noise), listener attention, and listener training and in most cases by listener audio equipment (such as sound cards, speakers and headphones).

Bit rateEdit

Several bit rates are specified in the MPEG-1 Layer 3 standard: 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160, 192, 224, 256 and 320 kbit/s, and the available sampling frequencies are 32, 44.1 and 48 kHz. A sample rate of 44.1 kHz is almost always used, because this is also used for CD audio, the main source used for creating MP3 files. A greater variety of bit rates are used on the Internet. 128 kbit/s is the most common, because it typically offers adequate audio quality in a relatively small space. 192 kbit/s is often used by those who notice artifacts at lower bit rates. As the Internet bandwidth availability and hard drive sizes have increased, 128 kbit/s bit rate files are slowly being replaced with higher bit rates like 192 kbit/s, with some being encoded up to MP3's maximum of 320 kbit/s. It is unlikely that higher bit rates will be popular with any lossy audio codec because file sizes at higher bit rates approach those of lossless codecs such as FLAC.

By contrast, uncompressed audio as stored on a compact disc has a bit rate of 1,411.2 kbit/s (16 bit/sample × 44100 samples/second × 2 channels / 1000 bits/kilobit).

Some additional bit rates and sample rates were made available in the MPEG-2 and the (unofficial) MPEG-2.5 standards: bit rates of 8, 16, 24, and 144 kbit/s and sample rates of 8, 11.025, 12, 16, 22.05 and 24 kHz.

Non-standard bit rates up to 640 kbit/s can be achieved with the LAME encoder and the freeformat option, although few MP3 players can play those files. According to the ISO standard, decoders are only required to be able to decode streams up to 320 kbit/s.

File structureEdit

An MP3 file is made up of multiple MP3 frames, which consist of a header and a data block. This sequence of frames is called an elementary stream. Frames are not independent items ("byte reservoir") and therefore cannot be extracted on arbitrary frame boundaries. The MP3 Data blocks contain the (compressed) audio information in terms of frequencies and amplitudes. The diagram shows that the MP3 Header consists of a sync word, which is used to identify the beginning of a valid frame. This is followed by a bit indicating that this is the MPEG standard and two bits that indicate that layer 3 is used; hence MPEG-1 Audio Layer 3 or MP3. After this, the values will differ, depending on the MP3 file. ISO/IEC 11172-3 defines the range of values for each section of the header along with the specification of the header. Most MP3 files today contain ID3 metadata, which precedes or follows the MP3 frames; as noted in the diagram.

ID3 and other tagsEdit

Main articles: ID3 and APEv2 tag

A "tag" in an audio file is a section of the file that contains metadata such as the title, artist, album, track number or other information about the file's contents.

As of 2006, the most widespread standard tag formats are ID3v1 and ID3v2, and the more recently introduced APEv2.

APEv2 was originally developed for the MPC file format. APEv2 can coexist with ID3 tags in the same file or it can be used by itself.

Tag editing functionality is often built-in to MP3 players and editors, but there also exist tag editors dedicated to the purpose.

Volume normalizationEdit

Since volume levels of different audio sources can vary greatly, it is sometimes desirable to adjust the playback volume of audio files such that a consistent average volume is perceived. The idea is to control the average volume across multiple files, not the volume peaks in a single file. This gain normalization, while similar in purpose, is distinct from dynamic range compression (DRC), which is a form of normalization used in audio mastering. Gain normalization may defeat the intent of recording artists and audio engineers who deliberately set the volume levels of the audio they recorded.

Alternative technologiesEdit

Main article: List of codecs

Many other lossy and lossless audio codecs exist. Among these, mp3PRO, AAC, and MP2 are all members of the same technological family as MP3 and depend on roughly similar psychoacoustic models. The Fraunhofer Gesellschaft owns many of the basic patents underlying these codecs as well, with others held by Dolby Labs, Sony, Thomson Consumer Electronics, and AT&T. In addition, there is also the open source file format Ogg Vorbis that has been available free of charge and legal threat thanks to the XIPH Open Source community.

See alsoEdit

ReferencesEdit

  1. Jayant, Nikil; Johnston, James; Safranek, Robert (October 1993). "Signal Compression Based on Models of Human Perception". Proceedings of the IEEE 81 (10): 1385–1422. doi:10.1109/5.241504. 

External linksEdit

This page uses CC-BY-SA content from Wikipedia (authors). Smallwikipedialogo.png

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.