MPEG Surround
From Wikipedia, the free encyclopedia
| This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (April 2009) |
MPEG Surround (ISO/IEC 23003-1 or MPEG-D) is a lossy compression format for surround sound that provides a method for extending mono or stereo audio services to multi-channel audio in a backwards compatible fashion. The total bit rates used for the (mono or stereo) core and the MPEG Surround data are typically only slightly higher than the bit rates used for coding of the (mono or stereo) core. MPEG Surround adds a side-information stream to the (mono or stereo) core bit stream, containing spatial image data. Legacy stereo playback systems will ignore this side-information while players supporting MPEG Surround decoding will output the reconstructed multi-channel audio.
The (mono or stereo) core could be coded with any (lossy or lossless) audio codec. Particularly low bitrates (64-96 kbit/s for 5.1 channels) are possible when using HE-AAC v2 as the core codec.
Contents |
[edit] Perception of sounds in space
MPEG Surround coding uses our capacity to perceive sound in the 3D and captures that perception in a compact set of parameters. Spatial perception is primarily attributed to three parameters, or cues, describing how humans localize sound in the horizontal plane: Interaural level differences(ILD), Interaural time difference(ITD) and Interaural coherence (IC). This three concepts are illustrated in next image. Direct, or first-arrival, waveforms from the source hit the left ear at time, while direct sound received by the right ear is diffracted around the head, with time delay and level attenuation, associated. These two effects result in ITD and ILD are associated with the main source. At last, in a reverberant environment, reflected sound from the source, or sound from diffuse source, or uncorrelated sound can hit both ears, all of them are related with IC.
[edit] Description
MPEG Surround uses interchannel differences in level, phase and coherence equivalent to the ILD, ITD and IC parameters. The spatial image is captured by a multichannel audio signal relative to a transmitted downmix signal. These parameters are encoded in a very compact form so as to decode the parameters and the transmitted signal and to synthesize a high quality multichannel representation.
MPEG Surround encoder receives a multichannel audio signal,x1 to xN where the number of input channels is N. The most important aspect of the encoding process is that a downmix signal, xt1 and xt2, which is typically stereo, is derived from the multichannel input signal, and it is this downmix signal that is compressed for transmission over the channel rather than the multichannel signal. The encoder may be able to exploit the downmix process so as to be more advantageous. It not only creates a faithful equivalent of the multichannel signal in the mono or stereo downmix, but also creates the best possible multichannel decoding based on the downmix and encoded spatial cues as well. Alternatively, the downmix could be supplied externally (Artistic Downmix in before Diagram Block). The MPEG Surround encoding process could be ignored by the compression algorithm used for the transmitted channels (Audio Encoder and Audio Decoder in before Diagram Block). It could be any type of high-performance compression algorithms such as MPEG-1 Layer III, MPEG-4 AAC or MPEG-4 High Efficiency AAC, or it could even be PCM.
[edit] Legacy compatibility
The most impotant point of the MPEG Surround technique, is that the transmitted downmix (e.g. stereo) is an excellent stereo version of the multichannel signal. In the case that an artistic downmix is available, it will be the chosen one. As shown in Figure 2, this is available in the output signal from the Audio Decoder. Hence, stereo decoder equipment has not got any disadvantage in front of the MPEG Surround decoders. This is necessary, since stereo presentation will remain pervasive due to the number of applications in which listening is primarily via headphones, such as portable music players. Additionally, MPEG Surround supports a mode in which the downmix is compatible with popular matrix surround decoders, e.g. Dolby Surround.
[edit] Applications
Digital Audio Broadcasting: Due to the relatively small channel bandwidth, the relatively large cost of transmission equipment and transmission licenses and the desire to maximize user choices by providing many programs, the majority of existing or planned digital broadcasting systems cannot provide multichannel sound to the users. Adding this feature could be a strong motivation for users to make the transition from their traditional FM receivers to new digital receivers. MPEG Surround technology could be a key factor in increasing the attractiveness of digital radio systems since it provides functionality not obtainable in other ways.
The backwards compatibility of MPEG Surround Sound Coding to existing stereo digital radio receivers is one of the key factors for existing digital audio broadcasting systems. This compatibility approach has the following advantages:
- Simulcast of a stereo and a 5.1 multichannel audio stream can be avoided
- Smooth transition from a pure stereo to a 5.1 multichannel audio service
- Minimum amount of additional bit-rate for spatial information required
- Compatibility provided for stereo audio and Programme Associated Data
- Same error robustness for stereo as well as for the 5.1 multichannel audio service
- No reconfiguration of multiplexer necessary
- Minimum additional operational complexity to provide both, a stereo and a 5.1 multichannel audio service for the service providers
Digital TV Broadcasting: Currently, the majority of digital TV broadcasts use stereo audio coding. MPEG Surround constitutes an excellent opportunity to extend these established services to surround sound. For a small additional overhead in bitrate, MPEG Surround enables stereo audio presentations in existing and new services to be smoothly upgraded to multichannel audio, while maintaining stereo backwards compatibility without any restriction or quality degradation for the installed receiver base. New receivers incorporating MPEG Surround will be able to deliver multichannel audio using the same transmitted signal. Hence MPEG Surround is of interest to any digital TV broadcasting system, either because of the possibility of performing a backward compatible upgrade of the receivers or because of the high efficiency of the multi channel transmission.
Music download service: Currently, a number of commercial music download services are available and working with considerable commercial success. Such services could be seamlessly extended to provide multichannel presentations while remaining compatible with stereo players: on computers with 5.1 channel playback systems the compressed sound files are presented in surround sound while on portable players the same files are reproduced in stereo.
Streaming music service / Internet radio: Many Internet radios operate with severely constrained transmission bandwidth, such that they can offer only mono or stereo content. MPEG Surround Coding technology could extend this to a multichannel service while still remaining within the permissible operating range of bitrates. Since efficiency is of paramount importance in this application, compression of the transmitted audio signal is vital. Using recent MPEG compression technology (MPEG-4 High Efficiency Profile coding), full MPEG Surround systems have been demonstrated with bitrates as low as 48 kbit/s.
[edit] Conclusions
MPEG Surround is the latest technology for bitrate efficient and backward compatible presentation of multi-channel audio. Nowadays a new MPEG Audio work item, MPEG Surround, has been demonstrated to deliver high-quality multichannel audio at bitrates as low as 48kbit/s. The most benefied applications are those ones that are currently based on the presentation of mono or stereo signals,since these same transmission channels now would be able to carry multichannel presentations for new decoder equipment, while retaining high-quality mono or stereo presentation for reproduction by legacy equipment.



