DAB vs DAB+ technology
Introduction
The DAB system was designed in the 1980s, and because the technologies it uses
are so old -- the technologies are
unchanged to this day -- DAB is a very
inefficient system by today's standards.
In 2003, new systems emerged that had been designed to enable mobile TV,
such as DVB-H and T-DMB, and these systems could also carry digital radio.
However, crucially, these systems used the AAC+ audio codec and Reed-Solomon
error correction coding, and the combination of these two technologies made
DVB-H six-times as efficient as DAB and T-DMB four-times as efficient as DAB.
Because of its inherent problems, broadcasters and governments from
numerous countries became opposed to using the old DAB system, so WorldDAB
(now called WorldDMB) was forced to upgrade the old DAB system or risk seeing
the UK, Denmark and Norway stranded using the old DAB system while all other
countries would have chosen a more modern system instead. The upgrade they came up with is called 'DAB+', and this page
compares the technologies used on DAB with those used on the new DAB+ system.
1.1.1 MP2 Codec Technology
Codec Type
MP2 is what is called a 'sub-band codec', which means that the linear PCM
input audio signal is first split by a filterbank (a poly-phase quadrature
mirror filter (PQMF)) into 32 equal-bandwidth 'sub-bands'. The input signal is
then analysed by the psychoacoustic model (a model of the human hearing
system), which determines which of the 32 sub-bands will be perceptible by
humans and which won't be. Those that are deemed to be
imperceptible are not transmitted, whereas the subbands that are deemed to be
perceptible are all encoded.
As the input signal is simply split into 32 subbands and the signal is not
transformed into the frequency domain, MP2 compression takes place in the time
domain, as is always the case with subband codecs.
Frequency Resolution
Referring to Nyquist's Sampling Theorem:
Fs >= 2 B
where Fs is the sampling frequency and B is the bandwidth of the
signal. For a sampling frequency of 48 kHz, which the DAB system uses, MP2's
frequency resolution (the bandwidth of each subband) is:
MP2 frequency resolution = (48 kHz / 2) / 32
MP2 frequency resolution = 750 Hz
Stereo Coding
MP2 can encode stereo signals either using discrete stereo coding or joint
stereo coding.
Discrete stereo coding consists simply of encoding the left and right channels separately.
Discrete stereo
coding is only suitable for high bit rates such as 192 kbps and above.
Intensity joint stereo coding consists of encoding the left and
right channels together above a certain frequency threshhold that is set on
a frame-by-frame basis by the encoder. Intensity stereo is the only type of joint stereo that MP2 can
use (other codecs can use the superior mid/side joint stereo method) and
intensity stereo can be used from a frequency of 2.25 kHz and upwards -- i.e.
it can be used over the top 90% of the audio band, and on 128 kbps MP2
services on DAB it tends to be used over the whole of the available range,
thus 128 kbps radio stations are actually panned mono rather than stereo.
For each subband, the
amplitudes of the left and right channels are added together then encoded
along with a vector for each subband that equates to the ratio between the
average energy (the intensity) of the left and right channels. Joint stereo
is invariably used on MP2 at
bit rates of 160 kbps and below, because at such low bit rates it is
preferable to accept some degradation of the stereo image in order to save
some bits so that more bits can be used to more accurately represent the audio
samples.
Intensity stereo is a lossy coding method, because the relative
phase information between the left and right channels is lost when the left
and right channel samples are added together.
The ear uses the relative time-difference between sounds in the left and
right ears to determine the direction from which the sound came, and it is
therefore this relative phase information that creates the stereo image.
Because the relative phase information above a certain frequency threshold is
destroyed, this is why the 128 kbps DAB radio stations
either have a very poor stereo image or the stereo image totally collapses.
The encoder can choose on an audio frame-by-audio frame basis whether to
use joint stereo or discrete stereo coding. At bit rates as low as 128 kbps
the encoder forces joint stereo for all frames, and at 160 kbps either all or
virtually all frames use joint stereo coding. Although most radio stations that use 192 kbps
-- for example on digital satellite -- use discrete stereo coding (which disallows the use of joint stereo for any audio
frames), it is still
better to use joint stereo at a bit rate of 192 kbps so that the encoder can
have the option to choose joint stereo coding for audio frames where it would be beneficial. Unfortunately, the BBC
misguidedly changed Radios 1-4 on Freeview (which all use 192 kbps) from being
joint stereo to discrete stereo in 2004, and the audio quality of these
stations on Freeview has never been as good since.
Sweet Spot
Karlheinz Brandenburg, the inventor of MP3, describes the sweet spot of an
audio codec as follows:
"Lower bit-rates will lead to higher compression factors, but lower quality of the compressed
audio. Higher bit-rates lead to a lower probability of signals with any audible artifacts. However, different
encoding algorithms do have ”sweet spots” where
they work best. At bit-rates much larger than this target bit-rate the audio quality improves only very slowly
with bit-rate, at much lower bit-rates the quality decreases very fast." |
MP2's sweet spot is at 192 kbps.
Just for comparison, Karlheinz Brandenburg describes the sweet spots of MP3
and AAC as follows:
"For Layer-3 [MP3] this target bit-rate is around 1.33 bit/sample (i.e. 128
kbit/s for a stereo signal at 48 kHz), for AAC it is around 1 bit/sample (i.e. 96 kbit/s for a stereo signal at 48 kHz)."
1.1.2 AAC+ Codec Technology
Codec Type
AAC+ is the combination of the standard AAC (LC AAC -- low complexity AAC)
codec with Coding Technologies' Spectral Band Replication (SBR) technology --
the AAC audio codec encodes the bottom half of the audio spectrum, and SBR
encodes the top half of the audio spectrum.
AAC is a transform codec, which means that blocks of input audio samples
are first transformed into the frequency domain by means of a modified
discrete cosine transform (MDCT), and compression takes place in the frequency
domain, not the time domain.
SBR is based on the fact that there is a strong correlation between the top
and bottom halves of the audio spectrum due to the presence of harmonics of
the lower frequencies. SBR works by transposing the bottom half of the audio
spectrum to the top half, and then modifying this top half of the spectrum so
that it resembles the actual top half of the audio spectrum more closely. The
SBR data only consists of the modification information, and this only requires an
ultra-low bit rate channel of between 1 - 3 kbps, which is far lower than the
bit rate that would be required if the top half of the audio spectrum is encoded by any
other audio coding method. This is why AAC+'s official name is HE AAC -- High
Efficiency AAC.
Frequency Resolution
AAC's frequency resolution varies depending on the statistical properties
of the input signal: typically the signal has stationary statistics and a
2,048-point MDCT is performed, which gives a frequency resolution of 23 Hz for
a 48 kHz sampling frequency input signal. For transient signals the AAC
encoder reverts to a 256-point MDCT in order to improve the time resolution to
avoid pre-echo.
Stereo Coding
AAC can use the following stereo coding types:
Discrete stereo coding - as explained in the MP2 section above
Mid/side joint stereo coding consists of forming the sum of the left
and right channels (L+R) and the difference between the left and right
channels (L-R) and encoding these sum and differences separately:
M = L + R
S = L - R
The decoder
then carries out the following equations to return the L and R signals:
Left = M + S = 0.5 x ((L+R) + (L-R)) = 0.5 x 2L = L
Right = M - S = 0.5 x ((L+R) - (L-R)) = 0.5 x 2R = R
Unlike intensity stereo coding,
which is a lossy form of joint stereo coding, mid/side joint stereo coding is lossless,
which means that the original information is returned without losing any of
the original information. Therefore, mid/side joint stereo coding does not
have the collapsing and non-existent stereo image problems that MP2 has at low
bit rates.
Intensity stereo coding - as explained in the MP2 section above
AAC allows the encoder to choose between the stereo coding modes listed
above in a more flexible fashion than MP2 allows.
Sweet Spot
AAC+'s sweet spot is at approximately 64 kbps.
MP2 vs AAC/AAC+ performance
To compare audio codecs, listening tests are carried out that follow the
ITU BS.1116 standard so that objective comparisons can be made between audio
codecs and between different bit rate levels.
The BS.1116 standard defines that testers should grade the audio quality
according to the the following impairment scale:

The following table shows the scores achieved in
listening tests for MP2, AAC and AAC+:
Bit Rate
kbps |
MP2 |
AAC |
AAC+ |
| 192 |
3.33 |
- |
- |
| 160 |
2.65 |
- |
- |
| 128 |
2.40 |
4.74 |
- |
| 64 |
- |
4.59 |
3.74 |
| 48 |
- |
- |
3.30 |
These results are shown graphically below:

The performance of AAC/AAC+ is therefore vastly superior to that of MP2
both in terms of the absolute level of audio quality that can be achieved (at
reasonable bit rate levels) and in particular in terms of the coding efficiency.
1.1 Why is AAC+ so much more efficient than MP2?
Put very simply, MP2 was not designed to be efficient, whereas AAC+ is the
culmination of over a decade's worth of advances in audio coding since MP2 was
chosen to be used on DAB, and AAC+ is
designed to be as efficient as possible.
1.1.3 Comparison of MP2 & AAC+ Codecs
The most obvious and largest difference in efficiency between the two codecs is
due to AAC+'s use
of SBR, which only consumes a bit rate of between 1 to 3 kbps to encode the
entire top half of the audio spectrum, compared to MP2 (and all other non-SBR
codecs) that consume a relatively large (not 50%, but still a very sizeable
percentage) of the overall bit rate on encoding the upper half of the
spectrum.
An inherent major problem with MP2 is its 750 Hz frequency resolution,
which stems from the fact that the input signal is only split into 32
subbands. What the frequency resolution determines is how finely redundancy
can be removed -- which is the key to reduced bit rate audio coding. With
MP2's 32 subbands, if there is a frequency component that is deemed to be
perceptible in a subband then that whole subband must be encoded. In
comparison, the very fine frequency resolution of the transform codecs --
AAC's frequency resolution is just 23 Hz for signals with stationary
statistics -- allows just those frequency components that are perceptible to
be encoded, and the rest discarded, which is far more efficient than the way
MP2 works.
The effect of too many subbands having to be encoded because the
psychoacoustic model deems there to be at least something perceptible in that
subband is that the available bit rate is spread too thinly, so there is an
insufficient number of bits available to encode the perceptible subbands, which leads to
an increase in the quantisation noise (coding noise) level, and in turn audio
artefacts become perceptible.
This can most readily be perceived on pop and especially rock music, which
tend to have a wideband spectrum and the dynamic range has already been
compressed when the CD was mastered, and the radio station simply flattens the
audio spectrum further by its use of audio processing. This results in a large
number of subbands being deemed to be perceptible, which requires them to be
encoded, resulting in dreadful definition, the stereo image is non-existent
and it degenerates into a ridiculously low quality wall of sound.
Transform codecs, with their much finer frequency resolution and their
ability to remove more redundancy unsurprisingly perform much better with
these more challenging to encode types of music.
Finally, MP2 being limited to using intensity stereo is another major
Achilles heal compared to AAC+ and all the other audio codecs that do allow
mid/side joint stereo coding. As mentioned above, only at a bit rate of 192
kbps and above does the encoder really begin to choose when intensity stereo
is actually beneficial rather than having it forced upon it due to their being
insufficient bits to use discrete stereo even when the signal demands it.
This isn't a problem with mid/side joint stereo, because it doesn't destroy
any of the phase information that intensity stereo does.
Overall, MP2 should not really ever be used at bit rates below 192 kbps,
and 128 kbps is simply far too low a bit rate to provide audio quality that
should be expected on a modern digital radio system.
Error Correction Coding
Error correction coding is the "heart" of a wireless digital
communication system, and without it applications such as digital terrestrial
TV, digital radio and mobile TV wouldn't be feasible. The error correction
coding scheme used on a digital radio system is important for the following
two main reasons:
- it determines how robust reception will be, because reception problems
occur when the error correction coding fails to correct a sufficient
proportion of the bit errors that inevitably occur with transmission over
the mobile channel
- it affects the spectral efficiency of the system, because a stronger
error correction coding scheme can correct more errors than a weaker one,
so stronger error correction coding schemes enable the capacity of a
multiplex to increase
An error correction coding scheme for a digital radio system must take into
account the error performance of the audio bitstream it is protecting.
DAB's UEP error correction coding
DAB uses UEP convolutional error correction coding, where UEP stands for unequal error
protection.
Audio data is grouped into audio frames, and some parts of the audio frame
are more sensitive to errors than other parts, and UEP protects more strongly the parts of
the audio frame that are more sensitive to errors and vice
versa.
The strength of a particular type of error correction coding can be varied
by changing its "code rate", Rc, and a lower code rate
will result in stronger protection and vice versa.
The figure below is adapted from the "Digital Audio Broadcasting: Principles
& Applications" book, and it shows how DAB's UEP applies stronger
error protection to the header, scale factors, PAD and so on, because these
parts of the audio frame are important to the correct playback, and lower
protection is applied to the sub-band samples (the actual encoded audio
samples). The height of the blocks in the figure denote how strongly this part
is protected.

Group 1 contains important information for things like synchronisation and
audio stream information; group 2 contains the scale factors, which scale the
subband samples (these are form the exponent of a crude floating point
number system); group 3 contains the subband samples (these form the
mantissa of a crude floating point number system to go with the scale
factors); and group 4 consists of the
PAD (programme associated data) and scale factors CRC (cylic redundancy
check).
The Rc values quoted in the figure are those used for 128 kbps
using Protection Level 3 (PL3), which is by far the most widely used
Protection Level.
Although from the figure you might at first glance think that the main
problem would be with the sub-band samples, because they have the have the
weakest error protection, the main problem is the insufficient protection of the scale factors.
The scale factors form the exponent of the crude floating point system used
to encode the subband audio samples, and any errors in these scale factors
should be detected by the scale factors' CRC check. When such errors are
detected this leads to either muting or crude error concealment techniques to
be used for the affected subbands which produces the "bubbling mud"
sound that accompanies poor DAB reception quality.
This problem with the error protection of the scale factors is that DAB's
error correction coding scheme uses convolutional coding (which is not by any
means a strong form of error correction when used on its own), and the code rate used to protect
the scale factors is only 8/18, or 0.44. Only using a convolutional code at a
code rate of 0.44 to protect something as crucial to the correct playback of
digital audio as the scale factors are is far too weak, and it is unsurprising
that reception problems are rife on DAB.
One thing that proponents of the old DAB system have consistently claimed
is that MP2 is somehow more robust for use on digital radio systems than
other audio codecs are. This view is typified in a comment made by Quentin
Howard, the chief executive of Digital One and the current President of WorldDMB, when he said:
"... AAC+ and WM9 [are used] in other applications and an
enhanced Reed-Solomon layer of error correction [is] available for these more fragile encoding
algorithms."
The argument put forward by the proponents of the old DAB system goes as
follows: Audio codecs such as MP3, AAC and AAC+ must use extra error
correction coding to protect them whereas MP2 doesn't need any extra error
correction coding to protect it on DAB, therefore MP2 must be more
error-robust than the other audio codecs.
This is simply completely false.
As discussed above, DAB uses UEP to protect MP2, and the reason it uses UEP
is because both the length of an MP2 audio frame and the groups within each
audio frame are fixed, so UEP can easily be applied. And it is only the
use of UEP on DAB that makes it appear as though MP2 is more robust than other
audio codecs, when in fact it is no more robust.
The length of audio frames for MP3, AAC, AAC+ etc is not fixed, therefore
it is not as easy to use UEP with these other audio codecs -- although it is not
impossible, because DRM uses UEP to protect AAC+.
The proponents of the old DAB system are simply failing to understand
what I mentioned at the beginning of the section on Error Correction Coding,
which is that it is better to use stronger error correction coding because
this allows the capacity of a multiplex to increase. And indeed, DAB+ is using
EEP (equal error protection) convolutional coding along with an outer layer of
Reed-Solomon coding, which is far stronger than the error correction coding
scheme used on DAB, and this will allow the multiplex capacity on DAB+ to
increase by about 30-40% compared to the capacity of a multiplex using the old
DAB system -- unless the broadcasters decide to greatly extend the coverage
area rather than take advantage of the increase in capacity.
Demonstration of why MP2 is no more robust than other codecs
For the DAB proponents' claim to be true then MP2 must be more robust than other codecs
when both audio codecs are using the same error correction coding scheme
.
So in order to demonstrate that they're wrong, I've written a program that simply adds bit errors
to files at random. You can download an executable of the program I've written
here,
and the C++
program file is here. Here's three files you can test the program with:
128
kbps MP2 file with no errors added
128
kbps MP3 file with no errors added
64
kbps AAC+ file with no errors added
And here's the same files with errors added where the BER (bit error rate) is
10-4:
Same
128 kbps MP2 file with errors added with a BER of 10-4
Same
128 kbps MP3 file with errors added with a BER of 10-4
Same
64 kbps AAC+ file with errors added with a BER of 10-4
None of them are acceptable to listen to, but they're about as bad as each
other, and I would say that the MP2 file is actually arguably worse than the MP3
and AAC+ files.
To run the program, copy the above files with no errors added to them or some
of your own files (see note below though) to the same directory in
which you've put the aac_mp2_errors.exe program; run the program, enter the
audio file's filename when requested, choose a bit error rate (BER) value (e.g.
enter 1e-4), and say no to using RS coding (enter the letter n).
A suitable BER value is 10-4, because this is the typical BER
figure quoted for digital radio systems, such as DAB and DRM. A BER of 10-4
means that there is one bit error every 10,000 bits.
Note: You might run into problems playing back AAC/AAC+ files that
you've encoded yourself after you've added errors to them, because audio you
encode at home isn't expected to have any errors, so things like Nero's AAC/AAC+
encoder must use different settings with respect to error detection and Winamp
sometimes won't play it back, especially when the BER is high -- this is why
I've used an MPEG-2 AAC+ file that use ADTS headers (which are similar to the
headers used for MP2 and MP3) above rather than an MPEG-4 AAC+ file encoded
using Nero.
Theory for the program
The above program adds bit errors at random to the audio files, and this
simulates EEP (equal error protection) convolutional coding with an infinite
length (i.e. ideal) time-interleaver. The program acts identically for the
different audio formats, so it is a fair comparison.
The reason why MP2 is no more robust than the other codecs can be
ascertained by looking at the figure of the MP2 audio frame above. All audio
formats that are used on broadcasting systems or live Internet streams are
split up into audio frames like MP2 is, and these audio frames consist of a
small percentage of data that is very important to the correct playback of the
audio, and if an error lands within this important part of the audio frame
then there's a high, or at least higher, probability that the error will be
perceivable to the listener than if the error lands in the less important part
of the audio frame.
Looking at this mathematically: The audio frame header is the most important
part of the audio frame, because errors that land in the audio frame header are
the most likely to lead to an audible disturbance. The audio frame header
accounts for approximately 6% of the length of an audio frame for a 128 kbps MP2
stream, so the probability that an error will land in the audio frame header is:
Probability of error landing in audio frame =
So
if you consider the case of MP2 and AAC being protected by identical EEP
convolutional coding along with an interleaver of infinite depth (the job of
an interleaver is to make the errors uniformly randomly distributed), the
areas of the audio frame for both MP2 and AAC will have identical protection,
so the BER (bit error rate -- which is the proportion of bits that are in
error (it's not really a rate, but that's the name for it)) will be identical
for both audio codecs and the
One thing that proponents of the old DAB system have consistently claimed
is that MP2 is somehow more robust than other audio codecs. For example,
one classic quote made by the current President of WorldDAB, Quentin Howard,
who is also the chief executive of the UK national commercial DAB multiplex
operator, Digital One, is as follows:
"Spurious claims from some quarters that MPEG-1 Layer2 audio is outdated or inefficient is a failure to understand the beauty of the way the frame length of MPEG and COFDM co-exist and the benefit of UEP which together deliver a very robust audio experience.
Eureka-147 allows for other audio coding, of course, with BSAC being used in Korea, AAC+ and WM9 in other applications and an
enhanced Reed-Solomon layer of error correction available for these more fragile encoding
algorithms."
I find it hard to put into words just how ridiculously contradictory I
think that statement is, because on the one hand it recognises the UEP makes
reception quality more robust, but it then goes on to completely ignore the
benefit that UEP brings to MP2 and accuses the more modern codecs of being
"fragile".
What seemingly all the DAB supporters get wrong is that it is ONLY the UEP
coding that makes them think that MP2 is more robust than other codecs.
Without UEP you have EEP — Equal Error Protection, where the whole
audio frame is protected with the same code rate, so all sections of the audio
frame are protected with equal strength — and if you protected MP2 using EEP
then it would be no more robust to errors than the more modern audio codecs.
For example, say MP2 and AAC streams were both being protected by the same
EEP error correction coding, and the error correction coding failed to correct
a bit error in the header part of the audio frames for both MP2 and AAC. The
audio would be disturbed on both MP2 and AAC and the listener would likely
notice the disturbance.
It is ONLY the use of UEP that makes the DAB supporters think that MP2
is more robust than other codecs, and if MP2 were protected by EEP it would be
no more robust to errors than any other audio codec.
|