Articles Comments

KARAGYOZOV.COM » digital audio, Sound theory » Quantization of the sound. Dither

Quantization of the sound. Dither

ATTENTION!!

As a part from the book “Digital Sound – Myths and Solutions” this text is protected with a copyright. Any way of using the text or part of it should be discussed with the author!!!

HOW WE ACHIEVE THE DISCRETIZATION OF THE SOUND?

Through a process called quantization.

Тhe analog wave, which is continuous in the time,  in the quantization (or discretization) process is converted into discrete pulses counted and measured by level over a period of time. These measurements are performed on a scale based on binary code. It allows reporting of such finite number of levels, as are the possible values that allows the number of combinations generated from this binary code.

Screen Shot 2016-07-07 at 2.32.51 PM

These levels are called exactly quantization levels.

The process of analog-to-digital conversion has great advantages – when we record binary numbers, even if the medium or the reading device to have slight deviations, it does not disturb the system and it reads the information correctly. The reason for this resistance from errors is that the system reads only two kinds of signals – 1 and 0, ie pulse – no pulse. In this process even if the zero is accompanied by a very weak signal because of any deviations, we can program the system to not report it until it reaches a certain level, so that for the system this signal still remains zero. Conversely, if the signal for 1 is weaker or louder than the nominal  value for 1, this is not a problem to certain limits that we have set. In the analog recording the exact recreating of the levels as voltage is absolutely necessary as any, even a small deviation leads to undesirable change in the sound.

And what are the disadvantages of this method?

The main drawback – it is the discrete representation of the continuous sounding sound as a series of samples. It‘s like in the movies – consecutive still images form the feeling for a continuous process – of movement.

But the process is discreet not only in time aspect, but also in intensity – that is in the vertical in our virtual coordinate system. As the reading of the sound level can only be done at specified levels (quantization levels), we can not report the differences in level occurring in between two discrete levels. If a certain level at a given moment is somewhere between the levels of quantization, it is rounded to the nearest one. The result – a change, although small, of the shape of  the subsequently reconstructed sound wave.

This is called quantization error and the distortions occurred as a result of this error   – quantization losses or quantization distortions.

Screen Shot 2016-07-07 at 2.33.47 PM

If we record levels through digital words consisting of 8 bits, we can describe 256 states of the level. If we use 16-bit words, provided that the number of possible combinations increases twice with the addition of every new bit, we have 65,536 possible values, at 24 bits they already are becoming over 16 million.

Logically, the coefficient of quantization losses decreases progressively with the increasing of the bit resolution.

How is most effective to deal with the problem? By working in a higher bit resolution.

Why, however, the application of dither is necessary?

The reason is that despite the good results of the the work in 24 and even in 32-bit mode, this is only an intermediate format of work, and in most cases the final mix required conversion to 16 bits – 44.1 kHz sampling rate for CD Audio and 16 bits – 48 kHz sampling rate for audio for video. The conversion to a lower bit resolution is made at the final stage and is a complex process in which the digital words defining the sound levels in any sample, are reduced in size in order to be reduced to a lower bit resolution.

But how do we convert the higher resolution into lower?

The methods to „go down“ from high to low resolution are a few:

1. Truncating

With this very quality method we just take out the values after the decimal point, thus reducing the number of combinations;

2. Rounding

With it the values after the decimal point are taken into account, and depending on them the last value before the decimal point is rounded.

Despite the relatively high precision in both cases the quantization error iis of the same order of magnitude. There was not much progress also in auditory aspect. What is the solution then?

3. Dithering

This is generally the addition of little noise in the input signal. Since the problem with the quantization error comes from the correlation with the sound, we look for a way  to avoid this correlation.

For this purpose a signal is generated, which has random nature as probable meanings and which is  audible as so called. white noise (auditory feeling is like the sound of running water) with a very small amplitude – just between two or three adjacent quantization levels. This noise, figuratively speaking, „swings“ in a random way the sound levels, thereby levels do not stay on one level and then suddenly to jump over to the next, but instead they fluctuate continuously. This fluctuation in nature gives the necessary information on the location of the waveform between two adjacent quantization levels, increasing by this the bit resolution over the real.

Screen Shot 2016-07-07 at 2.34.05 PM

ARE THERE DIFFERENT TYPES OF DITHER, WHAT’S THE DIFFERENCE BETWEEN THEM AND HOW IT AFFECTS THE FINAL IMPRESSION OF THE SOUND?

How can we distinguish the types of dither?

1. By amplitude

2. By probability distribution

3. By frequency response.

By amplitude

The amplitude of the noise can be within one or two levels of quantization, ie the „shaking“ that this noise will cause, will be in this range. The difference between the „zero“ in the sound and the lowest level, as shown in the 16-bit quantizer in Chart 4, is:

0000000000000000

0000000000000001

This bit at the end of the digital word that describes the smallest change in level is usually called: Least Significant Bit, or LSB.

If the amplitude of the noise is 1 LSB, then the signal will deviate up and down with + – 0.5 LSB. Logical, right? If 2 LSB, movement is + – 1 LSB.These are the two most frequent sizes of the amplitude of the dither noise.

By default, if we use dither noise with greater amplitude, we also add more noise in the final signal. In many cases, however, a biggger „shaking“ caused by the stronger dithering noise, leads to a cleaner sound with less distortion. In choosing one or the other option is sought the reasonable compromise between one and the other, starting also from the specific musical material.

By probability distribution

As the dither is a noise signal, which, just because it is a noise, has random nature like a level in the range between 0,5 or 2 LSB, as we defined earlier, it represents a statistically uniformly distributed signal in this range. However, it may differ in the probability of its presence in different areas of this range. If it is equally likely to have both a zero and a maximum (in this area) level, then this probability distribution will have a graph like rectangular shape:Screen Shot 2016-07-07 at 2.53.43 PM

If the probability for the level to be zero is greater than to be maximum, then the probability distribution has a triangular form:

Screen Shot 2016-07-07 at 2.53.46 PM

There is also the so-called „Gaussian“ noise, a kind of noise with „bell“ shape, once again with irregular probability distribution, which is often considered to be auditory more acceptable by the fact that the analog preamps have their own noise with a similar shape.

Screen Shot 2016-07-07 at 2.57.30 PM

By frequency response

And with respect to the spectrum?

It is reasonable, given that improvement of the sound is accompanied with introduction of noise in the phonogram, to look for a way for noise to be minimized acoustically, but in the same moment to maintain its positive function.

Here we work, generally speaking, in two directions – on the one hand, seeking to option to send part of the the noise in inaudible areas – for example, over 18 kHz, on the other – to reduce spectrally the presence of noise in the most audible frequency areas according to the Fletcher-Munson curves.

This is so.called. NOISE SHAPING.

The standard noise used in the process of dithering, is the so called „white noise“ (white noise). It is equally distributed as amplitude across the audible frequency range. When applying the noise shaping this noise is reduced, on the one hand, in the areas of greatest auditory sensitivity, and enhances on the other in areas inaudible. Aggressiveness of its application and the form of the frequency spectrum of the noise create different algorithms of noise shaping, which combined with different types of dither lead to different hearing result.

Screen Shot 2016-07-07 at 3.00.20 PM

WHEN TO APPLY DITHER?

1. In the final phase of the mix and mastering, at the moment of „descending“ from higher to lower bit resolution.

2. In a digital transfer in the studio when there is a reduction of the bit resolution. A typical case – we record through a digital mixing console, transferred digitally to an audio interface, respectively. a computer audio system. If the recording is 16-bit, it must be borne in mind that, inside the panel digital stream is likely to be 24-bit, depending on its specifications. In this case the system menu from the console to introduce the output resolution of the signal applied to the digital outputs of the console, and to introduce Dieter.

3. When editing or processing of already finished sound. The reason is that the internal resolution of most types of audio software for professional sound processing is 24 bit, so even if you process 16-bit sound, it is processed with a higher resolution.

In principle, it is inappropriate to apply dither more than once. The reason is the increased level of noise, as well as the artifacts that occur when this noise is processed in a subsequent step.

That is why dithering is applied only as a final process and there is no any subsequent processing which can be done after it. This is the reason why in the Wavelab software the last part of the master section is separated in the bottom field and any plugin containing dither can be invoked from the menu only there as a last process in the plugin chain.

A common mistake in the mixing is apply as an insert effects processors containing dither in the input channels or on the group tracks. This leads to increased background noise and other side effects.

Filed under: digital audio, Sound theory · Tags:

Leave a Reply

*