Articles Comments

KARAGYOZOV.COM » digital audio, ESPOL » Lecture 3 part 2 – Digital representation of the sound

Lecture 3 part 2 – Digital representation of the sound

In the 80’s a significant step in the improvement  of the recording and reproducing of sound was made. The digital technology was born.

The principle of the digital sound representation can be different, but the most common is the PCM (Pulse Code Modulation). In it the sound wave, which is continuous, is divided to frames, or, in the digital audio world,




This process is called quantisation.

The number of samples per second can be different, but usually it is 22,050, 32, 44,1, 48, 88,2, 96, 176,4, 192 kHz. This means for example that with 44.1 kHz sampling frequency we have 44100 samples. So our sound wave is measured 44100 times per second and the corresponding  number of values are stored in a memory.

This is called sampling frequency of sampling rate. It is measured in Hz.

The number of the combinations on the different values of the signal is dependent from the number of the bits, with which the momentary  value of the signal is described. For example:



Here we can see how when the signal is described with more bits, the number of the combinations in the value is more, hence the resulting waveform is closer to the original analog waveform.

Every time you add one more bit to the bit word, the number of the numeric combinations is doubled. So the arithmetic increase in the bits gives us a geometric increase in the quality of the sound.

This levels of quantisation are called quantisation levels. They are fixed, we can not describe something between them unless we round up to the nearest quantisation level.

When we round up the levels, we inevitably make an error in the signal (like in the graph above). This error is called quantisation error.

If we have more quantisation levels, the error is smaller. The number of quantisation levels depends from the number of the bits, with which the signal is described. They form a digital word, for example:

1001  – a 4-bit digital word;

01001011 – 8-bit word;

1011010011001111 – 16 bit word

and so on.

As we can see, every new bit doubles the number of possible combinations. For example. the 4 bit gives us 16 combinations, 16 bit – more than 50000, 24 bit – 16 million.

The number of the bits in the digital representation of the signal is called bit resolution, or bit depth.

Usually the bit resolutions which are mostly used are;

8 bit – only for non professional use or in a telephone systems. The quality is very bad;

12 bit – not very often used. Used in some digital cameras for reducing the amount of the signal;

16 bit – the most commonly used format. Usually it is the final format in most cases;

24 bit – ideal for recording and post production. Usually we work in it till we make the final render;

32 bit – specific “floating point” format. In theory has more dynamic range than the 24 bit.

As we can see, usually we are working in 16 and 24 bit resolution. When working on a project, it is better to use the 24-bit option. When finalising the project, it is better to do it in 16 bit.

So, we have two major recording parameters in the digital domain; sample frequency and bit resolution.



Every time we create a new file, the software asks us for three major parameters of the file:

1. Sample frequency;

2. Bit resolution;

3. number of channels.

That’s why it is very important to know the parameters of the digital signal. In every day’s work it is vitally important. If you mismatch the sample frequency of the used files with this of the project, strange things can happen with your sound – for example different pitch of the sound and different length of the file. This means lack of synchronisation and many other problems, so we have to be very careful about this parameters and how to use them.

The low sample frequency means low frequency range of the sound, because:

Every sound with a given frequency can not be described with less than twice more sample frequency.

This is called “Theorem of Nyquist” ––Shannon_sampling_theorem – if you are addicted in mathematic 🙂

If not, you can simply believe me 🙂

The low bit resolution means less dynamic range, because the sound is described with less quantisation levels. That is true especially   for low level signals, because they use less levels of quantisation in the same bit depth:



that’s all folks 🙂


Filed under: digital audio, ESPOL

Leave a Reply