Chapter 8. Audio Parameters

This chapter describes the dmSDK audio parameters and buffers.

Audio Buffer Layout

The digital representation of an audio signal is generated by periodically sampling the amplitude (voltage) of the audio signal. The samples represent periodic "snapshots" of the signal amplitude. The sampling rate specifies the number of samples per second. The audio buffer pointer points to the source or destination data in an audio buffer for processing a fragment of a media stream. For audio signals, a fragment typically corresponds to between 10 milliseconds and 1 second of audio data. An audio buffer is a collection of sample frames. A sample frame is a set of audio samples that are coincident in time. A sample frame for mono data is a single sample. A sample frame for stereo data consists of a left-right sample pair.

Stereo samples are interleaved; left-channel samples alternate with right-channel samples. 4-channel samples are also interleaved, with each frame usually having two left/right sample pairs, but there can be other arrangements.

Figure 8-1. Different Audio Sample Frames

Different Audio Sample Frames

Figure 8-1 shows the relationship between the number of channels and the frame size of audio sample data.

Figure 8-2. Layout of an Audio Buffer with 4 Channels

Layout of an Audio Buffer with 4 Channels

Figure 8-2 shows the layout of an audio buffer in memory.

Audio Parameters

The parameters discussed in the following sections are as follows:

DM_AUDIO_BUFFER_POINTER

Pointer to the audio buffer

DM_AUDIO_FRAMESIZE_INT32

Size of a audio sample frame in bytes

DM_AUDIO_SAMPLE_RATE_REAL64

Sample rate in Hz

DM_AUDIO_PRECISION_INT32

Precision at audio jack

DM_AUDIO_FORMAT_INT32

Format of the data in the audio buffer

DM_AUDIO_GAINS_REAL64_ARRAY

Audio gain controls

DM_AUDIO_COMPANDING_INT32

Sample quantization method

DM_AUDIO_CHANNELS_INT32

Number of audio channels

DM_AUDIO_COMPRESSION_INT32

Audio compression format

DM_AUDIO_BUFFER_POINTER

A pointer to the first byte of an in-memory audio buffer. The buffer address must comply with the alignment constraints for buffers on the particular path to which it is being sent. (See dmGetCapabilities(3dm) for details of determining alignment requirements).

DM_AUDIO_FRAME_SIZE_INT32

The size of an audio sample frame in bytes. This is a read-only parameter and is computed in the device using the current path control settings.

DM_AUDIO_SAMPLE_RATE_REAL64

The sample rate of the audio data in Hz. The sample rate is the frequency at which samples are taken from the analog signal. Sample rates are measured in hertz (Hz). A sample rate of 1 Hz is equal to one sample per second. For example, when a mono analog audio signal is digitized at a 44.1 kilohertz (kHz) sample rate, 44,100 digital samples are generated for every second of the signal. Values are dependent on the hardware, but are usually between 8,000.0 and 96,000.0. Default is hardware-specific. Common sample rates are:

8,000.0
16,000.0
32,000.0
44,100.0
48,000.0
96,000.0

The Nyquist theorem defines the minimum sampling frequency required to accurately represent the information of an analog signal with a given bandwidth. According to Nyquist, digital audio information is sampled at a frequency that is at least double the highest interesting analog audio frequency. The sample rate used for music-quality audio, such as the digital data stored on audio CDs is 44.1 kHz. A 44.1 kHz digital signal can theoretically represent audio frequencies from 0 kHz to 22.05 kHz, which adequately represents sounds within the range of normal human hearing. Higher sample rates result in higher-quality digital signals; however, the higher the sample rate, the greater the signal storage requirement.

DM_AUDIO_PRECISION_INT32

The maximum width in bits for an audio sample at the input or output jack. For example, a value of 16 indicates a 16-bit audio signal. Query only. DM_AUDIO_PRECISION_INT32 specifies the precision at the Audio I/O jack, whereas DM_AUDIO_FORMAT_INT32 specifies the packing of the audio samples in the audio buffer. If DM_AUDIO_FORMAT_INT32 is different than DM_AUDIO_PRECISION_INT32, the system will convert between the two formats. Such a conversion might include padding and/or truncation.

DM_AUDIO_FORMAT_INT32

Specifies the format in which audio samples are stored in memory. The interpretation of format values is:

DM_FORMAT_[type][bits]

  • [type] is U for unsigned integer samples, S for signed (2's compliment) integer samples, R for real (floating point) samples.

  • [bits] is the number of significant bits per sample.

For sample formats in which the number of significant bits is less than the number of bits in which the sample is stored, the format of the values is:

DM_FORMAT_{type}{bits} in{size}{alignment}

  • {size} is the total size used for the sample in memory, in bits.

  • {alignment} is either R or L depending on whether the significant bits are right- or left-shifted within the sample. For example, here are three of the most common audio buffer formats:

    DM_FORMAT_U8 

    7 char 0
    +------+
    iiiiiiii

    DM_FORMAT_S16 

    15  short int  0
    +--------------+
    iiiiiiiiiiiiiiii

    DM_FORMAT_S24in32R 

    31            int              0
    +------------------------------+
    ssssssssiiiiiiiiiiiiiiiiiiiiiiii

where s indicates sign-extension, and i indicates the actual component information. The bit locations refer to the locations when the 8-, 16-, or 32-bit sample has been loaded into a register as an integer quantity. If the audio data compression parameter DM_AUDIO_COMPRESSION_INT32 indicates that the audio data is in compressed form, the DM_AUDIO_FORMAT_INT32 indicates the data type of the samples after decoding. Common formats are:

DM_FORMAT_U8
DM_FORMAT_S16
DM_FORMAT_S24in32R
DM_FORMAT_R32

Default is hardware-specific.

DM_AUDIO_GAINS_REAL64_ARRAY

The gain factor in decibels (dB) on the given path. There will be a value for each audio channel. Negative values represent attenuation. Zero represents no change of the signal. Positive values amplify the signal. A gain of negative infinity indicates infinite attenuation (mute).

DM_AUDIO_COMPANDING_INT32

Describes the quantization method of the audio sample value. For DM_COMPANDING_MU_LAW and DM_COMPANDING_A_LAW, the output voltage changes exponentially with linear sample values changes. The purpose of this method is to use a wider dynamic volume range with the same number of sample bits. Companding is a neologism that combines “compressing” and “expanding”. It is different than Audio Compression, where a set of audio samples are compressed in order to get a smaller file size.

Common values are:

DM_COMPANDING_NONE (default, if supported by the hardware)
DM_COMPANDING_MU_LAW
DM_COMPANDING_A_LAW

DM_AUDIO_CHANNELS_INT32

The number of channels of audio data in the buffer. Multi-channel audio data is always stored interleaved, with the samples for each consecutive audio channel following one another in sequence. For example, a 4-channel audio stream will have the form:

123412341234...

where 1 is the sample for the first audio channel, 2 is for the second, and so on.

Common values are:

DM_CHANNELS_MONO
DM_CHANNELS_STEREO
DM_CHANNELS_4
DM_CHANNELS_8

DM_AUDIO_COMPRESSION_INT32

In case the audio data is in compressed form, this parameter specifies the compression format. The compression format may be an industry standard such as MPEG-1 audio, or it may be no compression at all.

Common values include the following:

DM_COMPRESSION_UNCOMPRESSED
DM_COMPRESSION_MU_LAW
DM_COMPRESSION_A_LAW
DM_COMPRESSION_IMA_ADPCM
DM_COMPRESSION_MPEG1
DM_COMPRESSION_MPEG2
DM_COMPRESSION_AC3

When the data is uncompressed, the value of this parameter is DM_COMPRESSION_UNCOMPRESSED.

Uncompressed Audio Buffer Size Computation

The following equation shows how to calculate the number of bytes for an uncompressed audio buffer given the sample frame size, sampling rate and the time period representing the audio buffer:

N = F . R . T

where:

N 

audio buffer size in bytes

F 

the number of bytes per audio sample frame (DM_AUDIO_FRAMESIZE_INT32)

R 

the sample rate in Hz (DM_AUDIO_SAMPLE_RATE_REAL64)

T 

the time period the audio buffer represents in seconds

Example 8-1. Buffer Size Computation

If:

  • F is 4 bytes (if packing is S16 and there are two channels)

  • R (sample rate) is 44,100 Hz

  • T = 40 ms = 0.04 s.

then the resulting buffer size (N) is 7056 bytes.