This chapter describes the dmSDK audio parameters and buffers.
The digital representation of an audio signal is generated by periodically sampling the amplitude (voltage) of the audio signal. The samples represent periodic "snapshots" of the signal amplitude. The sampling rate specifies the number of samples per second. The audio buffer pointer points to the source or destination data in an audio buffer for processing a fragment of a media stream. For audio signals, a fragment typically corresponds to between 10 milliseconds and 1 second of audio data. An audio buffer is a collection of sample frames. A sample frame is a set of audio samples that are coincident in time. A sample frame for mono data is a single sample. A sample frame for stereo data consists of a left-right sample pair.
Stereo samples are interleaved; left-channel samples alternate with right-channel samples. 4-channel samples are also interleaved, with each frame usually having two left/right sample pairs, but there can be other arrangements.
Figure 8-1 shows the relationship between the number of channels and the frame size of audio sample data.
Figure 8-2 shows the layout of an audio buffer in memory.
The parameters discussed in the following sections are as follows:
| DM_AUDIO_BUFFER_POINTER | Pointer to the audio buffer |
| DM_AUDIO_FRAMESIZE_INT32 | Size of a audio sample frame in bytes |
| DM_AUDIO_SAMPLE_RATE_REAL64 | Sample rate in Hz |
| DM_AUDIO_PRECISION_INT32 | Precision at audio jack |
| DM_AUDIO_FORMAT_INT32 | Format of the data in the audio buffer |
| DM_AUDIO_GAINS_REAL64_ARRAY | Audio gain controls |
| DM_AUDIO_COMPANDING_INT32 | Sample quantization method |
| DM_AUDIO_CHANNELS_INT32 | Number of audio channels |
| DM_AUDIO_COMPRESSION_INT32 | Audio compression format |
A pointer to the first byte of an in-memory audio buffer. The buffer address must comply with the alignment constraints for buffers on the particular path to which it is being sent. (See dmGetCapabilities(3dm) for details of determining alignment requirements).
The size of an audio sample frame in bytes. This is a read-only parameter and is computed in the device using the current path control settings.
The sample rate of the audio data in Hz. The sample rate is the frequency at which samples are taken from the analog signal. Sample rates are measured in hertz (Hz). A sample rate of 1 Hz is equal to one sample per second. For example, when a mono analog audio signal is digitized at a 44.1 kilohertz (kHz) sample rate, 44,100 digital samples are generated for every second of the signal. Values are dependent on the hardware, but are usually between 8,000.0 and 96,000.0. Default is hardware-specific. Common sample rates are:
| 8,000.0 |
| 16,000.0 |
| 32,000.0 |
| 44,100.0 |
| 48,000.0 |
| 96,000.0 |
The Nyquist theorem defines the minimum sampling frequency required to accurately represent the information of an analog signal with a given bandwidth. According to Nyquist, digital audio information is sampled at a frequency that is at least double the highest interesting analog audio frequency. The sample rate used for music-quality audio, such as the digital data stored on audio CDs is 44.1 kHz. A 44.1 kHz digital signal can theoretically represent audio frequencies from 0 kHz to 22.05 kHz, which adequately represents sounds within the range of normal human hearing. Higher sample rates result in higher-quality digital signals; however, the higher the sample rate, the greater the signal storage requirement.
The maximum width in bits for an audio sample at the input or output jack. For example, a value of 16 indicates a 16-bit audio signal. Query only. DM_AUDIO_PRECISION_INT32 specifies the precision at the Audio I/O jack, whereas DM_AUDIO_FORMAT_INT32 specifies the packing of the audio samples in the audio buffer. If DM_AUDIO_FORMAT_INT32 is different than DM_AUDIO_PRECISION_INT32, the system will convert between the two formats. Such a conversion might include padding and/or truncation.
Specifies the format in which audio samples are stored in memory. The interpretation of format values is:
DM_FORMAT_[type][bits] |
[type] is U for unsigned integer samples, S for signed (2's compliment) integer samples, R for real (floating point) samples.
[bits] is the number of significant bits per sample.
For sample formats in which the number of significant bits is less than the number of bits in which the sample is stored, the format of the values is:
DM_FORMAT_{type}{bits} in{size}{alignment} |
{size} is the total size used for the sample in memory, in bits.
{alignment} is either R or L depending on whether the significant bits are right- or left-shifted within the sample. For example, here are three of the most common audio buffer formats:
| DM_FORMAT_U8 |
| ||
| DM_FORMAT_S16 |
| ||
| DM_FORMAT_S24in32R |
|
| DM_FORMAT_U8 |
| DM_FORMAT_S16 |
| DM_FORMAT_S24in32R |
| DM_FORMAT_R32 |
Default is hardware-specific.
The gain factor in decibels (dB) on the given path. There will be a value for each audio channel. Negative values represent attenuation. Zero represents no change of the signal. Positive values amplify the signal. A gain of negative infinity indicates infinite attenuation (mute).
Describes the quantization method of the audio sample value. For DM_COMPANDING_MU_LAW and DM_COMPANDING_A_LAW, the output voltage changes exponentially with linear sample values changes. The purpose of this method is to use a wider dynamic volume range with the same number of sample bits. Companding is a neologism that combines “compressing” and “expanding”. It is different than Audio Compression, where a set of audio samples are compressed in order to get a smaller file size.
Common values are:
| DM_COMPANDING_NONE (default, if supported by the hardware) |
| DM_COMPANDING_MU_LAW |
| DM_COMPANDING_A_LAW |
The number of channels of audio data in the buffer. Multi-channel audio data is always stored interleaved, with the samples for each consecutive audio channel following one another in sequence. For example, a 4-channel audio stream will have the form:
123412341234... |
Common values are:
| DM_CHANNELS_MONO |
| DM_CHANNELS_STEREO |
| DM_CHANNELS_4 |
| DM_CHANNELS_8 |
In case the audio data is in compressed form, this parameter specifies the compression format. The compression format may be an industry standard such as MPEG-1 audio, or it may be no compression at all.
Common values include the following:
| DM_COMPRESSION_UNCOMPRESSED |
| DM_COMPRESSION_MU_LAW |
| DM_COMPRESSION_A_LAW |
| DM_COMPRESSION_IMA_ADPCM |
| DM_COMPRESSION_MPEG1 |
| DM_COMPRESSION_MPEG2 |
| DM_COMPRESSION_AC3 |
When the data is uncompressed, the value of this parameter is DM_COMPRESSION_UNCOMPRESSED.
The following equation shows how to calculate the number of bytes for an uncompressed audio buffer given the sample frame size, sampling rate and the time period representing the audio buffer:
N = F . R . T
where:
| N | audio buffer size in bytes | |
| F | the number of bytes per audio sample frame (DM_AUDIO_FRAMESIZE_INT32) | |
| R | the sample rate in Hz (DM_AUDIO_SAMPLE_RATE_REAL64) | |
| T | the time period the audio buffer represents in seconds |
Example 8-1. Buffer Size Computation
If:
F is 4 bytes (if packing is S16 and there are two channels)
R (sample rate) is 44,100 Hz
T = 40 ms = 0.04 s.
then the resulting buffer size (N) is 7056 bytes.