This chapter describes the ML audio parameters and buffers:
The digital representation of an audio signal is generated by periodically sampling the amplitude (voltage) of the audio signal. The samples represent periodic snapshots of the signal amplitude. The sampling rate specifies the number of samples per second. The audio buffer pointer points to the source or destination data in an audio buffer for processing a fragment of a media stream. For audio signals, a fragment typically corresponds to between 10 milliseconds and 1 second of audio data. An audio buffer is a collection of sample frames. A sample frame is a set of audio samples that are coincident in time. A sample frame for mono data is a single sample. A sample frame for stereo data consists of a left-right sample pair.
Stereo samples are interleaved; left-channel samples alternate with right-channel samples. Four-channel samples are also interleaved, with each frame usually having two left/right sample pairs, but there can be other arrangements.
Figure 8-1 shows the relationship between the number of channels and the frame size of audio sample data. Figure 8-2 shows the layout of an audio buffer in memory.
This section discusses the audio parameters.
Sets a pointer to the first byte of an in-memory audio buffer. The buffer address must comply with the alignment constraints for buffers on the particular path to which it is being sent. See the mlGetCapabilities(3dm) man page for details of determining alignment requirements.
Sets the number of channels of audio data in the buffer. Multichannel audio data is always stored interleaved, with the samples for each consecutive audio channel following one another in sequence. For example, a 4-channel audio stream will have the form:
123412341234... |
where: 1 is the sample for the first audio channel, 2 is for the second, and so on.
ML_CHANNELS_MONO |
ML_CHANNELS_STEREO |
ML_CHANNELS_4 |
ML_CHANNELS_8 |
Specifies the compression format if the audio data is in compressed form. The compression format may be an industry standard such as MPEG-1 audio, or it may be no compression at all.
ML_COMPRESSION_A_LAW |
ML_COMPRESSION_AC3 |
ML_COMPRESSION_IMA_ADPCM |
ML_COMPRESSION_MPEG1 |
ML_COMPRESSION_MPEG2 |
ML_COMPRESSION_MU_LAW |
ML_COMPRESSION_UNCOMPRESSED |
Specifies the format in which audio samples are stored in memory. The interpretation of format values is as follows:
ML_FORMAT_TypeBits |
Type is U for unsigned integer samples, S for signed (2's compliment) integer samples, R for real (floating point) samples
Bits is the number of significant bits per sample
For sample formats in which the number of significant bits is less than the number of bits in which the sample is stored, the format of the values is:
ML_FORMAT_TypeBitsinSizeAlignment |
Size is the total size used for the sample in memory, in bits.
Alignment is either R or L depending on whether the significant bits are right- or left-shifted within the sample. For example, following are three of the most common audio buffer formats:
ML_AUDIO_FORMAT_U8 | ||
| ||
ML_AUDIO_FORMAT_S16 | ||
| ||
ML_AUDIO_FORMAT_S24in32R | ||
|
where:
s indicates sign-extension
i indicates the actual component information
The bit locations refer to the locations when the 8-, 16-, or 32-bit sample has been loaded into a register as an integer quantity. If the audio data compression parameter ML_AUDIO_COMPRESSION_INT32 indicates that the audio data is in compressed form, the ML_AUDIO_FORMAT_INT32 indicates the data type of the samples after decoding. Common formats are:
ML_FORMAT_U8 |
ML_FORMAT_S16 |
ML_FORMAT_S24in32R |
ML_FORMAT_R32 |
Default is hardware-specific.
Sets the size of an audio sample frame in bytes. This is a read-only parameter and is computed in the device using the current path control settings.
Sets the gain factor in decibels (dB) on the given path. There will be a value for each audio channel. Negative values represent attenuation. Zero represents no change of the signal. Positive values amplify the signal. A gain of negative infinity indicates infinite attenuation (mute).
Queries the maximum width in bits for an audio sample at the input or output jack. For example, a value of 16 indicates a 16-bit audio signal. ML_AUDIO_PRECISION_INT32 specifies the precision at the audio I/O jack, whereas ML_AUDIO_FORMAT_INT32 specifies the packing of the audio samples in the audio buffer. If ML_AUDIO_FORMAT_INT32 is different than ML_AUDIO_PRECISION_INT32, the system will convert between the two formats. Such a conversion might include padding and/or truncation.
Sets the sample rate of the audio data in Hz. The sample rate is the frequency at which samples are taken from the analog signal. Sample rates are measured in hertz (Hz). A sample rate of 1 Hz is equal to one sample per second. For example, when a mono analog audio signal is digitized at a 44.1-kilohertz (kHz) sample rate, 44,100 digital samples are generated for every second of the signal. Values are dependent on the hardware, but are usually between 8,000.0 and 96,000.0. Default is hardware-specific. Common sample rates are:
8,000.0 |
16,000.0 |
32,000.0 |
44,100.0 |
48,000.0 |
96,000.0 |
The Nyquist theorem defines the minimum sampling frequency required to accurately represent the information of an analog signal with a given bandwidth. According to Nyquist, digital audio information is sampled at a frequency that is at least double the highest interesting analog audio frequency. The sample rate used for music-quality audio, such as the digital data stored on audio CDs, is 44.1 kHz. A 44.1-kHz digital signal can theoretically represent audio frequencies from 0 kHz to 22.05 kHz, which adequately represents sounds within the range of normal human hearing. Higher sample rates result in higher-quality digital signals; however, the higher the sample rate, the greater the signal storage requirement.
The following equation shows how to calculate the number of bytes for an uncompressed audio buffer given the sample frame size, sampling rate, and the time period representing the audio buffer:
N = F . R . T
where:
N | Audio buffer size in bytes |
F | The number of bytes per audio sample frame ( ML_AUDIO_FRAMESIZE_INT32) |
R | The sample rate in Hz (ML_AUDIO_SAMPLE_RATE_REAL64 ) |
T | The time period the audio buffer represents in seconds |
Example 8-1. Buffer Size Computation
If:
F is 4 bytes (if packing is S16 and there are two channels)
R (sample rate) is 44,100 Hz
T = 40 ms = 0.04 s.
then the resulting buffer size (N) is 7056 bytes.