This chapter describes ML support for synchronizing digital media streams. The described techniques are designed to enable accurate synchronization even when there are large (and possibly unpredictable) processing delays.
This chapter contains the following sections:
To time-stamp each media stream, some convenient representation for time is needed. In ML, time is represented by the value of the unadjusted system time (UST) counter. That counter starts at 0 when the system is reset and increases continuously (without any adjustment) while the system is running.
Each process and/or piece of hardware may have its own view of the UST counter. That view is an approximation to the real UST counter. The difference between any two views is bounded for any implementation.
Each UST time stamp is a signed 64-bit integer value with units of nanoseconds representing a recent view of the UST counter. To obtain a current view of the UST, use the mlGetSystemUST function call:
MLstatus mlGetSystemUST(MLint64 systemId, MLint64* ust); |
where:
systemId is ML_SYSTEM_LOCALHOST (any other value results in the return of ML_STATUS_INVALID_ID )
ust is a pointer to an int64 that will hold the resulting UST value
The status return is one of the following:
ML_STATUS_INVALID_ARGUMENT | |
UST is invalid | |
ML_STATUS_NO_ERROR | |
Successful execution |
To obtain the current UST on a particular system, use the following:
MLstatus mlGetSystemUST( systemId, ); |
where systemID is ML_SYSTEM_LOCALHOST .
This function returns one of the following:
ML_STATUS_INVALID_ARGUMENT | |
The UST was not returned successfully (there may be an invalid pointer ) | |
ML_STATUS_INVALID_ID | |
The specified system ID is invalid | |
ML_STATUS_NO_ERROR | |
The system UST was obtained successfully |
This section discusses the following:
Basic support for synchronization requires that the application know exactly when video or audio buffers are passed through a jack. In ML, this is achieved with the UST and MSC buffer parameters.
The UST parameters are as follows:
ML_AUDIO_UST_INT64, ML_VIDEO_UST_INT64 |
The UST is the time stamp for the most recently processed slot in the audio/video stream:
For video devices, the UST corresponds to the time at which the field/frame starts to pass through the jack
For audio devices, the UST corresponds to the time at which the first sample in the buffer passed through the jack
The MSC parameters are as follows:
ML_AUDIO_MSC_INT64, ML_VIDEO_MSC_INT64 |
The MSC is the most recently processed slot in the audio/video stream. This is snapped at the same instant as the UST described in “Unadjusted System Time (UST) Parameters”.
MSC increases by 1 for each potential slot in the media stream through the jack. For interlaced video timings, each slot contains one video field; for progressive timings, each slot contains one video frame. This means that when two fields are interlaced into one frame and sent as one buffer, then the MSC will increment by 2 (one for each field). Furthermore, the system guarantees that the least significant bit of the MSC will reflect the state of the field bit: 0 for Field 1 and 1 for Field 2. For audio, each slot contains one audio frame.
The ASC parameters are as follows:
ML_AUDIO_ASC_INT64, ML_VIDEO_ASC_INT64 |
The ASC is provided to aid the developer in predicting when the audio or video data will pass through an output jack. See “UST/MSC Example” for further information on the use of the ASC parameter.
Typically, an application will pass mlSendBuffers one of the following:
A video message containing values for the following:
ML_IMAGE_BUFFER_POINTER
ML_VIDEO_MSC_INT64
ML_VIDEO_UST_INT64 (and possibly for ML_VIDEO_ASC_INT64)
An audio message containing values for the following:
ML_AUDIO_BUFFER_POINTER
ML_AUDIO_UST_INT64
ML_AUDIO_MSC_INT64
In some cases, a message can contain both audio and video parameters.
Each message is processed as a single unit and a reply is returned to the application via mlReceiveMessage. The reply will contain the completed buffer and the UST/MSC or UST/ASC corresponding to the time at which the data in the buffers passed in or out of the jack.
![]() | Note: Due to hardware buffering on some cards, it is possible to receive a reply message before the data has finished flowing through an output jack. |
The following example sends an audio buffer and video buffer to an I/O path and requests both UST and MSC stamps:
MLpv message[7]; message[0].param = ML_IMAGE_BUFFER_POINTER; message[0].value.pByte = someImageBuffer; message[0].length = sizeof(someImageBuffer); message[0].maxLength = sizeof(someImageBuffer); message[1].param = ML_VIDEO_UST_INT64; message[2].param = ML_VIDEO_MSC_INT64; message[3].param = ML_AUDIO_BUFFER_POINTER; message[3].value.pByte = someAudioBuffer; message[3].length = sizeof(someAudioBuffer); message[3].maxLength = sizeof(someAudioBuffer); message[4].param = ML_AUDIO_UST_INT64; message[5].param = ML_AUDIO_MSC_INT64; message[6].param = ML_END; mlSendBuffers( device, message); |
After the device has processed the buffers, it will enqueue a reply message back to the application. That reply will be an exact copy of the message sent in, with the exception that the MSC and UST values will be filled in. (For input, the buffer parameter length will also be set to the number of bytes written into it).
![]() | Note: An mlSendBuffers call can only have one ML_IMAGE_BUFFER_POINTER. |
This section discusses the following:
On input, the application can detect if any data is missing by looking for breaks in the MSC sequence. This could happen if an application did not provide buffers fast enough to capture all of the signal that arrived at the jack. (An alternative to looking at the MSC numbers is to turn on the events ML_AUDIO_SEQUENCE_LOST or ML_VIDEO_SEQUENCE_LOST . Those will fire whenever the queue from application to device overflows.)
Given the UST/MSC stamps for two different buffers (UST1,MSC1) and (UST2,MSC2), the input sample rate in samples per nanosecond can be computed as follows:
One common technique for synchronizing different input streams is to start recording early, stop recording late, and then use the UST/MSC stamps in the recorded data to find exact points for trimming the input data.
An alternative way to start recording several streams simultaneously is to use predicate controls (see “Predicate Controls”).
On output, the actual output sample rate can be computed in exactly the same way as the input sample rate:
Some applications must determine exactly when the next buffer sent to the device will actually go out the jack. Doing this requires the following steps:
Maintain the field/frame count for the application. This parameter is called the ASC. The ASC may start at any desired value and should increase by one for every audio frame or video field enqueued. (For convenience, the application may wish to associate the ASC with the buffer by embedding it in the same message. The parameters ML_AUDIO_ASC_INT32 and ML_VIDEO_ASC_INT32 are provided for this use.)
Detect if there was any underflow by comparing the number of slots the application thought it had output with the number of slots that the system actually output. (This assumes that the application knows the UST/MSC/ASC for two previously-output buffers.) If the following equation is true, then all is well:
Predict that the next data the application enqueues has the following system sequence count:
(This assumes that the application knows the current ASC.)
Predict when the data will hit the output jack:
The application should periodically recompute the actual sample rate based on measured MSC/UST values. It is not sufficient to rely on a nominal sample rate because the actual rate may drift over time.
In summary: given the above mechanism, the application knows the UST/MSC pair for every processed buffer. Using the UST/MSC's for several processed buffers, you can compute the frame rate. Given a UST/MSC pair in the past, a prediction of the current MSC, and the frame rate, the application can predict the UST at which the next buffer to be enqueued will hit the jack.
Predicate controls allow an application to insert conditional commands into the queue to the device. Using these, you can preprogram actions, allowing the device to respond immediately, without needing to wait for a round-trip through the application.
Unlike the UST/MSC time stamps, predicate controls are not required to be supported on all audio/video devices. To see if they are supported on any particular device, look for the desired parameter in the list of supported parameters on each path; see the mlGetCapabilities(3dm) man page. The simplest predicate controls are as follows:
ML_WAIT_FOR_AUDIO_MSC_INT64 ML_WAIT_FOR_VIDEO_MSC_INT64 |
When the message containing these controls reaches the head of the queue, it causes the queue to stall until the specified MSC value has passed. Then that message, and subsequent messages, are processed as normal.
For example, following is code that uses WAIT_FOR_AUDIO_MSC to send a particular buffer out after a specified stream count:
MLpv message[3]; message[0].param = ML_WAIT_FOR_AUDIO_MSC_INT64; message[0].value.int64 = someMSCInTheFuture; message[1].param = ML_AUDIO_BUFFER_POINTER; message[1].value.pByte = someBuffer; message[1].value.length = sizeof(someBuffer); message[2].param = ML_END; mlSendBuffers( someOpenPath, message); |
This places a message on the queue to the path and then immediately returns control to the application. As the device processes that message, it will pause until the specified media MSC value has passed before allowing the buffer to flow through the jack.
Using this technique an application can program several media streams to start in synchronization by simply choosing some MSC count to start in the future.
![]() | Note: If both ML_IMAGE_DOMINANCE and ML_WAIT_FOR_VIDEO_MSC controls are set and do not correspond to the same starting output field order, the ML_WAIT_FOR_VIDEO_MSC_INT64 control will override ML_IMAGE_DOMINANCE_INT32 control settings. |
Another set of synchronization predicate controls are:
ML_WAIT_FOR_AUDIO_UST_INT64 ML_WAIT_FOR_VIDEO_UST_INT64 |
When the message containing these controls reaches the head of the queue it causes the queue to stall until the specified UST value has passed. Then that message, and subsequent messages, are processed as normal.
![]() | Note: The accuracy with which the system is able to implement the WAIT_FOR_UST command is device-dependent. For more information, see the device-specific documentation for limitations. |
For example, the following code uses WAIT_FOR_AUDIO_UST to send a particular buffer out after a specified time:
MLpv message[3]; message[0].param = ML_WAIT_FOR_AUDIO_UST_INT64; message[0].value.int64 = someUSTtimeInTheFuture; message[1].param = ML_AUDIO_BUFFER_POINTER; message[1].value.pByte = someBuffer; message[1].value.length = sizeof(someBuffer); message[2].param = ML_END; mlSendBuffers( someOpenPath, message); |
This places a message on the queue to the path and then immediately returns control to the application. As the device processes that message, it will pause until the specified video UST time has passed before allowing the buffer to flow through the jack.
Using this technique, an application can program several media streams to start in synchronization by simply choosing some UST time in the future and programming each to start at that time. The following predicates control processing up to a specified time:
ML_IF_VIDEO_UST_LT_INT64 ML_IF_AUDIO_UST_LT_INT64 |
When included in a message, these controls will cause the following logical test: if the UST is less than the specified time, then the entire message is processed as normal; otherwise, the entire message is simply skipped.
Regardless of the outcome, any following messages are processed as normal. Skipping over a message takes time, so there is a limit to how many messages a device can skip before the delay starts to become noticeable. All media devices will support skipping at least one message without noticeable delay.