Chapter 7. Image Buffer Parameters

This chapter describes in detail the ML image buffer parameters and gives examples of the resulting in-memory pixel formats:


Note: This chapter assumes a working knowledge of digital video concepts. Readers unfamiliar with terms such as video timing , 422, or CbYCr should consult a text devoted to this subject. A good resource is A Technical Introduction to Digital Video by Charles Poynton, published by John Wiley & Sons, 1996 (ISBN 0-471-12253-X, hardcover).


Image Buffer Layouts

An image buffer is memory allocated for a frame or field of pixels. Because ML itself does not allocate memory for buffers, the application must do the allocation. This means that each buffer requires a dedicated memory allocation call (malloc, for example.)

Buffers must be in contiguous virtual memory and should be pinned in memory for optimum performance. Once a buffer has been created, the pointer to the buffer is passed to ML with the parameter ML_IMAGE_BUFFER_POINTER . The buffer pointer points to the first byte of the image in memory and must comply with the alignment constraints for buffers on the particular path or transcoder to which it is being sent. See the mlGetCapabilities(3dm) man page for details on determining alignment requirements with ML_PATH_BUFFER_ALIGNMENT_INT32.

For example, if ML_PATH_BUFFER_ALIGNMENT_INT32 is 8, this means that the value of the buffer pointer must be a multiple of 8 bytes. The same applies to ML_PATH_COMPONENT_ALIGNMENT_INT32 , where the beginning of each line (the first pixel of each line) must be a multiple of the value of the ML_PATH_COMPONENT_ALIGNMENT_INT32 parameter.

Figure 7-1 shows an image that is mapped into a image buffer in a very general form. Figure 7-2 shows the more common simple image buffer layout.

Figure 7-1. General Image Buffer Layout

General Image Buffer Layout

Figure 7-2. Simple Image Buffer Layout

Simple Image Buffer Layout

Image Buffer Parameters Summary

This section describes the image buffer parameters.

ML_IMAGE_BUFFER_POINTER

Sets the pointer to the first byte of an image buffer in memory. The buffer address must comply with the alignment constraints for buffers on the particular path or transcoder to which it is being sent. See the mlGetCapabilities (3dm) man page for details on determining alignment requirements with ML_PATH_BUFFER_ALIGNMENT_INT32.

For example, if ML_PATH_BUFFER_ALIGNMENT_INT32 is 8, this means that the value of the buffer pointer must be a multiple of 8 bytes. The same applies to ML_PATH_COMPONENT_ALIGNMENT_INT32 , where the beginning of each line (the first pixel of each line) must be a multiple of the value of the ML_PATH_COMPONENT_ALIGNMENT_INT32 parameter.

ML_IMAGE_BUFFER_SIZE_INT32

Sets the size of the image buffer in bytes. This is a read-only parameter and is computed in the device using the current path control settings. This value represents the worst-case buffer size.

ML_IMAGE_COLORSPACE_INT32

Describes how to interpret each component. The full colorspace parameter is:

ML_COLORSPACE_representation_standard_range

where:

  • representation is either ML_REPRESENTATION_RGB or ML_REPRESENTATION_CbYCr.

    This controls how to interpret each component. Table 7-1 shows this mapping (assuming that every component is sampled once per pixel).

    Table 7-1. Mapping Colorspace representation Parameters

    Colorspace Representation

    Component 1

    Component 2

    Component 3

    Component 4

    RGB

    Red

    Green

    Blue

    Alpha

    CbYCr

    Cb

    Y

    Cr

    Alpha

    The packing dictates the size and order of the components in memory, while the colorspace describes what each component represents. For example, the following shows the effect of colorspace and packing combined (assuming a 4444 sampling, see “ML_IMAGE_SAMPLING_INT32”).

    Color                    31            int              0
    Standard  Packing        +------------------------------+
    RGB       10_10_10_2     RRRRRRRRRRGGGGGGGGGGBBBBBBBBBBAA
    RGB       10_10_10_2_R   AABBBBBBBBBBGGGGGGGGGGRRRRRRRRRR
    CbYCr     10_10_10_2     bbbbbbbbbbYYYYYYYYYYrrrrrrrrrrAA
    CbYCr     10_10_10_2_R   AAbbbbbbbbbbYYYYYYYYYYrrrrrrrrrr

  • standard indicates how to interpret particular values as actual colors. Choosing a different standard alters the way the system converts between different color representations. The current standards supported are Rec. 601, Rec. 709, and SMPTE 240M.

  • range is one of the following:

    • FULL, where the smallest and largest values are limited only by the available packing size. This is common in computer graphics.

    • HEAD, where the smallest and largest values are somewhat less than the theoretical min/max values to allow some "headroom". This is common in video, particularly when sending video signals over a wire. For example, values outside the legal component range may be used to mark the start or end of a video frame.

In Rec. 601 video, the black level (blackest black) is 16 for 8-bit video and 64 for 10-bit video. In computer graphics, 0 is blackest black. If a picture with 16 for blackest black is displayed by a system that uses 0 as blackest black, the image colors are all grayed-out as a result of shifting the colors to this new scale. Similarly, the brightest level is 235 for 8-bit video and 940 for 10-bit video. The best results are obtained by choosing the correct colorspace.

Example 7-1. ML_COLORSPACE_RGB_709_FULL

ML_COLORSPACE_RGB_709_FULL is shorthand for the following:

ML_REPRESENTATION_RGB
+
ML_STANDARD_709
+
ML_RANGE_FULL

where:

  • Representation is RGB

  • The standard is 709

  • Full-range data is used


ML_IMAGE_COMPRESSION_FACTOR_REAL32

Describes desired compression factor (for compressed images only). The values are as follows:

1

Disables the compressor or puts the compressor in pass-through mode.

x

Indicates that approximately x compressed buffers require the same space as 1 uncompressed buffer.


Note: The size of the uncompressed buffer depends on image width, height, packing, and sampling. The default value is implementation-dependent, but should represent a reasonable trade-off between compression time, quality, and bandwidth. x is a number larger than 1.


ML_IMAGE_COMPRESSION_INT32

Sets the input buffer or desired output buffer format on an image compressor/decompressor (codec). Possible values are as follows:

ML_COMPRESSION_UNCOMPRESSED
ML_COMPRESSION_BASELINE_JPEG
ML_COMPRESSION_DV_625
ML_COMPRESSION_DV_525
ML_COMPRESSION_MPEG2I
ML_COMPRESSION_DVCPRO_625
ML_COMPRESSION_DVCPRO_525
ML_COMPRESSION_DVCPRO50_625
ML_COMPRESSION_DVCPRO50_525
ML_COMPRESSION_MPEG2


Note: In case of a compressed bit stream, all parameters that describe the image data (such as height, width, and color space) might not be known. The only parameters that might be known are the compression type ML_IMAGE_COMPRESSION_INT32 and the size of the bit stream ML_IMAGE_BUFFER_SIZE_INT32. The image buffer layout parameters (ML_IMAGE_SKIP_ROWS, ML_IMAGE_SKIP_PIXELS , and ML_IMAGE_ROW_BYTES) do not apply to compressed images.

For more infomation, see the following:

  • JPEG: W. B. Pennebaker and J. L. Mitchell, JPEG: Still Image Data Compression Standard, New York, NY: Van Nostrand Reinhold, 1993.

  • IEC 61834-1 (1997). Recording - Helical-Scan Digital Video Cassette Recording System Using 6.35 mm Magnetic Tape for Consumer Use (525-60, 625-50, 1125-60, and 1250-50 Systems) - Part 1: General Specifications

  • IEC 61834-2 (1997). Recording - Helical-Scan Digital Video Cassette Recording System Using 6.35 mm Magnetic Tape for Consumer Use (525-60, 625-50, 1125-60, and 1250-50 Systems) - Part 2: SD Format for 525-60 and 625-50 Systems

  • SMPTE 314M Television - Data Structure for DV-Based Audio, Data and Compressed Video - 25 and 50 Mb/s THE SOCIETY OF MOTION PICTURE AND TELEVISION ENGINEERS, 1997

  • MPEG2: ISO/IEC 13818-2 GENERIC CODING OF MOVING PICTURES AND ASSOCIATED AUDIO: SYSTEMS.

ML_IMAGE_DOMINANCE_INT32

Sets the field dominance, which defines the order of fields in a frame. For the same sequence of fields, there are two valid interpretations about which of the two fields belong together:

  • F1-dominant is an F1 field followed by an F2 field

  • F2-dominant is an F2 field followed by an F1 field

Changing the field dominance is most significant when external devices (for example, a tape deck) can only operate on frame boundaries. Figure 7-3 describes field dominance.

Figure 7-3. Field Dominance

Field Dominance

The parameter can be one of the following:

ML_DOMINANCE_F1
 

Specifies that the video signal is F1-dominant. This is the default.

ML_DOMINANCE_F2
 

Specifies that the video signal is F2-dominant.

This parameter is ignored for progressive signals.

ML_IMAGE_HEIGHT_1_INT32

Represents one of the following:

  • The height of each frame for progressive or interleaved buffers (depending on parameter ML_IMAGE_INTERLEAVE_MODE_INT32 )

  • The height of each F1 field, measured in pixels, for interlaced and non-interleaved signals.


Note: Interlaced pertains to video signals while interleaved pertains to images in memory. Progressive video signals are noninterlaced and noninterleaved. An interlaced signal could be either interleaved (a frame will have 2 fields in a buffer) or noninterleaved (a frame will have 2 buffers, with each field in a buffer).

For more information, see A Technical Introduction to Digital Video by Charles Poynton, published by John Wiley & Sons, 1996 (ISBN 0-471-12253-X, hardcover.


ML_IMAGE_HEIGHT_2_INT32

Sets the height of each F2 field in an interlaced non-interleaved signal. For progressive video signals and interlaced interleaved buffers, it must be set to value 0.

ML_IMAGE_INTERLEAVE_MODE_INT32

Specifies whether the two fields have been interleaved into a single image (and reside in a single buffer) or are stored in two separate fields (hence in two separate buffers). This parameter is only used in interlaced images.

Possible values are:

ML_INTERLEAVE_MODE_INTERLEAVED
 

Each pair of fields is interleaved into a single buffer. In this case, the parameter ML_IMAGE_HEIGHT_2_INT32 is set to 0.

ML_INTERLEAVE_MODE_SINGLE_FIELD
 

The two fields are stored separately. This means that each field has its own image buffer. Use ML_IMAGE_HEIGHT_1_INT32 for the F1 buffer and ML_IMAGE_HEIGHT_2_INT32 for the F2 buffer.

This parameter is ignored for signals with progressive timing. Default is interleaved.

ML_IMAGE_ORIENTATION_INT32

Sets the orientation of the image:

ML_ORIENTATION_TOP_TO_BOTTOM
 

Natural video order pixel [0,0] is at the top left of the image

ML_ORIENTATION_BOTTOM_TO_TOP
 

Natural graphics order pixel [0,0] is at the bottom left of the image

ML_IMAGE_PACKING_INT32

Sets the image packing. The image packing parameter describes the pixel storage in detail as follows:

ML_PACKING_type_size_order

where:

  • type is the base type of each component. Leave blank for an unsigned integer, use S for a signed integer.

  • size defines the number of bits per component. The size may refer to simple, padded, or complex packings.

    For the simplest formats, every component is the same size and there is no additional space between components. A single numeric value specifies the number of bits per component. The first component consumes the first size bits, the next consumes the next size bits, and so on. Within each component, the most significant bits always precede the least-significant bits. For example, a size of 12 means that the first byte in memory has the most significant 8 bits of the first component, the second byte holds the remainder of the first component and the most significant 4 bits of the second component, and so on.

    Space is only allocated for components that are in use (this depends on the sampling mode, see “ML_IMAGE_SAMPLING_INT32”). For these formats, the data must always be interpreted as a sequence of bytes. For example, ML_PACKING_8 describes a packing in which each component is an unsigned 8-bit quantity. ML_PACKING_S8 describes the same packing except that each component is a signed 8-bit quantity.

    For padded formats, each component is padded and may be treated as a short 2-byte integer. When this occurs, the size takes the form:

    BitsinSpaceAlignment  

    where:
    Bits

    Specifies the number of bits of space per component

    Space

    Specifies the total size of each component

    Alignment

    Indicates whether the information is left-shifted (L) or right-shifted (R) in that space

    In this case, each component in use consumes space bits and those bits must be interpreted as a short integer. (Unused components consume no space).

    For example, following are some common packings:

              15  int  short 0 
    Packing   +--------------+
    12in16R   0000iiiiiiiiiiii
    S12in16R  ssssiiiiiiiiiiii
    12in16L   iiiiiiiiiiiipppp
    S12in16L  iiiiiiiiiiiipppp
    S12in16L0 iiiiiiiiiiii0000

    where:

    • s indicates sign-extension.

    • i indicates the actual component information.

    • p indicates padding (replicated from the most significant bits of information).

    • S indicates a signed number (for 12 bits: -2048 ... 2047). The range is: -2**(nbits-1) .. 2**(nbits-1) - 1.

    • 0 indicates that unused bits are padded with zeros.


    Note: These bit locations refer to the locations when the 16-bit component has been loaded into a register as a 16-bit integer quantity.

    For the most complex formats, the size of every component is specified explicitly and the entire pixel must be treated as a single 4-byte integer. The size takes the form size1_size2_size3 _size4, where size1 is the size of component 1, size2 is the size of component 2, and so on. In this case, the entire pixel is a single 4-byte integer of length equal to the sum of the component sizes. Any space allocated to unused components must be zero-filled. The most common complex packing occurs when 4 components are packed within a 4-byte integer. For example, ML_PACKING_10_10_10_2 is:

                31             int             0
    Packing     +------------------------------+
    10_10_10_2  11111111112222222222333333333344

    where 1 is the first component, 2 is the second component, and so on. The bit locations refer to the locations when this 32-bit pixel is loaded into a register as a 32-bit integer quantity. If only three components were in use (determined from the sampling), then the space for the fourth component would be zero-filled.

  • order is the order of the components in memory. Leave blank for natural ordering (1,2,3,4), use R for reversed ordering (4,3,2,1). For all other orderings, specify the component order explicitly. For example, 4123 indicates that the fourth component is stored first in memory, followed by the remaining three components. Here, we compare a normal, a reversed, and a 4123 packing:

                 31             int             0
    Packing      +------------------------------+
    10_10_10_2   11111111112222222222333333333344
    10_10_10_2_R 4433333333332222222222111111
    10_10_10_2_4123 44111111111122222222223333333333   
    

    where 1 is the first component, 2 is the second component, and so on. Because this is a complex packing, the bit locations refer to the locations when this entire pixel is loaded into a register as a single integer.

For recommendations on packing and component ordering, see Appendix A, “Pixels in Memory”.

ML_IMAGE_ROW_BYTES_INT32

Specifies the number of bytes along one row of the image buffer. If this value is 0, each row is exactly ML_IMAGE_WIDTH_INT32 pixels wide. Default is 0.


Note: In physical memory, there is no notion of two dimensions; the end of the first row continues directly at the start of the second row. An image buffer contains either one frame or one field. For interlaced image data, the two fields can be stored in two separate image buffers or they can be stored in interleaved form in one image buffer.


ML_IMAGE_SAMPLING_INT32

Specifies the sampling rate. The sampling parameters take their names from common terminology in the video industry. They describe how often each component is sampled for each pixel. In computer graphics, it is normal for every component to be sampled once per pixel, but in video that need not be the case.

For RGB colorspaces, the only legal values are:

RGB Value

Description

ML_SAMPLING_444

Indicates that the R, G, and B components are each sampled once per pixel, and only the first 3 channels are used. If used with an image packing that provides space for a fourth component, then those bits should have value 0 on an input path and will be ignored on an output path.

ML_SAMPLING_4444

Indicates that the R, G, B, and A components are sampled once per pixel.

For all CbYCr colorspaces, the legal values include the following:

CbYCr Value

Description

ML_SAMPLING_444

Indicates that Cb, Y, and Cr are each sampled once per pixel and only the first 3 channels are used. If a packing provides space for a fourth channel then those bits should have value 0.

ML_SAMPLING_4444

Indicates that Cb, Y, Cr, and Alpha are each sampled once per pixel.

ML_SAMPLING_422

Indicates that Y is sampled once per pixel and Cb/Cr are sampled once per pair of pixels. In this case, Cb and Cr are interleaved on component 1 (Cb is first, Cr is second) and the Y occupies component 2. If used with an image packing that provides space for a third or fourth component, those bits should have value 0 on an input path and will be ignored on an output path.

ML_SAMPLING_4224

Indicates that Y and Alpha are sampled once per pixel and Cb/Cr are sampled once per pair of pixels. In this case, Cb and Cr are interleaved on component 1, Y is on component 2, component 3 contains the Alpha channel, and component 4 is not used (and will have value 0 if space is allocated for it in the packing).

ML_SAMPLING_411

Indicates that Y is sampled once per pixel and Cb and Cr are sampled once per 4 pixels. In this case, Cb is component 1, Y is component 2, and Cr is component 3. If used with an image packing that provides space for a fourth component, those bits should have value 0 on an input path and will be ignored on an output path.

ML_SAMPLING_420

Indicates that Y is sampled once per pixel and Cb or Cr is sampled once per pair of pixels on alternate lines. In this case, Cb or Cr is interleaved on component 1 and the Y occupies component 2. If used with an image packing that provides space for a third or fourth component, those bits should have value 0 on an input path and will be ignored on an output path.

ML_SAMPLING_400

Indicates that only Y is sampled per pixel (a greyscale image). Y is stored on component 1, all other components are unused. If used with an image packing that provides space for additional components, those bits should have value 0 on an input path and will be ignored on an output path.

ML_SAMPLING_0004

Indicates that only Alpha is sampled per pixel. If used with an image packing that provides space for additional components, those bits should have value 0 on an input path and will be ignored for an output path.

Table 7-2 shows the combined effect of sampling and colorspace on the component definitions.

Table 7-2. Effect of Sampling and Colorspace on Component Definitions

Sampling

Colorspace Representation

Component 1

Component 2

Component 3

Component 4

4444

RGB

Red

Green

Blue

Alpha

444

RGB

Red

Green

Blue

 

0004

RGB

Alpha

Y

Cr

Alpha

444

CbYCr

Cb

Y

Cr

0

4224

CbYCr

Cb/Cr

Y

Alpha

0

422

400

CbYCr

CbYCr

Cb/Cr

Y

Y

 

 

420

CbYCr

Cb/Cr [a]

Y

  

411

CbYCr

Y

Cr

  

0004

CbYCr

Alpha

   

[a] Cb and Cr components are multiplexed with Y on alternate lines (not pixels.)


ML_IMAGE_SKIP_PIXELS_INT32

Specifies the number of pixels to skip at the start of each line in the image buffer. Must be 0 if ML_IMAGE_ROW_BYTES_INT32 is 0. Default is 0.

ML_IMAGE_SKIP_ROWS_INT32

Specifies the number of rows to skip at the start of each image buffer. Default is 0 .

ML_IMAGE_TEMPORAL_SAMPLING_INT32

Specifies whether the image source is progressive or interlaced. Set to one of the following:

ML_TEMPORAL_SAMPLING_FIELD_BASED
ML_TEMPORAL_SAMPLING_PROGRESSIVE

Default is device-dependent.

If the image data is field based, the parameter ML_IMAGE_INTERLEAVE_MODE_INT32 defines how the two fields are stored in an image buffer.

ML_IMAGE_WIDTH_INT32

Sets the width of the image in pixels.

ML_SWAP_BYTES_INT32


Note: Not available on all devices.

Sets whether or not byte reordering occurs:

1

Reorders bytes as a first step when reading data from memory and as a final step when writing data to memory. The exact reordering depends on the packing element size.

0

Does not reorder (default)

For simple and padded packing formats, the element size is the size of each component. For complex packing formats, the element size is the sum of the four component sizes. Table 7-3 describes how this parameter reorders bits.

Table 7-3. Bit Reordering

Element Size

Default Ordering

Modified Ordering

16-bit

[15..0]

[7..0][15..8]

32-bit

[31..0]

[7..0][15..8][23..16][31..24]

Other

[n..0]

[n..0] (no change)