It’s a bit of a challenge staying on top of all the new
formats and codecs used in our industry. In the past, there have been a few
mainstay codecs, such as MPEG-2, DV, HDV and DVCPRO HD, but now several new
codecs are emerging, including variants on MPEG-4, MJPEG2000, RedCode, DNxHD,
ProRes and others. Here’s what you need to know about these codecs, especially
the ones used in high definition.
INTRAFRAME VS. LONG GOP
Intraframe (spatial) compression is a form of compression
where an entire image can be reconstructed by the information contained within
that image, (considered single-image compression). They are generally referred
to as I-frames.
Long GOP (temporal) compression has I-frames as well, but
also has partial frames using temporal compression, in which frames store
differences between themselves and their neighboring frames. These frames are B
and P frames. Long GOP compression is used on DVDs and in HDV. DVDs generally
use a 15-frame GOP (Group of Pictures), which means that there is one I-frame
every half second, or 15 frames. The other frames in the GOP structure have
less bandwidth allocated, referencing their nearest I or P frames to
reconstruct the frames.
While Long GOP offers up to three times more efficient
compression, it comes at the cost of requiring more processing power and cannot
be edited without first being converted to full frames, which can be done
through conformation or conversion to an intraframe codec, or in rare instances
an editing application may do the conversion in realtime, storing the Long GOP
files in their native format.
The other potential drawback presents itself in the form of
blocky compression artifacts in situations where too much of the image changes
— from one frame to the next — for the limited bandwidth allocated for the B
and P frames to reconstruct the frame accurately. This however, is less of a
problem with higher data rate compressions, shorter GOP structures and variable
bit rate (VBR) encoding, which allows more bandwidth to be allocated when
needed to counter these types of problems.
DCT & WAVELET
There are basically two types of compression used in Intraframe
compression, known as wavelet, and Discrete Cosine Transform (DCT). Most of the
codecs employed for professional video use DCT compression. There is no simple
explanation for how this works. DCT generally uses 16x16 or 8x8 pixel
macroblocks, which are used in a matrix-based compression algorithm. wavelet
compression, by comparison uses a wavelet-based algorithm, which converts the
pixels into coefficients, which then go through transform coding and
quantization, making it an ideal solution for scalability.
The biggest difference in compression artifacts using
wavelet compression versus DCT-based compression is that instead of seeing
blocky artifacts, there is a softening of edges.
OTHER TECHNIQUES
Many of these codecs also employ Chroma Subsampling, Huffman
run length encoding (RLE) and entropy encoding.
Chroma Subsampling means that the luma (brightness)
information is stored for every pixel, but the image has a half (4:2:2) or one-quarter (4:1:1, and
4:2:0) chroma (color) resolution. For the pixels without color sampling, the
image assumes the same color information as the last color sampled pixel, with
the stored luminance value applied. This is one of the reasons 4:1:1 sampling
creates difficulties for composites as it causes stair step artifacts along the
edges between the subject and the greenscreen.
RLE encoding is a quick, lossless compression, which works
by turning areas where several pixels in a row have the same value into a
compressed version. To visualize this, assume 15 pixels in a row have the value
of A, then the next five have a value of B, and the next 10 have a value of C.
Rather than have 30 individual values, it would reduce the info to 15A5B10C. In
many situations this has little effect on reducing the size, however if you are
exporting an image with alpha information, a great deal of that image might
have the same value, of being completely transparent. In these situations, RLE
compression can have a dramatic effect on file size.
Entropy Encoding reduces the number of colors in an image
through palletization, which basically reduces image size by reducing the
number of colors in the image. This is done by creating a customized palette,
which represents the most commonly used colors in the image. An example of this
would be if a 24-bit image was reduced to an 18-bit image by creating a pallet
of the 262,144 most commonly used colors. This works, because most frames don’t
really need every color to create an image. For example, a mostly red image
probably doesn’t need to reference too many shades of blue, though it may need
every shade of red to avoid posterizing artifacts, in other words, to maintain
a smooth red gradient.
MORE TO COME
Next month, Post will discuss what common codecs are DCT-
and Wavelet-based, and what the advantages and disadvantages of each codec is.
We’ll also explain what this means when it comes to production and post quality
and workflow.
Heath Firestone is a Producer/Director with Firestone
Studios in Denver, CO. He can be reached at: heath@firestonestudios.com.