Yong Qiu Liu's Web Page--H.261 video

Introduction of multimedia codecs

3.1 H.261 video

H.261 is a recommendation of ITU for video codecs. It describes a video source coder, video multiplex coder and the transmission coder for the moving pictures. It was intended to be carried over ISDN at the rate of p*64 Kbit/s, where p is an integer between 1 to 30.

H.261 is intended for delivering video over ISDN to support face-to-face videophone applications and videoconferencing. For videophone, the parameter p could be 1 or 2 because a lower image quality is demanded. For videoconferencing a higher picture quality is required and p must be at least 6.^[32]

Image format

There are two picture formats defined in H.261, CIF (Common Intermediate Format) which has a resolution of 288 * 352 in luminance and 144 * 176 in chrominance and QCIF (Quarter Common Intermediate Format) which has the resolution of 144 * 176 in luminance and 72 * 90 in chrominance. In both cases, the chrominance components have half the horizontal and vertical resolution of the luminance component. The source coder operates only on non-interlaced pictures.^[32] The compression ratio of H.261 ranges from 1:1 up to 113:1 depends on the quality you want to gain.^[87]

The Video format and their uncompressed bit rate are listed in Table 2-1.

Table 2-1 Picture Formats Supported ^[33]

Picture format	Luminance pixels	Luminance lines	H.261 support	Uncompressed bit rate (Mbit/s)
				10 frames/s		30 frames/s
				Grey	Colour	Grey	Colour
QCIF	176	144	Yes	2.0	3.0	6.1	9.1
CIF	352	288	Optional	8.1	12.2	24.3	36.5

Codecs

The H.261 coding algorithm is based on the DCT (Discrete Cosine Transform) and performs image compression and motion-compensated inter-frame prediction.^[34]

H.261 has two type of coding, intra coding (I-frame) and inter coding (P-frame) (Figure 2-1). Each frame is processed in units of macro-blocks. A macro-block consists of six 8 * 8 pixel blocks: 4 luminance blocks and two chrominance blocks (Figure 2-2).^[33]

Figure 2-1 Coding Sequence

Figure 2-2 H.261 Macro-block^[34]

Intra coding encodes each block of 8 * 8 pixels only with reference to themselves and sends them directly to the DCT-based coding with no motion compensated prediction (Figure 2-3). Where as inter coding encodes them with respect to the reference frame preceding it and only the difference is encoded by using motion compensated prediction and DCT coding (Figure 2-4),^[33]^[34] where RLE stands for Run Length Encoding.

Figure 2-3 Intra Frame (I-frame) Coding

H261 also supports motion compensation in the encoder as an option which uses both the predication errors and the motion vectors to specify the value and direction of displacement between the encoded macro block (Target) and the chosen reference (Figure 2-4).^[34]

Figure 2-4 Inter-frame (P-frame) Coding

Encoding procedure (Figure 2-5):^[35]

Read in a frame of video

For all P-frames, estimate motion relative to previous transmitted frame and subtract a motion-compensated reference frame from current frame to create a residual frame and, also, calculate motion vectors.

Transform the residual frame using the Discrete Cosine Transform (DCT)

Quantise the results of the DCT

and transmit the quantised DCT coefficients

Reconstruct a reference frame and place it in the frame store

Figure 2-5 Encoding procedures^[34]

Decoding procedure (Figure 2-6):

Decode variable-length codes and extract quantised DCT coefficients.

Re-scale ("inverse quantiser") these coefficients Perform an inverse DCT to recreate the residual frame.

Add a motion compensated reference frame to the residual frame.

The result is a reduced-quality version of the original frame

Figure 2-6 Decoding procedures^[35]

Rate Control: A H.261 stream is typically sent to a constant bit rate channel, such as ISDN (e.g. 128kbps).^[34] But the output bit rate of H.263 encoder varies depending on the movement in the scene (Figure 2-7). To change this varying bit rate onto a constant, a rate control is required.

Figure 2-7 Bit rate of the H.261 stream^[34]

The rate-control strategy^[34] ensures that the encoded bit stream is buffered and the buffer is emptied at the constant bit rate of the channel, with the buffer compensating for the variable bit rate by filling and emptying as required.

If scene activity increases, the buffer is filled up and then the quantisation step size in the encoder is increased which increases the compression factor and reduces the output bit rate. If the buffer backs-up to empty, then the quantisation step size is reduced which reduces compression and increases the output bit rate.

This algorithm optimises bandwidth usage by trading picture quality against motion, so that a quickly-changing picture will have a lower quality than a relatively static picture in order to guarantee the constant bandwidth.

Figure 2-8 Rate Control

Last update April 1, 2002