Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

171

172

173

174

175

176

177

178

179

180

IEEE SIGNAL PROCESSING MAGAZINE [177] MARCH 2015

backward, biprediction, and symmetric

prediction, using two reference frames.

In a B frame, in addition to the

conventional forward, backward, bi-

directional, and skip/direct prediction

modes, symmetric prediction is defined as a

special biprediction mode, wherein only

one forward motion vector (MV) is coded

and the backward MV is derived from the

forward MV. For an F frame, besides the

conventional single hypothesis prediction

mode in a P frame, multihypothesis tech-

niques are added for more efficient predic-

tion, including the advanced skip/direct

mode [8], temporal multihypothesis predic-

tion mode [9], and spatial directional multi-

hypothesis (DMH) prediction mode [10].

In an F frame, an advanced skip/direct

mode is defined using a competitive

motion derivation mechanism. Two deri-

vation methods are used: one is temporal

and the other is spatial. Temporal multihy-

pothesis mode combines two predictors

along the predefined temporal direction,

while spatial multihypothesis mode com-

bines two predictors along the predefined

spatial direction. For temporal derivation,

the prediction block is obtained by an aver-

age of the prediction blocks indicated by

the MV prediction (MVP) and the scaled

MV in a second reference. The second ref-

erence is specified by the reference index

transmitted in the bit stream. For tempo-

ral multihypothesis prediction, as shown

in Figure 4, one predictor ref_blk1 is gen-

erated with the best MV MV and a refer-

ence frame ref1 is searched by motion

estimation, and then this MV is linearly

scaled to a second reference to generate

another predictor ref_blk2. The second

reference ref 2 is specified by the reference

index transmitted in the bit stream. In

DMH mode, as specified in Figure 4, the

seed predictors are located on the line

crossing the initial predictor obtained

from motion estimation. The number of

seed predictors is restricted to eight. If one

seed predictor is selected for combined

prediction, for example “Mode 1,” then the

index of the seed predictor “1” will be sig-

naled in the bit stream.

For spatial derivation, the prediction

block may be obtained from one or two

prediction blocks specified by the motion

copied from its spatial neighboring

blocks. The neighboring blocks are illus-

trated in Figure 5. They are searched in a

predefined order F, G, C, A, B, D, and the

selected neighboring block is signaled in

the bit stream.

MOTION VECTOR PREDICTION

AND CODING

MVP plays an important role in interpre-

diction, which can reduce the redundancy

among MVs of neighboring blocks and

thus save large numbers of coding bits for

MVs. In AVS2, four different prediction

methods are adopted, as tabulated in

Table 2. Each of them has its unique

usage. Spatial MVP is used for the spatial

derivation of Skip/Direct mode in F frames

and B frames. Temporal MVP is used for

temporal derivation of Skip/Direct mode

in P frames and F frames. Spatial-tempo-

ral-combined MVP is used for the joint

temporal and spatial derivation of Skip/

Direct mode in B frames. For other cases,

median prediction is used.

In AVS2, the MV is in quarter-pixel

precision for the luminance component,

and the subpixel is interpolated with an

eight-tap DCT interpolation filter (DCT-

IF) [11]. For the chrominance compo-

nent, the MV derived from luminance

with 1/8 pixel precision and a four-tap

DCT-IF is used for subpixel interpolation

[12]. After the MVP, the MV difference

(MVD) is coded in the bit stream. How-

ever, redundancy may still exist in MVD,

and to further save coding bits of MVs, a

progressive MV resolution adaptation

method is adopted in AVS2 [13]. In this

scheme, the MVP is firstly rounded to the

nearest integer sample position, and then

the MV is rounded to a half-pixel preci-

sion if its distance from MVP is larger

than a by a threshold. Finally, the resolu-

tion of the MVD is decreased to half-pixel

precision if it is larger than a threshold.

TRANSFORM

Two-level transform coding is utilized to

further compress the predicted residual.

For a CU with symmetric prediction unit

partition, the TU size can be

NN22# or

NN# signaled by a transform split flag.

Thus, the maximum transform size is

# 64, and the minimum transform

size is 4 # 4. For the TU size 4 # 4 to 32

# 32, an integer transform (IT) that

closely approximates the performance of

the discrete cosine transform (DCT) is

used; while for the 64 # 64 transform, a

logical transform (LOT) [14] is applied to

the residual. A five-three-tap integer wave-

let transform is first performed on a 64 #

64 block discarding the low-high (LH),

high-low (HL), and (high-high) HH-

bands, and then a normal 32 # 32 IT is

applied to the low-low (LL)-band. For a CU

that has an asymmetric PU partition, a

NN22# IT is used in the first level and a

nonsquare transform [15] is used in the sec-

ond level, as shown in Figure 6. Moreover,

in the latest AVS2 standard, a secondary

transform was adopted for intraprediction

residual (for more details see the latest AVS

specification document N2120 on the AVS

FTP Web site [21]).

ENTROPY CODING

After transform and quantization, a two-

level coding scheme is applied to the

Current PU

DB GC

[FIG5]

An illustration of neighboring

blocks A, B, C, D, F, and G for MVP.

[TABLE 2] MV PREDICTION METHODS IN AVS2.

METHOD DETAILS

MEDIAN USING THE MEDIAN MV VALUES OF THE NEIGHBORING BLOCKS.

SPATIAL USING THE MVs OF SPATIAL NEIGHBORING BLOCKS.

TEMPORAL USING THE MVs OF TEMPORAL COLLOCATED BLOCKS.

SPATIAL-TEMPORAL COMBINED USING THE TEMPORAL MVP FIRST IF IT IS AVAILABLE, AND SPATIAL

MVP IS USED INSTEAD IF THE TEMPORAL MVP IS NOT AVAILABLE.

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND