Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

181

182

183

184

185

186

187

188

189

190

[

standards

IN A NUTSHELL

]

continued

IEEE SIGNAL PROCESSING MAGAZINE [180] MARCH 2015

of a Gaussian mixture model, etc. In this

way, the selected or generated G-picture

can well represent the background of a

scene with rare occluding foreground

objects and noise. Once a G-picture is

obtained, it is encoded and the recon-

structed picture is stored into the back-

ground memory in the encoder/decoder

and updated only if a new G-picture is

selected or generated. After that,

S-pictures can be involved in the encod-

ing process by an S-picture decision.

Except that it uses a G-picture as a refer-

ence, the S-picture owns similar properties

as the traditional I-picture such as error

resilience and random access (RA). There-

fore, the pictures that should be coded as

traditional I-pictures can be candidate

S-pictures, such as the first picture of one

group of pictures, or scene change, etc.

Besides bringing about more prediction

opportunity for those background blocks

that normally dominate a picture, an

additional benefit from the background

picture is a new prediction mode called

background difference prediction, as

shown in Figure 10, which can improve

foreground prediction performance by

excluding the background influence. It

can be seen that, after background differ-

ence prediction, the background redun-

dancy is effectively removed. Furthermore,

according to the predication modes in the

AVS2 compression bit stream, the blocks of

an AVS2 picture could be classified as back-

ground blocks, foreground blocks, or

blocks on the edge area. Obviously, this

information is very helpful for possible

subsequent vision tasks such as object

detection and tracking. Object-based cod-

ing has already been proposed in MPEG-4;

however, object segmentation remains a

challenging problem, which constrains

the application of object-based coding.

Therefore AVS2 uses simple background

modeling instead of accurate object seg-

mentation, which is easier and provides a

good tradeoff between coding efficiency

and complexity.

To provide convenience for applica-

tions like event detection and searching,

AVS2 added some novel high-level syntax

to describe the region of interest (ROI). In

the region extension, the region number,

event ID, and coordinates for top left and

bottom right corners are included to show

what number the ROI is, what event hap-

pened, and where it lies.

PERFORMANCE COMPARISON

The major target applications of AVS2 are

high-quality TV broadcasting and scene

videos. For high-quality broadcasting, RA

is necessary and may be achieved by

inserting intraframes at a fixed interval,

e.g, 0.5 s. And for high-quality video cap-

ture and editing, all intracoding (AI) is

required. For scene video applications,

e.g., video surveillance or videoconference,

low delay (LD) needs to be guaranteed.

According to the applications, we tested

[FIG11] A performance comparison between AVS2 and HEVC for surveillance videos: (a) main road and (b) over a bridge.

0 2,000 4,000 6,000 8,000

kb/s

10,000 12,000 0 500 1,000 1,500 2,000

kb/s

(a) (b)

2,500 3,000

PSNR (dB)

AVS2

HEVC

AVS2

HEVC

Main Road Over a Bridge

[TABLE 3] BIT RATE SAVING OF AVS2 PERFORMANCE COMPARISON

WITH AVS1 AND HEVC.

SEQUENCES

CONFIGURATION

AVS2 VERSUS

AVS1

AVS2 VERSUS

HEVC

AVS2 VERSUS

AVS1

AVS2 VERSUS

HEVC

AVS2 VERSUS

HEVC

UHD 31.2% 2.4% 50.3% −0.4%

1080P 33% 0.8% 50.3% 0.3%

1200P 37.9%

SD 26.2%

OVERALL 32.1% 1.6% 50.3% −0.1% 32.1%

AVS2 HAS BEEN

DEVELOPED IN

ACCORDANCE WITH

AVS AND IEEE IPR

POLICIES TO ENSURE

RAPID LICENSING OF

ESSENTIAL PATENTS

AT COMPETITIVE

ROYALTY RATES.

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND