Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

101

102

103

104

105

106

107

108

109

110

IEEE SIGNAL PROCESSING MAGAZINE [103] MARCH 2015

Furthermore, adding the rever-

beration of sources (or the loud-

speaker signals in virtualization of

multichannel loudspeaker signals)

can also improve the realism of the

reproduced sound scene [10]. There-

fore, in virtualization, it is quite com-

mon to use BRIRs [1], [5] that

encapsulate HRTFs and reverbera-

tion. Accordingly, selecting the cor-

rect amount of early reflections as

well as late reverberation is critical to

recreate a faithful sound environ-

ment [1]. In general, the BRIR that

matches the sound environment of the scene or BRIR of a mix-

ing studio are considered to be more suitable [4]. As discussed

in the section “Challenges,” natural sound rendering requires

the accurate reproduction of both the sound sources and the

sound environment. Compared to the virtualization of multi-

channel loudspeaker signals [Figure 2(a)], the latter technique of

virtualizing the source and environment signals [Figure 2(b)] is

more desirable as it is closer to natural listening [6], [8], [9]. These

virtualization techniques can also be incorporated into spatial

audio coding systems, such as binaural cue coding [11], spatial

audio scene coding [5], and directional audio coding [3].

In virtualization, the directions of the sources [or the loud-

speakers in virtualization of multichannel loudspeaker signals as

in Figure 2(a)] have to be calibrated

according to the head movements

(as in natural listening). To fulfill this

need, the HRTFs/BRIRs in the virtu-

alization are updated on the fly based

on these head movements, which are

often tracked by a sensor (e.g., accel-

erometer, gyroscope, camera, etc.).

The latency between the head track-

ing and sound rendering should be

such that the localization accuracy is

not affected [12]. When incorporated

in the virtualization process, such a

head-tracking system can provide

useful dynamic cues to resolve the localization conflicts [1] and

enhance natural sound rendering [10], [12]. It shall be noted that

head tracking is more critical for the directional sources but less

important for the diffuse signals like environment signals and late

reverberation [12]. This is because the perception of diffuse signals

is less affected by head movements.

Recreating the perception of distance of the sources close to

natural listening is another critical aspect in virtualization for nat-

ural sound rendering. However, the challenges in simulating accu-

rate distance perception are numerous. Human beings’ ability to

accurately estimate these distances has long been known to be

poorer compared to our ability to estimate directions, even in the

physical listening space [1]. Virtual listening through headphones

[FIG2] Virtualization of (a) multichannel loudspeaker signals xn

[5], and (b) multiple sources sn

and environment signals

,.anan

,ynyn

is the signal sent to the left and right ear, respectively. Note that head tracking can be used to update the

selected directions of HRTFs/binaural room impulse responses (BRIRs).

M Multichannel Signals

Virtual

Loudspeaker

Positions

Virtual

Source

Positions

Channel 1 x

(n)

Channel 2 x

(n)

Channel Mx

(n)

…

x1L

(n)

...

(n)

Head

Tracking

Head

Tracking

HRTF/BRIR

K Sources

Source 1 s

(n)

Source 2 s

(n)

Source Ks

(n)

…..

...

(a)

(b)

x2L

(n)

xML

(n)

s2L

(n)

sKL

(n)

x1R

(n)

x2R

(n)

xMR

(n)

s1L

(n)

s2R

(n)

sKR

(n)

s1R

(n)

HRTF/BRIR

A PERSONALIZED

LISTENING EXPERIENCE

AND INCORPORATING THE

INFORMATION OF LISTENING

ENVIRONMENT HAVE ALSO

BEEN TRENDS IN THE

HEADPHONE INDUSTRY. THESE

TRENDS SHARE ONE

COMMON OBJECTIVE—

TO RENDER NATURAL

SOUND IN HEADPHONES.

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND