Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

IEEE SIGNAL PROCESSING MAGAZINE [39] MARCH 2015

depicted in Figure 6(a). To reproduce the diffuse sound, the signals

(, )knY

, id

are decorrelated such that ( , )knY

, id

and ( , )knY

, jd

for

ij! are uncorrelated [29]. Note that the less correlation between

the loudspeaker channels, the more enveloping the perceived

sound is. The described processing for synthesizing the loud-

speaker signals is depicted in Figure 5(b).

When sound scene manipulation, such as directional filtering

[10] and dereverberation [11], is also desired, an additional gain

knB i can be applied to modify the direct signal. In this case,

the i th loudspeaker channel gain for the direct sound can be

expressed as

(,) (,,)(,,),kn kn knGPB

ii= (10)

where (, , )knB

i is the desired gain for the sound arriving from

(, ).kn

i In principle, ( , , )knB i can be defined freely to provide

any desired directivity pattern; an example directivity gain function

is shown in Figure 6(a). In addition, the diffuse sound gain

()kQ

can be adjusted to control the level of reproduced ambient sound.

For instance, dereverberation is achieved by selecting

() .kQ 1

The results for the considered teleconferencing scenario are

illustrated in Figure 6. Depicted in Figure 6(a)–(c) are the gain func-

tions, the spectrogram of an input signal, and the DOAs estimated

using ESPRIT. Figure 6(d) and (e) illustrate the spatial reproduction

and depict the panning gains

(, , )knP

right

i used for the right loud-

speaker and the spectrogram of the resulting signal. Lower weights

can be observed when the source on the left side is active than for

the source in the right, which is expected from the panning curve

(, , )knP

right

i depicted in Figure 6(a). Note that the exact values for

the respective DOAs should be .P 026

right

= for –20° and

.P 086

right

= for 10°. Next we illustrate an example of sound scene

manipulation. If the listener prefers to extract the signal of the

talker sitting on a sofa, while reducing the other talker, a suitable

gain function

(, , )knB i can be designed to preserve the sounds

coming from the sofa and attenuate sounds arriving from other

directions; an example of such a gain function is shown in

Figure 6(a). Additionally, setting the diffuse gain to a low value, for

example

() . ,kQ 025

= reduces the power level of the diffuse

sound, thereby increasing the SDR during reproduction. The spec-

trogram of the manipulated output signal is shown in Figure 6(f),

where the power of the interfering talker and reverberation are sig-

nificantly reduced.

VIRTUAL CLASSROOM

The geometric model with IPLS positions as parametric informa-

tion can facilitate assisted listening by creating binaural signals for

any desired position in the acquired sound scene, regardless of

where the microphone arrays are located. Let us consider the vir-

tual classroom scenario in Figure 7 as an example, although the

same concept also applies to other applications such as teleconfer-

ence systems in dedicated rooms, assisted listening in museums,

augmented reality, and many others. A teacher tutors in a typical

classroom environment, where only some students are physically

present, while the rest participates in the class remotely, for exam-

ple, from home. As illustrated in Figure 7, the sound scene is

captured using several distributed microphone arrays, with known

positions. The goal is to assist a remote student to virtually partici-

pate in a class from his preferred position, for instance close to the

teacher, in between the teacher and another student involved in

the discussion, or at his favorite desk, by synthesizing the binaural

signals for the desired virtual listener (VL) location

These bin-

aural signals are generated at the reproduction side based on the

received audio and position information, such that the student

could listen to the synthesized sound over headphones on a laptop

or any mobile device that can play multimedia content.

The processing to achieve this goal is in essence similar to that

utilized in the virtual microphone (VM) technique [12], [27], [30],

where the goal was to generate the signal of a VM that sounds per-

ceptually similar to the signal that would be recorded with a

physical microphone located at the same position. The tech-

nique has been shown successful in synthesizing the VM signals

in arbitrary positions in a room [27], [30]. However, in the vir-

tual classroom application, instead of generating the signals of

nonexisting microphones with physical characteristics, we

directly aim to generate the binaural signals for headphone

reproduction. The overall gain for the direct sound in the

ith

channel can be divided into three components:

.(, ) (, ) (, ) (, , )dkn kn kn knGDH B

,ii s HRTF IPLS

= (11)

The first gain (, )knD

is a factor compensating for the wave

propagation from d

IPLS

to the VL position ,d

and from d

IPLS

to d

for the direct signal estimated at the reference

Virtual Classroom

Virtual

User B

Virtual

User C

Virtual

User A

[FIG7] A virtual classroom scenario.

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND