Zoom out Search Issue
IEEE SIGNAL PROCESSING MAGAZINE [39] MARCH 2015
depicted in Figure 6(a). To reproduce the diffuse sound, the signals
(, )knY
, id
are decorrelated such that ( , )knY
, id
and ( , )knY
, jd
for
ij! are uncorrelated [29]. Note that the less correlation between
the loudspeaker channels, the more enveloping the perceived
sound is. The described processing for synthesizing the loud-
speaker signals is depicted in Figure 5(b).
When sound scene manipulation, such as directional filtering
[10] and dereverberation [11], is also desired, an additional gain
(,
,)
knB i can be applied to modify the direct signal. In this case,
the i th loudspeaker channel gain for the direct sound can be
expressed as
(,) (,,)(,,),kn kn knGPB
ii
ii= (10)
where (, , )knB
i is the desired gain for the sound arriving from
(, ).kn
i In principle, ( , , )knB i can be defined freely to provide
any desired directivity pattern; an example directivity gain function
is shown in Figure 6(a). In addition, the diffuse sound gain
()kQ
i
can be adjusted to control the level of reproduced ambient sound.
For instance, dereverberation is achieved by selecting
() .kQ 1
i
1
The results for the considered teleconferencing scenario are
illustrated in Figure 6. Depicted in Figure 6(a)–(c) are the gain func-
tions, the spectrogram of an input signal, and the DOAs estimated
using ESPRIT. Figure 6(d) and (e) illustrate the spatial reproduction
and depict the panning gains
(, , )knP
right
i used for the right loud-
speaker and the spectrogram of the resulting signal. Lower weights
can be observed when the source on the left side is active than for
the source in the right, which is expected from the panning curve
(, , )knP
right
i depicted in Figure 6(a). Note that the exact values for
the respective DOAs should be .P 026
right
= for –20° and
.P 086
right
= for 10°. Next we illustrate an example of sound scene
manipulation. If the listener prefers to extract the signal of the
talker sitting on a sofa, while reducing the other talker, a suitable
gain function
(, , )knB i can be designed to preserve the sounds
coming from the sofa and attenuate sounds arriving from other
directions; an example of such a gain function is shown in
Figure 6(a). Additionally, setting the diffuse gain to a low value, for
example
() . ,kQ 025
i
= reduces the power level of the diffuse
sound, thereby increasing the SDR during reproduction. The spec-
trogram of the manipulated output signal is shown in Figure 6(f),
where the power of the interfering talker and reverberation are sig-
nificantly reduced.
VIRTUAL CLASSROOM
The geometric model with IPLS positions as parametric informa-
tion can facilitate assisted listening by creating binaural signals for
any desired position in the acquired sound scene, regardless of
where the microphone arrays are located. Let us consider the vir-
tual classroom scenario in Figure 7 as an example, although the
same concept also applies to other applications such as teleconfer-
ence systems in dedicated rooms, assisted listening in museums,
augmented reality, and many others. A teacher tutors in a typical
classroom environment, where only some students are physically
present, while the rest participates in the class remotely, for exam-
ple, from home. As illustrated in Figure 7, the sound scene is
captured using several distributed microphone arrays, with known
positions. The goal is to assist a remote student to virtually partici-
pate in a class from his preferred position, for instance close to the
teacher, in between the teacher and another student involved in
the discussion, or at his favorite desk, by synthesizing the binaural
signals for the desired virtual listener (VL) location
.d
VL
These bin-
aural signals are generated at the reproduction side based on the
received audio and position information, such that the student
could listen to the synthesized sound over headphones on a laptop
or any mobile device that can play multimedia content.
The processing to achieve this goal is in essence similar to that
utilized in the virtual microphone (VM) technique [12], [27], [30],
where the goal was to generate the signal of a VM that sounds per-
ceptually similar to the signal that would be recorded with a
physical microphone located at the same position. The tech-
nique has been shown successful in synthesizing the VM signals
in arbitrary positions in a room [27], [30]. However, in the vir-
tual classroom application, instead of generating the signals of
nonexisting microphones with physical characteristics, we
directly aim to generate the binaural signals for headphone
reproduction. The overall gain for the direct sound in the
ith
channel can be divided into three components:
.(, ) (, ) (, ) (, , )dkn kn kn knGDH B
,ii s HRTF IPLS
= (11)
The first gain (, )knD
s
is a factor compensating for the wave
propagation from d
IPLS
to the VL position ,d
VL
and from d
IPLS
to d
1
for the direct signal estimated at the reference
Virtual Classroom
Virtual
User B
Virtual
User C
Virtual
User A
A
B
C
[FIG7] A virtual classroom scenario.
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®