Zoom out Search Issue

IEEE SIGNAL PROCESSING MAGAZINE [41] MARCH 2015
estimates and degree of diffuseness. The evaluation of different
setups with one desired talker and one interfering talker dem-
onstrated that an improvement in the speech reception thresh-
old (SRT) between 4 and 24 dB could be obtained.
In Figure 8, a general parametric spatial sound processing
scheme is illustrated, where spatial analysis provides the DOA esti-
mates, and the direct and diffuse sound estimates for the left and
right ear are obtained using different (left or right) reference micro-
phones. The left (and right) output signal can then be computed
using (3) with
(, ) (, , ) ()kn kn kGBH
i ex
i= for { },i left, right!
where (, , )knB
i defines the desired spatial response that depends
on the listening mode, ()kH
ex
helps to externalize sounds, and
() () ()kck kQH
i ex
= with ( )ck011# is a constant used to
reduce the diffuse sound and hence increase the SDR at the output.
At the cost of an increase in computational complexity and memory
use, the proposed scheme can fully exploit all microphones.
While many more examples can be found in the literature, it
can readily been seen that the parametric spatial sound process-
ing, using either geometrically or psychoacoustically motivated
parametric models, provides a flexible and efficient way to achieve
directional filtering. The limited improvement in terms of the
SRT reported in [14] could be related to the inherent tradeoff
between interference reduction and speech distortion found in
most single-channel processing techniques. Further research is
required to develop robust and efficient parameter estimators for
this application and to study the impact on the SRT. More
advanced schemes to modify the spatial response and the DOAs
based on the listening mode and the listening situation could be
realized using the processing scheme depicted in Figure 8.
CONCLUSIONS
Parametric models have been shown to provide an efficient way to
describe sound scenes. While in earlier work multiple microphones
were only used to estimate the geometric model parameters, in
more recent work it has been shown that they can also be used to
estimate the direct and diffuse sound components. As the latter esti-
mates are more accurate than single-channel estimates, the sound
quality of the overall system is increased, for instance, by avoiding
decorrelating the direct sound that may partially leak into the dif-
fuse sound estimate in single-channel extraction. Depending on the
application, the estimated components and parameters can be
manipulated before computing one or more output signals by mix-
ing the components together based on the parametric side informa-
tion. In a spatial audio communication scenario in which the direct
and diffuse signals as well as the parameters are transmitted to the
far-end side, it is possible to determine at the receiver side which
sounds to extract and how to accurately reproduce the recorded spa-
tial sounds over loudspeakers or headphones. By using the position-
based model, we have shown how binaural signals can be
synthesized at the receiver side that correspond to a desired listen-
ing position on the recording side. Finally, we have described how
parametric spatial sound processing can be applied to binaural hear-
ing aids to achieve both directional filtering and dereverberation.
To date, the majority of the geometric models assume that at
most one direct sound is active per time-frequency. Extensions
of these models are currently under development where multi-
ple direct sound components plus diffuse sound components
coexist in a single time-frequency instance [23]. Preliminary
results have shown that this model can help to further improve
the spatial selectivity and sound quality.
We hope that by presenting this unified perspective on para-
metric spatial sound processing we can help readers to
approach other problems encountered in assisted listening from
this perspective and to help highlight relations between a family
of approaches that may initially seem divergent.
ACKNOWLDGEMENTS
This work has received funding from the European Communi-
ty’s Seventh Framework (FP7/2007-2013) under grant agree-
ment ICT-2011-287760, from the European Research Council
under the European Community’s Seventh Framework
(FP7/2007-2013)/ERC grant agreement 240453, and from the
Academy of Finland.
AUTHORS
Konrad Kowalczyk (konrad.kowalczyk@iis.fraunhofer.de)
received the B.Eng. and M.Sc. degrees in telecommunications
from AGH University of Science and Technology, Krakow,
Poland, in 2005 and the Ph.D. degree in electronics and electri-
cal engineering from Queens University, Belfast, United King-
dom, in 2009. From 2009 until 2011, he was a postdoctoral
research fellow at the Chair of Multimedia Communications and
Signal Processing, Friedrich-Alexander-University Erlangen-
Nürnberg, Germany. In 2012, he joined Fraunhofer Institue for
Integrated Circuits IIS as an associate researcher for communi-
cation acoustics and spatial audio processing. His main research
interests include virtual acoustics, sound field analysis, spatial
audio, signal enhancement, and array signal processing.
Oliver Thiergart (oliver.thiergart@iis.fraunhofer.de) studied
media technology at Ilmenau University of Technology (TUI),
Germany, and received his Dipl.-Ing. (M.Sc.) degree in 2008. In
2008, he was with the Fraunhofer Institute for Digital Media
Technology IDMT in Ilmenau where he worked on sound field anal-
ysis with microphone arrays. He then joined the Audio Department
of the Fraunhofer Institute for Integrated Circuits IIS in Erlangen,
Germany, where he worked on spatial audio analysis and reproduc-
tion. In 2011, he became a member of the International Audio
Laboratories Erlangen where he is currently pursuing a Ph.D.
degree in the field of parametric spatial sound processing.
Maja Taseska (maja.taseska@audiolabs-erlangen.de) received
her B.Sc. degree in electrical engineering at Jacobs University,
Bremen, Germany, in 2010, and her M.Sc. degree at the
Friedrich-Alexander-University Erlangen-Nürnberg, Germany, in
2012. She then joined the International Audio Laboratories
Erlangen, where she is currently pursuing a Ph.D. degree in the
field of informed spatial filtering. Her current research interests
include informed spatial filtering, source localization and tracking,
blind source separation, and noise reduction.
Giovanni Del Galdo (giovanni.delgaldo@iis.fraunhofer.de)
studied telecommunications engineering at Politecnico di
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
________________________
____________________
______________________
______________________