Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

IEEE SIGNAL PROCESSING MAGAZINE [41] MARCH 2015

estimates and degree of diffuseness. The evaluation of different

setups with one desired talker and one interfering talker dem-

onstrated that an improvement in the speech reception thresh-

old (SRT) between 4 and 24 dB could be obtained.

In Figure 8, a general parametric spatial sound processing

scheme is illustrated, where spatial analysis provides the DOA esti-

mates, and the direct and diffuse sound estimates for the left and

right ear are obtained using different (left or right) reference micro-

phones. The left (and right) output signal can then be computed

using (3) with

(, ) (, , ) ()kn kn kGBH

i ex

i= for { },i left, right!

where (, , )knB

i defines the desired spatial response that depends

on the listening mode, ()kH

helps to externalize sounds, and

() () ()kck kQH

i ex

= with ( )ck011# is a constant used to

reduce the diffuse sound and hence increase the SDR at the output.

At the cost of an increase in computational complexity and memory

use, the proposed scheme can fully exploit all microphones.

While many more examples can be found in the literature, it

can readily been seen that the parametric spatial sound process-

ing, using either geometrically or psychoacoustically motivated

parametric models, provides a flexible and efficient way to achieve

directional filtering. The limited improvement in terms of the

SRT reported in [14] could be related to the inherent tradeoff

between interference reduction and speech distortion found in

most single-channel processing techniques. Further research is

required to develop robust and efficient parameter estimators for

this application and to study the impact on the SRT. More

advanced schemes to modify the spatial response and the DOAs

based on the listening mode and the listening situation could be

realized using the processing scheme depicted in Figure 8.

CONCLUSIONS

Parametric models have been shown to provide an efficient way to

describe sound scenes. While in earlier work multiple microphones

were only used to estimate the geometric model parameters, in

more recent work it has been shown that they can also be used to

estimate the direct and diffuse sound components. As the latter esti-

mates are more accurate than single-channel estimates, the sound

quality of the overall system is increased, for instance, by avoiding

decorrelating the direct sound that may partially leak into the dif-

fuse sound estimate in single-channel extraction. Depending on the

application, the estimated components and parameters can be

manipulated before computing one or more output signals by mix-

ing the components together based on the parametric side informa-

tion. In a spatial audio communication scenario in which the direct

and diffuse signals as well as the parameters are transmitted to the

far-end side, it is possible to determine at the receiver side which

sounds to extract and how to accurately reproduce the recorded spa-

tial sounds over loudspeakers or headphones. By using the position-

based model, we have shown how binaural signals can be

synthesized at the receiver side that correspond to a desired listen-

ing position on the recording side. Finally, we have described how

parametric spatial sound processing can be applied to binaural hear-

ing aids to achieve both directional filtering and dereverberation.

To date, the majority of the geometric models assume that at

most one direct sound is active per time-frequency. Extensions

of these models are currently under development where multi-

ple direct sound components plus diffuse sound components

coexist in a single time-frequency instance [23]. Preliminary

results have shown that this model can help to further improve

the spatial selectivity and sound quality.

We hope that by presenting this unified perspective on para-

metric spatial sound processing we can help readers to

approach other problems encountered in assisted listening from

this perspective and to help highlight relations between a family

of approaches that may initially seem divergent.

ACKNOWLDGEMENTS

This work has received funding from the European Communi-

ty’s Seventh Framework (FP7/2007-2013) under grant agree-

ment ICT-2011-287760, from the European Research Council

under the European Community’s Seventh Framework

(FP7/2007-2013)/ERC grant agreement 240453, and from the

Academy of Finland.

AUTHORS

Konrad Kowalczyk (konrad.kowalczyk@iis.fraunhofer.de)

received the B.Eng. and M.Sc. degrees in telecommunications

from AGH University of Science and Technology, Krakow,

Poland, in 2005 and the Ph.D. degree in electronics and electri-

cal engineering from Queens University, Belfast, United King-

dom, in 2009. From 2009 until 2011, he was a postdoctoral

research fellow at the Chair of Multimedia Communications and

Signal Processing, Friedrich-Alexander-University Erlangen-

Nürnberg, Germany. In 2012, he joined Fraunhofer Institue for

Integrated Circuits IIS as an associate researcher for communi-

cation acoustics and spatial audio processing. His main research

interests include virtual acoustics, sound field analysis, spatial

audio, signal enhancement, and array signal processing.

Oliver Thiergart (oliver.thiergart@iis.fraunhofer.de) studied

media technology at Ilmenau University of Technology (TUI),

Germany, and received his Dipl.-Ing. (M.Sc.) degree in 2008. In

2008, he was with the Fraunhofer Institute for Digital Media

Technology IDMT in Ilmenau where he worked on sound field anal-

ysis with microphone arrays. He then joined the Audio Department

of the Fraunhofer Institute for Integrated Circuits IIS in Erlangen,

Germany, where he worked on spatial audio analysis and reproduc-

tion. In 2011, he became a member of the International Audio

Laboratories Erlangen where he is currently pursuing a Ph.D.

degree in the field of parametric spatial sound processing.

Maja Taseska (maja.taseska@audiolabs-erlangen.de) received

her B.Sc. degree in electrical engineering at Jacobs University,

Bremen, Germany, in 2010, and her M.Sc. degree at the

Friedrich-Alexander-University Erlangen-Nürnberg, Germany, in

2012. She then joined the International Audio Laboratories

Erlangen, where she is currently pursuing a Ph.D. degree in the

field of informed spatial filtering. Her current research interests

include informed spatial filtering, source localization and tracking,

blind source separation, and noise reduction.

Giovanni Del Galdo (giovanni.delgaldo@iis.fraunhofer.de)

studied telecommunications engineering at Politecnico di

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND

________________________

____________________

______________________