Zoom out Search Issue
Digital Object Identifier 10.1109/MSP.2014.2369251
Date of publication: 12 February 2015
ith the advancement of technology, both
assisted listening devices and speech com-
munication devices are becoming more por-
table and also more frequently used. As a
consequence, users of devices such as hearing
aids, cochlear implants, and mobile telephones, expect their
devices to work robustly anywhere and at any time. This holds in
particular for challenging noisy environments like a cafeteria, a
restaurant, a subway, a factory, or in traffic. One way to making
assisted listening devices robust to noise is to apply speech
enhancement algorithms. To improve the corrupted speech, spa-
tial diversity can be exploited by a constructive combination of
microphone signals (so-called beamforming), and by exploiting
the different spectrotemporal properties of speech and noise.
Here, we focus on single-channel speech enhancement algorithms
which rely on spectrotemporal properties. On the one hand, these
algorithms can be employed when the miniaturization of devices
only allows for using a single microphone. On the other hand,
when multiple microphones are available, single-channel algo-
rithms can be employed as a postprocessor at the output of a
beamformer. To exploit the short-term stationary properties of
natural sounds, many of these approaches process the signal in a
time-frequency representation, most frequently the short-time
discrete Fourier transform (STFT) domain. In this domain, the
coefficients of the signal are complex-valued, and can therefore be
represented by their absolute value (referred to in the literature
both as STFT magnitude and STFT amplitude) and their phase.
While the modeling and processing of the STFT magnitude has
been the center of interest in the past three decades, phase has
been largely ignored.
In this article, we review the role of phase processing for
speech enhancement in the context of assisted listening and
speech communication devices. We explain why most of the
research conducted in this field used to focus on estimating
spectral magnitudes in the STFT domain, and why recently
phase processing is attracting increasing interest in the speech
W
EAR PHOTO—©ISTOCKPHOTO.COM/XRENDER
ASSISTED LISTENING SIGN—© ISTOCKPHOTO.COM/NCANDRE
EARPHONES—IMAGE LICENSED BY INGRAM PUBLISHING
[
Timo Gerkmann, Martin Krawczyk-Becker, and Jonathan Le Roux
]
[
History and recent advances
]
Phase Processing
for Single-Channel
Speech Enhancement
1053-5888/15©2015IEEE IEEE SIGNAL PROCESSING MAGAZINE [55] MARCH 2015
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®