Zoom out Search Issue

Digital Object Identifier 10.1109/MSP.2014.2365594
Date of publication: 12 February 2015
M
odern communication technology facilitates
communication from anywhere to anywhere. As
a result, low speech intelligibility has become a
common problem, which is exacerbated by the
lack of feedback to the talker about the render-
ing environment. In recent years, a range of algorithms has been
developed to enhance the intelligibility of speech rendered in a
noisy environment. We describe methods for intelligibility
enhancement from a unified vantage point. Before one defines a
measure of intelligibility, the level of abstraction of the representa-
tion must be selected. For example, intelligibility can be measured
on the message, the sequence of words spoken, the sequence of
sounds, or a sequence of states of the auditory system. Natural
measures of intelligibility defined at the message level are mutual
information and the hit-or-miss criterion. The direct evaluation of
high-level measures requires quantitative knowledge of human
cognitive processing. Lower-level measures can be derived from
higher-level measures by making restrictive assumptions. We dis-
cuss the implementation and performance of some specific
enhancement systems in detail, including speech intelligibility
index (SII)-based systems and systems aimed at enhancing the
sound-field where it is perceived by the listener. We conclude with a
discussion of the current state of the field and open problems.
INTRODUCTION
Humans adapt their speech to the physical environment. Based on
the facial expression of a listener, a talker may repeat or reformu-
late the message. A noisy environment gives rise to the Lombard
effect, e.g., [1], an involuntary change in the speech characteristics
that makes speech more intelligible.
In modern communication systems, the speaker often has lit-
tle or no awareness of the physical environment in which the
speech is rendered. This is perhaps most obvious for current-
generation speech synthesis, which produces speech without
[
W. Bastiaan Kleijn, João B. Crespo, Richard C. Hendriks,
Petko N. Petkov, Bastian Sauert, and Peter Vary
]
EAR PHOTO—©ISTOCKPHOTO.COM/XRENDER
ASSISTED LISTENING SIGN—© ISTOCKPHOTO.COM/NCANDRE
EARPHONES—IMAGE LICENSED BY INGRAM PUBLISHING
[
A unified view
]
Optimizing Speech
Intelligibility in a
Noisy Environment
1053-5888/15©2015 European Union IEEE SIGNAL PROCESSING MAGAZINE [43] MARCH 2015
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®