Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

IEEE SIGNAL PROCESSING MAGAZINE [54] MARCH 2015

in 2003 and 2008, respectively. He was a Ph.D. researcher

(2003–2007) and a postdoctoral researcher (2007–2010) at Delft

University of Technology. In 2005, he was a visiting researcher

at the Institute of Communication Acoustics, Ruhr-University

Bochum, Germany, and in 2008–2009 he was a visiting

researcher at Oticon A/S, Denmark. He is an assistant professor

at Delft University of Technology. His main research interests

include intelligibility improvement and digital speech process-

ing in general.

Petko N. Petkov (petkov@kth.se) received the B.Sc. degree in

communication engineering from the Technical University of

Sofia, Bulgaria, and the M.Sc. and Ph.D degrees in electrical engi-

neering from The Royal Institute of Technology (KTH)

Stockholm, Sweden. He was a research and development engineer

with Global IP Solutions from 2006 to 2007. He is currently with

the Speech Technology Group, Cambridge Research Laboratory,

Toshiba, working on speech intelligibility enhancement.

Bastian Sauert (bastian.sauert@head-acoustics.de) obtained

both the Dipl.-Ing. and Dr.-Ing. degrees from RWTH Aachen

University, Germany. In 2014, he joined HEAD acoustics,

Herzogenrath, Germany. He was a researcher at the Institute of

Communication Systems and Data Processing of RWTH Aachen

University, Germany, where he studied the enhancement of

speech intelligibility for listeners in a noisy environment. His

focus was on optimizing objective speech intelligibility mea-

sures in noise with special consideration of the application in

mobile phones. His main research interests are speech/audio

processing, including noise suppression and near-end listening

enhancement, as well as speech quality estimation.

Peter Vary (vary@rwth-aachen.de) received the Dipl.-Ing.

degree in electrical engineering from the University of

Darmstadt, Germany, in 1972 and the Dr.-Ing. degree from the

University of Erlangen-Nuremberg, Germany, in 1978. In 1980,

he joined Philips Communication Industries, Nuremberg,

Germany, where he became the head of the Digital Signal

Processing Group. Since 1988, he has been a professor at RWTH

Aachen University, Germany, and head of the Institute of

Communication Systems and Data Processing. His main

research interests are speech coding, joint source-channel cod-

ing, error concealment, and speech enhancement including

noise suppression, acoustic echo cancellation, and artificial

wideband extension. He is a Fellow of the IEEE.

REFERENCES

[1] M. Cooke and Y. Lu, “Spectral and temporal changes to speech produced in the

presence of energetic and informational maskers,” J. Acoust. Soc. Am., vol. 128,

no. 4, pp. 2059–2069, 2010.

[2] J. D. Griffiths, “Optimum linear filter for speech transmission,” J. Acoust. Soc.

Am., vol. 43, no. 1, pp. 81–86, 1968.

[3] R. Niederjohn and J. Grotelueschen, “The enhancement of speech intelligibility

in high noise levels by high-pass filtering followed by rapid amplitude compres-

sion,” IEEE Trans. Acoust. Speech Signal Processing, vol. 24, no. 4, pp. 277–282,

1976.

[4] B. Sauert and P. Vary, “Recursive closed-form optimization of spectral au-

dio power allocation for near end listening enhancement,” in ITG-Fachbericht-

Sprachkommunikation,2010.

[5] T. C. Zorilc, V. Kandia, and Y. Stylianou, “Speech-in-noise intelligibility im-

provement based on spectral shaping and dynamic range compression,” in Proc.

Interspeech,Portland, OR,2012, pp. 635–638.

[6] Y. Tang and M. Cooke, “Optimised spectral weightings for noise-dependent

speech intelligibility enhancement,” in Proc. Interspeech,Portland, OR,2012, pp.

955–958.

[7] M. Cooke, S. King, M. Garnier, and V. Aubanel, “The listening talker: A re-

view of human and algorithmic context-induced modifications of speech,” Elsevier

Comput. Speech Lang., vol. 28, no. 2, pp. 543–571, 2014.

[8] C. H. Taal, J. Jensen, and A. Leijon, “On optimal linear filtering of speech for

near-end listening enhancement,” IEEE Signal Processing Lett., vol. 20, no. 3,

pp. 225–228, 2013.

[9] M. Cooke, C. Mayo, and C. Valentini-Botinhao, “Intelligibility-enhancing

speech modifications: the Hurricane challenge,” in Proc. Interspeech,Lyon,

France, 2013, pp. 3552–3556.

[10]M.Cooke,C.Mayo,C.Valentini-Botinhao,Y.Stylianou,B.Sauert, and Y.

Tang, “Evaluating the intelligibility benefit of speech modifications in known noise

conditions,” Speech Commun., vol. 55, no. 4, pp. 572–585, 2013.

[11] P. N. Petkov, G. E. Henter, and W. B. Kleijn, “Maximizing phoneme recognition

accuracy for enhanced speech intelligibility in noise,” IEEE Trans. Audio, Speech,

Lang. Processing, vol. 21, no. 5, pp. 1035–1045, 2013.

[12] B. Sauert and P. Vary, “Near end listening enhancement optimized with re-

spect to speech intelligibility index,” in EURASIP European Signal Processing

Conf. (EUSIPCO), 2009, vol. 17, pp. 1844–1848.

[13] M. Cooke, “A glimpsing model of speech perception in noise,” J. Acoust. Soc.

Am., vol. 119, no. 3, pp. 1562–1573, 2006.

[14]

C. H. Taal, R. C. Hendriks, and R. Heusdens, “Speech energy redistribution

for intelligibility improvement in noise based on a perceptual distortion mea-

sure,” Comput. Speech Lang., vol. 28, no. 4, pp. 858–872, 2014.

[15] W. B. Kleijn and R. C. Hendriks, “A simple model of speech communication

and its application to intelligibility enhancement,” IEEE Signal Process. Lett., vol.

22, no. 3, pp. 303–307, Mar. 2015.

[16] J. B. Crespo and R. C. Hendriks, “Multizone speech reinforcement,” IEEE/

ACM Trans. Audio, Speech, Lang. Processing, vol. 22, no. 1, pp. 54–66, 2014.

[17] J. Allen, “How do humans process and recognize speech?” IEEE Trans. Speech

Audio Processing, vol. 2, no. 4, pp. 567–577, Oct. 1994.

[18] Methods for the Calculation of the Speech Intelligibility Index, ANSI S3.5-

1997.

[19] M. Zhang, P. N. Petkov, and W. B. Kleijn, “Rephrasing-based speech intelligi-

bility enhancement,” in Proc. Interspeech,Aug.2013, pp. 3587–3591.

[20] T. Dau, D. Püschel, and A. Kohlrausch, “A quantitative model of the effective

signal processing in the auditory system. i. model structure,” J. Acoust. Soc. Amer.,

vol. 99, no. 6, pp. 3615–3622, 1996.

[21] C. H. Taal, R. C. Hendriks, and R. Heusdens, “A low-complexity spectro-tem-

poral distortion measure for audio processing applications,” IEEE Trans. Audio,

Speech, Lang. Process., vol. 20, no. 5, pp. 1553–1564, 2012.

[22] C. Valentini-Botinhao, J. Yamagishi, and S. King, “Can objective measures

predict the intelligibility of modified HMM-based synthetic speech in noise?” in

Proc. Interspeech,Aug.2011, pp. 1837–1840.

[23] C. Valentini-Botinhao, J. Yamagishi, S. King, and R. Maia, “Intelligibility en-

hancement of HMM-generated speech in additive noise by modifying Mel cepstral

coefficients to increase the Glimpse Proportion,” Comput. Speech Lang., vol. 28,

no. 2, pp. 665–686, 2014.

[24] N. R. French and J. C. Steinberg, “Factors governing the intelligibility of

speech sounds,” J. Acoust. Soc. Am., vol. 19, no. 1, pp. 90–119, Jan. 1947.

[25] K. D. Kryter, “Methods for the calculation and use of the Articulation Index,” J.

Acoust. Soc. Am., vol. 34, no. 11, pp. 1689–1697, Nov. 1962.

[26] G. A. Studebaker, C. V. Pavlovic, and R. L. Sherbecoe, “A frequency impor-

tance function for continuous discourse,” J. Acoust. Soc. Amer., vol. 81, no. 4,

pp. 1130–1138, 1987.

[27] K. S. Rhebergen and N. J. Versfeld, “A speech intelligibility index-based

approach to predict the speech reception threshold for sentences in fluctuat-

ing noise for normal-hearing listeners,” J. Acoust. Soc. Am., vol. 117, no. 4, pp.

2181–2192, 2005.

[28] H. Schepker, J. Rennies, and S. Doclo, “Improving speech intelligibility in

noise by SII-dependent preprocessing using frequency-dependent amplification and

dynamic range compression,” in Proc. Interspeech, 2013, pp. 3577–3581.

[29] S. Elliott, J. Cheer, J.-W. Choi, and Y. Kim

, “Robustness and regularization of

personal audio systems,” IEEE Trans. Speech, Audio Lang. Processing, vol. 20,

pp. 2123–2133, Sept. 2012.

[30] D. B. Ward and G. W. Elko, “Virtual sound using loudspeakers: robust acoustic

crosstalk cancellation,” in Acoustics Signal Processing for Telecom,S. L.Gay and

J. Benesty, Eds. Boston, MA: Kluwer Academic, 2000, ch. 14.

[SP]

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND

_____________

____________________

________