Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

121

122

123

124

125

126

127

128

129

130

IEEE SIGNAL PROCESSING MAGAZINE [123] MARCH 2015

latter case). In the scenario where nonlinear speech enhancement

(noise suppression and dereverberation) was activated, three meas-

ures stood out: HASQI, PEMO-Q, and PEMO-Q-HI. Interestingly,

for the nonenhanced and enhanced cases, HASPI, a metric tailored

for intelligibility prediction, outperformed HASQI (its quality pre-

dictor counterpart) and all other metrics in terms of

sig

t Such

findings resonate with what was mentioned in

the section “HA: NFC Conditions” that alternate mappings of HAS-

QI’s internal parameters could be devised to reduce

f-RMSE. For

nonintrusive measures, in turn, it was found that all tested metrics

achieved insignificantly different

sig

t values in the noisy condition,

with ModA achieving the highest

sig

t and lowest f-RMSE. In the

enhanced conditions, on the other hand, only ModA achieved lev-

els above ITU-T’s “acceptability threshold.” Interestingly, in the

nonenhanced conditions (i.e., noise-alone, reverberation-alone,

and noise-plus-reverberation) ITU-T P.563 achieved reliable results

in line with those obtained with SRMR-HA and ModA. With speech

enhancement enabled, however, both P.563 and SRMR-HA perfor-

mances decreased to unacceptable levels, thus suggesting that

these two metrics are not capable of detecting and quantifying the

effects of speech enhancement artefacts on perceived quality.

These findings motivate the need for more research on the devel-

opment of innovative nonintrusive quality measures for HA

devices with nonlinear speech enhancement.

CONCLUSIONS

This article has provided a comprehensive review of 12 existing

objective quality and intelligibility prediction algorithms that have

been developed for NH and HI listeners who are users of assistive

listening devices, such as HAs and CIs. The algorithms were tested

on three common subjectively rated speech data sets: one with

subjective ratings collected from CI users in noisy and reverberant

environments, one from HA users in noisy and reverberant envi-

ronments with and without speech enhancement, and one from

HA users with NFC. The recommended metrics to be used under

each condition (nonenhanced, enhanced, NFC) were tabulated for

the two different assistive devices. In summary, for CI devices, two

measures stood out: STOI (intrusive) and SRMR-CI (nonintru-

sive). For HA with NFC, several intrusive measures attained com-

parable results, including the recently proposed PEMO-Q-HI.

None of the tested nonintrusive measures, on the other hand,

achieved acceptable results, thus leading us to explore the develop-

ment of a new metric called SRMR-HA

comp

. Finally, for HA with

speech enhancement enabled, the HASQI and PEMO-Q-HI intru-

sive measures stood out alongside ModA, a recently proposed non-

intrusive measure. It is hoped that these insights will be useful not

only for those in the assistive listening device research and devel-

opment community but also clinicians, audiologists, and patients

who wish to quickly gauge the performance of different devices

across different practical environmental conditions.

ACKNOWLEDGMENTS

Tiago H. Falk and João F. Santos acknowledge funding from the

Natural Sciences and Engineering Research Council of Canada

(NSERC) and the Fonds de Recherche du Quebec–Nature et

Technologies. Vijay Parsa and Susan Scollie acknowledge funding

from NSERC, the Oticon Foundation, and Phonak AG; James M.

Kates and Kathryn Arehart received funding from GN ReSound

and the National Institutes of Health R01 DC012289 (KA); Oldooz

Harati received funding from the National Institute of Deafness

and other Communication Disorders of the National Institutes of

Health R01 DC 007527 (PI: Philipos C. Loizou); and Rainer Huber

acknowledges funding from the German Research Foundation

(DFG) FOR-1732.

AUTHORS

Tiago H. Falk (falk@emt.inrs.ca) received the Ph.D. degree from

Queen’s University, Kingston, Canada, in 2009. From 2009 to

2010, he was a postdoctoral fellow at the University of Toronto. In

December 2010, he joined INRS-EMT (Montreal) as an assistant

professor. His research interests include multimedia quality meas-

urement and enhancement and human–machine interaction. He

has published over 130 papers in top-tiered journals and confer-

ences and has won four Best Paper Awards. He is a member of the

IEEE Signal Processing Society’s Speech and Language Technical

Committee, the Sigma Xi Society, and the editorial board of Jour-

nal of the Canadian Acoustical Association and Canadian Jour-

nal of Electrical and Computer Engineering. He is a Senior

Member of the IEEE.

Vijay Parsa (parsa@nca.uwo.ca) received the Ph.D. degree in

biomedical engineering from the University of New Brunswick,

Canada, in 1996. He then joined the Hearing Health Care

Research Unit at the University of Western Ontario, where he

worked on developing speech processing algorithms for audiology

and speech language pathology applications. Between 2002 and

2007, he served as the Oticon Foundation chair in acoustic signal

processing. He is currently an associate professor jointly appointed

across the Faculties of Health Sciences and Engineering. His

research interests are in speech signal processing with applica-

tions to hearing aids, assistive listening devices, and augmentative

communication devices.

João F. Santos (joao.eel@gmail.com) received his B.S.

degree in electrical engineering from the Federal University of

Santa Catarina, Brazil, in 2011 and his M.Sc. degree in tele-

communications from INRS in 2014, where he placed on the

dean’s list and was awarded the Best M.Sc Thesis Award. He is

currently pursuing his Ph.D. degree in telecommunications at

the same institute. His main research area is speech signal

processing with an emphasis in speech quality assessment and

enhancement for hearing aids and cochlear implants. He is

also interested in applications of bioinspired algorithms and

sparse representations to audio and speech processing.

Kathryn Arehart (kathryn.arehart@colorado.edu) is a professor

in the Speech, Language, and Hearing Sciences Department at the

University of Colorado at Boulder. Her laboratory’s research

focuses on understanding auditory perception and the impact

hearing loss has on listening in complex auditory environments.

Current projects include the study of individual factors (cognition,

hearing loss, auditory processing) that affect the ability of older

adults to successfully use advanced hearing-aid signal processing

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND

__________________

_____________

___________

__________