Zoom out Search Issue
IEEE SIGNAL PROCESSING MAGAZINE [78] MARCH 2015
running spectral analysis and the distribution of current levels
across electrodes is determined such that the loudness experi-
enced by the CI user is similar to that experienced by an average
listener with NH. Preliminary perceptual studies with CI recipi-
ents using SpeL confirmed that the relation between loudness
and the level and bandwidth of sounds was closer to normal [18].
More recently, SCORE (“Stimulus Control to Optimize Recipi-
ent Experience”), a simplified version of SpeL, was developed that
uses the estimated specific loudness function to calculate the
total loudness of sound signals in real time. Tests of speech recog-
nition with SCORE showed small but significant improvements
over ACE. Tests with an extended version for CI recipients who
use an acoustic hearing aid in the nonimplanted ear (SCORE
bimodal), suggested that it may improve speech recognition and
the ability of users to localize sounds, presumably because the
loudness differences between ears that carry information about
the direction of a sound source are conveyed more consistently
(cf.[19] and the references therein).
ENVELOPE ENHANCEMENT
In a CI, the electrical stimulation directly generates action poten-
tials in auditory neurons, predominantly bypassing any remain-
ing hair cells and synapse function. The synapse normally
demonstrates neural short-term adaptation [20], i.e., an increased
firing rate at the onsets of sounds. This short-term adaptation
acts as an across-channel phonological timing cue [20] and, with
conventional schemes such as CIS, is not present in the electri-
cally stimulated auditory nerve as in the normal auditory system.
Furthermore, recent studies have demonstrated that the tran-
sient parts of the speech envelope carry information that is
important for speech intelligibility in NH listeners.
Based on this rationale and former investigations, the
enhanced envelope (EE) strategy was developed and its feasibility
studied for applications in auditory prostheses. In this approach,
an additional processing stage is introduced after the envelope
detection stage (see Figure 2, block 11) wherein peaks, as a model
for the short-term adaptation and dependent on the onset rise
time, are added at the onsets in the envelope. This scheme is com-
plementary to the main structure of ACE or CIS.
The EE algorithm was evaluated with CI users and all lis-
teners demonstrated an immediate benefit with EE relative to
ACE [21]. The advantage of this enhanced envelope coding is
due to the emphasis of across-channel temporal coherence in
the coded speech signal. This temporal marker is an important
attribute for speech understanding in adverse listening situa-
tions and for sound source segregation; see also the electrodo-
grams in Figures 4–6. The onset enhancement is particularly
noticeable for the “b” sound in the word “boy” in Figure 5.
PERIODICITY MODULATION ENHANCEMENT
From psychophysical studies it is known that periodicity cues are
better perceived when modulation depth is high and modulations
are synchronized across channels to some extent [15]. This is
probably due to spread of excitation: electrodes close together
stimulate overlapping populations of neurons, which therefore
receive the aggregate stimulation pattern of multiple electrodes.
So if modulations are not synchronized across electrodes, the
modulation depth at the neural level may be severely reduced.
From the electrodograms in Figures 4–6, it is clear that
with most commercial strategies temporal modulations are not
well coded. In some channels, modulation depth is quite shal-
low and the desynchronization across channels combined with
spread of excitation leads to reduced modulation depth or even
spurious modulations in the aggregate pattern that will be
received by the auditory nerve fibers.
To improve this, several strategies have been developed
(e.g., [22]–[25] and the references therein). While the signal
processing to achieve it may differ, these strategies either
expand modulation depth or remove existing modulations and
explicitly modulate the envelope at the rate of F0. As an exam-
ple, in the following, the F0 modulation (F0mod) and eTone
strategies are briefly described.
The F0mod strategy is a simple example of a periodicity
enhancement strategy based on the ACE strategy. For each
frame of samples it estimates F0 and voicing probability using
an autocorrelation approach. If a frame is unvoiced, ACE pro-
cessing is applied. If a frame is voiced, all channels are modu-
lated synchronously using a sinusoidal modulator constructed
based on the F0 estimate. The block diagram and the output of
F0mod are shown in Figure 2 and Figures 4–6, respectively.
The eTone strategy [23] is based on the same principles but
includes an F0 estimator based on harmonic sieves, which is
very precise and robust to noise, and the modulated envelope is
mixed with the original envelope with a ratio depending on an
estimate of harmonicity of that particular channel. Modula-
tions are synchronous across channels and an exponential
decay modulation shape is used.
The F0mod and eTone strategies were evaluated for music
perception and speech recognition, and with tonal languages in
which pitch determines the lexical meaning of certain pho-
nemes (see [22]–[25] and the references therein). While period-
icity enhancement strategies can clearly improve periodicity
pitch perception, performance is still well below that of NH lis-
teners. For good pitch perception, listeners need access to all
three physical cues (see the “Introduction” section) and spectral
(place) and temporal cues need to be consistent. There are no
current CI strategies that make this possible, and we hypothe-
size that with the current electrode design and stimulation par-
adigm it is not possible to provide sufficiently place-specific
stimulation to achieve performance similar to NH. Note that for
a good representation of temporal information, good place spec-
ificity is required as well: when a population of neurons is stim-
ulated by information from several channels due to spread of
excitation, the aggregate pattern will be coded.
BILATERAL STRATEGIES
In various studies with controlled stimulation in laboratory
conditions, it has been found that bilateral CI users can be sen-
sitive to ITDs [26]. ITD thresholds, i.e., the smallest ITD that
can be detected, vary widely across subjects, with the best
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®