Zoom out Search Issue
IEEE SIGNAL PROCESSING MAGAZINE [77] MARCH 2015
partially overlap, then temporal information from closely spaced
electrodes will generally be combined at the neural level. Psycho-
physical studies have reported evidence that temporal patterns
from nearby electrodes cannot be completely resolved by most CI
recipients. This suggests that sound processing schemes like
HiRes120 and FSP, which use very different approaches but rely on
providing independent channels of information across adjacent
electrodes, may result in only limited benefits [15]. More carefully
controlled studies of CI recipients’ listening experiences using
schemes such as HiRes120 and FSP over an extended time are
needed to determine specifically whether provision of fine-struc-
ture information by these schemes is perceptually beneficial.
MP3000
The MP3000 strategy is based on the ACE scheme but uses a psy-
choacoustic masking model with the aim of improving sound per-
ception for CI users based on more perceptually relevant channel
selection. The masking model attempts to select the perceptually
most important spectral components in the coding of any given
input audio signal. The rationale for this development was that it
should not be necessary to code sounds in parts of the spectrum
that are masked. This approach reduces the spread of excitation
and can lead to a more precise representation of the spectrum,
which in turn could lead to improved speech intelligibility. Process-
ing techniques based on auditory masking are widely used in com-
mon audio and music data-compression algorithms. These
techniques also compress the audio signals by selecting only a sub-
set of the frequency bands at a time. A well-known example is the
MP3 compression algorithm. In principle, the n-of-m speech cod-
ing strategies such as ACE are similar to these data reduction or
compression algorithms.
In MP3000, an additional processing stage is introduced
between the envelope estimation and the channel selection mod-
ules (see Figure 2, block 8). The psychoacoustic masking model
used is derived from a body of data from psychoacoustic mea-
surements in human auditory perception, such as studies on
absolute thresholds of hearing and simultaneous masking [5].
For each sound, the envelopes of each channel of the filter bank
are inputs to the psychoacoustic model, and masking spread
functions with three parameters (peak amplitude or attenuation,
high- and low-frequency slope) are calcu-
lated. The masked threshold is calculated
for each channel selected. The overall
masked threshold from all channels is
approximated by a nonlinear superposi-
tion of the separate masked thresholds
[16]. Subsequently, the n channels with
highest levels relative to an estimate of the
spread of masking are selected in each
stimulation cycle. This selection of stimu-
lation channels can be significantly differ-
ent from the ACE standard scheme where
only the n channels (typically
)8n = with
the highest envelope magnitudes are
selected. This is clearly visible in Figure 3,
where in channel 14 a formant is coded with MP3000 that is not
coded by ACE.
MP3000 has been implemented and evaluated in a within-sub-
ject repeated measures design with 221 subjects using an ABABA-
design with “A” for ACE and “B” for MP3000. With a fixed pulse
rate per channel, no significant difference was found for speech
intelligibility and strategy preference between MP3000 (four to six
spectral maxima selected) and ACE (eight to ten spectral maxima
selected). The best results were found for MP3000 with six spectral
maxima, leading to an increase in battery life of about 24% relative
to ACE [17]. Thus when a lower number of stimulation channels
is selected in each cycle, resulting in a lower overall stimulation
rate, MP3000 has advantages. However, overall subject preferences
were equally distributed between the two strategies, and additional
parameters have to be fitted in the MP3000 mapping sessions.
EXPERIMENTAL PROCESSING STRATEGIES
In this section, some experimental stimulation strategies are
briefly discussed to demonstrate the current limitations and
opportunities with CI stimulation. Most of these strategies have
been or are being considered for implementation in commercial
speech processors for CIs. The following sections concern loud-
ness-based strategies, envelope enhancement based on a neural
model, enhancement of periodicity modulation, and bilateral
stimulation strategies. The loudness-based strategies are not
shown in Figure 2. They can be added onto any strategy by add-
ing an extra block before the mapping block (5). The bilateral
strategies are not shown for reasons of clarity.
LOUDNESS-BASED STRATEGIES
A distinctive approach to sound processing for CIs has been
explored in a range of experimental schemes with the broad aim
to improve the experience of loudness by CI recipients when lis-
tening to sounds with widely varying acoustic characteristics.
Psychophysical studies have shown that CI users generally do
not experience the loudness of sounds in the same way as listen-
ers with NH, particularly when the spectral content and level of
sound signals change over time.
In one such scheme, known as SpeL (for “Specific Loud-
ness”), the initial stages of sound processing are based on a
[FIG7] A virtual channel plot for the sentence ”A boy fell from the window” processed
by HiRes120. Color intensity indicates current. Integer numbers indicate ”real” channels.
Time (s)
Channel
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Current (uA)
0
100
200
300
400
500
600
700
800
900
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®