Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

IEEE SIGNAL PROCESSING MAGAZINE [71] MARCH 2015

the normal anatomical or neurophysiological place because

generally electrode arrays do not allow insertion beyond the

anatomical position corresponding to acoustic frequencies

lower than 500–1,000 Hz. However, studies have shown that

with time of use of the CI, cortical plasticity can partly compen-

sate for this mismatch. Also, manufacturers have recently

introduced CI systems with electrode arrays that allow deeper

insertion depths to facilitate more apical stimulation.

After the filter bank, the magnitude of the envelope in each

channel is determined (block 4 in Figure 2), for instance with an

envelope detector using rectification or using a Hilbert transforma-

tion followed by low-pass filtering. The filter cut-off frequency

should at least comprise the modulation frequencies below 20 Hz to

preserve the speech envelope information. Typical cut-off frequen-

cies are between 125 and 300 Hz. When spectral estimates are

obtained via an FFT, magnitudes corresponding to each of the elec-

trodes are obtained from the allocated FFT bins, summing the pow-

ers across adjacent FFT bins depending on the filter bandwidths.

The stimulation levels are related to the magnitudes of the

band-limited input signals by user-specific functions. The output

of the envelope detector is transformed to a value between the

min and max levels according to a nonlinear compression func-

tion because the electrical stimulation dynamic range

10 dB.

is much smaller than the input dynamic range of the preproces-

sor (block 5 in Figure 2). This mapping is patient specific because

min and max can vary widely across patients, stimulation chan-

nels, and electrode configurations (due to the status of the neural

structures at the electrode-neuron interface and higher-level neu-

ral structures). Next, these transformed magnitudes modulate

carrier waves of electrical pulses. Commonly, symmetric biphasic

pulse trains are used in commercial CIs, and magnitude is coded

by varying the pulse amplitude and/or the pulse width.

For practical reasons (many CIs have only one current source)

but also for limiting across-channel interactions, pulsatile stimuli

are used in an interleaved stimulation scheme (i.e., only one pulse

is delivered at any time). Furthermore, all channels are activated in

a temporally nonoverlapping sequence, and a fixed stimulation car-

rier rate is used [typically 500–2,000 pulses per second (pps)], with

the total pulse rate equal to the number of active channels times

the channel rate. The latter has no relationship with auditory neu-

rophysiology, as neural fibers do not fire at fixed rates and stimula-

tion rates are generally far higher than neural spike rates. However,

it is simple from a signal processing point of view and provides

most CI recipients with adequate perception of sounds.

This strategy can faithfully represent the temporal speech

envelope in the electrical stimulation patterns, leading to effec-

tive transmission of envelope information, which is a necessary

condition for speech perception. CIS was described by Wilson et

al. in 1991 [7]. Essentially the same sound processing scheme,

albeit with a relatively low stimulation rate (around 300 pps),

was previously used in an earlier French CI system [8].

In general, the evaluation (and comparison) of strategies is

mainly based on behavioral performance measures on identifica-

tion and discrimination tasks related to speech understanding,

music and tone perception, directional hearing, sound quality,

and preference measures. Right now, no validated model of these

measures, nor objective neurophysiological markers, exists for

electrical stimulation. So behavioral tests are the reference eval-

uation approach.

In the following sections, a range of stimulation strategies for

CI sound coding is described. Along with a description of the

technical features of each strategy, we highlight the rationale

behind the strategy, where one can be identified. We also review

selected published outcomes for speech understanding and, if

relevant and available, also for music or tone perception. Some

of these schemes are widely used in commercial processors while

others are experimental and still in development.

SOUND PROCESSING STRATEGIES IMPLEMENTED IN

COMMERCIAL SOUND PROCESSORS

Since the introduction of the first stimulation strategies in com-

mercial multichannel CIs over 30 years ago, a number of diverse

sound processing strategies have been devised and evaluated.

These strategies focus on better spectral representation, better

distribution of stimulation across channels, and better temporal

representation of the input signal. The four most commonly

used strategies are described: 1) advanced combination encoder

(ACE) with channel selection based on spectral features; 2)

MP3000 (named after the MP3 digital audio format) with chan-

nel selection and stimulation based on spectral masking; 3); fine

structure processing (FSP) based on enhancement of temporal

features; and 4) HiRes120 (high resolution) with temporal fea-

ture enhancement and current steering to improve the spatial

precision of stimulus delivery.

An overall outline of the sound processing steps for the dif-

ferent stimulation strategies, with common and differentiating

parts, is shown in Figure 2. The outputs of the strategies are

shown as electrodograms. An electrodogram is similar to a

spectrogram, but the vertical axis indicates channel number

rather than frequency, and biphasic current pulses are repre-

sented as vertical lines with amplitudes between 0 (min level

of map) and 1 (max level of map). Electrodograms are shown

of the synthesized vowel ah (Figure 3), a naturally spoken sen-

tence in quiet taken from the HINT corpus (Figure 4), a

selected word from the same sentence (Figure 5), and the

same sentence in steady noise with a speech-weighted spec-

trum at a signal-to-noise ratio of 10 dB (Figure 6). The base

stimulation rate per channel for ACE/CIS was 900 pps, for FSP

1,500 pps, and for HiRes120 1,856 pps.

Four manufacturers of CI systems are on the international

market (with implementations of strategies described in this

review): Cochlear (ACE, MP3000), Advanced Bionics (HiRes120),

Med-El (FSP), and Oticon Medical (formerly Neurelec).

ACE

ACE is the sound processing scheme currently used by most recip-

ients of CI systems manufactured by Cochlear. It is functionally

very similar to the spectral maxima sound processor (SMSP) and

the Speak scheme [9] used with previous models of Cochlear CIs.

The original development of the SMSP arose from the observation

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND