Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

IEEE SIGNAL PROCESSING MAGAZINE [77] MARCH 2015

partially overlap, then temporal information from closely spaced

electrodes will generally be combined at the neural level. Psycho-

physical studies have reported evidence that temporal patterns

from nearby electrodes cannot be completely resolved by most CI

recipients. This suggests that sound processing schemes like

HiRes120 and FSP, which use very different approaches but rely on

providing independent channels of information across adjacent

electrodes, may result in only limited benefits [15]. More carefully

controlled studies of CI recipients’ listening experiences using

schemes such as HiRes120 and FSP over an extended time are

needed to determine specifically whether provision of fine-struc-

ture information by these schemes is perceptually beneficial.

MP3000

The MP3000 strategy is based on the ACE scheme but uses a psy-

choacoustic masking model with the aim of improving sound per-

ception for CI users based on more perceptually relevant channel

selection. The masking model attempts to select the perceptually

most important spectral components in the coding of any given

input audio signal. The rationale for this development was that it

should not be necessary to code sounds in parts of the spectrum

that are masked. This approach reduces the spread of excitation

and can lead to a more precise representation of the spectrum,

which in turn could lead to improved speech intelligibility. Process-

ing techniques based on auditory masking are widely used in com-

mon audio and music data-compression algorithms. These

techniques also compress the audio signals by selecting only a sub-

set of the frequency bands at a time. A well-known example is the

MP3 compression algorithm. In principle, the n-of-m speech cod-

ing strategies such as ACE are similar to these data reduction or

compression algorithms.

In MP3000, an additional processing stage is introduced

between the envelope estimation and the channel selection mod-

ules (see Figure 2, block 8). The psychoacoustic masking model

used is derived from a body of data from psychoacoustic mea-

surements in human auditory perception, such as studies on

absolute thresholds of hearing and simultaneous masking [5].

For each sound, the envelopes of each channel of the filter bank

are inputs to the psychoacoustic model, and masking spread

functions with three parameters (peak amplitude or attenuation,

high- and low-frequency slope) are calcu-

lated. The masked threshold is calculated

for each channel selected. The overall

masked threshold from all channels is

approximated by a nonlinear superposi-

tion of the separate masked thresholds

[16]. Subsequently, the n channels with

highest levels relative to an estimate of the

spread of masking are selected in each

stimulation cycle. This selection of stimu-

lation channels can be significantly differ-

ent from the ACE standard scheme where

only the n channels (typically

)8n = with

the highest envelope magnitudes are

selected. This is clearly visible in Figure 3,

where in channel 14 a formant is coded with MP3000 that is not

coded by ACE.

MP3000 has been implemented and evaluated in a within-sub-

ject repeated measures design with 221 subjects using an ABABA-

design with “A” for ACE and “B” for MP3000. With a fixed pulse

rate per channel, no significant difference was found for speech

intelligibility and strategy preference between MP3000 (four to six

spectral maxima selected) and ACE (eight to ten spectral maxima

selected). The best results were found for MP3000 with six spectral

maxima, leading to an increase in battery life of about 24% relative

to ACE [17]. Thus when a lower number of stimulation channels

is selected in each cycle, resulting in a lower overall stimulation

rate, MP3000 has advantages. However, overall subject preferences

were equally distributed between the two strategies, and additional

parameters have to be fitted in the MP3000 mapping sessions.

EXPERIMENTAL PROCESSING STRATEGIES

In this section, some experimental stimulation strategies are

briefly discussed to demonstrate the current limitations and

opportunities with CI stimulation. Most of these strategies have

been or are being considered for implementation in commercial

speech processors for CIs. The following sections concern loud-

ness-based strategies, envelope enhancement based on a neural

model, enhancement of periodicity modulation, and bilateral

stimulation strategies. The loudness-based strategies are not

shown in Figure 2. They can be added onto any strategy by add-

ing an extra block before the mapping block (5). The bilateral

strategies are not shown for reasons of clarity.

LOUDNESS-BASED STRATEGIES

A distinctive approach to sound processing for CIs has been

explored in a range of experimental schemes with the broad aim

to improve the experience of loudness by CI recipients when lis-

tening to sounds with widely varying acoustic characteristics.

Psychophysical studies have shown that CI users generally do

not experience the loudness of sounds in the same way as listen-

ers with NH, particularly when the spectral content and level of

sound signals change over time.

In one such scheme, known as SpeL (for “Specific Loud-

ness”), the initial stages of sound processing are based on a

[FIG7] A virtual channel plot for the sentence ”A boy fell from the window” processed

by HiRes120. Color intensity indicates current. Integer numbers indicate ”real” channels.

Time (s)

Channel

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Current (uA)

100

200

300

400

500

600

700

800

900

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND