Zoom out Search Issue

IEEE SIGNAL PROCESSING MAGAZINE [69] MARCH 2015
impedances of the stimulation channels can be measured (which
may lead to deactivation of some electrodes if faults are detected) and
parameters of the preprocessing stage can be adjusted.
Today’s CIs have a high power consumption compared to hear-
ing aids, which means that the batteries largely determine the size
of the BTE sound processor, making it cumbersome and unsightly
for users. This also means that users need to replace batteries
often, typically every day with rechargeable cells and every two
days for primary cells, which may be expensive and inconvenient.
Therefore, extensive research and development is currently
devoted to reducing power consumption. Another major comfort
improvement would be a totally implantable CI. The major chal-
lenge of a totally implantable system is the capture of airborne tar-
get sound with microphones and accelerometers, while
suppressing the high levels of unwanted noise emanating from
inside the human body.
A major technical and basic scientific challenge, and the sub-
ject of this article, is the translation of the captured sounds, partic-
ularly speech or music, to electrical stimulation patterns across
the intracochlear channels to optimize auditory perception and
interpretation. Historically, the objective of CIs has mainly been to
improve speech intelligibility. Speech intelligibility is determined
by spectral and temporal characteristics of the acoustic signal. The
spectral information is coarsely coded through multichannel rep-
resentation following the auditory system’s natural tonotopic
organization; i.e., acoustic spectral information is normally repre-
sented from low to high frequency in a corresponding spatial pro-
gression within the cochlea. Temporal speech information is
commonly classified into three categories:
the speech envelope, defined as the fluctuations in overall
amplitude at rates between 2 and 20Hz
the periodicity from around 50 to 500Hz, usually due to the
fundamental frequency (F0)
temporal fine structure (TFS).
TFS can be defined as the variations in wave shape within single
periods of periodic sounds, or over short time intervals of aperi-
odic ones. It has dominant fluctuation rates from around 500Hz
to 10kHz. Alternatively, from a perceptual point of view, TFS can
be defined as the fast fluctuations in a signal that can be used by
NH listeners to perceive pitch, to localize sounds, and to binau-
rally segregate different sound sources. The fine structure is mod-
ulated in amplitude by the temporal envelope and periodicity. For
speech sounds, F0 is the frequency at which the vocal cords
vibrate. Recently the transmission of F0 information, related to
pitch perception, has attracted a lot of interest because of the need
to improve perception of music and tonal languages with CIs.
It is not easy to define pitch. It is defined by the American
National Standards Institute (1994) as “that attribute of auditory
sensation in terms of which sounds may be ordered on a scale
extending from high to low.” From a musical point of view, it can
be defined as “that attribute of sensation whose variation is associ-
ated with musical melodies.” For periodic sounds, pitch is the per-
ceptual counterpart of the fundamental frequency (F0), leading to
the alternative definition that “a sound has a certain pitch if it can
be reliably matched by adjusting the frequency of a pure tone of
arbitrary amplitude” [6]. While F0 is a purely physical signal attri-
bute, i.e., the frequency of the first harmonic of a complex tone,
pitch is a perceptual attribute that arises after processing in the
brain and can not always be easily linked to physical signal attri-
butes. Typical relevant signals that elicit a pitch percept are spoken
vowels and sustained sounds produced by musical instruments.
Aperiodic sounds can also elicit a pitch percept, but it is less
well-defined.
In the normal auditory system, pitch is determined by three
different physical cues: 1) place of stimulation in the cochlea,
2) TFS, and 3) periodicity. The cochlea is tonotopically orga-
nized, so sounds with different spectral content will activate dis-
tinct neural populations, leading to different percepts. In the
case of a simple sinusoid, there is a one-to-one relation between
frequency and place of stimulation. For harmonic sounds, the
situation is more complicated: the place of stimulation of the
lowest harmonic still has a one-to-one relationship with F0, but
the higher harmonics do not by themselves directly code F0.
The spectral pitch mechanism is not very sensitive to small
changes in F0, and the change in percept associated with a pure
change in spectral pitch has been reported to correspond more
to a change in timbre than a change in pitch [6]. Timbre, also
called tone color, tone quality, or brightness, is the quality of a
sound that distinguishes different types of sound production,
such as voices or musical instruments. The American Standards
Association (1960) defines timbre by exclusion as “that attribute
of sensation in terms of which a listener can judge that two
sounds having the same loudness and pitch are dissimilar.”
The second pitch-related cue, TFS, can yield a strong and tonal
pitch percept when individual harmonics are coded by discrete
neural populations and their frequency is lower than the maximal
frequency to which neurons can phase-lock (around 1,500Hz);
i.e., the neural action potentials tend to occur during a particular
phase of the oscillation. When multiple harmonics excite the same
hair cells and therefore neurons, information is carried mainly by
the aggregate stimulation pattern. This is likely to happen at
higher frequencies because harmonics of a given F0 are spaced
linearly in frequency whereas the auditory periphery is organized
logarithmically. This leads to unavailability of the TFS of individ-
ual harmonics. However, the auditory system can still make use of
a third physical cue: the periodicity of the combined harmonics,
which corresponds to the F0. Perception of periodicity is limited
to around 300–500Hz. Periodicity pitch is weak compared to TFS
pitch. For good pitch perception across a wide variety of types of
sound, all three cues are needed.
Pitch perception with CIs is extremely poor. This is due both to
limitations at the interface with electrical stimulation (spread of
excitation) and to imprecise coding of temporal cues. The large
spread of excitation in the cochlea and the small number of chan-
nels to code the low frequencies with electrical stimulation
reduces the spectral resolution and therefore the precision of spec-
tral pitch. Another limitation with electrical stimulation is the
inability of CI users to perceive TFS. Therefore the only remain-
ing mechanism is periodicity pitch perception, which is much
weaker than TFS pitch and limited by the maximum frequency
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®