Zoom out Search Issue

IEEE SIGNAL PROCESSING MAGAZINE [96] MARCH 2015
several advantages. First, it lowers
the required filter orders of the
HRTFs, reducing computational and
memory requirements. Second, the
phase differences between neighbor-
ing HRTFs are significantly reduced,
which alleviates the comb-filtering
artifacts during interpolation. Finally,
the perceptual limits for the ITD necessitate variable time delays
with subsample accuracy, which are best implemented using frac-
tional-delay filtering techniques, e.g., [20].
Several strategies exist to interpolate the HRTF responses
(without delays). Interpolating the magnitude and phase responses
separately preserves the complex-valued responses of HRTF filters
[6]. Other approaches make use of the physical properties of (delay-
compensated) HRTFs, which closely resemble minimum-phase
systems [21], or the limited perceptual relevance of the phase [19],
[22]. Interpolation of the magnitude responses followed by mini-
mum-phase reconstruction is proposed in [6] and [16]. Another
method is to interpolate only the HRTF magnitudes [19].
Filtering of HRTFs can be performed either by linear convolu-
tion in the time domain, or by frequency-domain fast convolution
techniques. While the latter is significantly more efficient than lin-
ear convolution for all but the lowest filter orders, it introduces an
additional blocking latency in the order of the HRTF filter length,
which can be critical for assisted listening, e.g., hear-through
applications. Partitioned convolution techniques [23], [24] enable
advantageous tradeoffs between the efficiency of fast convolution
and system latency.
HRTF crossfading, which is also denoted as commutation [6],
refers to the gradual transition between interpolated HRTFs. It
reduces audible artifacts that are caused by the exchange of filter
coefficients. Thus, crossfading is typically performed at a much
higher time resolution than HRTF interpolation. The choice of
the crossfading algorithm tightly depends on the convolution
method used for HRTF filtering. In case of linear convolution, it
can be efficiently implemented by a linear interpolation of the
finite impulse reponse (FIR) filter coefficients. In contrast, inte-
grating crossfading with frequency-domain convolution is more
difficult due to block-based operation. A typical solution is to per-
form two convolution processes in parallel and to crossfade the
filtered signals in the time domain. A technique that combines
crossfading with frequency-domain and partitioned convolution
to avoid the complexity of two separate filtering processes is pro-
posed in [24].
AUDIO-AUGMENTED REALITY
Audio-augmented reality refers to a system with which the user
hears simultaneously both the synthetic and the ambient sounds
around her/him. In addition to the requirements of regular head-
phone or virtual-reality listening, a hear-through mode is now
needed [25], [26].
The hear-through mode is trivial in open and bone-conduction
headphones, which do not block the ear canal [27]. Then the user
will always hear the ambient sounds without extra attenuation.
However, other types of headphones,
such as closed and IE headphones,
block the ear canal and suppress out-
side sounds. The hear-through mode
must compensate for this attenua-
tion so that the environmental
sounds could be heard in a natural
way. As seen in Figure 2, in closed-
back and IE headphones, the attenuation at low frequencies is not
dramatic, but at frequencies higher than 1 kHz it can be remark-
able, such as more than 20 dB. This corresponds to a severe acous-
tic isolation of the headphone user, similar to that observed with
hearing protectors.
A hear-through system is usually based on an external micro-
phone [25]. The ambient sound signal captured by the micro-
phone is filtered and sent to the earpiece with an appropriate gain.
The aim of the filtering and the amplification is to cancel the
attenuation caused by the headphone itself. Thus, the filter is usu-
ally of high-pass type, because low frequencies leak to the ear
without being much damped.
An additional constraint in a hear-through system is its
latency, or the time delay between the leaked and processed sound
[25]. It is inevitable that some delay is caused by the analog-to-dig-
ital and digital-to-analog conversions and the processing itself,
which the microphone signal undergoes. This delay can be, e.g., 1 ms.
When the delayed and processed sound are added to the leaked
sound at the ear, a comb-filtering effect can color ambient sounds,
which is disturbing. The disturbance is strongest when a notch of
the comb filter occurs at the frequency range where the leaked
and processed sound are equally loud [25]. This corresponds to a 6
dB attenuation in both direct sound and the processed sound. For
this reason, slightly surprisingly, a colorless hear-though system is
easiest to implement for headphones that attenuate outside
sounds well, because then most of the ambient sound can come
through microphones and processing.
ALL-PASS HEAR-THROUGH DESIGN
We describe briefly a method to design a hear-through system
based on the all-pass principle [28]. The method takes as its input
the impulse response corresponding to the isolation transfer func-
tion of the headset. It can be measured using a dummy head with
headphones and by playing a sinusoidal sweep signal from a loud-
speaker. Additionally, it is necessary to know the latency of the
acoustic signal processing system from the microphone input to
the earpiece output, which is easy to measure. Furthermore, it is
important to account for the magnitude and group-delay of the
earpiece response, but here we assume it to be flat and delay-free.
The beginning of the impulse response is given as the input to
the all-pass filter design method, which completes it so that the
overall system is all-pass [28]. Figure 6(a) shows an example where
the given sequence is the beginning of the isolation impulse
response, which corresponds to the low-pass filter response in Fig-
ure 6(b). When a truncated impulse response of an all-pass filter is
combined with it, the overall magnitude response becomes flat, as
shown in Figure 6(b). In practice, the headphone itself produces
AN ADDITIONAL CONSTRAINT
IN A HEAR-THROUGH SYSTEM
IS ITS LATENCY, OR THE TIME
DELAY BETWEEN THE LEAKED
AND PROCESSED SOUND.
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®