Zoom out Search Issue

IEEE SIGNAL PROCESSING MAGAZINE [108] MARCH 2015
parameters [30]. One of the major challenges today to numeri-
cally model the HRTF is the very high resolution of imaging
techniques required for accurate prediction of HRTFs at high
frequencies. The required resolution of the mesh imaging
depends on the shortest wavelength, which is around 17 mm at
20 kHz [13]. Moreover, obtaining these optical descriptors
demands for the use of extremely expensive laser, MRI scanners,
and also requires highly skilled, qualified users.
PERCEPTUAL FEEDBACK
Several attempts have been carried out to personalize HRTF from
a generic HRTF database using perceptual feedback. Subjects
select the HRTFs through listening tests, where they choose the
HRTFs based on the correct perception of frontal sources and
reduced front–back reversals [13]. Listeners can also adapt to the
nonindividualized HRTF by modifying the HRTFs to suit his or
her perception. Middlebrooks observed that the peaks and notches
of HRTFs are frequency shifted for different individuals and that
the extent of the shift is related to the size of pinna [31]. Listeners
often tune the spectrum until they achieve a satisfactory and natu-
ral spatialization [13]. Other techniques involve active sensory
tuning [26] and tuning the PCA weights [32] to individualize the
HRTFs. These perceptual-based methods are much simpler in
terms of the required resources and effort compared to the indi-
vidualization methods using acoustical measurements or anthro-
pometric data. However, these listening sessions can sometimes be
quite long and result in listener fatigue.
FRONTAL PROJECTION PLAYBACK
More recently, a study by Sunder et al. [33] customized the non-
individualized HRTFs using a frontal projection headphone.
Unlike side projection of sound in conventional headphones, a
frontal projection headphone projects the sound from the front to
emulate the playback from a physical set of loudspeakers. By pro-
jecting the sound from the front, the idiosyncratic frontal pinna
spectral cues of the listener are captured inherently during the
playback [33]. It is found that the idiosyncratic high-frequency
pinna cues captured in the frontal projection headphones
response match well with the frontal HRTF cues, giving it a bet-
ter frontal perception (as shown in Figure 4). The authors of
[33] reported that the front–back reversals were reduced by
almost 50% [33] using the frontal projection headphone, thus
improving the veracity of the 3-D audio. The advantage of this
technique is that it does not require any measurements, train-
ing, or the anthropometric data of the listener. However, the
frontal projection individualization technique has been limited
to only the horizontal plane and also requires a special kind of
headphone equalization (Type-2).
As discussed previously, head tracking is important in the
virtualization process. It was found that head tracking, when
used with nonindividualized HRTFs, can improve the localiza-
tion [10]. However, head tracking primarily helps in reducing
the front–back confusions and has minimal effect in reducing
the elevation localization errors, IHL [10], and coloration
caused by nonindividualized HRTFs. Since individualization of
HRTFs can alleviate some of these limitations, it is suggested
that head tracking be used with individualized rendering.
In summary, there is a noticeable trend to achieve more and
more accurate individualization with lesser data, complexity,
and effort. However, the effect of individualization of HRTFs can
be hindered by the presence of the headphones. Hence, the
headphones have to be compensated to ensure that the spec-
trum at the eardrum has only the individualized HRTF features.
Additionally, equalization of the binaural recording itself may be
necessary in certain applications (e.g., musical recordings). The
challenges and methods of equalization for both binaural and
stereo recordings are explained in the next section.
EQUALIZATION
Headphones are not acoustically transparent as they not only
color the sound that is played from the headphone but also affect
the free-air characteristics at the ear. Typically the HPTF com-
prises the headphones transducer response and the acoustic cou-
pling between the headphones and the listener’s ears. To
compensate for the headphone response, the HPTF is first mea-
sured at the same point where the recording was carried out at
the blocked ear canal or at the eardrum [35]. The binaural
recording is then deconvolved with the HPTF to eliminate the
effect of the recording microphones and the headphone. This type
of direct equalization is also known as the nondecoupled mode of
equalization (Table 4) [36]. This method is often used when the
HPTF is measured with the same measurement setup as the
recording and particularly works well when the HPTF measure-
ment and recording are carried out on the same dummy’s head.
It is observed that, in the absence of headphone equalization,
the front–back reversals are increased and the elevation localiza-
tion is distorted [1], [13], [26]. Thus, headphone equalization is
critical to create a convincing perception of virtual sound sources.
However, headphone equalization is challenging since the HPTF
depends on individual morphology (headphone–ear coupling).
Researchers have also reported that the use of nonindividualized
equalization can reduce the externalization and the effect can be
Frequency (Hz)
Magnitude (dB/20 μPa)
10
2
10
3
10
4
50
60
70
80
90
100
110
120
Frontal Projection Response
Frontal HRTFs
5 kHz 16 kHz
High-
Frequency
Cues
[FIG4] A comparison of the frontal projection headphone
response and the frontal directional HRTFs measured on a
dummy’s head. (Figure used courtesy of [33].)
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®