Zoom out Search Issue

IEEE SIGNAL PROCESSING MAGAZINE [34] MARCH 2015
spherical wave is emitted by an isotropic point-like source (IPLS)
located at a time-frequency-dependent position (, ).d kn
IPLS
The
magnitude of the pressure of the spherical wave is inversely pro-
portional to the distance traveled, which is known in physics as
the inverse distance law. The diffuse sound is assumed to be spa-
tially isotropic and homogenous, which means that diffuse sound
arrives from all directions with equal power and that its power is
position independent. Finally, it is assumed that the direct sound
and diffuse sound are uncorrelated.
The direct and diffuse sounds are captured with one or
more microphone arrays (depending on the application) that
are located in the far field of the sound sources. Therefore, at
the microphone array(s), the spherical wave can be approxi-
mated by a plane wave arriving from direction
.(, )kni In the
following, we will differentiate between two related geometrical
models: the DOA-based model and the position-based model. In
the DOA-based model, the DOA and direct sound are estimated
with a single microphone array, while in the position-based
model, the position of the IPLS is estimated using at least two
spatially distributed arrays, and the sound is captured using
one or more microphones.
Under the aforementioned assumptions, the signals received
at the omnidirectional microphones of an M-element micro-
phone array can be written as
(, ) (, ) (, ) (, ),xx x xkn kn kn kn
sdn
=++ (1)
where the vector ( , ) [ ( , , ), , ( ,
,)
]xdd
XX
kn kn kn
M1
T
f= con-
tains the M microphone signals in the time-frequency domain,
where d
M1f
are the microphone positions. Without loss of general-
ity, the first microphone located at d
1
is used as a reference micro-
phone. The vector (, ) [ (, , ), , (, , )]xddXXkn kn kn
ss M1s
T
f= is
the captured direct sound at the different microphones and
(, ) [ (, , ), , (, , )]xddXXkn kn kn
M1dd d
T
f= is the captured diffuse
sound. Furthermore, (, )x kn
n
contains the slowly time-varying
noise signals (for example, the microphone self-noise). The direct
sound at the different microphones can be related to the direct
sound at the reference microphone via the array propagation vector
(, ),g k
i which can be expressed as
(, ) (, ) (, , ).xg dkn k X kn
1ss
i= (2)
The mth element of the array propagation vector
(, ) [(, , ), , (, , )]gggddkkn kn
M1
T
fi = is the relative transfer
function of the direct sound from the mth to the first micro-
phone, which depends on the DOA (, )kn
i of the direct sound
from the point of view of the array. For instance, for a uniform
linear array of omnidirectional microphones
,,dgkn
m
=
^h
ddexp sinj
m 1
li
-
"
,
where j denotes the imaginary unit, l
is the wavenumber, and dd
m 1
- is the distance between
positions d
m
and .d
1
In this article, we will demonstrate how this geometric model
can be effectively utilized to support a number of assisted listening
applications. In the considered applications, the desired output
signal of a loudspeaker (or headphone) channel
(, )Ykn
i
is given
as a weighted sum of the direct and diffuse sound at the reference
microphone, i.e.,
(, ) (, ) (, , ) () (,
Ykn G knX kn Q kX kn
ii i
(3a)
(,
)(,)
,Ykn Ykn
,,
si id
=+ (3b)
where i is the index of the output channel, and ( , )Gkn
i
and
()Qk
i
are the application-dependent weights. It is important to
note that (, )Gkn
i
depends on the DOA ( , )kni of the direct sound
or on the position (, ).d kn
IPLS
To synthesize a desired output sig-
nal two steps are required: 1) extract the direct and diffuse sound
components and estimate the parameters (i.e., DOAs or positions),
and 2) determine the weights
(, )Gkn
i
and ( )Qk
i
using the esti-
mated parameters and application-specific requirements. The first
step is commonly referred to as the spatial analysis and is dis-
cussed next. In this article, the second step is referred to as the
application-specific synthesis.
SPATIAL ANALYSIS
To facilitate flexible sound field manipulation with high-quality
audio signals, it is crucial to accurately estimate the components
describing the sound field, specifically the direct and diffuse sound
components, as well as the DOAs or positions. Such spatial analysis
based on the microphone signals is depicted in Figure 3. The direct
and diffuse sound components can be estimated using single-
channel or multichannel filters. To compute these filters, we may
exploit knowledge about the DOA estimate of the direct sound or
compute additional parameters as discussed in the following.
[FIG2] A geometric sound field model: the direct sound emitted
by a point source arrives at the array with a certain DOA, and
the point-source position can be estimated when the DOA
estimates from at least two arrays are available.
[FIG3] A block diagram for spatial analysis.
Point Source
d
IPLS
Direct Sound
Diffuse Sound
Microphone
Signals
DOA/
Position
Estimation
Direct
Parameters
Diffuse
Diffuse
Signal
Extraction
Direct
Signal
Extraction
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®