Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

IEEE SIGNAL PROCESSING MAGAZINE [34] MARCH 2015

spherical wave is emitted by an isotropic point-like source (IPLS)

located at a time-frequency-dependent position (, ).d kn

IPLS

The

magnitude of the pressure of the spherical wave is inversely pro-

portional to the distance traveled, which is known in physics as

the inverse distance law. The diffuse sound is assumed to be spa-

tially isotropic and homogenous, which means that diffuse sound

arrives from all directions with equal power and that its power is

position independent. Finally, it is assumed that the direct sound

and diffuse sound are uncorrelated.

The direct and diffuse sounds are captured with one or

more microphone arrays (depending on the application) that

are located in the far field of the sound sources. Therefore, at

the microphone array(s), the spherical wave can be approxi-

mated by a plane wave arriving from direction

.(, )kni In the

following, we will differentiate between two related geometrical

models: the DOA-based model and the position-based model. In

the DOA-based model, the DOA and direct sound are estimated

with a single microphone array, while in the position-based

model, the position of the IPLS is estimated using at least two

spatially distributed arrays, and the sound is captured using

one or more microphones.

Under the aforementioned assumptions, the signals received

at the omnidirectional microphones of an M-element micro-

phone array can be written as

(, ) (, ) (, ) (, ),xx x xkn kn kn kn

sdn

=++ (1)

where the vector ( , ) [ ( , , ), , ( ,

]xdd

kn kn kn

f= con-

tains the M microphone signals in the time-frequency domain,

where d

M1f

are the microphone positions. Without loss of general-

ity, the first microphone located at d

is used as a reference micro-

phone. The vector (, ) [ (, , ), , (, , )]xddXXkn kn kn

ss M1s

f= is

the captured direct sound at the different microphones and

(, ) [ (, , ), , (, , )]xddXXkn kn kn

M1dd d

f= is the captured diffuse

sound. Furthermore, (, )x kn

contains the slowly time-varying

noise signals (for example, the microphone self-noise). The direct

sound at the different microphones can be related to the direct

sound at the reference microphone via the array propagation vector

(, ),g k

i which can be expressed as

(, ) (, ) (, , ).xg dkn k X kn

1ss

i= (2)

The mth element of the array propagation vector

(, ) [(, , ), , (, , )]gggddkkn kn

fi = is the relative transfer

function of the direct sound from the mth to the first micro-

phone, which depends on the DOA (, )kn

i of the direct sound

from the point of view of the array. For instance, for a uniform

linear array of omnidirectional microphones

,,dgkn

ddexp sinj

m 1

where j denotes the imaginary unit, l

is the wavenumber, and dd

m 1

- is the distance between

positions d

and .d

In this article, we will demonstrate how this geometric model

can be effectively utilized to support a number of assisted listening

applications. In the considered applications, the desired output

signal of a loudspeaker (or headphone) channel

(, )Ykn

is given

as a weighted sum of the direct and diffuse sound at the reference

microphone, i.e.,

(, ) (, ) (, , ) () (,

,)dd

Ykn G knX kn Q kX kn

ii i

11sd

(3a)

)(,)

,Ykn Ykn

si id

=+ (3b)

where i is the index of the output channel, and ( , )Gkn

and

()Qk

are the application-dependent weights. It is important to

note that (, )Gkn

depends on the DOA ( , )kni of the direct sound

or on the position (, ).d kn

IPLS

To synthesize a desired output sig-

nal two steps are required: 1) extract the direct and diffuse sound

components and estimate the parameters (i.e., DOAs or positions),

and 2) determine the weights

(, )Gkn

and ( )Qk

using the esti-

mated parameters and application-specific requirements. The first

step is commonly referred to as the spatial analysis and is dis-

cussed next. In this article, the second step is referred to as the

application-specific synthesis.

SPATIAL ANALYSIS

To facilitate flexible sound field manipulation with high-quality

audio signals, it is crucial to accurately estimate the components

describing the sound field, specifically the direct and diffuse sound

components, as well as the DOAs or positions. Such spatial analysis

based on the microphone signals is depicted in Figure 3. The direct

and diffuse sound components can be estimated using single-

channel or multichannel filters. To compute these filters, we may

exploit knowledge about the DOA estimate of the direct sound or

compute additional parameters as discussed in the following.

[FIG2] A geometric sound field model: the direct sound emitted

by a point source arrives at the array with a certain DOA, and

the point-source position can be estimated when the DOA

estimates from at least two arrays are available.

[FIG3] A block diagram for spatial analysis.

Point Source

IPLS

Direct Sound

Diffuse Sound

Microphone

Signals

DOA/

Position

Estimation

Direct

Parameters

Diffuse

Signal

Extraction

Direct

Signal

Extraction

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND