Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

IEEE SIGNAL PROCESSING MAGAZINE [36] MARCH 2015

the microphone spacing and frequency [22]. Therefore,

(, )kn

U in (7) can be computed with (8b) when the diffuse

sound power (, )kn

z is known. The PSD matrix of the noise

()k

U in (7) is commonly estimated during silence, i.e., when

the sources are inactive, assuming that the noise is stationary.

The estimation of

(, )kn

z and ( )k

U is explained in more

detail in the next section. Note that the filter (, )w kn

is recom-

puted for each time-frequency bin with the geometric parame-

ters estimated for that bin. The solution is computationally

feasible since there exists a closed-form solution to the optimiza-

tion problem in (7) [21].

To estimate the diffuse sound

(, , ),knX d

a multichannel fil-

ter that suppresses the direct sound and minimizes the noise

while capturing the diffuse sound can be applied. Such a filter can

be obtained by solving

(, ) ()wwwarg minkn k subject to

(, ) (, ) (, ) (, ) .wg wakn k kn kn01and

i ==(9)

The first linear constraint ensures that the direct sound is

strongly suppressed by the filter. The second linear constraint

ensures that we capture the diffuse sound as desired. Note that

there exist different definitions for the vector

(, ).a kn In [23],

(, )a kn corresponds to the propagation vector of a notional

plane wave arriving from a direction (, ),kn

which is far away

from the DOA ( , )kni of the direct sound. With this definition,

(, )w kn

represents a multichannel filter that captures the dif-

fuse sound mainly from direction (, ),kn

i while attenuating

the direct sound from direction (, ).kn

i In [24], ( , )a kn corre-

sponds to the mean relative transfer function of the diffuse

sound between the array microphones. With this approach,

(, )w kn

represents a multichannel filter that captures the dif-

fuse sound from all directions except for the direction (, )kn

from which the direct sound arrives. Note that the optimization

problem (9) has a closed-form solution [21], which can be com-

puted when the DOA

(, )kni of the direct sound is known.

Figure 4(c) and (e) depict the spectrograms of the direct

sound and diffuse sound that were extracted using the multi-

channel LCMV filters for the example scenario consisting of

noise, castanets, and speech. As can be observed, the direct

sound extracted using the multichannel filter is less noisy and

contains less diffuse sound compared to the direct sound

extracted using the single-channel filter. Moreover, the diffuse

sound extracted using the multichannel filer contains no onsets

of the direct sound (clearly visible for the onsets of the castanets

in time frames 75–150) and a significantly reduced noise level.

As expected, the multichannel filters provide more accurate

decomposition of the sound field into a direct and a diffuse sig-

nal component. The estimation accuracy strongly influences the

performance of the discussed parametric processing approaches.

[FIG4] Spectograms of (a) the input signal, (b) the direct signal estimated using a single-channel filter, (c) the direct signal estimated

using a multichannel filter, (d) the diffuse signal estimated using a single-channel filter, and (e) the diffuse signal estimated using a

multichannel filter.

Single-Channel

Extraction

Time Frame Index

100 200

Multichannel

Extraction

Time Frame Index

100 200

Time Frame Index

Input Signal

Time Frame Index

Frequency [kHz]

100 200

−60

−50

−40

−30

−20

−10

(a)

(b) (c)

(d) (e)

100 200

Time Frame Index

100 200

−60

−50

−40

−30

−20

−10

−60

−50

−40

−30

−20

−10

Direct Sound

Frequency [kHz]

Diffuse Sound

Frequency [kHz]

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND