Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

IEEE SIGNAL PROCESSING MAGAZINE [50] MARCH 2015

we denote it for a band i as .(, )AE D

Let us define the piece-

wise linear sigmoid

(;,)xS

bb = ( (,),) /max min x

21 1

bb b-

(),

bb-

which has a range [ , ]0 1 . The band audibility function of the SII is

factorized into two factors: the first factor accounts for the instan-

taneous masking and the second factor accounts for high presen-

tation levels:

(,)(;,)

(; , ),

AE D E D D

EU U

15 15

170 10

ii ii i

ii i

=-+

--- --

(11)

where U

is the standard speech level at normal voicing effort

(provided in a table in the standard). The heuristic factor

(;,)ED D15 15S

ii i

-+ assumes that speech signals 15 dB below

the disturbance level are fully masked, and speech signals 15 dB

above the disturbance level are not masked, which leads to a curve

similar to the result derived in (9).

The SII is a refined and normalized version of (7) that accounts

for decreased intelligibility at high presentation levels

(, ).IAE DSII

(12)

The band-importance function I

in the SII is specified by a table

that is based on fitting to a database. Figure 2 illustrates the com-

putation of the SII. The suppression of the audibility function at

high presentation levels is clearly shown in the panel showing the

audibility function (11).

The measure (12) can be used to optimize a modification oper-

ator that shapes the spectrum. As the intelligibility decreases both

at high and low presentation levels, the SII criterion can, in princi-

ple, be optimized without constraint. It is seen from (12) that if

there is no global constraint, each frequency band can be opti-

mized independently. The resulting solutions are not necessarily

unique because of the form of

.S It is natural to select the solu-

tion that has the lowest power but does not reduce the speech

power in any band. For low absolute noise levels, where the solu-

tion is not limited by the second factor in (11), the solution for the

gain is [4]

,,maxgDEE15

iiii

=+-

(13)

where the shaping gain g

for band i is given in dB. In (13) the

original equivalent speech spectrum level is E

and the modi-

fied speech has equivalent speech spectrum level .gE

As was discussed in the section “Constraints on Optimization,”

it is common to constrain the overall loudspeaker signal power in

practical applications. The optimization of (12) subject to a power

constraint was studied in [4] and [8]. To facilitate analysis, the two

approaches use approximations of (12). Although the approxima-

tions are different, both neglect the second factor in (11) and start

from

(, )AE D

. .(

;,)

ED D15 15S

ii i

Reference [4] simpli-

fies (, )AE D

further by removing the lower bound on the sig-

moid and writing (, )AE D

. (/) ( , )

min ED1 2 15 30

+- Ref-

erence [8], on the other hand, makes the approximation

(, ) /( ),AE D 10 10 10

// /

EE D10 10 10

ii i

. + which is a differentiable

function. When writing the above expressions for the modified

speech, the audibility-function approximations are concave func-

tions of the (linear) spectral gain

.10

/g 10

Optimizing the approxi-

mations subject to linear constraints on 10

/g 10

form

[FIG2] The computation of the SII.

T, 21

T, 2

T, 1

Power

, D

)

, D

)

, D

)

Power

E, 21

E, 2

E, 1

Power

FilterbankFilterbank

Masking

SII

−20 0 20 40 60 80

Intelligibility

Full

Intelligibility

Disturbance

Spectrum Level

e.g., D

= 10 dB

Full

Masking

Partial

Masking

Limitation

by Human Ear

Speech Spectrum Level E

(dB)

, D

)

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND