Zoom out Search Issue

IEEE SIGNAL PROCESSING MAGAZINE [50] MARCH 2015
we denote it for a band i as .(, )AE D
ii
Let us define the piece-
wise linear sigmoid
(;,)xS
12
bb = ( (,),) /max min x
21 1
bb b-
^h
(),
21
bb-
which has a range [ , ]0 1 . The band audibility function of the SII is
factorized into two factors: the first factor accounts for the instan-
taneous masking and the second factor accounts for high presen-
tation levels:
(,)(;,)
(; , ),
AE D E D D
EU U
15 15
170 10
S
S
ii ii i
ii i
=-+
--- --
(11)
where U
i
is the standard speech level at normal voicing effort
(provided in a table in the standard). The heuristic factor
(;,)ED D15 15S
ii i
-+ assumes that speech signals 15 dB below
the disturbance level are fully masked, and speech signals 15 dB
above the disturbance level are not masked, which leads to a curve
similar to the result derived in (9).
The SII is a refined and normalized version of (7) that accounts
for decreased intelligibility at high presentation levels
(, ).IAE DSII
i
i
ii
=
/
(12)
The band-importance function I
i
in the SII is specified by a table
that is based on fitting to a database. Figure 2 illustrates the com-
putation of the SII. The suppression of the audibility function at
high presentation levels is clearly shown in the panel showing the
audibility function (11).
The measure (12) can be used to optimize a modification oper-
ator that shapes the spectrum. As the intelligibility decreases both
at high and low presentation levels, the SII criterion can, in princi-
ple, be optimized without constraint. It is seen from (12) that if
there is no global constraint, each frequency band can be opti-
mized independently. The resulting solutions are not necessarily
unique because of the form of
.S It is natural to select the solu-
tion that has the lowest power but does not reduce the speech
power in any band. For low absolute noise levels, where the solu-
tion is not limited by the second factor in (11), the solution for the
gain is [4]
,,maxgDEE15
iiii
=+-
^h
(13)
where the shaping gain g
i
for band i is given in dB. In (13) the
original equivalent speech spectrum level is E
i
and the modi-
fied speech has equivalent speech spectrum level .gE
ii
+
As was discussed in the section “Constraints on Optimization,”
it is common to constrain the overall loudspeaker signal power in
practical applications. The optimization of (12) subject to a power
constraint was studied in [4] and [8]. To facilitate analysis, the two
approaches use approximations of (12). Although the approxima-
tions are different, both neglect the second factor in (11) and start
from
(, )AE D
ii
. .(
;,)
ED D15 15S
ii i
-+
Reference [4] simpli-
fies (, )AE D
ii
further by removing the lower bound on the sig-
moid and writing (, )AE D
ii
. (/) ( , )
/.
min ED1 2 15 30
ii
+- Ref-
erence [8], on the other hand, makes the approximation
(, ) /( ),AE D 10 10 10
// /
ii
EE D10 10 10
ii i
. + which is a differentiable
function. When writing the above expressions for the modified
speech, the audibility-function approximations are concave func-
tions of the (linear) spectral gain
.10
/g 10
i
Optimizing the approxi-
mations subject to linear constraints on 10
/g 10
i
form
[FIG2] The computation of the SII.
a
T, 21
a
T, 2
a
T, 1
Power
E
21
E
2
E
1
A
21
(E
21
, D
21
)
A
2
(E
2
, D
2
)
A
1
(E
1
, D
1
)
I
21
I
2
I
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Power
Power
v
E, 21
v
E, 2
v
E, 1
Power
D
21
D
2
D
1
.
.
.
.
.
.
.
.
.
Power
Power
FilterbankFilterbank
Masking
+
a
T
v
E
SII
−20 0 20 40 60 80
No
Intelligibility
Full
Intelligibility
Disturbance
Spectrum Level
e.g., D
i
= 10 dB
Full
Masking
Partial
Masking
No
Masking
Limitation
by Human Ear
Speech Spectrum Level E
i
(dB)
A
i
(E
i
, D
i
)
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®