Zoom out Search Issue
IEEE SIGNAL PROCESSING MAGAZINE [59] MARCH 2015
faster phase estimation from magnitude, modeling of the signal
phase, group delay and transient processing, and joint estimation
of phase and magnitude.
ITERATIVE ALGORITHMS FOR PHASE ESTIMATION
Among the first proposals for phase estimation are iterative
approaches, which aim at estimating a time-domain signal
whose STFT magnitude is as close as possible to a target one
[1], [8]. Indeed, if the STFT magni-
tude of two signals are close, the
signals will in general be perceptu-
ally close as well. Thus, finding a
signal whose STFT magnitude is
close to a target one is considered a
valid goal when looking to obtain a
signal that “sounds” like that target
magnitude. This motivated intense
research on algorithms to estimate
signals (or equivalently a corresponding phase) given target
magnitudes, with applications such as speech enhancement or
timescale modification. In the case of speech enhancement, the
magnitude is typically obtained through one of the many mag-
nitude estimation algorithms mentioned earlier, while some
estimate of the phase, such as that of the noisy mixture, may
further be exploited for initialization or as side information.
The most well known and fundamental of these approaches
is that of Griffin and Lim [1], which consists in applying STFT
synthesis and analysis iteratively while retaining information
about the updated phases and replacing the updated magni-
tudes by the given ones. This exploits correlations between
neighboring STFT frames to lead to an estimate of the spectral
phases and the time-domain signal.
Given a target magnitude spectrogram
,A Griffin and Lim
formulated the problem as that of estimating a real-valued time-
domain signal
x such that the magnitude of its STFT X is closest
to A in the least-squares sense, i.e., estimating a signal x which
minimizes the squared distance
(, ) || | | .xAdXA
,
,,
k
kk
2
=-
,
,,
/
(2)
They proposed an iterative procedure which can be proven to min-
imize, at least locally, this distance. Starting from an initial signal
estimate x
()0
such as random noise, iterate the following compu-
tations: compute the STFT X
()i
of the signal estimate x
()i
at step
;i compute the phase estimate
()i
z as the phase of ,X
()i
;X
() ()ii
+z = compute the signal estimate x
()i 1+
at step i 1+ as
the iSTFT of .Ae
j
()i
z
Using the operator G defined in (1), this can
be reformulated as
().AeG
()i 1j
()i
+z =
z+
(3)
This procedure can be proven to be nonincreasing as well for a
measure of inconsistency of the spectrogram Ae
j
()i
z
defined
directly in the time-frequency domain:
() ( ) .AAeeIG
jj
2
2
z =-
zz
(4)
Indeed, one can easily show that ( , ) ( ) ( , ) .xA xAddI
( ) () ()iii1
##z
+
Interestingly, if only parts of the phase are updated according to (3),
the nondecreasing property still holds for
(),I
z but whether it still
does for (, )xAd has not been established.
Due to the extreme simplicity of its implementation and to its
perceptually relatively good results, GL was used as the standard
benchmark and a starting point for multiple extensions in the
three decades that have followed, even after better and only mar-
ginally more involved algorithms
had been devised. Most of the algo-
rithms that have been developed
since attempted to fix GL’s issues, of
which there are several: first, conver-
gence typically requires many itera-
tions; second, GL does not provide a
good initial estimate, starting from
random phases with no considera-
tions for cross-frame dependencies;
third, the updates rely on computing STFTs, which are computa-
tionally costly even when implemented using fast Fourier trans-
forms (FFTs); fourth, the updates are typically performed on whole
frames, without emphasis on local regularities; and finally, the
original version of GL processes signals in batch mode.
On this last point, it is interesting to note that Griffin and Lim
did actually hint at how to modify their algorithm to use it for
online applications. They described briefly in [1] and with more
details in [16] how to sequentially update the phase using “cas-
caded processors” that each take care of one iteration; their partic-
ular proposal however still incurs an algorithmic delay of
I times
the window length if performing I iterations. In [16], Griffin also
presented several methods that he referred to as “sequential esti-
mation methods”: these only incur a single frame delay and could
thus be used for online application, the best performing one being
reported as on par with batch GL.
While one can already see in Griffin’s account [16] several ele-
ments to modify GL into an algorithm that can lead to high qual-
ity reconstruction in a real-time setting, such as sliding-block
analysis across the signal and the use of windows that compensate
for partially reconstructed frames, these ideas seem to have gone
largely unnoticed and it is not until much later that they were
incorporated into more refined methods. Beauregard, Zhu, and
Wyse proposed consecutively two algorithms for real-time signal
reconstruction from STFT magnitude, the real-time iterative spec-
trogram inversion (RTISI) algorithm and RTISI with look ahead
(RTISI-LA) [17]. RTISI aims at improving the original batch GL in
two respects: allowing for online implementation, and generating
better initial phase estimates. The algorithm considers the frames
sequentially in order, and at frame
,, it only uses information
from the current frame’s magnitude and the previous overlapping
frames. The initial phase estimate
()0
z
,
for frame , is obtained as
the phase of the partial reconstruction from the previous frames,
windowed by an analysis window, which already ensures some
consistency between the phases of the current and previous
frames. An iterative procedure similar to GL is then applied, lim-
ited to the current frame’s phase: at each iteration, frame
, ’s
FINDING A SIGNAL WHOSE
STFT MAGNITUDE IS CLOSE TO
A TARGET ONE IS CONSIDERED A
VALID GOAL WHEN LOOKING TO
OBTAIN A SIGNAL THAT “SOUNDS”
LIKE THAT TARGET MAGNITUDE.
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®