Zoom out Search Issue
IEEE SIGNAL PROCESSING MAGAZINE [150] MARCH 2015
Cramér–Rao induced bound for the assessment of CPD perform-
ance were derived in [52] and [53].
Since the computation of CPD is intrinsically multilinear, we
can arrive at the solution through a sequence of linear subprob-
lems as in the alternating least squares (ALS) framework,
whereby the least squares (LS) cost function is optimized for
one component matrix at a time, while keeping the other com-
ponent matrices fixed [6]. As seen from (5), such a conditional
update scheme boils down to solving overdetermined sets of
linear equations.
While the ALS is attractive for its simplicity and satisfactory
performance for a few well-separated components and at suffi-
ciently high signal-to-noise ratio (SNR), it also inherits the
problems of alternating algorithms and is not guaranteed to
converge to a stationary point. This can be rectified by only
updating the factor matrix for which the cost function has most
decreased at a given step [54], but this results in an
N-times
increase in computational cost per iteration. The convergence
of ALS is not yet completely understood—it is quasilinear close
to the stationary point [55], while it becomes rather slow for ill-
conditioned cases; for more details, we refer to [56] and [57].
The conventional all-at-once algorithms for numerical optimi-
zation, such as nonlinear conjugate gradients, quasi-Newton, or
nonlinear least squares (NLS) [58], [59], have been shown to often
outperform ALS for ill-conditioned cases and to be typically more
robust to overfactoring. However, these come at the cost of a much
higher computational load per iteration. More sophisticated ver-
sions use the rank-1 structure of the terms within CPD to perform
efficient computation and storage of the Jacobian and (approxi-
mate) Hessian; their complexity is on par with ALS while, for ill-
conditioned cases, the performance is often superior [60], [61].
An important difference between matrices and tensors is that
the existence of a best rank-
R approximation of a tensor of rank
greater than R is not guaranteed [22], [62] since the set of ten-
sors whose rank is at most R is not closed. As a result, the cost
functions for computing factor matrices may only have an infi-
mum (instead of a minimum) so that their minimization will
approach the boundary of that set without ever reaching the
boundary point. This will cause two or more rank-1 terms go to
infinity upon convergence of an algorithm; however, numerically,
the diverging terms will almost completely cancel one another
while the overall cost function will still decrease along the itera-
tions [63]. These diverging terms indicate an inappropriate data
model: the mismatch between the CPD and the original data ten-
sor may arise because of an underestimated number of compo-
nents, not all tensor components having a rank-1 structure, or
data being too noisy.
CONSTRAINTS
As mentioned earlier, under quite mild conditions, the CPD is
unique by itself, without requiring additional constraints. However,
to enhance the accuracy and robustness with respect to noise, prior
knowledge of data properties (e.g., statistical independence, spars-
ity) may be incorporated into the constraints on factors so as to
facilitate their physical interpretation, relax the uniqueness
conditions, and even simplify computation [64]–[66]. Moreover, the
orthogonality and nonnegativity constraints ensure the existence of
the minimum of the optimization criterion used [63], [64], [67].
APPLICATIONS
The CPD has already been established as an advanced tool for sig-
nal separation in vastly diverse branches of signal processing and
data analysis, such as in audio and speech processing, biomedical
engineering, chemometrics, and machine learning [7], [24], [25],
[28]. Note that algebraic ICA algorithms are effectively based on
the CPD of a tensor of the statistics of recordings; the statistical
independence of the sources is reflected in the diagonality of the
core tensor in Figure 3, i.e., in vanishing cross-statistics [11], [12].
The CPD is also heavily used in exploratory data analysis, where
the rank-1 terms capture the essential properties of dynamically
complex signals [8]. Another example is in wireless communica-
tion, where the signals transmitted by different users correspond
to rank-1 terms in the case of line-of-sight propagation [19]. Also,
in harmonic retrieval and direction of arrival type applications,
real or complex exponentials have a rank-1 structure, for which
the use of CPD is natural [36], [65].
EXAMPLE 1
Consider a sensor array consisting of
K displaced but otherwise
identical subarrays of I sensors, with IKI=
u
sensors in total.
For R narrowband sources in the far field, the baseband equiva-
lent model of the array output becomes ,XAS E
T
=+ where
A C
IR
!
#
u
is the global array response, S C
J R
!
#
contains J
snapshots of the sources, and E is the noise. A single source
)(R 1= can be obtained from the best rank-1 approximation of
the matrix ;X however, for ,R 12 the decomposition of X is
not unique, and, hence, the separation of sources is not possible
without incorporating additional information. The constraints
on the sources that may yield a unique solution are, for instance,
constant modulus and statistical independence [12], [68].
Consider a row-selection matrix
J C
k
II
!
#
u
that extracts the
rows of X corresponding to the kth subarray, , , .k K1 f= For
two identical subarrays, the generalized EVD of the matrices
J X
1
and J X
2
corresponds to the well-known estimation of sig-
nal parameters via rotational invariance techniques (ESPRIT)
[69]. For the case
,K 22 we shall consider J X
k
as slices of the
tensor CX
I J K
!
##
(see the section “Tensorization—Blessing
of Dimensionality”). It can be shown that the signal part of
X admits a CPD as in (3) and (4), with ,1
R1
gmm===
(,, ),J AB bbdiag
()
() ()
k
kkR
1
1
33
f= and BS
()2
= [19], and the conse-
quent source separation under rather mild conditions—its
uniqueness does not require constraints such as statistical inde-
pendence or constant modulus. Moreover, the decomposition is
unique even in cases when the number of sources,
,R exceeds the
number of subarray sensors, ,I or even the total number of sen-
sors, .I
u
Note that particular array geometries, such as linearly
and uniformly displaced subarrays, can be converted into a con-
straint on CPD, yielding a further relaxation of the uniqueness
conditions, reduced sensitivity to noise, and often faster
computation [65].
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®