Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

151

152

153

154

155

156

157

158

159

160

IEEE SIGNAL PROCESSING MAGAZINE [150] MARCH 2015

Cramér–Rao induced bound for the assessment of CPD perform-

ance were derived in [52] and [53].

Since the computation of CPD is intrinsically multilinear, we

can arrive at the solution through a sequence of linear subprob-

lems as in the alternating least squares (ALS) framework,

whereby the least squares (LS) cost function is optimized for

one component matrix at a time, while keeping the other com-

ponent matrices fixed [6]. As seen from (5), such a conditional

update scheme boils down to solving overdetermined sets of

linear equations.

While the ALS is attractive for its simplicity and satisfactory

performance for a few well-separated components and at suffi-

ciently high signal-to-noise ratio (SNR), it also inherits the

problems of alternating algorithms and is not guaranteed to

converge to a stationary point. This can be rectified by only

updating the factor matrix for which the cost function has most

decreased at a given step [54], but this results in an

N-times

increase in computational cost per iteration. The convergence

of ALS is not yet completely understood—it is quasilinear close

to the stationary point [55], while it becomes rather slow for ill-

conditioned cases; for more details, we refer to [56] and [57].

The conventional all-at-once algorithms for numerical optimi-

zation, such as nonlinear conjugate gradients, quasi-Newton, or

nonlinear least squares (NLS) [58], [59], have been shown to often

outperform ALS for ill-conditioned cases and to be typically more

robust to overfactoring. However, these come at the cost of a much

higher computational load per iteration. More sophisticated ver-

sions use the rank-1 structure of the terms within CPD to perform

efficient computation and storage of the Jacobian and (approxi-

mate) Hessian; their complexity is on par with ALS while, for ill-

conditioned cases, the performance is often superior [60], [61].

An important difference between matrices and tensors is that

the existence of a best rank-

R approximation of a tensor of rank

greater than R is not guaranteed [22], [62] since the set of ten-

sors whose rank is at most R is not closed. As a result, the cost

functions for computing factor matrices may only have an infi-

mum (instead of a minimum) so that their minimization will

approach the boundary of that set without ever reaching the

boundary point. This will cause two or more rank-1 terms go to

infinity upon convergence of an algorithm; however, numerically,

the diverging terms will almost completely cancel one another

while the overall cost function will still decrease along the itera-

tions [63]. These diverging terms indicate an inappropriate data

model: the mismatch between the CPD and the original data ten-

sor may arise because of an underestimated number of compo-

nents, not all tensor components having a rank-1 structure, or

data being too noisy.

CONSTRAINTS

As mentioned earlier, under quite mild conditions, the CPD is

unique by itself, without requiring additional constraints. However,

to enhance the accuracy and robustness with respect to noise, prior

knowledge of data properties (e.g., statistical independence, spars-

ity) may be incorporated into the constraints on factors so as to

facilitate their physical interpretation, relax the uniqueness

conditions, and even simplify computation [64]–[66]. Moreover, the

orthogonality and nonnegativity constraints ensure the existence of

the minimum of the optimization criterion used [63], [64], [67].

APPLICATIONS

The CPD has already been established as an advanced tool for sig-

nal separation in vastly diverse branches of signal processing and

data analysis, such as in audio and speech processing, biomedical

engineering, chemometrics, and machine learning [7], [24], [25],

[28]. Note that algebraic ICA algorithms are effectively based on

the CPD of a tensor of the statistics of recordings; the statistical

independence of the sources is reflected in the diagonality of the

core tensor in Figure 3, i.e., in vanishing cross-statistics [11], [12].

The CPD is also heavily used in exploratory data analysis, where

the rank-1 terms capture the essential properties of dynamically

complex signals [8]. Another example is in wireless communica-

tion, where the signals transmitted by different users correspond

to rank-1 terms in the case of line-of-sight propagation [19]. Also,

in harmonic retrieval and direction of arrival type applications,

real or complex exponentials have a rank-1 structure, for which

the use of CPD is natural [36], [65].

EXAMPLE 1

Consider a sensor array consisting of

K displaced but otherwise

identical subarrays of I sensors, with IKI=

sensors in total.

For R narrowband sources in the far field, the baseband equiva-

lent model of the array output becomes ,XAS E

=+ where

A C

is the global array response, S C

J R

contains J

snapshots of the sources, and E is the noise. A single source

)(R 1= can be obtained from the best rank-1 approximation of

the matrix ;X however, for ,R 12 the decomposition of X is

not unique, and, hence, the separation of sources is not possible

without incorporating additional information. The constraints

on the sources that may yield a unique solution are, for instance,

constant modulus and statistical independence [12], [68].

Consider a row-selection matrix

J C

that extracts the

rows of X corresponding to the kth subarray, , , .k K1 f= For

two identical subarrays, the generalized EVD of the matrices

J X

and J X

corresponds to the well-known estimation of sig-

nal parameters via rotational invariance techniques (ESPRIT)

[69]. For the case

,K 22 we shall consider J X

as slices of the

tensor CX

I J K

(see the section “Tensorization—Blessing

of Dimensionality”). It can be shown that the signal part of

X admits a CPD as in (3) and (4), with ,1

gmm===

(,, ),J AB bbdiag

()

() ()

kkR

f= and BS

()2

= [19], and the conse-

quent source separation under rather mild conditions—its

uniqueness does not require constraints such as statistical inde-

pendence or constant modulus. Moreover, the decomposition is

unique even in cases when the number of sources,

,R exceeds the

number of subarray sensors, ,I or even the total number of sen-

sors, .I

Note that particular array geometries, such as linearly

and uniformly displaced subarrays, can be converted into a con-

straint on CPD, yielding a further relaxation of the uniqueness

conditions, reduced sensitivity to noise, and often faster

computation [65].

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND