Zoom out Search Issue
IEEE SIGNAL PROCESSING MAGAZINE [157] MARCH 2015
only require a limited, partial exploration of the data matrix.
Tucker variants of this approach have been derived in [99]–[101]
and are illustrated in Figure 11, while a cross-approximation for
the TT format has been derived in [102]. Following a somewhat
different idea, a tensor generalization of the CUR decomposition
of matrices samples fibers on the basis of statistics derived from
the data [103].
MULTIWAY REGRESSION—HIGHER-ORDER PARTIAL LS
MULTIVARIATE REGRESSION
Regression refers to the modeling of one or more dependent
variables (responses),
,Y by a set of independent data (predic-
tors), .X In the simplest case of conditional mean square esti-
mation (MSE), whereby (|),yEyx=
t
the response y is a linear
combination of the elements of the vector of predictors ;x for
multivariate data, the multivariate linear regression (MLR) uses
a matrix model,
,YXPE=+ where P is the matrix of coeffi-
cients (loadings) and E is the residual matrix. The MLR solu-
tion gives ()PXXXY
TT1
=
-
and involves inversion of the
moment matrix .XX
T
A common technique to stabilize the
inverse of the moment matrix XX
T
is the principal component
regression (PCR), which employs low-rank approximation of .X
MODELING STRUCTURE IN DATA—THE PARTIAL LS
Note that in stabilizing multivariate regression, PCR uses only
information in the X variables, with no feedback from the Y varia-
bles. The idea behind the partial LS (PLS) method is to account for
structure in data by assuming that the underlying system is gov-
erned by a small number,
,R of specifically constructed latent vari-
ables, called scores, that are shared between the X and Y variables;
in estimating the number ,R PLS compromises between fitting X
and predicting .Y Figure 12 illustrates that the PLS procedure:
1) uses eigenanalysis to perform contraction of the data matrix X
to the principal eigenvector score matrix [, , ]T tt
R1
f= of rank R
and 2) ensures that the t
r
components are maximally correlated
with the u
r
components in the approximation of the responses ,Y
this is achieved when the
r
\
u
s are scaled versions of the .s\t
r
The
Y-variables are then regressed on the matrix [, , ].U uu
R1
f=
Therefore, PLS is a multivariate model with inferential ability that
aims to find a representation of
X (or a part of )X that is relevant
for predicting ,Y using the model
,X TPE Etp
T
r
r
R
r
T
1
=+= +
=
/
(15)
.Y UQF Fuq
T
r
r
R
r
T
1
=+= +
=
/
(16)
The score vectors t
r
provide an LS fit of X-data, while at the
same time, the maximum correlation between t and u scores
ensures a good predictive model for Y variables. The predicted
responses Y
new
are then obtained from new data X
new
and the
loadings P and .Q
In practice, the score vectors ,t
r
are extracted sequentially, by a
series of orthogonal projections followed by the deflation of X. Since
the rank of Y is not necessarily decreased with each new ,t
r
we may
continue deflating until the rank of the X-block is exhausted so as to
balance between prediction accuracy and model order.
The PLS concept can be generalized to tensors in the follow-
ing ways:
1) Unfolding multiway data. For example, tensors
()IJKX ##
and ()IMNY ## can be flattened into long matrices ( )X IJK#
and ()Y IMN# so as to admit matrix-PLS (see Figure 12).
However, such flattening prior to standard bilinear PLS obscures
the structure in multiway data and compromises the interpret-
ation of latent components.
2) Low-rank tensor approximation. The so-called N-PLS
attempts to find score vectors having maximal covariance
with response variables, under the constraints that tensors
X
and Y are decomposed as a sum of rank-1 tensors [104].
3) A BTD-type approximation. As in the higher-order PLS
(HOPLS) model shown in Figure 13 [105], the use of block
terms within HOPLS equips it with additional flexibility,
together with a more physically meaningful analysis than
unfolding-PLS and N-PLS.
The principle of HOPLS can be formalized as a set of sequen-
tial approximate decompositions of the independent tensor
RX
II I
N12
!
## #g
and the dependent tensor RY
JJ J
M12
!
## #g
(with )IJ
11
= so as to ensure maximum similarity (correlation)
between the scores t
r
and u
r
within the matrices T and ,U
based on
Entry of Maximum Absolute
Value Within a Fiber in the
Residual Tensor
C
(3)
C
(1)
C
(2)
Two-Way CA:
PCA, ICA,
NMF, . . .
=
~
[FIG11] The Tucker representation through fiber sampling and
cross-approximation: the columns of factor matrices are sampled
from the fibers of the original data tensor .X Within MWCA, the
selected fibers may be further processed using BSS algorithms.
U
T
u
r
(I × N )
(I × M )(I × R )
(I × R )
(R × N )
(R × M )
P
T
Q
T
t
r
p
r
q
r
=
R
r = 1
R
r = 1
=
=
~
=
~
X
Y
[FIG12] The basic PLS model performs joint sequential low-rank
approximation of the matrix of predictors
X and the matrix of
responses Y so as to share (up to the scaling ambiguity) the
latent components—columns of the score matrices T and .U The
matrices P and Q are the loading matrices for predictors and
responses, and E and F are the corresponding residual matrices.
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page
q
q
M
M
q
q
M
M
q
M
THE WORLD’S NEWSSTAND
®