We collect in this chapter some of the most useful operations on tensors, in view of the applications to Algebraic Statistics.
8.1 Marginalization
The concept of marginalization goes back to the beginnings of a statistical analysis of discrete random variables, when, mainly, only a pair of variables were compared and correlated. In this case, distributions in the total correlation corresponded to matrices, and it was natural to annotate the sums of rows and columns at the edges (margins) of the matrices.
Definition 8.1.1








Notice that we can define as well the marginalization of A by .


Definition 8.1.2







Example 8.1.3
The marginalization of the tensor of Example 6.1.6
is ((18, 36), (9, 18, 27), (18, 36)).
Since the marginalization is a linear map, we can analyze its linear properties. It is immediate to realize that, except for trivial cases,
is not injective.
Example 8.1.4




Even the surjectivity of fails in general. This is obvious for
matrices, since
is a noninjective linear map between
and itself.
Indeed, if is the marginalization of the matrix
, then clearly
is equal to the sum of all the entries of A, thus it is also equal to
. We prove that this is the only restriction for a vector in
to belong to the image of the marginalization.
Proposition 8.1.5



Proof














Corollary 8.1.6
The image of has dimension
. Thus the kernel of
has dimension
.
We can extend the previous computation to tensors of arbitrary dimension.
Proposition 8.1.7



Definition 8.1.8
If the marginalization of a tensor T is the element
of
, then the vector
is called the i-contraction of T.
Next, we will study the connections between the marginalization of a tensor T and the tensor product of its contractions.
Proposition 8.1.9
Let be a tensor of rank 1. Let
be the i-contraction of T. Then T is a scaling of
.
Proof









Remark 8.1.10






Example 8.1.11
Let be the rank 1 tensor, product of
. Then,








When T has rank , then clearly it cannot coincide with a multiple of
. For general T, the product of its contractions
determines a good rank 1 approximation of T.


Example 8.1.12
Consider the tensor
which is an approximation of (only one entry has been changed). The marginalization of T is
. The product
, divided by
gives the rank 1 tensor (approximate):
not really far from T.
Indeed, for some purposes, the product of the contractions can be considered as a good rank 1 approximation of a given tensor. We warn the reader that, on the other hand, there are other methods for the rank 1 approximation of tensors which, in many cases, produce a result much closer to the best possible rank 1 approximation. See, for instance, Remark 8.3.12.
8.2 Contractions
The concept of i-contraction of a tensor can be extended to any subset of indices. In this case, we will talk about partial contraction . The definition is straightforward, though it can look odd, at a first sight, since it requires many levels of indices.
Definition 8.2.1

![$$[n]=\{1,\dots ,n\}$$](../images/485183_1_En_8_Chapter/485183_1_En_8_Chapter_TeX_IEq86.png)










Example 8.2.2
Consider the tensor:











The last observation of the previous example generalizes.
Proposition 8.2.3
If , then
is the ith contraction of T.
Proof
Just compare the two definitions.
Remark 8.2.4
The contraction along determines a linear map between spaces of tensors.
If , then the contraction
is a contraction of
.
The relation on the ranks of a tensor and its contractions is expressed as follows:
Proposition 8.2.5


Proof








The second statement follows immediately from the first one and from the linearity of the contraction. Indeed, if , where each
has rank 1, then
.
The following proposition is indeed an extension of Proposition 8.1.9.
Proposition 8.2.6
Let be a tensor of rank 1. Let
be a partition of
, with the property that every element of
is smaller than every element of
, for all i. Define
.
Then the tensor product is a scalar multiple of T.
Proof
The proof is straightforward when we take the (maximal) partition where and
for all i. Indeed in this case
is the ith contraction of T, and we can apply Proposition 8.1.9.
For the general case, we can use Proposition 8.1.9 again and induction on n. Indeed, by assumption, if , then the kth contraction of
is equal to the
th contraction of T.


Example 8.2.7
Consider the rank 1 tensor:





hence .
8.3 Scan and Flattening
The following operations is natural for tensors and often allow a direct computation of the rank.
Definition 8.3.1
Let be a tensor. For any
we denote with
the set of
tensors obtained by fixing the index j in all the possible ways.




By applying the definition recursively, we obtain the definition of scan of a tensor T along a subset of two, three, etc., indices.

Example 8.3.2
Consider the tensor:




Remark 8.3.3
There is an obvious relation between the scan and the contraction of a tensor T. If is any subset of the set of indices and
then the
contraction of T equals the sum of the tensors of the scan of the tensor T along
.
We define the flattening of a tensor by taking the scan along one index, and arranging the resulting tensors in one big tensor.
Definition 8.3.4










Example 8.3.5
Consider the tensor:




Remark 8.3.6
One can apply the flattening procedure after a permutation of the indices. In this way, in fact, one can define the flattening along any of the indices. We leave the details to the reader.
Moreover, one can take an ordered series of indices and perform a sequence of flattening procedures, in order to reduce the dimension of the tensor.
The final target which is usually the most useful for applications is the flattening reduction of a tensor to a (usually rather huge) matrix, by performing flattenings of an n-dimensional tensor. If we do not use permutations, the final output is a matrix of size
.
The reason why the flattenings are useful, for the analysis of tensors, is based on the following property.
Proposition 8.3.7
A tensor T has rank 1, if and only if all its flattening has rank 1.
Proof












The second tensor in Example 8.3.5 shows the flattening of a tensor of rank 1.
Notice that one implication for the Proposition works indeed for any rank. Namely, if T has rank r, then all of its flattenings have rank .
On the other hand, the converse does not hold at least when the rank is big. For instance, there are tensors of type and rank 3 (see Example 6.4.15) for which, of course, the
flattenings cannot have rank 3.

Example 8.3.8
The tensor



We have the following, straightforward:
Proposition 8.3.9








As a corollary of Proposition 8.3.7, we get:
Proposition 8.3.10
If a tensor T has rank r, then its flattening has rank .
Proof
The flattening is a linear operation in the space of tensors. Thus, if with each
of rank 1, then also
, and each
has rank 1. The claim follows.
Of course, the rank of the flattening FT can be strictly smaller than the rank of T. For instance, we know from Example 6.4.15 that there are tensors T of rank 3. The flattening FT, which is a
matrix, cannot have rank bigger than 2.
Remark 8.3.11
Here is one application of the flattening procedure to the computation of the rank.
Assume we are given a tensor and assume we would like to know if the rank of T is r. If
, then we can perform a series of flattenings along the last indices, obtaining a matrix
of size
. Then, we can compute the rank of the matrix (and we have plenty of fast procedures to do this). If
has rank
, then there is no chance that the original tensor T has rank r. If
has rank r, then this can be considered as a cue toward the fact that
.
Of course, a similar process is possible, by using permutations on the indices, when but
for some i.
The flattening process is clearly invertible, so that one can reconstruct the original tensor T from the flattening FT, thus also from the matrix resulting from a process of
flattenings.
On the other hand, since a matrix of rank has infinitely many decompositions as a sum of r matrices of rank 1, then by taking one decomposition of
as a sum of r matrices of rank 1 one cannot hope to reconstruct automatically from that decomposition of T as a sum of r tensors of rank 1.
Indeed, the existence of a decomposition for T is subject to the existence of a decomposition for in which every summand satisfies the condition of Proposition 8.3.9.
Remark 8.3.12
One can try to find an approximation of a given tensor T with a tensor of prescribed, small rank by taking the matrix
, resulting from a process of
flattenings, and considering the rank r approximation for
obtained by the standard SVD approximation process for matrices (see [1]).
For instance, one can find in this way a rank 1 approximation for a tensor, which in principle is not equal to the rank 1 approximation obtained by the marginalization (see Example 8.1.12).



Example 8.3.13
Consider the tensor:
of Example 8.1.12. The contractions of T are (1, 1), (1, 1), (1, 1), whose tensor product, divided by determines a rank 1 approximation:


which is another rank 1 approximation of T.
If we consider the tensors as vectors in , then the natural distance
is
, while the natural distance
is 1. So
is “closer” to T than
. On the other hand, for some purposes, one could consider
as a better rank 1 approximation of T than
(e.g., it preserves marginalization).
8.4 Exercises
Exercise 15
Prove the assertion in Example 8.1.4: the matrix M defined there generates the Kernel of the marginalization.
Exercise 16
Find generators for the Kernel of the marginalization map of matrices.
Exercise 17
Find generators for the image of the marginalization map of tensors and prove Proposition 8.1.7.
Exercise 18
Prove the statements of Remark 8.2.4.