# More algebra fun

I recently posted on algebraic rules for rewriting linear and quadratic forms and showed how Kronecker products come into play when expanding products of matrices.

Likewise, we might come into a situation where parameters of interest are inside a Kronecker product which we want to “pull out”. As the operation is still linear this should be an easy step. Strangely at the time I needed it I could not find the algebraic rules for it anywhere. I found one by trial and error for a special case and my colleague Martin Weis came up with the much more general case. Here it is for your convenience:

Let $\ma{A} \in \compl^{M \times N}$ and $\ma{B} \in \compl^{P \times Q}$. Then, $\ma{A} \otimes \ma{B}$ is a matrix with depends on the elements of $\ma{A}$ as well as the elements of $\ma{B}$ linearly. Consequently, ${\rm vec}(\ma{A} \otimes \ma{B})$ must be linear in ${\rm vec}(\ma{A})$ and ${\rm vec}(\ma{B})$. Indeed it is and the coefficient matrices are found as:

$${\rm vec}(\ma{A} \otimes \ma{B}) = \left( \ma{I}_N \otimes \left[ \begin{array}{c} \ma{I}_M \otimes \ma{b}_1 \\ \ma{I}_M \otimes \ma{b}_2 \\ \vdots \\ \ma{I}_M \otimes \ma{b}_Q \end{array} \right] \right) \cdot {\rm vec}(\ma{A}) = \left[ \begin{array}{c} \ma{I}_Q \otimes \ma{a}_1 \otimes \ma{I}_P \\ \ma{I}_Q \otimes \ma{a}_2 \otimes \ma{I}_P \\ \vdots \\ \ma{I}_Q \otimes \ma{a}_N \otimes \ma{I}_P \end{array} \right] \cdot {\rm vec}(\ma{B}).$$

Special cases where either of $M, N, P, Q$ are equal to one (i.e., one or two matrices are row or column vectors) are easily found from this, e.g.,
${\rm vec}(\ma{a} \otimes \ma{B}) = \left[\ma{I}_Q \otimes \ma{a} \otimes \ma{I}_P\right] \cdot {\rm vec}(\ma{B})$ or ${\rm vec}(\ma{A} \otimes \ma{b}) = \left[\ma{I}_{MN} \otimes \ma{b} \right] \cdot {\rm vec}(\ma{A})$.

A related important observation is that the matrix $\ma{A} \otimes \ma{B}$ and the matrix ${\rm vec}(\ma{B}) \cdot {\rm vec}(\ma{A})^{\rm T}$ contain exactly the same elements. Therefore, you can permute a given Kronecker product so that it becomes a rank one matrix, where each column is a scaled version of ${\rm vec}(\ma{B})$ and each row a scaled version of ${\rm vec}(\ma{A})^{\rm T}$. This is particularly relevant when we have an estimate of a matrix that should be a Kronecker product and want to find a “Least-Squares Kronecker factorization”. By performing the above mentioned rearrangement, this problem is transformed into finding a Least-Squared rank-one approximation which is readily solved by a truncated Singular Value Decomposition (SVD) via the Eckart-Young theorem.

Starting from this I wondered what happens if we consider a rank higher than one. This allows us to decompose a given matrix into the sum of Kronecker products and find exact as well as approximate low-rank decompositions with a Least-Squares optimality guarantee. Unfortunately, I found out later that others had this idea before. A decomposition like that is known as Kronecker Product SVD. Here is a slide set by C. Van Loan on it.

# Fun with algebra

Quite often, the key to finding an efficient solution to a given problem is to rewrite it in simpler terms. This requires a fair amount of experience and a good toolbox of rules one can apply to transform things.

For me, problems are often algebraic and I have to deal with products of (tensors,) matrices and vectors and with their determinants, traces, and derivatives thereof.

A very good collection of rules for these kinds of problems has always been the matrix cookbook by K. B. Petersen and M. S. Pedersen. It is available freely in PDF form. The official website seems to be down at the day of writing this post but a google search for matrix cookbook reveals several alternative websites where it can be downloaded.

Over the years I have used many of the rules in the book, but I have also found some that are not listed. I’ve tried to contact the authors and after one initial positive response never heard anything again. Therefore, I will start collecting rules I find here as well, hoping that others may find them useful.

Here is a rule I use very, very frequently. Let $\ma{A} \in \compl^{M \times N}$, $\ma{X} \in \compl^{N \times P}$, and $\ma{B} \in \compl^{P \times Q}$. Then
$${\rm vec}\left(\ma{A} \cdot \ma{X} \cdot \ma{B} \right) = \left(\ma{B}^{\rm T} \otimes \ma{A} \right)\cdot {\rm vec}(\ma{X}),$$
where $\otimes$ represents the Kronecker product and ${\rm vec}\{.\}$ is an operator that stacks all elements of a matrix into a vector, proceeding column by column. This rule is extremely handy for rewriting linear forms. The philosophy is that the left-hand side represents a vector where each element is a linear combination of the elements of $\ma{X}$. Therefore, it is a linear mapping from ${\rm vec}(\ma{X})$ to a new vector. As any linear mapping can be expressed by a matrix multiplication, there must exist such a matrix. We only need to find it. This is what the Kronecker product is good for, it gives us a handy way of expressing such a mapping directly in terms of $\ma{A}$ and $\ma{B}$.

It is often needed to rewrite linear forms like that, “pulling out” all parameters of interest, i.e., when solving least squares problems or finding expectations.

So why not think further? What about quadratic forms? Here are two rules I found for quadratic forms. Albeit not quite the same as for linear forms, both have proven very useful:

$${\rm trace}\left(\ma{X} \cdot \ma{R} \cdot \ma{X}^{\rm H}\right) = {\rm vec}\left(\ma{X}\right)^{\rm H} \cdot \left( \ma{R}^{\rm T} \otimes \ma{I} \right) \cdot {\rm vec}\left(\ma{X}\right)$$

$$\ma{a}^{\rm H}\cdot \ma{X} \cdot \ma{B} \cdot \ma{X}^{\rm H} \cdot \ma{c} = {\rm vec}(\ma{X})^{\rm H}\cdot \left(\ma{B}^{\rm T} \otimes \ma{c}\cdot \ma{a}^{\rm H}\right)\cdot{\rm vec}(\ma{X})$$

where $\ma{a}$ and $\ma{c}$ are column vectors and all other quantities are matrices of arbitrary dimensions as long as they match for the matrix products to be well-defined. The second rule is more general, it can be used to prove the first. To prove the second, use the rule from linear forms twice. Try it, it’s easy.

By the way, a more general version of the first, which is not a quadratic form anymore, would be

$${\rm trace}\left(\ma{X} \cdot \ma{R} \cdot \ma{Y}^{\rm H}\right) = {\rm vec}\left(\ma{Y}\right)^{\rm H} \cdot \left( \ma{R}^{\rm T} \otimes \ma{I} \right) \cdot {\rm vec}\left(\ma{X}\right)$$

Have fun and let me know in case you know related rules!