Fun with algebra

Quite often, the key to finding an efficient solution to a given problem is to rewrite it in simpler terms. This requires a fair amount of experience and a good toolbox of rules one can apply to transform things.

For me, problems are often algebraic and I have to deal with products of (tensors,) matrices and vectors and with their determinants, traces, and derivatives thereof.

A very good collection of rules for these kinds of problems has always been the matrix cookbook by K. B. Petersen and M. S. Pedersen. It is available freely in PDF form. The official website seems to be down at the day of writing this post but a google search for matrix cookbook reveals several alternative websites where it can be downloaded.

Over the years I have used many of the rules in the book, but I have also found some that are not listed. I’ve tried to contact the authors and after one initial positive response never heard anything again. Therefore, I will start collecting rules I find here as well, hoping that others may find them useful.

Here is a rule I use very, very frequently. Let $\ma{A} \in \compl^{M \times N}$, $\ma{X} \in \compl^{N \times P}$, and $\ma{B} \in \compl^{P \times Q}$. Then
$${\rm vec}\left(\ma{A} \cdot \ma{X} \cdot \ma{B} \right) = \left(\ma{B}^{\rm T} \otimes \ma{A} \right)\cdot {\rm vec}(\ma{X}),$$
where $\otimes$ represents the Kronecker product and ${\rm vec}\{.\}$ is an operator that stacks all elements of a matrix into a vector, proceeding column by column. This rule is extremely handy for rewriting linear forms. The philosophy is that the left-hand side represents a vector where each element is a linear combination of the elements of $\ma{X}$. Therefore, it is a linear mapping from ${\rm vec}(\ma{X})$ to a new vector. As any linear mapping can be expressed by a matrix multiplication, there must exist such a matrix. We only need to find it. This is what the Kronecker product is good for, it gives us a handy way of expressing such a mapping directly in terms of $\ma{A}$ and $\ma{B}$.

It is often needed to rewrite linear forms like that, “pulling out” all parameters of interest, i.e., when solving least squares problems or finding expectations.

So why not think further? What about quadratic forms? Here are two rules I found for quadratic forms. Albeit not quite the same as for linear forms, both have proven very useful:

$${\rm trace}\left(\ma{X} \cdot \ma{R} \cdot \ma{X}^{\rm H}\right) = {\rm vec}\left(\ma{X}\right)^{\rm H} \cdot \left( \ma{R}^{\rm T} \otimes \ma{I} \right) \cdot {\rm vec}\left(\ma{X}\right) $$

$$ \ma{a}^{\rm H}\cdot \ma{X} \cdot \ma{B} \cdot \ma{X}^{\rm H} \cdot \ma{c} = {\rm vec}(\ma{X})^{\rm H}\cdot \left(\ma{B}^{\rm T} \otimes \ma{c}\cdot \ma{a}^{\rm H}\right)\cdot{\rm vec}(\ma{X}) $$

where $\ma{a}$ and $\ma{c}$ are column vectors and all other quantities are matrices of arbitrary dimensions as long as they match for the matrix products to be well-defined. The second rule is more general, it can be used to prove the first. To prove the second, use the rule from linear forms twice. Try it, it’s easy.

By the way, a more general version of the first, which is not a quadratic form anymore, would be

$${\rm trace}\left(\ma{X} \cdot \ma{R} \cdot \ma{Y}^{\rm H}\right) = {\rm vec}\left(\ma{Y}\right)^{\rm H} \cdot \left( \ma{R}^{\rm T} \otimes \ma{I} \right) \cdot {\rm vec}\left(\ma{X}\right) $$

Have fun and let me know in case you know related rules!

Join the Conversation

Cancel reply

Leave a comment