Fun with statistics – transformations of random variables part 2

I recently posted on how to find the distribution of functions of random variables, i.e., the distribution of $Y=g(X)$, where $X$ is a random variable with known distribution and $y=g(x)$ is some function.


As a natural extension of this concept we may ask ourselves what happens if we have two random variables involved. Let us start with one function of two random variables, i.e., given $X$ and $Y$ and knowing their joint PDF $f_{X,Y}(x,y)$ or their joint CDF $F_{X,Y}(x,y) = {\rm Pr}[X \leq x, Y \leq y]$ we would like to calculate the distribution of $Z = g(X,Y)$ where $z=g(x,y)$ is a function with two arguments, e.g., $z=x+y$.

Again, there are multiple ways of addressing this problem. A natural way would be to calculate the CDF of $Z$ directly, i.e., $F_Z(z) = {\rm Pr}[Z \leq z] = {\rm Pr}[g(X,Y) \leq z]$. In other words, we need to compute the probability of the event that relates to all realization of $X$ and $Y$ which satisfy $g(X,Y) \leq z$. This is easily done by integrating the joint PDF $f_{X,Y}(x,y)$ over all points in the set ${\mathcal D}_z$ which contains all points $(x,y)$ for which $g(x,y) \leq z$. Written out, we have

$$F_Z(z) = {\rm Pr}[Z \leq z] = {\rm Pr}[g(X,Y) \leq z] = \iint_{{\mathcal D}_z} f_{X,Y}(x,y) {\rm d}x {\rm d} y$$

Whether or not this approach is easy to follow depends on two things: (1) how easy it is to parametrize the set ${\mathcal D}_z$ and (2) how easy it is to integrate the joint PDF over ${\mathcal D}_z$.

Let us make an example considering the simple function $g(x,y) = x+y$. Then ${\mathcal D}_z$ contains all points $(x,y)$ for which $x+y \leq z$, i.e., $y\leq z-x$ or $x \leq z-y$. Geometrically, this is the set of points that are on the lower-left of a line with slope -1 and offset $z$, i.e., a line passing through $(z,0)$ and $(0,z)$. The integral over this set is relatively simple, as we can directly write it as

$$\displaystyle F_Z(z) = \int_{-\infty}^{+\infty} \int_{-\infty}^{z-y} f_{X,Y}(x,y) {\rm d}x {\rm d} y = \int_{-\infty}^{+\infty} \int_{-\infty}^{z-x} f_{X,Y}(x,y) {\rm d}y {\rm d} x$$.

Another example is $g(x,y) = \max(x,y)$. Since $\max(x,y) \leq z \Leftrightarrow ((x \leq  z) \;\mbox{and} \; (y \leq z))$ we can argue

$$F_Z(z) = {\rm Pr}[\max(X,Y) \leq z] = \int_{-\infty}^z \int_{-\infty}^z f_{X,Y}(x,y) {\rm d}x {\rm d} y$$.

Geometrically, ${\mathcal D}_z$ contains all points on the “lower left” of the point $(z,z)$, i.e., the intersection of the half-planes below $y=z$ and left of $x=z$.


The second extension is to consider two functions of two random variables. Say we are given the distribution of $X$ and $Y$ via their joint PDF, we would like to find the joint PDF of $Z=g(X,Y)$ and $W=h(X,Y)$. There is a closed-form expresion for it as a direct extension of the closed-form expression for the PDF of one function of one random variable. It reads as

$$f_{Z,W}(z,w) = \sum_{i=1}^N \frac{1}{|{\rm det} \ma{J}(x_i,y_i)|} f_{X,Y}(x_i,y_i)$$,

where $(x_i,y_i)$ are all solutions to the system of equations $z=g(x,y)$, $w=h(x,y)$ in $x$ and $y$. Here, $\ma{J}$ is the Jacobian matrix given by

$$ \ma{J} = \left[ \begin{array}{cc}
\frac{\partial g}{x} & \frac{\partial g}{y} \\
\frac{\partial h}{x} & \frac{\partial h}{y}

Moreover, the term ${\rm det} \ma{J}(x_i,y_i)$ means that we first compute the determinant of the Jacobian matrix (in terms of $x$ and $y$) and then insert $x_i(z,w)$ and $y_i(z,w)$.


Example? How about the joint distribution of $X+Y$ and $X-Y$? In this case, solving for $z=x+y$ and $w=x-y$ for $x$ and $y$ is simple, we have one solution given by $x_1 = (z+w)/2$ and $y_1 = (z-w)/2$. The Jacobian matrix is given by

$$ \ma{J} = \left[ \begin{array}{cc}
1& 1 \\ 1 & -1
and hence its determinant is $-2$ everywhere. This gives the solution for $f_{Z,W}(z,w)$ in the form

$f_{Z,W}(z,w) = \frac{1}{2} f_{X,Y}((z+w)/2,(z-w)/2)$.

As in the 1-D case, this direct solution depends heavily on our ability to solve the given functions for $x$ and $y$, which may be tedious for complicated functions.


Interestingly, the first case where we considered one function of one random variable can be solved also via this approach, simply by creating another “auxiliary” variable, and then marginalizing over it. So once we have $Z=g(X,Y)$ we make up another $W=h(X,Y)$, choosing it such that the remaining calculations are simple. For instance, for $g(x,y) = x+y$ we may choose $h(x,y) = y$. Then, the Jacobian matrix becomes

$$ \ma{J} = \left[ \begin{array}{cc}
1& 1 \\ 0 & 1

with determinant one. Moreover, we have $x_1 = z-w$ and $y_1 = w$. Therefore, we get

$$f_{Z,W}(z,w) = f_{X,Y}(z-w,w)$$.

The final step is marginalizing out the auxiliary $W$ which gives

$$f_Z(z) = \int_{-\infty}^{+\infty} f_{X,Y}(z-w,w) {\rm d}w.$$

Looks much like a convolution integral, doesn’t it? In fact, if $X$ and $Y$ are statistically independent, we can write $f_{X,Y}(x,y) = f_X(x) \cdot f_Y(y)$ and hence we obtain

$$f_Z(z) = \int_{-\infty}^{+\infty} f_{X}(z-w)\cdot f_Y(w) {\rm d}w = f_X(x) * f_Y(y),$$

where $*$ denotes convolution. This shows very easily that the PDF of the sum of two random variables is the convolution of their PDFs, if they are statistically independent.