Here is another tool I learned about and have found very useful ever since then: transformations of random variables. The basic idea is this: Given a random variable $X$ from which we know how it is distributed, i.e., either its probability density function (PDF) $f_X(x)$ or its cumulative distribution function (CDF) $F_X(x)$, find the distribution of a new random variable $Y$ formed as $Y=g(X)$, where $y = g(x)$ is a function from $\real$ to $\real$. For example, we may wonder what is the distribution of a squared normal random variable or of the logarithm of a uniform random variable. Tasks like this appear suprisingly often in engineering. Communications are a good example: Your signal of interest is a random process containing random desired information, as well as random undesired perturbations like noise. To find an optimal detector for your information of interest you need to know exactly how it is distributed. Now what happens if your receiver ist best described by some nonlinear function? How do the known (or assumed) distributions of your desired signal and your perturbation change?

Fortunately, the basic procedure of answering the above problem is strikingly simple. There are two basic approaches: one based on the CDF and one based on the PDF. Let us start with the CDF. To find the CDF of $Y$ we need to think what it is: the probability that our random variable $Y$ is less than or equal to some given $y$. Inserting $Y=g(X)$ we can try to solve this for the CDF of $X$:

$$F_Y(y) = {\rm Pr}[Y \leq y] = {\rm Pr}[g(X) \leq y] = {\rm Pr}[X \leq g^{-1}(y)] = F_X(g^{-1}(y))$$.

Really? So simple? Well, not quite. The above step only works if $g(x) = y$ is bijective, i.e., uniquely invertible, and monotonically increasing. The tricky step if of course rewriting the condition $g(X) \leq y$ in terms of a condition on $X$. Here, care needs to be taken. For example, say $g(x) = a\cdot x + b$. Then $g(X) \leq y$ becomes $X \leq (y-b)/a$ only for $a>0$, for $a<0$ we have $X \geq (y-b)/a$. Of course this still admits a simple solution in terms of the CDF since ${\rm Pr}[X \geq (y-b)/a] = 1-{\rm Pr}[X \leq (y-b)/a] + {\rm Pr}[X=(y-b)/a]$ which becomes $1-F_X((y-b)/a)$ if $X$ is continuous ($F_X(x)$ is differentiable).

For other functions there may be multiple solutions. For instance, consider $g(x) = y^2$. This gives $F_Y(y) = {\rm Pr}[X^2 \leq y] = {\rm Pr}[-\sqrt{y} \leq X \leq +\sqrt{y}] = F_X(\sqrt{y}) – F_X(-\sqrt{y})$.

What if we want to work with the PDF instead of the CDF? Sometimes the CDF is not available or very complicated to work with. In this case, we can find a similar process for PDFs. It is easily derived from the previous approach on the CDF by taking its derivative and using the chain rule twice. We find that the PDF of $Y$ can be directly computed as

$$f_Y(y) = \sum_{i=1}^n \frac{1}{|g'(x_i(y))|} f_X(x_i(y)),$$

where $x_i(y)$ are all solutions of $y=g(x)$ in terms of $y$ for $i=1, 2, \ldots, N$. Furthermore, $g'(x)$ is the first derivative of $g(x)$ with respect to $x$. Example. Consider $y=g(x) = x^2$. It has two solutions in $x$: $x_1(y) =+\sqrt{y}$ and $x_2(y) = -\sqrt{y}$. We find $g'(x) = 2x$ and inserting $x_i(y)$ we have $g'(x_i(y)) = \pm 2 \sqrt{y}$. Therefore, the final solution for $f_Y(y)$ becomes

$$f_Y(y) = \frac{1}{|2\sqrt{y}|}f_X(\sqrt{y}) + \frac{1}{|-2\sqrt{y}|} f_X(-\sqrt{y})= \frac{1}{2\sqrt{y}}\left(f_X(\sqrt{y})+f_X(-\sqrt{y})\right).$$

Of course, the entire process has some limitations, the main one being that we need to be able to solve $g(x) = y$ for $x$. This becomes tedious if there are infinitely many solutions, as for instance for $g(x) = \sin(x)$. Even more so, it becomes impossible for many functions. Try $g(x) = {\rm sinc}(x) = \sin(x)/x$ for instance. Even the number of solutions $N$ depends on $y$ and finding $x_i(y)$ explicitly is not really possible.

Still for a wide range of problems this is a quite handy tool.