What is a random variable, $X$?

We need a way to express that an experiment has been conducted and its outcome has been recorded, even though we don’t yet know what that outcome will be. We use the notation $X$ (or sometimes $Y$or $T$ or $U$ or $Z$ or other capital letters) to denote the outcome of an experiment the has random outcomes.

An event is a set of possible outcomes.

Example: Let $X$ be the outcome of the roll of a six-sided die. Let $\mathcal{O}$ be the event that the roll is odd. We write $\mathcal{O} = (X \in \{1, 3, 5\})$. Naturally, if the roll of the die is a fair one, the probability of this event occurring is written $P(\mathcal{O}) = 1/2$, or $P(X \in \{1, 3, 5\}) = 1/2.$

We say that a random variable is discrete if the set of values it can attain is finite or “countable.” One way to think about it is that if you marked the values $X$ can achieve on a number line, there would be gaps between every possible value. So the outcome of a roll of a fair die, or the number of field mice populating a farm, would be discrete random variables. By contrast, a continuous random variable can take on all possible values in some interval.

Probability Mass Functions

If we wish to abstractly represent specific values that a random variable might take, we use the lower case letter that corresponds to the name of random variable. This comes up when we are writing probability mass functions, or pmfs.

Definition (Probability Mass Function)

Let $X$ be a random variable whose range is the discrete (countable) set $\mathcal{X}$. The probability mass function (pmf) of $X$ is a function $p(x)$ satisfying

$$ \sum_{x \in \mathcal{X}} p(x) = 1, $$

and for each $x \in \mathcal{X}$, we have

$$ P(X = x) = p(x). $$

</aside>

Finer Detail: Random variables versus deterministic functions

In pre-calculus or calculus, you were probably introduced to functions, which are defined to be maps from a domain to a range. That is to say, for every value $x$ in the domain, we assign a value $y$ in the range. For example, if we define $f$ to be the function $f(x) = x^2$, then the domain is the whole real line $\mathbb{R}$ and the range is the non-negative real line $\mathbb{R}_+$ (all the real numbers greater than or equal to zero). In probability theory, we call such a function deterministic. We are aware of the values in the domain, so we know which value in the range will come up.

In probability theory, random variables are functions as well, but we are not aware of the values in the domain. We only see the values in the range and we only know how likely different values are to be observed.

Rigorous sketches of probability mass functions

A key skill for building probabilistic intuition is to be able to sketch pmfs carefully. Throughout the guided numerical experiments you will be asked to report rigorous sketches of pmfs and of histograms from experiments. Here are the essential values to mark:

The range of the pmf must be clearly marked. Often there are many values with very very small probabilities, so the goal is to communicate the values that make up 99% of the probability mass.
The mode (or modes) of the distribution. Where is the pmf at its highest points? These should be marked on the $x$-axis and its height should be marked on the $y$-axis.
The symmetry of the distribution, or lack thereof. You should clearly indicate whether the distribution looks symmetric about some middle point. In this case, the middle point should be indicated clearly on the $x$-axis.
The Gaussianity of the distribution, or lack thereof. If a distribution is symmetric about some middle value, does it have a roughly Bell shaped curved? Rounded down and the middle and concave up at the ends. We will define the Gaussian distribution rigorously later, but roughly speaking a Gaussian-looking distribution with mean $\mu$ and standard deviation $\sigma$ will look like a bell-curve with

$$ \text{Approximate Gaussianity:} \quad \begin{array}{rl} P(X \in [\mu - \sigma, \mu + \sigma]) &\approx \,\, 0.7;\\ P(X \in [\mu - 2\sigma, \mu + 2\sigma]) &\approx \,\, 0.95;\\ P(X \in [\mu - 3\sigma, \mu + 3\sigma]) &\approx \,\, 0.99. \end{array} $$