The first task in evaluating the implications of experimental observations is to formulate a hypothesis test before looking at the experimental outcome.
<aside> <img src="/icons/fleur-de-lis_purple.svg" alt="/icons/fleur-de-lis_purple.svg" width="40px" />
Evaluation: If the experimental outcome of the test statistic satisfies the rejection criterion, then we reject the null hypothesis $H_0$. Otherwise we fail to reject $H_0$. Note that in this framing, we can never “prove the alternative hypothesis”.
</aside>
Example 1.1: Framing an investigation in terms of a two-sided hypothesis test.
Example 1.2: Framing an investigation in terms of a one-sided hypothesis test.
Example 1.3: Polling surveys, revisited.
The null hypothesis posits a single value for the parameter of interest. But the alternative hypothesis consists of a range of values, and those values can be extremely close to the null hypothesis. For example, suppose that $H_0$ is that $p = 0.5$ and $H_A$ is $p \neq 0.5$. Then $p = 0.5000001$ is technically a version of the alternative hypothesis, but for all practical purposes this version of the alternative would be indistinguishable from the null. So how do we deal with this?
<aside> <img src="/icons/fleur-de-lis_purple.svg" alt="/icons/fleur-de-lis_purple.svg" width="40px" />
There are two main reasons we want to develop good probability models
The first purpose translates into using probability models to construct good rejection criteria for hypothesis tests. Without a probability model, we would have no idea how to set up a good test! The second purpose translates into what we will later call $p$-values. On the positive side, $p$-values provide a universal way to communicate scientific results. On the negative side, $p$-values provide a way to hide scientific malpractice and/or exaggerations of scientific findings.
</aside>
To think about the outcomes of experiments, we must propose a mathematical model for uncertainty. There are three primary methods: the first two are purely theoretical, while the other uses the data itself.
<aside> <img src="/icons/fleur-de-lis_purple.svg" alt="/icons/fleur-de-lis_purple.svg" width="40px" />
Methods for modeling the intrinsic uncertainty of experiments.
For part one of this course, we will focus on simulation studies, Hypergeometic models, Binomial models, and Poisson models. In the second and third parts of the course, we will introduce several more probability models for uncertainty.
For the statistical decisions we are studying here, we start with a testable hypothesis, and conclude with a statement that we REJECT $H_0$, the null hypothesis, or FAIL TO REJECT $H_0$. To quantify the quality of a given test, we need to evaluate its probability of leading to an incorrect conclusion, or error. Since there are only two types of conclusions, there are two types of error.
<aside> <img src="/icons/fleur-de-lis_purple.svg" alt="/icons/fleur-de-lis_purple.svg" width="40px" />
Definition (Type I and Type II Errors)
These errors can be summarized by the following helpful chart.