Now that we have a language for probability distributions, and a few probability distributions we can use for modeling, we can address probability calculations that emerge in the process of hypothesis testing.

Recall that significance, power, and $p$-values are all probabilities.

Assessing the quality of critical value tests

In Chapter 1, we learned how to make statistical decisions using the framework of critical value hypothesis tests, but we never discussed how to pick a “good” rejection region. There are several ways to do this.

2σ-tests

A 2$\sigma$-test is a quick “back of the envelope” technique for evaluating whether the outcome of an experiment is sufficient evidence to reject a null hypothesis.

<aside> <img src="/icons/fleur-de-lis_purple.svg" alt="/icons/fleur-de-lis_purple.svg" width="40px" />

Definition (2$\sigma$-rule for Rejection Criteria)

Assumptions

  1. The null hypothesis $H_0$, alternative hypothesis $H_1$, and test statistic $X$ are defined, but you are seeking a good first guess for a rejection criterion.
  2. Your probability model for $X$ is approximately Gaussian when the null hypothesis is true.
  3. The mean, $\mu = E_{H_0}(X)$, and standard deviation, $\sigma =\mathrm{SD}_{H_0}(X)$, are known for your test statistic when the null hypothesis $H_0$ is true.

Then the $2\sigma$-rule for constructing a rejection region is

$$ \mathcal{R} = \{X < \mu - 2 \sigma\} \, \cup \, \{X > \mu + 2\sigma\}. $$

</aside>

Template:

26Spring_Template_HT_2sigma.pdf

Frequently Asked Questions

What are some circumstances when this rule does not work?

The $2\sigma$-rule will causes errors in your statistical decisions most often when your probability model is skewed, or when right and left tails of the distribution look substantially different than a standard Gaussian distribution. Skewness can happen when a binomial distribution has a very low or very high success probability. The tails of a HyperGeometric distribution can look non-Gaussian if the total population size $N$ is not large compared to the size of the target population $K$.

The $p$-value method for hypothesis testing