The Bayesian method consists of three steps:
$$ \mathrm{Prior} \times \mathrm{Likelihood} \rightarrow \mathrm{Posterior} $$
In the general setting there is a parameter $\theta$ that we want to learn about from an experiment. We have a prior assumption about it that be quantified by the function $\mathrm{Prior}(\theta)$. The function $P(\mathrm{Data} \, | \, \theta)$ is read “the likelihood of observing the data given that the parameter value is $\theta$.” The function $P(\theta \, | \, \mathrm{Data})$ is read “the likelihood that the parameter’s true value is $\theta$ given the observed data.” This are related through the general formula
$$ P(\theta \, | \, \mathrm{Data}) = \frac{P(\mathrm{Data} \, | \, \theta) \, \mathrm{Prior}(\theta)}{P_\mathrm{prior}(\mathrm{Data})} $$
where $P_\mathrm{Prior}(\mathrm{Data})$ is the “the likelihood of observing the data weighted by the prior distribution of $\theta$.”
The mathematical details of how to interpret this formula and actually calculate the posterior depends on what we are trying to infer. We will study two important cases: