Prior distributions and uncertainty

Peter Ralph

Advanced Biological Statistics

Biased coins

a motivating example

Suppose I have two trick coins:

one (coin A) comes up heads 75% of the time, and
the other (coin B) only 25% of the time.

But, I lost one and I don’t know which! So, I flip it 10 times and get 6 heads. Which is it, and how sure are you?

Possible answers:

Er, probably coin (A)?
Well, \[\begin{aligned} \P\{ \text{6 H in 10 flips} \given \text{coin A} \} &= \binom{10}{6} (.75)^6 (.25)^4 \\ &= 0.146 \end{aligned}\] and \[\begin{aligned} \P\{ \text{6 H in 10 flips} \given \text{coin B} \} &= \binom{10}{6} (.25)^6 (.75)^4 \\ &= 0.016 \end{aligned}\] … so, probably coin (A)?

For a precise answer…

Before flipping, each coin seems equally likely. Then

\[\begin{aligned} \P\{ \text{coin A} \given \text{6 H in 10 flips} \} &= \frac{ \frac{1}{2} \times 0.146 }{ \frac{1}{2} \times 0.146 + \frac{1}{2} \times 0.016 } \\ &= 0.9 \end{aligned}\]

Probability

Bayes’ rule

\[\begin{aligned} \P\{B \given A\} = \frac{\P\{B\} \P\{A \given B\}}{ \P\{A\} } , \end{aligned}\]

where

$B$: possible model
$A$: data
$\P\{B\}$: prior weight on model $B$
$\P\{A \given B\}$: likelihood of data under $B$
$\P\{B\} \P\{A \given B\}$: posterior weight on $B$
$\P\{A\}$: total sum of posterior weights

Breaking it down with more coins

Suppose instead I had 9 coins, with probabilities 10%, 20%, …, 90%; as before I flipped one 10 times and got 6 heads. For each $\theta$ in $0.1, 0.2, \ldots, 0.8, 0.9,$ find \[\begin{aligned} \P\{\text{ coin has prob $\theta$ } \given \text{ 6 H in 10 flips } \} . \end{aligned}\]

Question: which coin(s) is it, and how sure are we? (And, what does it mean when we say how sure we are?)

Uniform prior

prior

$\times$

likelihood

$\propto$

posterior

plot of chunk r the_prior

Weak prior

prior

$\times$

likelihood

$\propto$

posterior

plot of chunk r weak_prior

Strong prior

prior

$\times$

likelihood

$\propto$

posterior

plot of chunk r strong_prior

The likelihood: 6 H in 10 flips

prior

$\times$

likelihood

$\propto$

posterior

plot of chunk r ten_flips

The likelihood: 30 H in 50 flips

prior

$\times$

likelihood

$\propto$

posterior

plot of chunk r fifty_flips

The likelihood: 60 H in 100 flips

prior

$\times$

likelihood

$\propto$

posterior

plot of chunk r 100_flips

The likelihood: 6,000 H in 10,000 flips

prior

$\times$

likelihood

$\propto$

posterior

plot of chunk r ten_thou_flips

A question

What is the right answer to the “coin question”?

Recall: there were nine possible values of $\theta$.

Which coin is it, and how sure are you?

Possible types of answer:

“best guess”
“range of values”
“don’t know”

theta <- (1:9)/10
prior <- rep(1/9, 9)
likelihood <- dbinom(6, size=10, prob=theta)
posterior <- prior*likelihood/sum(prior*likelihood)
names(posterior) <- theta
barplot(posterior, xlab='true prob of heads', main='posterior probability')

plot of chunk r plot_pos

Unknown coins

Motivating example

Now suppose we want to estimate the probability of heads for a coin without knowing the possible values. (or, a disease incidence, or error rate in an experiment, …)

We flip it $n$ times and get $z$ Heads.

The likelihood of this, given the prob-of-heads $\theta$, is: \[p(z \given \theta) = \binom{n}{z}\theta^z (1-\theta)^{n-z} . \]

How to weight the possible $\theta$? Need a flexible set of weighting functions, i.e., prior distributions on $[0,1]$.

Beta distributions.

What $\alpha$ and $\beta$ would we use for a $\Beta(\alpha, \beta)$ prior if:

the coin is probably close to fair.
the disease is probably quite rare.
no idea whatsoever.

plot of chunk r beta_stuff

Beta-Binomial Bayesian analysis

If \[\begin{aligned} P &\sim \text{Beta}(a,b) \\ Z &\sim \text{Binom}(n,P) , \end{aligned}\]

then “miraculously”,

\[\begin{aligned} (P \given Z = z) \sim \text{Beta}(a+z, b+n-z) . \end{aligned}\]

Discuss:

We flip an odd-looking coin 100 times, and get 65 heads. What is it’s true* probability of heads?

What prior to use?
Plot the prior and the posterior.
Is it reasonable that $\theta = 1/2$?
Best guess at $\theta$?
How far off are we, probably?

Tools include: rbeta( )

IN CLASS

prior_alpha <- prior_beta <- 1
post_alpha <- prior_alpha + 65
post_beta <- prior_beta + 35
post_mean <- post_alpha / (post_alpha + post_beta)

xvals <- seq(0, 1, length.out=101)
plot(xvals, dbeta(xvals, prior_alpha, prior_beta),
     type='l', ylim=c(0,10), ylab='density', xlab='theta')
lines(xvals, dbeta(xvals, post_alpha, post_beta),
    col='red')
abline(v=post_mean, lty=1, col='green')
abline(v= qbeta(c(.025, .975), post_alpha, post_beta),
       lty=3, col='blue')
legend("topleft", lty=c(1,1,1,3), col=c('black', 'red', 'green', 'blue'),
        legend=c("prior", "posterior", "posterior mean", "95% CI"))
abline(v=1/2, lty=3)

plot of chunk r in_calss It doesn’t look likely that $\theta = 1/2$: in fact, the posterior probability that $\theta \le 1/2$ is pbeta(1/2, post_alpha, post_beta). The posterior mean is 0.6470588. Our guess is probably acurate to within about 10% or so; a 95% credible interval is from 0.7363926 to 0.5522616.

If we were being frequentist, a 95% confidence interval for the mean (i.e., the proportion of heads) would be almost exactly the same thing.

phat <- 65/100
se <- sqrt(phat * (1 - phat) / 100)
conf_int <- phat + c(+1, -1) * 1.96 * se