Peter Ralph
Advanced Biological Statistics
Suppose I have two trick coins:
But, I lost one and I don’t know which! So, I flip it 10 times and get 6 heads. Which is it, and how sure are you?
Er, probably coin (A)?
Well,
For a precise answer…
Before flipping, each coin seems equally likely. Then
where
Suppose instead I had 9 coins, with probabilities 10%, 20%, …, 90%;
as before I flipped one 10 times and got 6 heads. For each
Question: which coin(s) is it, and how sure are we? (And, what does it mean when we say how sure we are?)
prior
likelihood
posterior
prior
likelihood
posterior
prior
likelihood
posterior
prior
likelihood
posterior
prior
likelihood
posterior
prior
likelihood
posterior
prior
likelihood
posterior
Recall: there were nine possible values of
Which coin is it, and how sure are you?
Possible types of answer:
Now suppose we want to estimate the probability of heads for a coin without knowing the possible values. (or, a disease incidence, or error rate in an experiment, …)
We flip it
The likelihood of this, given the prob-of-heads
How to weight the possible
What
the coin is probably close to fair.
the disease is probably quite rare.
no idea whatsoever.
If
then “miraculously”,
We flip an odd-looking coin 100 times, and get 65 heads. What is it’s true* probability of heads?
What prior to use?
Plot the prior and the posterior.
Is it reasonable that
Best guess at
How far off are we, probably?
Tools include: rbeta( )
prior_alpha <- prior_beta <- 1
post_alpha <- prior_alpha + 65
post_beta <- prior_beta + 35
post_mean <- post_alpha / (post_alpha + post_beta)
xvals <- seq(0, 1, length.out=101)
plot(xvals, dbeta(xvals, prior_alpha, prior_beta),
type='l', ylim=c(0,10), ylab='density', xlab='theta')
lines(xvals, dbeta(xvals, post_alpha, post_beta),
col='red')
abline(v=post_mean, lty=1, col='green')
abline(v= qbeta(c(.025, .975), post_alpha, post_beta),
lty=3, col='blue')
legend("topleft", lty=c(1,1,1,3), col=c('black', 'red', 'green', 'blue'),
legend=c("prior", "posterior", "posterior mean", "95% CI"))
abline(v=1/2, lty=3)
It doesn’t look likely that
pbeta(1/2, post_alpha, post_beta)
. The posterior mean is
0.6470588. Our guess is probably acurate to within about 10% or so; a
95% credible interval is from 0.7363926 to 0.5522616.
If we were being frequentist, a 95% confidence interval for the mean (i.e., the proportion of heads) would be almost exactly the same thing.