Peter Ralph
17 November 2020 – Advanced Biological Statistics
Suppose now we have data from \(n\) different coins from the same source. We don’t assume they have the same \(\theta\), but don’t know what its distribution is, so try to learn it.
\[\begin{aligned} Z_i &\sim \Binom(N_i, \theta_i) \\ \theta_i &\sim \Beta(\alpha, \beta) \\ \alpha &\sim \Unif(0, 100) \\ \beta &\sim \Unif(0, 100) \end{aligned}\]
Goal: find the posterior distribution of \(\alpha\), \(\beta\).
Problem: we don’t have a nice mathematical expression for this posterior distribution.
data {
// stuff you input
}
transformed data {
// stuff that's calculated from the data (just once, at the start)
}
parameters {
// stuff you want to learn the posterior distribution of
}
transformed parameters {
// stuff that's calculated from the parameters (at every step)
}
model {
// the action!
}
generated quantities {
// stuff you want computed also along the way
}
How to do everything: see the user’s manual.
We’ve flipped a coin 10 times and got 6 Heads. We think the coin is close to fair, so put a \(\Beta(20,20)\) prior on it’s probability of heads, and want the posterior distribution.
\[\begin{aligned} Z &\sim \Binom(10, \theta) \\ \theta &\sim \Beta(20, 20) \end{aligned}\] Sample from \[\theta \given Z = 6\]
\[\begin{aligned} Z &\sim \Binom(10, \theta) \\ \theta &\sim \Beta(20, 20) \end{aligned}\]
data {
// stuff you input
}
parameters {
// stuff you want to learn
// the posterior distribution of
}
model {
// the action!
}
\[\begin{aligned} Z &\sim \Binom(10, \theta) \\ \theta &\sim \Beta(20, 20) \end{aligned}\]
data {
int N; // number of flips
int Z; // number of heads
}
parameters {
// stuff you want to learn
// the posterior distribution of
}
model {
// the action!
}
\[\begin{aligned} Z &\sim \Binom(10, \theta) \\ \theta &\sim \Beta(20, 20) \end{aligned}\]
data {
int N; // number of flips
int Z; // number of heads
}
parameters {
// probability of heads
real<lower=0,upper=1> theta;
}
model {
// the action!
}
\[\begin{aligned} Z &\sim \Binom(10, \theta) \\ \theta &\sim \Beta(20, 20) \end{aligned}\]
data {
int N; // number of flips
int Z; // number of heads
}
parameters {
// probability of heads
real<lower=0,upper=1> theta;
}
model {
Z ~ binomial(N, theta);
theta ~ beta(20, 20);
}
With a uniform prior, the “maximum posterior” parameter values are also the maximum likelihood values.
> help(optimizing)
optimizing package:rstan R Documentation
Obtain a point estimate by maximizing the joint posterior
Description:
Obtain a point estimate by maximizing the joint posterior from the
model defined by class 'stanmodel'.
Usage:
## S4 method for signature 'stanmodel'
optimizing(object, data = list(),
## $par
## theta
## 0.5208333
##
## $value
## [1] -33.22939
##
## $return_code
## [1] 0
##
## $theta_tilde
## theta
## [1,] 0.5208333
The maximum a posteriori probability estimate (MAP) is 0.5208333.
post_fun <- function (p) {
n <- 10; z <- 6; a <- b <- 20
lik <- dbinom(z, size=n, prob=p)
prior <- dbeta(p, a, b)
return( prior * lik )
}
curve(post_fun, 0, 1, xlab=expression(theta), ylab='posterior prob')
points(fit$par, post_fun(fit$par), col='red', pch=20)
Note: Since the prior was uniform, then this is the maximum likelihood estimate (MLE) also.
lp__
is the log posterior density. Note n_eff
and Rhat
.
## Inference for Stan model: 5f3115ac9593cded393e67d2d0e3ce84.
## 3 chains, each with iter=10000; warmup=5000; thin=1;
## post-warmup draws per chain=5000, total post-warmup draws=15000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## theta 0.52 0.00 0.07 0.38 0.47 0.52 0.57 0.66 6136 1
## lp__ -35.12 0.01 0.71 -37.17 -35.28 -34.85 -34.67 -34.62 5608 1
##
## Samples were drawn using NUTS(diag_e) at Mon Nov 16 09:44:31 2020.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
Fuzzy caterpillars are good.
Stan uses ggplot2.
## Warning: Removed 2 rows containing missing values (geom_bar).
The samples:
What’s the posterior probability that \(\theta < 0.5\)?
## [1] 0.384
## [1] 0.3877248
\[\begin{aligned} Z_i &\sim \Binom(N_i, \theta_i) \\ \theta_i &\sim \Beta(\alpha, \beta) \\ \alpha &\sim \Unif(0, 100) \\ \beta &\sim \Unif(0, 100) \end{aligned}\]
data {
// the data (input)
}
parameters {
// the parameters (output)
}
model {
// how they are related
}
\[\begin{aligned} Z_i &\sim \Binom(N_i, \theta_i) \\ \theta_i &\sim \Beta(\alpha, \beta) \\ \alpha &\sim \Unif(0, 100) \\ \beta &\sim \Unif(0, 100) \end{aligned}\]
data {
int n; // number of coins
int N[n]; // number of flips
int Z[n]; // number of heads
}
parameters {
// the parameters (output)
}
model {
// how they are related
}
\[\begin{aligned} Z_i &\sim \Binom(N_i, \theta_i) \\ \theta_i &\sim \Beta(\alpha, \beta) \\ \alpha &\sim \Unif(0, 100) \\ \beta &\sim \Unif(0, 100) \end{aligned}\]
data {
int n; // number of coins
int N[n]; // number of flips
int Z[n]; // number of heads
}
parameters {
// probability of heads
real<lower=0,upper=1> theta[n];
real<lower=0,upper=100> alpha;
real<lower=0,upper=100> beta;
}
model {
// how they are related
}
\[\begin{aligned} Z_i &\sim \Binom(N_i, \theta_i) \\ \theta_i &\sim \Beta(\alpha, \beta) \\ \alpha &\sim \Unif(0, 100) \\ \beta &\sim \Unif(0, 100) \end{aligned}\]
data {
int n; // number of coins
int N[n]; // number of flips
int Z[n]; // number of heads
}
parameters {
// probability of heads
real<lower=0,upper=1> theta[n];
real<lower=0, upper=100> alpha;
real<lower=0, upper=100> beta;
}
model {
Z ~ binomial(N, theta);
theta ~ beta(alpha, beta);
// uniform priors "go without saying"
// alpha ~ uniform(0, 100);
// beta ~ uniform(0, 100);
}
Data:
set.seed(23)
ncoins <- 100
true_theta <- rbeta(ncoins, 20, 50)
N <- rep(50, ncoins)
Z <- rbinom(ncoins, size=N, prob=true_theta)
Find the posterior distribution on alpha
and beta
: check convergence with print()
and stan_trace()
, then plot using stan_hist()
, pairs()
, and/or stan_scat()
(with e.g., pars=c("alpha", "beta", "theta[1]", "theta[2]")
).
data {
int n; // number of coins
int N[n]; // number of flips
int Z[n]; // number of heads
}
parameters {
// probability of heads
real<lower=0,upper=1> theta[n];
real<lower=0,upper=100> alpha;
real<lower=0,upper=100> beta;
}
model {
Z ~ binomial(N, theta);
theta ~ beta(alpha, beta);
// uniform priors 'go without saying'
// alpha ~ uniform(0, 100);
// beta ~ uniform(0, 100);
}
Write down and compile the model:
bag_block <- "
data {
int n; // number of coins
int N[n]; // number of flips
int Z[n]; // number of heads
}
parameters {
// probability of heads
real<lower=0,upper=1> theta[n];
real<lower=0,upper=100> alpha;
real<lower=0,upper=100> beta;
}
model {
Z ~ binomial(N, theta);
theta ~ beta(alpha, beta);
// uniform priors 'go without saying'
// alpha ~ uniform(0, 100);
// beta ~ uniform(0, 100);
}
"
bag_model <- stan_model(model_code=bag_block)
Sample from the model:
## Error in .local(object, ...): object 'ncoins' not found
Sample more from the model:
## Error in .local(object, ...): object 'ncoins' not found
Check for convergence: n_eff
are all pretty big, and Rhat
are all equal to 1!
## Error in print(bag_fit): object 'bag_fit' not found
Check diagnostic plots: – whoops, the upper bound of \(\beta \le 100\) is constraining things; we should probably increase that.
## Error in is.stanreg(object): object 'bag_fit' not found
More plots: Hm, the posteriors of \(\alpha\) and \(\beta\) are correlated.
## Error in pairs(bag_fit, pars = c("theta[1]", "theta[2]", "alpha", "beta")): object 'bag_fit' not found
Results:
## Error in is.stanreg(object): object 'bag_fit' not found
What about the mean \(\theta\) in the bag? The mean of a Beta distribution is \(\alpha / (\alpha + \beta)\), so let’s get the posterior distribution of that:
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'extract': object 'bag_fit' not found
post_mean <- samples$alpha / (samples$alpha + samples$beta)
hist(post_mean, main='posterior distribution of mu', xlab=expression(alpha / (alpha + beta)))
## Error in hist.default(post_mean, main = "posterior distribution of mu", : invalid number of 'breaks'
The posterior mean of \(\mu = \alpha / (\alpha + \beta)\) is NaN, and a 95% credible interval is from NA to NA.
int x; // an integer
int y[10]; // ten integers
real z; // a number
real z[2,5]; // a 2x5 array of numbers
vector[10] u; // length 10 vector
matrix[10,10] v; // 10x10 matrix
vector[10] w[10]; // ten length 10 vectors
;
stan_model( )
.sampling( )
.Rhat
, stan_trace( )
, pairs( )
, etc).stan_plot( )
, stan_hist( )
, etc).