Peter Ralph
Advanced Biological Statistics
The brms
package lets you
fit hierarchical models using Stan
with mixed-model syntax!!!
# e.g.
brm(formula = z ~ x + y + (1 + y|f), data = xy,
family = poisson(link='log'))
# or
brm(formula = z ~ x + y + (1 + y|f), data = xy,
family = student(link='identity'))
brm(formula = z ~ x + y + (1 + y|f), data = xy,
family = gaussian(link='identity'))
formula
: almost just like lme4
data
: must contain all the variablesfamily
: distribution of responselink
: connects mean to linear predictorThere are several classes of parameter in a brms model:
b
: the population-level (or, fixed) effectssd
: the standard deviations of group-level (or, random) effectssigma
for the GaussianExamples:
b_x
: the slope of x
: class="b", coef="x"
sd_f
: the SD of effects for levels of f
: class="sd", coef="f"
Pass a vector of “priors”, specified by
set_prior(prior, class="b", ...)
where prior
is some valid Stan code.
brm(formula = z ~ x + y + (1 + y|f), data = xy,
family = gaussian(link='identity'),
prior=c(set_prior("normal(0, 5)", class="b"),
set_prior("cauchy(0, 1)", class="sd", coef="f")))
Default:
## prior class coef group resp dpar nlpar bound source
## (flat) b default
## (flat) b x (vectorized)
## (flat) b y (vectorized)
## lkj(1) cor default
## lkj(1) cor f (vectorized)
## student_t(3, 1.6, 2.5) Intercept default
## student_t(3, 0, 2.5) sd default
## student_t(3, 0, 2.5) sd f (vectorized)
## student_t(3, 0, 2.5) sd Intercept f (vectorized)
## student_t(3, 0, 2.5) sd y f (vectorized)
## student_t(3, 0, 2.5) sigma default
## Compiling Stan program...
## Start sampling
## Warning: There were 121 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: There were 75 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
## http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
## Warning: Examine the pairs() plot to diagnose sampling problems
## Warning: There were 121 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: z ~ x + y + (1 + y | f)
## Data: xy (Number of observations: 100)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Group-Level Effects:
## ~f (Number of levels: 3)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.08 0.19 0.00 0.52 1.00 926 1132
## sd(y) 1.16 0.42 0.54 2.16 1.00 1668 1890
## cor(Intercept,y) -0.08 0.54 -0.95 0.91 1.00 1134 1187
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.02 0.10 -0.09 0.15 1.00 2467 1498
## x 1.00 0.01 0.98 1.03 1.00 3633 2541
## y 1.90 0.66 0.55 3.20 1.00 994 1200
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.10 0.01 0.09 0.12 1.00 3580 2548
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Or…
launch_shinystan(the_fit)
Summaries of, or samples from, the posteriors of:
fixef( )
: “fixed” effectsranef( )
: “random” effectsfitted( )
: posterior distribution of mean response (see posterior_epred
)predict( )
: posterior distribution of actual responses (see posterior_predict
)hypothesis( )
: get posterior distributions of functions of various parameters (e.g., difference between two classes)conditional_effects( )
: effect of one predictor conditioned on values of othersparnames( )
: list of parameter namespp_check( )
: compare response distribution to posterior predictive simulationsloo( )
leave-one-out crossvalidation for model comparisonbayes_R2( )
: Bayesian \(r^2\)help(brmsformula)
help(brmsfamily)
(but note can use those in help(family)
also)help(set_prior)
and also check what can have a prior with get_prior( )
stancode(the_fit)
(and standata(the_fit)
)loo( )
\[\begin{aligned} Y &\sim \text{Family}(\text{mean}=\mu) \\ \mu &= \text{link}(X \beta) . \end{aligned}\]
We can ask about:
\[\begin{aligned} Y &\sim \text{Family}(\text{mean}=\mu) \\ \mu &= \text{link}(X \beta + Z u) . \end{aligned}\]
We can ask about:
… with or without the group-level (random) effects, \(u\).
batting <- read.csv("data/BattingAveragePlus.csv", header=TRUE, stringsAsFactors=TRUE)
batting$scaled_height <- (batting$height - mean(batting$height))/sd(batting$height)
batting$scaled_weight <- (batting$weight - mean(batting$weight))/sd(batting$weight)
bb_fit <- brm(
Hits | trials(AtBats) ~ 0 + scaled_weight + scaled_height + PriPos + (1 | PriPos:Player),
data = batting,
family = "binomial",
prior = c(prior(normal(0, 5), class = b),
prior(normal(0, 5), class = sd)),
iter = 2000, chains = 3
)
## Compiling Stan program...
## Start sampling
Example: Is height associated with batting average? Which positions tend to be better batters?
Translated: What’s the posterior distribution of the effect of height on logit batting average?
Tools: (all have pars=
arguments)
summary( )
mcmc_hist( )
mcmc_intervals( )
extract( )
## Family: binomial
## Links: mu = logit
## Formula: Hits | trials(AtBats) ~ 0 + scaled_weight + scaled_height + PriPos + (1 | PriPos:Player)
## Data: batting (Number of observations: 886)
## Draws: 3 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 3000
##
## Group-Level Effects:
## ~PriPos:Player (Number of levels: 886)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.14 0.01 0.12 0.16 1.00 1173 1824
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## scaled_weight 0.01 0.01 -0.01 0.04 1.00 1298 1966
## scaled_height -0.01 0.01 -0.03 0.02 1.00 1326 1885
## PriPos1stBase -1.08 0.03 -1.13 -1.03 1.00 1632 2030
## PriPos2ndBase -1.09 0.03 -1.15 -1.04 1.00 1482 2158
## PriPos3rdBase -1.06 0.02 -1.11 -1.01 1.00 1744 2141
## PriPosCatcher -1.16 0.03 -1.21 -1.11 1.00 1851 2193
## PriPosCenterField -1.05 0.03 -1.10 -1.00 1.00 1434 1943
## PriPosLeftField -1.10 0.02 -1.14 -1.05 1.00 1619 1745
## PriPosPitcher -1.91 0.05 -2.00 -1.82 1.00 3786 2374
## PriPosRightField -1.05 0.03 -1.11 -0.99 1.00 1104 1437
## PriPosShortstop -1.10 0.03 -1.16 -1.04 1.00 1653 1911
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Example: What is the mean batting average of each position, as a function of weight?
Tools:
conditional_effects( )
: means in abstract conditionsfitted( )
: expected values, summarized (optionally, for new data)posterior_epred( )
: draws from the posterior mean (optionally, for new data)conditional_effects
for average-height-and-weight:
## Setting all 'trials' variables to 1 by default if not specified otherwise.
conditional_effects
by weight at average height:
## Setting all 'trials' variables to 1 by default if not specified otherwise.
conditional_effects
for shortstops by weight that are one SD above average height:
conditional_effects(bb_fit, effects="scaled_weight",
conditions=data.frame(PriPos="Shortstop", scaled_height=1))
## Setting all 'trials' variables to 1 by default if not specified otherwise.
The fitted( )
values: for the original data
posterior_epred( )
: posterior distribution of expected values
Mean number of hits out of 50 at-bats for a shortstop of average weight that is 1.5 SD above average height?
pp <- posterior_epred(bb_fit,
newdata=data.frame(
PriPos="Shortstop",
scaled_weight=0,
scaled_height=1.5,
AtBats=50),
re_formula=NA
)
head(pp)
## [,1]
## [1,] 12.47813
## [2,] 12.26702
## [3,] 12.26739
## [4,] 12.30489
## [5,] 12.54477
## [6,] 12.13928
Suppose Miguel Cairo, and Franklin Gutierrez go up to bat 150 more times, how many hits will they get?
(mc <- subset(batting, Player %in% c("Miguel Cairo", "Franklin Gutierrez"))[, c("Player", "PriPos", "Hits", "AtBats", "scaled_height", "scaled_weight" )])
## Player PriPos Hits AtBats scaled_height scaled_weight
## 329 Franklin Gutierrez Center Field 39 150 0.2568063 -0.5613294
## 641 Miguel Cairo 1st Base 28 150 -0.1757610 0.6614721
Tools:
predict( )
: summaryposterior_predict( )
: get samples## Player PriPos Hits AtBats scaled_height scaled_weight Estimate Est.Error Q2.5 Q97.5
## 329 Franklin Gutierrez Center Field 39 150 0.2568063 -0.5613294 38.90500 6.355636 27 52
## 641 Miguel Cairo 1st Base 28 150 -0.1757610 0.6614721 34.83633 6.001574 24 47
fitted( )
vs predict( )
predict
always has greater uncertainty.
## Player PriPos Hits AtBats scaled_height scaled_weight predict.Estimate predict.Est.Error predict.Q2.5 predict.Q97.5 fitted.Estimate fitted.Est.Error fitted.Q2.5 fitted.Q97.5
## 329 Franklin Gutierrez Center Field 39 150 0.2568063 -0.5613294 38.88267 6.361425 27 52 38.92534 3.247120 32.81933 45.41366
## 641 Miguel Cairo 1st Base 28 150 -0.1757610 0.6614721 34.82467 5.905609 24 47 34.71744 3.186855 28.87710 41.10412
\[\begin{aligned} Z_i &\sim \Binom(N_i, \theta_i) \\ \theta_i &= \logit(\beta_{p_i} + u_i) \\ u_i &\sim \Normal(0, \sigma^2) \end{aligned}\]
where for the \(i^\text{th}\) player,
## Player PriPos Hits AtBats scaled_height scaled_weight
## 20 Albert Pujols 1st Base 173 607 0.6893737 1.395153
How would we use the posterior to summarize:
pp_check
Question: does our model fit the data?
Possible answer: gee, I dunno, let’s simulate from it and see?
Fit a model.
Draw a set of parameters from the posterior distribution.
With these, simulate a new data set.
Do this a few times, and compare the results to the original dataset.
brms
lets you do this with the pp_check(brms_fit, type='xyz')
method
See the docs for options.
Let’s first go back to the pumpkin data from Week 5:
lme4
:Then, we fit a mixed model with lme4
:
lme_pumpkins <- lmer( weight ~ water * fertilizer + (1|plot:water:fertilizer), data=pumpkins)
summary(lme_pumpkins)
## Linear mixed model fit by REML ['lmerMod']
## Formula: weight ~ water * fertilizer + (1 | plot:water:fertilizer)
## Data: pumpkins
##
## REML criterion at convergence: 170.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.64930 -0.54801 0.03808 0.70427 1.69905
##
## Random effects:
## Groups Name Variance Std.Dev.
## plot:water:fertilizer (Intercept) 0.01064 0.1031
## Residual 0.21505 0.4637
## Number of obs: 120, groups: plot:water:fertilizer, 24
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.0019 0.1158 25.920
## waterwater 6.0103 0.1638 36.696
## fertilizerlow -2.0347 0.1638 -12.423
## fertilizermedium -1.0567 0.1638 -6.452
## waterwater:fertilizerlow -2.9360 0.2316 -12.676
## waterwater:fertilizermedium -2.7175 0.2316 -11.732
##
## Correlation of Fixed Effects:
## (Intr) wtrwtr frtlzrl frtlzrm wtrwtr:frtlzrl
## waterwater -0.707
## fertilizrlw -0.707 0.500
## fertilzrmdm -0.707 0.500 0.500
## wtrwtr:frtlzrl 0.500 -0.707 -0.707 -0.354
## wtrwtr:frtlzrm 0.500 -0.707 -0.354 -0.707 0.500
brms
:Here’s the “same thing” with brms
:
## Compiling Stan program...
## Start sampling
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: weight ~ water * fertilizer + (1 | plot:water:fertilizer)
## Data: pumpkins (Number of observations: 120)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Group-Level Effects:
## ~plot:water:fertilizer (Number of levels: 24)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.11 0.07 0.01 0.28 1.01 824 1490
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 3.00 0.13 2.75 3.24 1.00 1462 1780
## waterwater 6.02 0.18 5.66 6.36 1.00 1368 2067
## fertilizerlow -2.03 0.17 -2.37 -1.69 1.00 1474 2420
## fertilizermedium -1.05 0.17 -1.39 -0.71 1.00 1516 1806
## waterwater:fertilizerlow -2.94 0.24 -3.43 -2.46 1.00 1499 2099
## waterwater:fertilizermedium -2.73 0.25 -3.22 -2.24 1.00 1541 2401
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.47 0.03 0.41 0.54 1.00 3376 2784
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: weight ~ water * fertilizer + (1 | plot:water:fertilizer)
## Data: pumpkins (Number of observations: 120)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Group-Level Effects:
## ~plot:water:fertilizer (Number of levels: 24)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.11 0.07 0.01 0.28 1.01 824 1490
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 3.00 0.13 2.75 3.24 1.00 1462 1780
## waterwater 6.02 0.18 5.66 6.36 1.00 1368 2067
## fertilizerlow -2.03 0.17 -2.37 -1.69 1.00 1474 2420
## fertilizermedium -1.05 0.17 -1.39 -0.71 1.00 1516 1806
## waterwater:fertilizerlow -2.94 0.24 -3.43 -2.46 1.00 1499 2099
## waterwater:fertilizermedium -2.73 0.25 -3.22 -2.24 1.00 1541 2401
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.47 0.03 0.41 0.54 1.00 3376 2784
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
## Linear mixed model fit by REML ['lmerMod']
## Formula: weight ~ water * fertilizer + (1 | plot:water:fertilizer)
## Data: pumpkins
##
## REML criterion at convergence: 170.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.64930 -0.54801 0.03808 0.70427 1.69905
##
## Random effects:
## Groups Name Variance Std.Dev.
## plot:water:fertilizer (Intercept) 0.01064 0.1031
## Residual 0.21505 0.4637
## Number of obs: 120, groups: plot:water:fertilizer, 24
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.0019 0.1158 25.920
## waterwater 6.0103 0.1638 36.696
## fertilizerlow -2.0347 0.1638 -12.423
## fertilizermedium -1.0567 0.1638 -6.452
## waterwater:fertilizerlow -2.9360 0.2316 -12.676
## waterwater:fertilizermedium -2.7175 0.2316 -11.732
##
## Correlation of Fixed Effects:
## (Intr) wtrwtr frtlzrl frtlzrm wtrwtr:frtlzrl
## waterwater -0.707
## fertilizrlw -0.707 0.500
## fertilzrmdm -0.707 0.500 0.500
## wtrwtr:frtlzrl 0.500 -0.707 -0.707 -0.354
## wtrwtr:frtlzrm 0.500 -0.707 -0.354 -0.707 0.500
Try out:
launch_shinystan(brms_pumpkins)
conditional_effects(brms_pumpkins)
pp_check(brms_pumpkins, type='scatter')