Peter Ralph
Advanced Biological Statistics
Data from: Donihue, C.M., Herrel, A., Fabre, AC. et al. Hurricane-induced selection on the morphology of an island lizard. Nature 560, 88–91 (2018).
Anolis scriptus morphology and performance from before and after hurricanes
Lizard morphology data collected from two visits to Pine Cay and Water Cay in
the Turks and Caicos Islands. Anolis scriptus lizards were surveyed on both
islands four days before hurricane Irma and again, six weeks later, after the
islands had been hit by Hurricanes Irma and Maria. We measured morphology and
performance and found significant differences in the "before" and "after"
hurricane lizard population morphology. All linear measurements are in MM,
surface area measurements are in MM2 and force in N. Counts correspond to the
number of lamellar scales on the forelimb and hind limb toepads.
Here’s the data: ../Datasets/Hurricane_lizards/. First we’ll read it in, reorder the levels of the Hurricane factor (so that “before” comes before “after”), change units on SVL to centimeters so it’s on the same scale as the other variables (useful below), and drop unused variables.
lizards <- read.csv(
"../Datasets/Hurricane_lizards/lizards.csv",
header=TRUE, stringsAsFactors=TRUE)
lizards$Hurricane <- factor(lizards$Hurricane, levels=c("Before", "After"))
lizards$SVL_cm <- lizards$SVL / 10
varnames <- c("SVL_cm", "Femur", "Tibia", "Metatarsal", "LongestToe", "Humerus",
"Radius", "Metacarpal", "LongestFinger", "FingerCount", "ToeCount",
"MeanFingerArea", "MeanToeArea")
lizards <- lizards[, c(c("Hurricane", "Origin", "Sex"), varnames)]
head(lizards)
## Hurricane Origin Sex SVL_cm Femur Tibia Metatarsal LongestToe Humerus Radius Metacarpal LongestFinger FingerCount ToeCount MeanFingerArea MeanToeArea
## 1 After Pine Cay Male 4.869 10.39 11.87 7.52 7.43 8.66 7.99 2.22 3.19 10 12 1.3327 2.433
## 2 After Pine Cay Female 4.031 8.66 9.79 6.18 6.20 8.01 6.51 2.38 3.55 10 13 0.9613 1.518
## 3 After Pine Cay Male 5.830 12.87 14.76 9.45 9.58 11.72 9.54 3.54 5.09 14 15 2.6313 4.098
## 4 After Pine Cay Female 4.315 8.55 10.29 6.60 6.26 7.43 6.60 2.79 3.55 11 12 1.1777 1.879
## 5 After Pine Cay Female 4.551 10.26 11.02 6.89 7.02 7.71 7.25 2.52 3.37 11 13 1.3843 2.530
## 6 After Pine Cay Female 4.697 10.02 10.78 6.85 7.18 8.45 7.15 2.39 3.26 12 14 1.4260 2.036
Here’s the sample sizes:
## , , Hurricane = Before
##
## Origin
## Sex Pine Cay Water Cay
## Female 15 18
## Male 18 20
##
## , , Hurricane = After
##
## Origin
## Sex Pine Cay Water Cay
## Female 17 22
## Male 29 25
## Hurricane Origin Sex SVL_cm Femur Tibia Metatarsal LongestToe Humerus Radius Metacarpal LongestFinger FingerCount ToeCount MeanFingerArea MeanToeArea
## 1 After Pine Cay Male 4.869 10.39 11.87 7.52 7.43 8.66 7.99 2.22 3.19 10 12 1.3327 2.433
## 2 After Pine Cay Female 4.031 8.66 9.79 6.18 6.20 8.01 6.51 2.38 3.55 10 13 0.9613 1.518
## 3 After Pine Cay Male 5.830 12.87 14.76 9.45 9.58 11.72 9.54 3.54 5.09 14 15 2.6313 4.098
## 4 After Pine Cay Female 4.315 8.55 10.29 6.60 6.26 7.43 6.60 2.79 3.55 11 12 1.1777 1.879
## 5 After Pine Cay Female 4.551 10.26 11.02 6.89 7.02 7.71 7.25 2.52 3.37 11 13 1.3843 2.530
## 6 After Pine Cay Female 4.697 10.02 10.78 6.85 7.18 8.45 7.15 2.39 3.26 12 14 1.4260 2.036
Care, or at least think, about the data.
Look at the data.
Query the data.
Check the results.
Communicate.
Which variables do we expect? Anything that lowers the cetner of gravity might be associated. Also, big feet might be important.
Look at the distributions, with the pre- and post-hurricane histograms overlaid. Or, boxplots.
Look at measurements as a ratio of body length, since bigger bodies = bigger sails. (eg make histograms of this)
With boxplots, we might do t-tests to compare before/after.
Exercise: Write a possible conclusion sentence or two (imagining what might happen). Make sure to communicate (a) the size of the effect we’re looking for, in real-world terms; (b) the strength of the statistical evidence; and (c) context (e.g., how’s the size of the hurricane-induced change compare to other differences).
We found that the typical ratio of hand size to body size was 10% larger after than before (t-test, p-value 0.01). This is roughly 2 standard deviations of the ratios observed within each sex; ratios do not seem to differ between islands.
The body radius decreased by 20% after the hurricane compared to before.
Metatarsal length increased by 5% after the hurricane (possibly due to incrased grip).
We see that lizards with the longest fingers and toes (5% longer on average) survived better on average (t-test of p=0.01); they seem to have been able to hold on better during the hurricane.
lizards_long <- pivot_longer(lizards, col=all_of(varnames), names_to="measurement")
ggplot(lizards_long) + geom_histogram(aes(x=value)) + facet_wrap(~measurement)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
There is not a huge, obvious shift.
Let’s do some t-tests to see what happens:
results <- expand.grid(
variable=varnames,
sex=levels(lizards$Sex),
stringsAsFactors=FALSE
)
results$p <- NA
results$lower_CI <- NA
results$upper_CI <- NA
for (j in 1:nrow(results)) {
v <- results$variable[j]
s <- results$sex[j]
x <- subset(lizards, Sex==s)
t <- t.test(x[[v]][x$Hurricane == "Before"],
x[[v]][x$Hurricane == "After"]
)
results$p[j] <- t$p.value
results$lower_CI[j] <- t$conf.int[1]
results$upper_CI[j] <- t$conf.int[2]
}
This is preliminary, but it seems that maybe males need shorter (?) legs, and females need larger (?) fingers.
## variable sex p lower_CI upper_CI
## 15 Femur Male 3.030e-07 0.70632 1.49627
## 18 LongestToe Male 5.133e-06 0.41631 0.98772
## 12 MeanFingerArea Female 1.204e-05 -0.25470 -0.10373
## 16 Tibia Male 6.230e-05 0.36557 1.02030
## 13 MeanToeArea Female 4.725e-04 -0.36000 -0.10662
## 10 FingerCount Female 1.172e-03 -1.04091 -0.26911
## 17 Metatarsal Male 1.410e-03 0.14257 0.57219
## 14 SVL_cm Male 1.917e-03 0.09161 0.39137
## 2 Femur Female 1.054e-02 0.10359 0.75492
## 20 Radius Male 1.402e-02 0.06389 0.55246
## 6 Humerus Female 3.436e-02 -0.47849 -0.01885
## 22 LongestFinger Male 1.227e-01 -0.03672 0.30369
## 8 Metacarpal Female 1.354e-01 -0.21897 0.03025
## 21 Metacarpal Male 2.715e-01 -0.05845 0.20522
## 1 SVL_cm Female 4.324e-01 -0.13187 0.05706
## 11 ToeCount Female 4.970e-01 -0.65824 0.32257
## 5 LongestToe Female 4.980e-01 -0.13888 0.28289
## 19 Humerus Male 5.556e-01 -0.25189 0.46536
## 25 MeanFingerArea Male 6.349e-01 -0.18544 0.11373
## 23 FingerCount Male 6.674e-01 -0.44854 0.28866
## 3 Tibia Female 6.923e-01 -0.33560 0.22413
## 4 Metatarsal Female 7.478e-01 -0.22756 0.16415
## 26 MeanToeArea Male 8.006e-01 -0.21041 0.27190
## 24 ToeCount Male 8.869e-01 -0.35325 0.40787
## 9 LongestFinger Female 9.214e-01 -0.15176 0.13740
## 7 Radius Female 9.294e-01 -0.17156 0.18756
Let’s look at the Femur ~ SVL plot:
simple_lm <- lm(Femur ~ SVL_cm + Hurricane, data=lizards)
plot(Femur ~ SVL_cm, data=lizards, col=ifelse(Hurricane=="Before", "black", "red"), pch=20)
coefs <- coef(simple_lm)
abline(coefs["(Intercept)"], coefs["SVL_cm"])
abline(coefs["(Intercept)"] + coefs["HurricaneAfter"], coefs["SVL_cm"], col='red', pch=20)
legend("bottomright", lty=1, col=c("black", "red"), legend=c("Before", "After"))
Femur
and MeanFingerArea
,Femur, MeanFingerArea ~ SVL_cm * (Hurricane + Origin)
Later: add I(SVL_cm^2)
, consider other variables.
Goal:
Remaining female lizards of average size on Pine Cay had an average finger pad area that was X% larger (95% CI of L-U%) than before the hurricane. This difference of X square centimeters represents roughly Y standard deviations of finger pad area. Similarly, males of average length …
(same for femur length)
layout(1:2)
hist( resid(lm(Femur ~ SVL_cm, data=lizards)), breaks=20 )
hist( resid(lm(MeanFingerArea ~ SVL_cm, data=lizards)), breaks=20 )
Note: there are some outliers, maybe? We can check that later, and if so, change the family to “student-t”, maybe.
The slope of these lines is about 4 for Femur and less than that for finger area, and the difference between them is about 0.5.
## Family: MV(gaussian, gaussian)
## Links: mu = identity; sigma = identity
## mu = identity; sigma = identity
## Formula: Femur ~ SVL_cm * (Hurricane + Origin) + I(SVL_cm^2)
## MeanFingerArea ~ SVL_cm * (Hurricane + Origin) + I(SVL_cm^2)
## Data: lizards (Number of observations: 163)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Femur_Intercept -2.30 4.32 -10.85 6.07 1.00 1945 2475
## MeanFingerArea_Intercept 0.12 1.18 -2.22 2.41 1.00 1863 2330
## Femur_SVL_cm 3.24 1.71 -0.07 6.61 1.00 2009 2665
## Femur_HurricaneAfter -1.15 0.78 -2.73 0.38 1.00 2306 2283
## Femur_OriginWaterCay -2.22 0.84 -3.87 -0.58 1.00 2088 2375
## Femur_ISVL_cmE2 -0.10 0.17 -0.43 0.23 1.00 2093 2638
## Femur_SVL_cm:HurricaneAfter 0.13 0.16 -0.18 0.44 1.00 2302 2252
## Femur_SVL_cm:OriginWaterCay 0.43 0.17 0.10 0.76 1.00 2014 2447
## MeanFingerArea_SVL_cm -0.21 0.47 -1.11 0.71 1.00 1929 2435
## MeanFingerArea_HurricaneAfter -0.30 0.22 -0.73 0.13 1.00 2469 2569
## MeanFingerArea_OriginWaterCay -0.42 0.23 -0.87 0.02 1.00 2250 2576
## MeanFingerArea_ISVL_cmE2 0.09 0.05 0.00 0.18 1.00 2002 2436
## MeanFingerArea_SVL_cm:HurricaneAfter 0.10 0.04 0.01 0.19 1.00 2459 2439
## MeanFingerArea_SVL_cm:OriginWaterCay 0.10 0.05 0.01 0.19 1.00 2238 2497
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_Femur 0.62 0.03 0.55 0.69 1.00 5309 3583
## sigma_MeanFingerArea 0.17 0.01 0.15 0.19 1.00 4649 3225
##
## Residual Correlations:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## rescor(Femur,MeanFingerArea) 0.22 0.08 0.07 0.37 1.00 4769 2834
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
fuzzy caterpillars good!
Hm - negative posterior correlation between slope and intercept
We’d like to know the propoportional increase in finger pad area (to say “average finger pad area was X% larger after the hurricane”), and to get a credible interval for this (i.e., for the “X% larger” part). To do this, we’ll use posterior_epred()
, since this gets posterior distributions of expected values (given the predictors).
mean_female_length <- with(subset(lizards, Sex=="Female"),
mean(SVL_cm))
newdata <- data.frame(
Origin="Pine Cay",
SVL_cm=mean_female_length,
Hurricane=levels(lizards$Hurricane)
)
pp <- posterior_epred(lfit, newdata=newdata, resp="MeanFingerArea")
# posterior distribution of the relative change
# (ie the change divided by the mean beforehand)
pp_ratio <- (pp[,2] - pp[,1])/pp[,1]
We also want to know how the mean increase in finger pad area compares to between-individual variation, i.e., the standard deviation of finger pad area for lizards of the same length and in the same situation. We get this using posterior_predict( )
, since this obtains the posterior distribution of the response, i.e., of actual toe pad areas (not their mean values).
Remaining female lizards of average size on Pine Cay had an average finger pad area that was 15% larger (95% CI of 7 - 23%) than before the hurricane. This difference of 0.1414 square centimeters represents roughly three-quarters of a standard deviation of finger pad area. Similarly, males of average length …
No outliers in the residuals: we’re probably OK with the Normal family.
## Using 10 posterior draws for ppc type 'scatter' by default.
## Using 10 posterior draws for ppc type 'error_hist' by default.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.