- Experimental Design
- Power analyses
- Multi-factor ANOVA
- Nested ANOVA
- Factorial ANOVA
- Analysis of CoVariance (ANCOVA)
11/06/2018 and 11/08/2018
Survival of climbers of Mount Everest is higher for individuals taking supplemental oxygen than those who don’t.
Why?
The goal of experimental design is to eliminate bias and to reduce sampling error when estimating and testing effects of one variable on another.
\[ Power \propto \frac{(ES)(\alpha)(\sqrt n)}{\sigma}\]
Power is proportional to the combination of these parameters
perc <- read.table('perchlorate_data.tsv', header=T, sep='\t') x <- perc$Strain y <- log10(perc$T4_Hormone_Level) MyANOVA <- aov(y ~ x) summary (MyANOVA) boxplot(y ~ x)
Based on your results, calculate the power for your ANOVA.
pwr.t2n.test(n1=xxx, n2=xxx, d=xxx, sig.level=.05, power=NULL)
Check out the functions in the ‘pwr’ library (Unnecessary in this case, but could use ANOVA version):
pwr.anova.test(k=2, n=190, f=0.25, sig.level=.05, power=NULL)
Each pair of dots represents the two measurements
andrew_data <- read.table('andrew.tsv', header=T, sep=‘\t') head(andrew_data)
andrew_data$TREAT2 <- factor(c(rep(“low”,40),rep(“high”,40))
andrew_data$PATCH <- factor(andrew_data$PATCH)
andrew.agg <- with(andrew_data, aggregate(data.frame(ALGAE), by = list(TREAT2=TREAT2, PATCH=PATCH), mean) library(nlme) andrew.agg <- gsummary(andrew_data, groups=andrew_data$PATCH) boxplot(ALGAE ~ TREAT2, andrew.agg)
nested.aov <- aov(ALGAE ~ TREAT2 + Error(PATCH), data=andrew_data) summary(nested.aov)
library(nlme) VarCorr(lme(ALGAE ~ 1, random = ~1 | TREAT2/PATCH, andrew_data))
rnadata <- read.table('RNAseq.tsv', header=T, sep='') head(rnadata)
gene <- rnadata$Gene80 microbiota <- rnadata$Microbiota genotype <- rnadata$Genotype boxplot(gene ~ microbiota) boxplot(gene ~ genotype) boxplot(gene ~ microbiota*genotype)
rna_aov <- aov(gene ~ microbiota + genotype + microbiota:genotype) rna_aov <- aov(gene ~ microbiota*genotype)
plot(rna_aov) summary(rna_aov) anova(rna_aov)
IntPlot_data
file and IntPlot_Example.R
continuous response and two main effect variables
rnadata <- read.table('RNAseq.tsv', header=T, sep='') gene <- rnadata$Gene80 microbiota <- rnadata$Microbiota genotype <- rnadata$Genotype
make new “pseudo factor,” combining genotype and microbiota
gxm <- interaction(genotype,microbiota) levels(gxm) boxplot(gene ~ gxm)
specify the following 2 contrasts
contrasts(gxm) <- cbind(c(2, -1, 0, -1), c(-1, -1, 3, -1))
Fit the factorial linear model
rna_aov <- aov(gene ~ gxm)
Examine the ANOVA table, using supplied contrasts. Figure out the appropriate titles to give them.
summary(rna_aov, split = list(gxm = list('xxx'=1,'xxx'=2)))
What does the contrast summary tell you about the nature of the interaction?
nlme
& lme4
packages in R
.rnadata <- read.table('RNAseq.tsv', header=T, sep='') head(rnadata)
variables excluding first 5 and last 5 observations
gene <- rnadata$Gene80[6:75] microbiota <- rnadata$Microbiota[6:75] genotype <- rnadata$Genotype[6:75] boxplot(gene ~ microbiota) boxplot(gene ~ genotype) boxplot(gene ~ microbiota*genotype)
Estimate the variance components using Restricted Maximum Likelihood (REML)
library(lme4) lmer(gene ~ 1 + (1 | microbiota) + (1 | genotype) + (1 | microbiota:genotype))
Based on the REML sd estimates, what are the relative contributions of the factors to total variance in gene expression?
longevity_data <- read.table('longevity.csv', header=T, sep=',') head(longevity_data)
Variables
long <- longevity_data$LONGEVITY treat <- longevity_data$TREATMENT thorax <- longevity_data$THORAX
boxplot(long ~ treat) plot(long ~ thorax)
plot(aov(long ~ thorax + treat ), which = 1)
plot(aov(log10(long) ~ thorax + treat ), which = 1)
library(lattice) print(xyplot(log10(long) ~ thorax | treat, type = c("r", "p")))
anova(aov(log10(long) ~ thorax*treat))
anova(aov(thorax ~ treat))