Chapter 3 Course Schedule
3.1 Weeks 1-2
- Data organization and management
- best practices, reproducibility, etc.
- Basic programming fundamentals for data curation
- The Unix environment and fundamental commands
- Formatting and manipulating tabular text files from the terminal
- Introduction to R and Rstudio
- Installation/Updates
- R object types and assignment
- Practice with R objects
- vectors, matrices, data frames, etc.
- Applying core programming fundamentals in R
- vectorized operations
- replicate, apply family, ifelse, for loops, etc.
3.2 Week 3
- Plotting/visualizing data as a means of exploration
- Different plot types
- Scale, transformations, etc.
- Fundamentals of plotting in base R
- par
- using palettes, points, sizes, etc. to convey information
- axes and labels
- R markdown
3.3 Week 4
Population parameters, samples, and sampling distributions
- Central Limit Theorem and the normal dist.
- Mean and st. dev.
Probability and probability distributions
Calculating summary statistics
- Other common summary statistics (quantiles, etc.)
3.4 Week 5
- Parameter estimation
- Simulating data sets with known parameters
- Revisit probability distributions
- Uncertainty in estimation
- Parametric and nonparametric approaches to uncertainty
3.5 Week 6
- Experimental design
- lexicon
- considering sources of variance
- types of variables (categorical, ordinal, rational)
- confounding variables
- Frequentist hypothesis testing
- error types
- p-values
- degrees of freedom
- statistical power
- multiple testing problem
3.6 Week 7
- Comparing means between groups
- Student’s t-test
- Bootstrapping and randomization to compare means
3.7 Week 8
- Relationships between quantitative variables
- correlation and covariance
- Simple linear regression
- residuals and least squares
- fitting linear regression models