Peter Ralph
21 January 2020 – Advanced Biological Statistics
HairEyeColor package:datasets R Documentation
Hair and Eye Color of Statistics Students
Description:
Distribution of hair and eye color and sex in 592 statistics
students.
Usage:
HairEyeColor
Format:
A 3-dimensional array resulting from cross-tabulating 592
observations on 3 variables. The variables and their levels are
as follows:
No Name Levels
1 Hair Black, Brown, Red, Blond
2 Eye Brown, Blue, Hazel, Green
3 Sex Male, Female
Details:
The Hair x Eye table comes from a survey of students at the
University of Delaware reported by Snee (1974). The split by
‘Sex’ was added by Friendly (1992a) for didactic purposes.
This data set is useful for illustrating various techniques for
the analysis of contingency tables, such as the standard
chi-squared test or, more generally, log-linear modelling, and
graphical methods such as mosaic plots, sieve diagrams or
association plots.
Source:
<URL:
http://euclid.psych.yorku.ca/ftp/sas/vcd/catdata/haireye.sas>
Snee (1974) gives the two-way table aggregated over ‘Sex’. The
‘Sex’ split of the ‘Brown hair, Brown eye’ cell was changed to
agree with that used by Friendly (2000).
References:
Snee, R. D. (1974). Graphical display of two-way contingency
tables. _The American Statistician_, *28*, 9-12. doi:
10.2307/2683520 (URL: http://doi.org/10.2307/2683520).
## , , Sex = Male
##
## Eye
## Hair Brown Blue Hazel Green
## Black 32 11 10 3
## Brown 53 50 25 15
## Red 10 10 7 7
## Blond 3 30 5 8
##
## , , Sex = Female
##
## Eye
## Hair Brown Blue Hazel Green
## Black 36 9 5 2
## Brown 66 34 29 14
## Red 16 7 7 7
## Blond 4 64 5 8
(haireye <- data.frame(hair = dimnames(HairEyeColor)[[1]][slice.index(HairEyeColor, 1)],
eye = dimnames(HairEyeColor)[[2]][slice.index(HairEyeColor, 2)],
sex = dimnames(HairEyeColor)[[3]][slice.index(HairEyeColor, 3)],
number = as.vector(HairEyeColor)))
## hair eye sex number
## 1 Black Brown Male 32
## 2 Brown Brown Male 53
## 3 Red Brown Male 10
## 4 Blond Brown Male 3
## 5 Black Blue Male 11
## 6 Brown Blue Male 50
## 7 Red Blue Male 10
## 8 Blond Blue Male 30
## 9 Black Hazel Male 10
## 10 Brown Hazel Male 25
## 11 Red Hazel Male 7
## 12 Blond Hazel Male 5
## 13 Black Green Male 3
## 14 Brown Green Male 15
## 15 Red Green Male 7
## 16 Blond Green Male 8
## 17 Black Brown Female 36
## 18 Brown Brown Female 66
## 19 Red Brown Female 16
## 20 Blond Brown Female 4
## 21 Black Blue Female 9
## 22 Brown Blue Female 34
## 23 Red Blue Female 7
## 24 Blond Blue Female 64
## 25 Black Hazel Female 5
## 26 Brown Hazel Female 29
## 27 Red Hazel Female 7
## 28 Blond Hazel Female 5
## 29 Black Green Female 2
## 30 Brown Green Female 14
## 31 Red Green Female 7
## 32 Blond Green Female 8
Questions:
If hair and eye color are independent, then probabilities of combinations are multiplicative:
\[\begin{aligned} &\P\{\text{black hair and blue eyes}\} \\ &\qquad = \P\{\text{black hair}\} \times \P\{\text{blue eyes}\given\text{black hair}\} \\ \end{aligned}\]
which if independent is \[\begin{aligned} &\hphantom{\P\{\text{black hair and blue eyes}\}} \\ &\qquad = \P\{\text{black hair}\} \times \P\{\text{blue eyes}\} \end{aligned}\]
A model of independence will have a multiplicative form: \[ p_{ab} = p_a \times p_b . \]
Set \(\lambda = \log(p)\), so that \[ \lambda_{ab} = \lambda_a + \lambda_b . \]
Let’s start by looking at just hair and eye color, summing over sex:
## Eye
## Hair Brown Blue Hazel Green
## Black 68 20 15 5
## Brown 119 84 54 29
## Red 26 17 14 14
## Blond 7 94 10 16
In this dataset…
Let \[\begin{aligned} n_{ij} &= (\text{observed}_{ij}) \\ &=(\text{observed number with hair $i$ and eye $j$}) \\ E_{ij} &= (\text{expected}_{ij}) \\ &=(\text{total number}) \times(\text{proportion with hair $i$}) \\ &\qquad \times (\text{proportion with eye $j$}) \\ &= n \times \left(\frac{n_{i\cdot}}{n}\right) \times \left(\frac{n_{\cdot j}}{n}\right) . \end{aligned}\]
Here \(n_{i \cdot}\) and \(n_{\cdot j}\) are the row and column sums.
We want to quantify how different the observed and expected are, inversely weighted by their noisiness: \[\begin{aligned} \sum_{ij} \left( \frac{ (\text{observed})_{ij} - (\text{expected})_{ij} }{ \SE[\text{observed}_{ij}] } \right)^2 \end{aligned}\]
So, what is \(\SE[\text{observed}_{ij}]\)?
Under the model of independence, \[\begin{aligned} n_{ij} &\sim \Binom(n, p_i q_j) , \\ \text{where}\quad p_i &= (\text{prob of hair color $i$}) \\ q_j &= (\text{prob of eye color $j$}) . \end{aligned}\]
So, \[\begin{aligned} \sd[n_{ij}] = \sqrt{ n p_i q_j (1 - p_i q_j) } , \end{aligned}\]
… and so how about this \[\begin{aligned} \SE[n_{ij}] &\approx \sqrt{ n p_i q_j } \\ &= \sqrt{(\text{expected}_{ij})} \qquad \ldots? \end{aligned}\]
\[\begin{aligned} \chi^2 &= \sum_{ij} \frac{ \left((\text{observed})_{ij} - (\text{expected})_{ij} \right)^2 }{ (\text{expected})_{ij} } . \end{aligned}\]
i.e., “observed minus expected squared, divided by expected”.
This gives us a number. What does it mean?
Suppose that \(Z_1, \ldots, Z_k\) are independent \(\Normal(0, 1)\). Then
\[ \chi^2 = Z_1^2 + \cdots + Z_k^2 \]
has the chi squared distribution with \(k\) degrees of freedom.
Notes:
\(\chi^2\) is a unitless nonnegative numbers.
\(\E[\chi^2] = k\).
If instead \(Z_i \sim \Normal(\mu_i, \sigma_i)\), then \(\chi^2 = \sum_{i=1}^k (Z_i - \mu_i)^2 / \sigma_i\).
\(\chi^2 \sim \Gamma(k/2, 1/2)\).
If the number of observations in a contingency table with \(r\) rows and \(c\) columns is large, then the chi-squared statistic has, approximately, the chi-squared distribution with \((r-1)\times(c-1)\) degrees of freedom under the hypothesis of independence of rows and columns.
(Asymptotically, i.e., as the number of observations goes to infinity.)
##
## Pearson's Chi-squared test
##
## data: haireye_2d
## X-squared = 138, df = 9, p-value <2e-16
Um, ok?
Let’s actually look at “observed minus expected”:
haireye_exp <- 0 * haireye_2d
haireye_exp[] <- ( rowSums(haireye_2d)[row(haireye_exp)]
* colSums(haireye_2d)[col(haireye_exp)]
/ sum(haireye_2d) )
## Eye
## Hair Brown Blue Hazel Green
## Black 40.1 39.2 17.0 11.7
## Brown 106.3 103.9 44.9 30.9
## Red 26.4 25.8 11.2 7.7
## Blond 47.2 46.1 20.0 13.7
Observed minus expected:
## Eye
## Hair Brown Blue Hazel Green
## Black 27.86 -19.22 -1.97 -6.68
## Brown 12.72 -19.87 9.07 -1.92
## Red -0.39 -8.79 2.85 6.32
## Blond -40.20 47.88 -9.95 2.27
Normalized by \(\sqrt{\text{expected}}\):
## Eye
## Hair Brown Blue Hazel Green
## Black 4.398 -3.069 -0.477 -1.954
## Brown 1.233 -1.949 1.353 -0.345
## Red -0.075 -1.730 0.852 2.283
## Blond -5.851 7.050 -2.228 0.613
Compute the chi-squared statistic with chisq.test( )
:
## Warning in chisq.test(HairEyeColor[, , "Female"]): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: HairEyeColor[, , "Female"]
## X-squared = 107, df = 9, p-value <2e-16
(definition here)
A permutation test estimates the “probability … under the model” part.
We still need the other stuff.
First, “individualize” the data:
long_haireye <- haireye[rep(1:nrow(haireye), haireye$number),
c("hair", "eye", "sex")]
stopifnot(nrow(long_haireye) == sum(haireye$number))
long_haireye
## hair eye sex
## 1 Black Brown Male
## 1.1 Black Brown Male
## 1.2 Black Brown Male
## 1.3 Black Brown Male
## 1.4 Black Brown Male
## 1.5 Black Brown Male
## 1.6 Black Brown Male
## 1.7 Black Brown Male
## 1.8 Black Brown Male
## 1.9 Black Brown Male
## 1.10 Black Brown Male
## 1.11 Black Brown Male
## 1.12 Black Brown Male
## 1.13 Black Brown Male
## 1.14 Black Brown Male
## 1.15 Black Brown Male
## 1.16 Black Brown Male
## 1.17 Black Brown Male
## 1.18 Black Brown Male
## 1.19 Black Brown Male
## 1.20 Black Brown Male
## 1.21 Black Brown Male
## 1.22 Black Brown Male
## 1.23 Black Brown Male
## 1.24 Black Brown Male
## 1.25 Black Brown Male
## 1.26 Black Brown Male
## 1.27 Black Brown Male
## 1.28 Black Brown Male
## 1.29 Black Brown Male
## 1.30 Black Brown Male
## 1.31 Black Brown Male
## 2 Brown Brown Male
## 2.1 Brown Brown Male
## 2.2 Brown Brown Male
## 2.3 Brown Brown Male
## 2.4 Brown Brown Male
## 2.5 Brown Brown Male
## 2.6 Brown Brown Male
## 2.7 Brown Brown Male
## 2.8 Brown Brown Male
## 2.9 Brown Brown Male
## 2.10 Brown Brown Male
## 2.11 Brown Brown Male
## 2.12 Brown Brown Male
## 2.13 Brown Brown Male
## 2.14 Brown Brown Male
## 2.15 Brown Brown Male
## 2.16 Brown Brown Male
## 2.17 Brown Brown Male
## 2.18 Brown Brown Male
## 2.19 Brown Brown Male
## 2.20 Brown Brown Male
## 2.21 Brown Brown Male
## 2.22 Brown Brown Male
## 2.23 Brown Brown Male
## 2.24 Brown Brown Male
## 2.25 Brown Brown Male
## 2.26 Brown Brown Male
## 2.27 Brown Brown Male
## 2.28 Brown Brown Male
## 2.29 Brown Brown Male
## 2.30 Brown Brown Male
## 2.31 Brown Brown Male
## 2.32 Brown Brown Male
## 2.33 Brown Brown Male
## 2.34 Brown Brown Male
## 2.35 Brown Brown Male
## 2.36 Brown Brown Male
## 2.37 Brown Brown Male
## 2.38 Brown Brown Male
## 2.39 Brown Brown Male
## 2.40 Brown Brown Male
## 2.41 Brown Brown Male
## 2.42 Brown Brown Male
## 2.43 Brown Brown Male
## 2.44 Brown Brown Male
## 2.45 Brown Brown Male
## 2.46 Brown Brown Male
## 2.47 Brown Brown Male
## 2.48 Brown Brown Male
## 2.49 Brown Brown Male
## 2.50 Brown Brown Male
## 2.51 Brown Brown Male
## 2.52 Brown Brown Male
## 3 Red Brown Male
## 3.1 Red Brown Male
## 3.2 Red Brown Male
## 3.3 Red Brown Male
## 3.4 Red Brown Male
## 3.5 Red Brown Male
## 3.6 Red Brown Male
## 3.7 Red Brown Male
## 3.8 Red Brown Male
## 3.9 Red Brown Male
## 4 Blond Brown Male
## 4.1 Blond Brown Male
## 4.2 Blond Brown Male
## 5 Black Blue Male
## 5.1 Black Blue Male
## 5.2 Black Blue Male
## 5.3 Black Blue Male
## 5.4 Black Blue Male
## 5.5 Black Blue Male
## 5.6 Black Blue Male
## 5.7 Black Blue Male
## 5.8 Black Blue Male
## 5.9 Black Blue Male
## 5.10 Black Blue Male
## 6 Brown Blue Male
## 6.1 Brown Blue Male
## 6.2 Brown Blue Male
## 6.3 Brown Blue Male
## 6.4 Brown Blue Male
## 6.5 Brown Blue Male
## 6.6 Brown Blue Male
## 6.7 Brown Blue Male
## 6.8 Brown Blue Male
## 6.9 Brown Blue Male
## 6.10 Brown Blue Male
## 6.11 Brown Blue Male
## 6.12 Brown Blue Male
## 6.13 Brown Blue Male
## 6.14 Brown Blue Male
## 6.15 Brown Blue Male
## 6.16 Brown Blue Male
## 6.17 Brown Blue Male
## 6.18 Brown Blue Male
## 6.19 Brown Blue Male
## 6.20 Brown Blue Male
## 6.21 Brown Blue Male
## 6.22 Brown Blue Male
## 6.23 Brown Blue Male
## 6.24 Brown Blue Male
## 6.25 Brown Blue Male
## 6.26 Brown Blue Male
## 6.27 Brown Blue Male
## 6.28 Brown Blue Male
## 6.29 Brown Blue Male
## 6.30 Brown Blue Male
## 6.31 Brown Blue Male
## 6.32 Brown Blue Male
## 6.33 Brown Blue Male
## 6.34 Brown Blue Male
## 6.35 Brown Blue Male
## 6.36 Brown Blue Male
## 6.37 Brown Blue Male
## 6.38 Brown Blue Male
## 6.39 Brown Blue Male
## 6.40 Brown Blue Male
## 6.41 Brown Blue Male
## 6.42 Brown Blue Male
## 6.43 Brown Blue Male
## 6.44 Brown Blue Male
## 6.45 Brown Blue Male
## 6.46 Brown Blue Male
## 6.47 Brown Blue Male
## 6.48 Brown Blue Male
## 6.49 Brown Blue Male
## 7 Red Blue Male
## 7.1 Red Blue Male
## 7.2 Red Blue Male
## 7.3 Red Blue Male
## 7.4 Red Blue Male
## 7.5 Red Blue Male
## 7.6 Red Blue Male
## 7.7 Red Blue Male
## 7.8 Red Blue Male
## 7.9 Red Blue Male
## 8 Blond Blue Male
## 8.1 Blond Blue Male
## 8.2 Blond Blue Male
## 8.3 Blond Blue Male
## 8.4 Blond Blue Male
## 8.5 Blond Blue Male
## 8.6 Blond Blue Male
## 8.7 Blond Blue Male
## 8.8 Blond Blue Male
## 8.9 Blond Blue Male
## 8.10 Blond Blue Male
## 8.11 Blond Blue Male
## 8.12 Blond Blue Male
## 8.13 Blond Blue Male
## 8.14 Blond Blue Male
## 8.15 Blond Blue Male
## 8.16 Blond Blue Male
## 8.17 Blond Blue Male
## 8.18 Blond Blue Male
## 8.19 Blond Blue Male
## 8.20 Blond Blue Male
## 8.21 Blond Blue Male
## 8.22 Blond Blue Male
## 8.23 Blond Blue Male
## 8.24 Blond Blue Male
## 8.25 Blond Blue Male
## 8.26 Blond Blue Male
## 8.27 Blond Blue Male
## 8.28 Blond Blue Male
## 8.29 Blond Blue Male
## 9 Black Hazel Male
## 9.1 Black Hazel Male
## 9.2 Black Hazel Male
## 9.3 Black Hazel Male
## 9.4 Black Hazel Male
## 9.5 Black Hazel Male
## 9.6 Black Hazel Male
## 9.7 Black Hazel Male
## 9.8 Black Hazel Male
## 9.9 Black Hazel Male
## 10 Brown Hazel Male
## 10.1 Brown Hazel Male
## 10.2 Brown Hazel Male
## 10.3 Brown Hazel Male
## 10.4 Brown Hazel Male
## 10.5 Brown Hazel Male
## 10.6 Brown Hazel Male
## 10.7 Brown Hazel Male
## 10.8 Brown Hazel Male
## 10.9 Brown Hazel Male
## 10.10 Brown Hazel Male
## 10.11 Brown Hazel Male
## 10.12 Brown Hazel Male
## 10.13 Brown Hazel Male
## 10.14 Brown Hazel Male
## 10.15 Brown Hazel Male
## 10.16 Brown Hazel Male
## 10.17 Brown Hazel Male
## 10.18 Brown Hazel Male
## 10.19 Brown Hazel Male
## 10.20 Brown Hazel Male
## 10.21 Brown Hazel Male
## 10.22 Brown Hazel Male
## 10.23 Brown Hazel Male
## 10.24 Brown Hazel Male
## 11 Red Hazel Male
## 11.1 Red Hazel Male
## 11.2 Red Hazel Male
## 11.3 Red Hazel Male
## 11.4 Red Hazel Male
## 11.5 Red Hazel Male
## 11.6 Red Hazel Male
## 12 Blond Hazel Male
## 12.1 Blond Hazel Male
## 12.2 Blond Hazel Male
## 12.3 Blond Hazel Male
## 12.4 Blond Hazel Male
## 13 Black Green Male
## 13.1 Black Green Male
## 13.2 Black Green Male
## 14 Brown Green Male
## 14.1 Brown Green Male
## 14.2 Brown Green Male
## 14.3 Brown Green Male
## 14.4 Brown Green Male
## 14.5 Brown Green Male
## 14.6 Brown Green Male
## 14.7 Brown Green Male
## 14.8 Brown Green Male
## 14.9 Brown Green Male
## 14.10 Brown Green Male
## 14.11 Brown Green Male
## 14.12 Brown Green Male
## 14.13 Brown Green Male
## 14.14 Brown Green Male
## 15 Red Green Male
## 15.1 Red Green Male
## 15.2 Red Green Male
## 15.3 Red Green Male
## 15.4 Red Green Male
## 15.5 Red Green Male
## 15.6 Red Green Male
## 16 Blond Green Male
## 16.1 Blond Green Male
## 16.2 Blond Green Male
## 16.3 Blond Green Male
## 16.4 Blond Green Male
## 16.5 Blond Green Male
## 16.6 Blond Green Male
## 16.7 Blond Green Male
## 17 Black Brown Female
## 17.1 Black Brown Female
## 17.2 Black Brown Female
## 17.3 Black Brown Female
## 17.4 Black Brown Female
## 17.5 Black Brown Female
## 17.6 Black Brown Female
## 17.7 Black Brown Female
## 17.8 Black Brown Female
## 17.9 Black Brown Female
## 17.10 Black Brown Female
## 17.11 Black Brown Female
## 17.12 Black Brown Female
## 17.13 Black Brown Female
## 17.14 Black Brown Female
## 17.15 Black Brown Female
## 17.16 Black Brown Female
## 17.17 Black Brown Female
## 17.18 Black Brown Female
## 17.19 Black Brown Female
## 17.20 Black Brown Female
## 17.21 Black Brown Female
## 17.22 Black Brown Female
## 17.23 Black Brown Female
## 17.24 Black Brown Female
## 17.25 Black Brown Female
## 17.26 Black Brown Female
## 17.27 Black Brown Female
## 17.28 Black Brown Female
## 17.29 Black Brown Female
## 17.30 Black Brown Female
## 17.31 Black Brown Female
## 17.32 Black Brown Female
## 17.33 Black Brown Female
## 17.34 Black Brown Female
## 17.35 Black Brown Female
## 18 Brown Brown Female
## 18.1 Brown Brown Female
## 18.2 Brown Brown Female
## 18.3 Brown Brown Female
## 18.4 Brown Brown Female
## 18.5 Brown Brown Female
## 18.6 Brown Brown Female
## 18.7 Brown Brown Female
## 18.8 Brown Brown Female
## 18.9 Brown Brown Female
## 18.10 Brown Brown Female
## 18.11 Brown Brown Female
## 18.12 Brown Brown Female
## 18.13 Brown Brown Female
## 18.14 Brown Brown Female
## 18.15 Brown Brown Female
## 18.16 Brown Brown Female
## 18.17 Brown Brown Female
## 18.18 Brown Brown Female
## 18.19 Brown Brown Female
## 18.20 Brown Brown Female
## 18.21 Brown Brown Female
## 18.22 Brown Brown Female
## 18.23 Brown Brown Female
## 18.24 Brown Brown Female
## 18.25 Brown Brown Female
## 18.26 Brown Brown Female
## 18.27 Brown Brown Female
## 18.28 Brown Brown Female
## 18.29 Brown Brown Female
## 18.30 Brown Brown Female
## 18.31 Brown Brown Female
## 18.32 Brown Brown Female
## 18.33 Brown Brown Female
## 18.34 Brown Brown Female
## 18.35 Brown Brown Female
## 18.36 Brown Brown Female
## 18.37 Brown Brown Female
## 18.38 Brown Brown Female
## 18.39 Brown Brown Female
## 18.40 Brown Brown Female
## 18.41 Brown Brown Female
## 18.42 Brown Brown Female
## 18.43 Brown Brown Female
## 18.44 Brown Brown Female
## 18.45 Brown Brown Female
## 18.46 Brown Brown Female
## 18.47 Brown Brown Female
## 18.48 Brown Brown Female
## 18.49 Brown Brown Female
## 18.50 Brown Brown Female
## 18.51 Brown Brown Female
## 18.52 Brown Brown Female
## 18.53 Brown Brown Female
## 18.54 Brown Brown Female
## 18.55 Brown Brown Female
## 18.56 Brown Brown Female
## 18.57 Brown Brown Female
## 18.58 Brown Brown Female
## 18.59 Brown Brown Female
## 18.60 Brown Brown Female
## 18.61 Brown Brown Female
## 18.62 Brown Brown Female
## 18.63 Brown Brown Female
## 18.64 Brown Brown Female
## 18.65 Brown Brown Female
## 19 Red Brown Female
## 19.1 Red Brown Female
## 19.2 Red Brown Female
## 19.3 Red Brown Female
## 19.4 Red Brown Female
## 19.5 Red Brown Female
## 19.6 Red Brown Female
## 19.7 Red Brown Female
## 19.8 Red Brown Female
## 19.9 Red Brown Female
## 19.10 Red Brown Female
## 19.11 Red Brown Female
## 19.12 Red Brown Female
## 19.13 Red Brown Female
## 19.14 Red Brown Female
## 19.15 Red Brown Female
## 20 Blond Brown Female
## 20.1 Blond Brown Female
## 20.2 Blond Brown Female
## 20.3 Blond Brown Female
## 21 Black Blue Female
## 21.1 Black Blue Female
## 21.2 Black Blue Female
## 21.3 Black Blue Female
## 21.4 Black Blue Female
## 21.5 Black Blue Female
## 21.6 Black Blue Female
## 21.7 Black Blue Female
## 21.8 Black Blue Female
## 22 Brown Blue Female
## 22.1 Brown Blue Female
## 22.2 Brown Blue Female
## 22.3 Brown Blue Female
## 22.4 Brown Blue Female
## 22.5 Brown Blue Female
## 22.6 Brown Blue Female
## 22.7 Brown Blue Female
## 22.8 Brown Blue Female
## 22.9 Brown Blue Female
## 22.10 Brown Blue Female
## 22.11 Brown Blue Female
## 22.12 Brown Blue Female
## 22.13 Brown Blue Female
## 22.14 Brown Blue Female
## 22.15 Brown Blue Female
## 22.16 Brown Blue Female
## 22.17 Brown Blue Female
## 22.18 Brown Blue Female
## 22.19 Brown Blue Female
## 22.20 Brown Blue Female
## 22.21 Brown Blue Female
## 22.22 Brown Blue Female
## 22.23 Brown Blue Female
## 22.24 Brown Blue Female
## 22.25 Brown Blue Female
## 22.26 Brown Blue Female
## 22.27 Brown Blue Female
## 22.28 Brown Blue Female
## 22.29 Brown Blue Female
## 22.30 Brown Blue Female
## 22.31 Brown Blue Female
## 22.32 Brown Blue Female
## 22.33 Brown Blue Female
## 23 Red Blue Female
## 23.1 Red Blue Female
## 23.2 Red Blue Female
## 23.3 Red Blue Female
## 23.4 Red Blue Female
## 23.5 Red Blue Female
## 23.6 Red Blue Female
## 24 Blond Blue Female
## 24.1 Blond Blue Female
## 24.2 Blond Blue Female
## 24.3 Blond Blue Female
## 24.4 Blond Blue Female
## 24.5 Blond Blue Female
## 24.6 Blond Blue Female
## 24.7 Blond Blue Female
## 24.8 Blond Blue Female
## 24.9 Blond Blue Female
## 24.10 Blond Blue Female
## 24.11 Blond Blue Female
## 24.12 Blond Blue Female
## 24.13 Blond Blue Female
## 24.14 Blond Blue Female
## 24.15 Blond Blue Female
## 24.16 Blond Blue Female
## 24.17 Blond Blue Female
## 24.18 Blond Blue Female
## 24.19 Blond Blue Female
## 24.20 Blond Blue Female
## 24.21 Blond Blue Female
## 24.22 Blond Blue Female
## 24.23 Blond Blue Female
## 24.24 Blond Blue Female
## 24.25 Blond Blue Female
## 24.26 Blond Blue Female
## 24.27 Blond Blue Female
## 24.28 Blond Blue Female
## 24.29 Blond Blue Female
## 24.30 Blond Blue Female
## 24.31 Blond Blue Female
## 24.32 Blond Blue Female
## 24.33 Blond Blue Female
## 24.34 Blond Blue Female
## 24.35 Blond Blue Female
## 24.36 Blond Blue Female
## 24.37 Blond Blue Female
## 24.38 Blond Blue Female
## 24.39 Blond Blue Female
## 24.40 Blond Blue Female
## 24.41 Blond Blue Female
## 24.42 Blond Blue Female
## 24.43 Blond Blue Female
## 24.44 Blond Blue Female
## 24.45 Blond Blue Female
## 24.46 Blond Blue Female
## 24.47 Blond Blue Female
## 24.48 Blond Blue Female
## 24.49 Blond Blue Female
## 24.50 Blond Blue Female
## 24.51 Blond Blue Female
## 24.52 Blond Blue Female
## 24.53 Blond Blue Female
## 24.54 Blond Blue Female
## 24.55 Blond Blue Female
## 24.56 Blond Blue Female
## 24.57 Blond Blue Female
## 24.58 Blond Blue Female
## 24.59 Blond Blue Female
## 24.60 Blond Blue Female
## 24.61 Blond Blue Female
## 24.62 Blond Blue Female
## 24.63 Blond Blue Female
## 25 Black Hazel Female
## 25.1 Black Hazel Female
## 25.2 Black Hazel Female
## 25.3 Black Hazel Female
## 25.4 Black Hazel Female
## 26 Brown Hazel Female
## 26.1 Brown Hazel Female
## 26.2 Brown Hazel Female
## 26.3 Brown Hazel Female
## 26.4 Brown Hazel Female
## 26.5 Brown Hazel Female
## 26.6 Brown Hazel Female
## 26.7 Brown Hazel Female
## 26.8 Brown Hazel Female
## 26.9 Brown Hazel Female
## 26.10 Brown Hazel Female
## 26.11 Brown Hazel Female
## 26.12 Brown Hazel Female
## 26.13 Brown Hazel Female
## 26.14 Brown Hazel Female
## 26.15 Brown Hazel Female
## 26.16 Brown Hazel Female
## 26.17 Brown Hazel Female
## 26.18 Brown Hazel Female
## 26.19 Brown Hazel Female
## 26.20 Brown Hazel Female
## 26.21 Brown Hazel Female
## 26.22 Brown Hazel Female
## 26.23 Brown Hazel Female
## 26.24 Brown Hazel Female
## 26.25 Brown Hazel Female
## 26.26 Brown Hazel Female
## 26.27 Brown Hazel Female
## 26.28 Brown Hazel Female
## 27 Red Hazel Female
## 27.1 Red Hazel Female
## 27.2 Red Hazel Female
## 27.3 Red Hazel Female
## 27.4 Red Hazel Female
## 27.5 Red Hazel Female
## 27.6 Red Hazel Female
## 28 Blond Hazel Female
## 28.1 Blond Hazel Female
## 28.2 Blond Hazel Female
## 28.3 Blond Hazel Female
## 28.4 Blond Hazel Female
## 29 Black Green Female
## 29.1 Black Green Female
## 30 Brown Green Female
## 30.1 Brown Green Female
## 30.2 Brown Green Female
## 30.3 Brown Green Female
## 30.4 Brown Green Female
## 30.5 Brown Green Female
## 30.6 Brown Green Female
## 30.7 Brown Green Female
## 30.8 Brown Green Female
## 30.9 Brown Green Female
## 30.10 Brown Green Female
## 30.11 Brown Green Female
## 30.12 Brown Green Female
## 30.13 Brown Green Female
## 31 Red Green Female
## 31.1 Red Green Female
## 31.2 Red Green Female
## 31.3 Red Green Female
## 31.4 Red Green Female
## 31.5 Red Green Female
## 31.6 Red Green Female
## 32 Blond Green Female
## 32.1 Blond Green Female
## 32.2 Blond Green Female
## 32.3 Blond Green Female
## 32.4 Blond Green Female
## 32.5 Blond Green Female
## 32.6 Blond Green Female
## 32.7 Blond Green Female
Compute the chi-squared statistic with chisq.test( )
:
## List of 9
## $ statistic: Named num 107
## ..- attr(*, "names")= chr "X-squared"
## $ parameter: Named int 9
## ..- attr(*, "names")= chr "df"
## $ p.value : num 7.01e-19
## $ method : chr "Pearson's Chi-squared test"
## $ data.name: chr "he_tab"
## $ observed : 'table' int [1:4, 1:4] 9 64 34 7 36 4 66 16 2 8 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ hair: chr [1:4] "Black" "Blond" "Brown" "Red"
## .. ..$ eye : chr [1:4] "Blue" "Brown" "Green" "Hazel"
## $ expected : num [1:4, 1:4] 18.9 29.5 52.1 13.5 20.3 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ hair: chr [1:4] "Black" "Blond" "Brown" "Red"
## .. ..$ eye : chr [1:4] "Blue" "Brown" "Green" "Hazel"
## $ residuals: 'table' num [1:4, 1:4] -2.28 6.35 -2.51 -1.76 3.49 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ hair: chr [1:4] "Black" "Blond" "Brown" "Red"
## .. ..$ eye : chr [1:4] "Blue" "Brown" "Green" "Hazel"
## $ stdres : 'table' num [1:4, 1:4] -3.14 9.25 -4.26 -2.36 4.9 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ hair: chr [1:4] "Black" "Blond" "Brown" "Red"
## .. ..$ eye : chr [1:4] "Blue" "Brown" "Green" "Hazel"
## - attr(*, "class")= chr "htest"
(What did we actually test?)
Use a permutation test to assess whether the relation between hair and eye color differs by sex.
If we have Poisson-many things of two categories: \[\begin{aligned} A &\sim \Poisson(a) \\ B &\sim \Poisson(b) \end{aligned}\] then the total number of things is also Poisson: \[ A + B \sim \Poisson(a + b) \] and each thing chooses its type independently: \[ A \given A + B \sim \Binom\left(A+B, \frac{a}{a+b}\right) . \]
If we fit a model where
\[\begin{aligned} (\text{number in cell $ij$}) &\sim \Poisson(\lambda_{ij}) \end{aligned}\]
then \[\begin{aligned} \frac{\lambda_{ij}}{\sum_{k\ell} \lambda_{k\ell}} &= (\text{proportion in category $ij$}) \\ \frac{\lambda_{ij}}{\sum_{k} \lambda_{kj}} &= (\text{proportion of row $j$ in column $i$}) \\ \frac{\lambda_{ij}}{\lambda_{ij} + \lambda_{k\ell}} &= (\text{proportion of those in $k\ell$ or $ij$ that are $ij$}) \end{aligned}\]
Suppose each gene falls into one of these categories:
cell death
cell motility
ion transport
… and that we have detected a number of upregulated genes in our experiment.
Do these have anything in common?
I.e., are any gene ontologies enriched among the upregulated genes?
## top term num_genes upregulated downregulated
## 1 cell death neuron death 922 39 27
## 2 cell death oxidative stress 1932 100 54
## 3 cell death necrotic 2020 98 57
## 4 cell death apoptosis 972 59 28
## 5 cell motility actin polymerization 57 4 3
## 6 cell motility sperm motility 379 58 31
## 7 cell motility cilia 1096 147 79
## 8 cell motility axis elongation 248 164 33
## 9 ion transport anion transport 300 10 11
## 10 ion transport cation transport 194 8 7
## 11 ion transport transmembrane transport 1905 46 32
Basic question: are the “upregulated” genes special?
goterms$no_diff <- with(goterms, num_genes - upregulated - downregulated)
gomat <- goterms[, c("upregulated", "downregulated", "no_diff")]
chisq.test(gomat)
## Warning in chisq.test(gomat): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: gomat
## X-squared = 1698, df = 20, p-value <2e-16
Yes. But, how?
Let
\[\begin{aligned} N_i &= (\text{number of no difference genes in category $i$}) \\ &\sim \Poisson(\exp(\lambda_i^N)) \\ U_i &= (\text{number of upregulated genes in category $i$}) \\ &\sim \Poisson(\exp(\lambda_i^U)) \\ D_i &= (\text{number of downregulated genes in category $i$}) \\ &\sim \Poisson(\exp(\lambda_i^D)) . \end{aligned}\]
and
\[\begin{aligned} \lambda_i^N &= \alpha_i \\ \lambda_i^U &= \alpha_i + \delta_+ + \beta_1[\text{top}_i] + \beta_2[i] \\ \lambda_i^D &= \alpha_i + \delta_- + \beta_1[\text{top}_i] + \beta_2[i] + \gamma_i. \end{aligned}\]
Interpret these: how do we see if
## Error: <text>:1:6: unexpected '{'
## 1: data {
## ^
## prior class coef group resp dpar nlpar bound
## 1 b
## 2 b eyeBrown
## 3 b eyeBrown:sexMale
## 4 b eyeGreen
## 5 b eyeGreen:sexMale
## 6 b eyeHazel
## 7 b eyeHazel:sexMale
## 8 b hairBlond
## 9 b hairBlond:eyeBrown
## 10 b hairBlond:eyeGreen
## 11 b hairBlond:eyeHazel
## 12 b hairBlond:sexMale
## 13 b hairBrown
## 14 b hairBrown:eyeBrown
## 15 b hairBrown:eyeGreen
## 16 b hairBrown:eyeHazel
## 17 b hairBrown:sexMale
## 18 b hairRed
## 19 b hairRed:eyeBrown
## 20 b hairRed:eyeGreen
## 21 b hairRed:eyeHazel
## 22 b hairRed:sexMale
## 23 b sexMale
## 24 student_t(3, 2, 10) Intercept
## Warning: partial match of 'dpar' to 'dpars'
## Compiling the C++ model
## Start sampling
## prior class coef group resp dpar nlpar bound
## 1 b
## 2 b eyeBrown
## 3 b eyeBrown:sexMale
## 4 b eyeGreen
## 5 b eyeGreen:sexMale
## 6 b eyeHazel
## 7 b eyeHazel:sexMale
## 8 b hairBlond
## 9 b hairBlond:eyeBrown
## 10 b hairBlond:eyeGreen
## 11 b hairBlond:eyeHazel
## 12 b hairBlond:sexMale
## 13 b hairBrown
## 14 b hairBrown:eyeBrown
## 15 b hairBrown:eyeGreen
## 16 b hairBrown:eyeHazel
## 17 b hairBrown:sexMale
## 18 b hairRed
## 19 b hairRed:eyeBrown
## 20 b hairRed:eyeGreen
## 21 b hairRed:eyeHazel
## 22 b hairRed:sexMale
## 23 b sexMale
## 24 student_t(3, 2, 10) Intercept
## Warning: partial match of 'dpar' to 'dpars'
## Compiling the C++ model
## Start sampling
A collection of \(n\) nonnegative random numbers \((P_1, \ldots, P_n)\) that sums to 1 has a Dirichlet(\(\alpha_1, \ldots, \alpha_n\)) distribution if it has probability density \[ \frac{1}{B(\alpha)} \prod_{i=1}^n p_i^{\alpha_i - 1} . \]
Facts:
If \(U \sim \Beta(\alpha, \beta)\) then \((U, 1-U) \sim \Dirichlet(\alpha, \beta)\).
If \(Y_1, \ldots, Y_n\) are independent, Exponential with rates \(\alpha_1, \ldots, \alpha_n\) then \[ Y / \sum_{i=1}^n Y_i \sim \Dirichlet(\alpha_1, \ldots, \alpha_n) . \]
Useful? Yes, if you need a distribution on numbers that sum to one (e.g., class proportions).