\[ %% % Add your macros here; they'll be included in pdf and html output. %% \newcommand{\R}{\mathbb{R}} % reals \newcommand{\E}{\mathbb{E}} % expectation \renewcommand{\P}{\mathbb{P}} % probability \DeclareMathOperator{\logit}{logit} \DeclareMathOperator{\logistic}{logistic} \DeclareMathOperator{\sd}{sd} \DeclareMathOperator{\var}{var} \DeclareMathOperator{\cov}{cov} \DeclareMathOperator{\Normal}{Normal} \DeclareMathOperator{\Poisson}{Poisson} \DeclareMathOperator{\Beta}{Beta} \DeclareMathOperator{\Binom}{Binomial} \DeclareMathOperator{\Gam}{Gamma} \DeclareMathOperator{\Exp}{Exponential} \DeclareMathOperator{\Cauchy}{Cauchy} \DeclareMathOperator{\Unif}{Unif} \DeclareMathOperator{\Dirichlet}{Dirichlet} \DeclareMathOperator{\Wishart}{Wishart} \DeclareMathOperator{\StudentsT}{StudentsT} \newcommand{\given}{\;\vert\;} \]

Random effects, and mixed models

Peter Ralph

29 October – Advanced Biological Statistics

Example: parent-child heights

Demonstration: heights

Let’s recreate Galton’s classic analysis: midparent height, adjusted for gender, is a good predictor of child height. (How good?)

Link to the data.

##   family father mother gender height kids male female
## 1      1   78.5   67.0      M   73.2    4    1      0
## 2      1   78.5   67.0      F   69.2    4    0      1
## 3      1   78.5   67.0      F   69.0    4    0      1
## 4      1   78.5   67.0      F   69.0    4    0      1
## 5      2   75.5   66.5      M   73.5    4    1      0
## 6      2   75.5   66.5      M   72.5    4    1      0

in class

Here is the source and the results.

Random effects

An example: urchins eat algae

From Logan:

To investigate density-dependent grazing effects of sea urchin Andrew and Underwood (1993) on filamentous algae measured the percentage of filamentous algae within five quadrats randomly positioned within each of four random patches of reef that were in turn nested within four sea urchin density treatments (no urchins, 33% of natural density, 66% natural density and 100% natural density). The sea urchin density treatment was considered a fixed factor and patch within density treatment as well as the individual quadrats were treated as random factors.

An example: urchins eat algae

##   TREAT PATCH QUAD ALGAE
## 1    0%     1    1    46
## 2    0%     1    2    44
## 3    0%     1    3    41
## 4    0%     1    4    29
## 5    0%     1    5    11
## 6    0%     2    1    65

There are four variables: TREAT, PATCH, QUAD and ALGAE

Main effect factor: TREAT

Experimental design

## , , TREAT = 0%
## 
##      QUAD
## PATCH 1 2 3 4 5
##    1  1 1 1 1 1
##    2  1 1 1 1 1
##    3  1 1 1 1 1
##    4  1 1 1 1 1
##    5  0 0 0 0 0
##    6  0 0 0 0 0
##    7  0 0 0 0 0
##    8  0 0 0 0 0
##    9  0 0 0 0 0
##    10 0 0 0 0 0
##    11 0 0 0 0 0
##    12 0 0 0 0 0
##    13 0 0 0 0 0
##    14 0 0 0 0 0
##    15 0 0 0 0 0
##    16 0 0 0 0 0
## 
## , , TREAT = 33%
## 
##      QUAD
## PATCH 1 2 3 4 5
##    1  0 0 0 0 0
##    2  0 0 0 0 0
##    3  0 0 0 0 0
##    4  0 0 0 0 0
##    5  1 1 1 1 1
##    6  1 1 1 1 1
##    7  1 1 1 1 1
##    8  1 1 1 1 1
##    9  0 0 0 0 0
##    10 0 0 0 0 0
##    11 0 0 0 0 0
##    12 0 0 0 0 0
##    13 0 0 0 0 0
##    14 0 0 0 0 0
##    15 0 0 0 0 0
##    16 0 0 0 0 0
## 
## , , TREAT = 66%
## 
##      QUAD
## PATCH 1 2 3 4 5
##    1  0 0 0 0 0
##    2  0 0 0 0 0
##    3  0 0 0 0 0
##    4  0 0 0 0 0
##    5  0 0 0 0 0
##    6  0 0 0 0 0
##    7  0 0 0 0 0
##    8  0 0 0 0 0
##    9  1 1 1 1 1
##    10 1 1 1 1 1
##    11 1 1 1 1 1
##    12 1 1 1 1 1
##    13 0 0 0 0 0
##    14 0 0 0 0 0
##    15 0 0 0 0 0
##    16 0 0 0 0 0
## 
## , , TREAT = 100%
## 
##      QUAD
## PATCH 1 2 3 4 5
##    1  0 0 0 0 0
##    2  0 0 0 0 0
##    3  0 0 0 0 0
##    4  0 0 0 0 0
##    5  0 0 0 0 0
##    6  0 0 0 0 0
##    7  0 0 0 0 0
##    8  0 0 0 0 0
##    9  0 0 0 0 0
##    10 0 0 0 0 0
##    11 0 0 0 0 0
##    12 0 0 0 0 0
##    13 1 1 1 1 1
##    14 1 1 1 1 1
##    15 1 1 1 1 1
##    16 1 1 1 1 1

Response distribution

plot of chunk boxit

Why is this wrong?

## 
## Call:
## lm(formula = ALGAE ~ TREAT, data = andrew_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -39.20 -19.00  -1.30  12.72  57.45 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   39.200      5.152   7.608 6.17e-11 ***
## TREAT33%     -20.200      7.287  -2.772   0.0070 ** 
## TREAT66%     -17.650      7.287  -2.422   0.0178 *  
## TREAT100%    -37.900      7.287  -5.201 1.62e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23.04 on 76 degrees of freedom
## Multiple R-squared:  0.2634, Adjusted R-squared:  0.2343 
## F-statistic: 9.059 on 3 and 76 DF,  p-value: 3.362e-05

What we really want: \[ \text{(algae)} = \text{(mean for treatment)} + \text{(mean offset for patch)} + \text{("noise")} . \]

We could do:

ALGAE ~ TREAT + PATCH

… but do we care about all those patch means?

## 
## Call:
## lm(formula = ALGAE ~ TREAT + PATCH, data = andrew_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -36.80  -2.60  -0.50   4.25  42.20 
## 
## Coefficients: (3 not defined because of singularities)
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   34.200      7.728   4.426 3.82e-05 ***
## TREAT33%       1.600     10.929   0.146  0.88406    
## TREAT66%     -14.200     10.929  -1.299  0.19850    
## TREAT100%    -31.600     10.929  -2.891  0.00523 ** 
## PATCH2        27.800     10.929   2.544  0.01339 *  
## PATCH3       -32.000     10.929  -2.928  0.00472 ** 
## PATCH4        24.200     10.929   2.214  0.03038 *  
## PATCH5       -33.200     10.929  -3.038  0.00345 ** 
## PATCH6       -35.800     10.929  -3.276  0.00170 ** 
## PATCH7         1.800     10.929   0.165  0.86970    
## PATCH8            NA         NA      NA       NA    
## PATCH9         8.400     10.929   0.769  0.44495    
## PATCH10       16.800     10.929   1.537  0.12917    
## PATCH11      -19.000     10.929  -1.739  0.08693 .  
## PATCH12           NA         NA      NA       NA    
## PATCH13       -1.000     10.929  -0.092  0.92738    
## PATCH14       -2.600     10.929  -0.238  0.81272    
## PATCH15       -1.600     10.929  -0.146  0.88406    
## PATCH16           NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.28 on 64 degrees of freedom
## Multiple R-squared:  0.6512, Adjusted R-squared:  0.5694 
## F-statistic: 7.964 on 15 and 64 DF,  p-value: 1.05e-09

Random effects

Small modification: \[ \text{(algae)} = \text{(mean for treatment)} + \text{(random offset for patch)} + \text{("noise")} . \]

We add a random intercept:

ALGAE ~ TREAT + (1|PATCH)

## Linear mixed model fit by REML ['lmerMod']
## Formula: ALGAE ~ TREAT + (1 | PATCH)
##    Data: andrew_data
## 
## REML criterion at convergence: 682.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9808 -0.3106 -0.1093  0.2831  2.5910 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  PATCH    (Intercept) 294.3    17.16   
##  Residual             298.6    17.28   
## Number of obs: 80, groups:  PATCH, 16
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   39.200      9.408   4.167
## TREAT33%     -20.200     13.305  -1.518
## TREAT66%     -17.650     13.305  -1.327
## TREAT100%    -37.900     13.305  -2.849
## 
## Correlation of Fixed Effects:
##           (Intr) TREAT3 TREAT6
## TREAT33%  -0.707              
## TREAT66%  -0.707  0.500       
## TREAT100% -0.707  0.500  0.500

## refitting model(s) with ML (instead of REML)
## Data: andrew_data
## Models:
## lm(ALGAE ~ TREAT, data = andrew_data): ALGAE ~ TREAT
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data): ALGAE ~ TREAT + (1 | PATCH)
##                                                       Df    AIC    BIC
## lm(ALGAE ~ TREAT, data = andrew_data)                  5 734.90 746.81
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data)  6 718.83 733.12
##                                                        logLik deviance
## lm(ALGAE ~ TREAT, data = andrew_data)                 -362.45   724.90
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data) -353.42   706.83
##                                                        Chisq Chi Df
## lm(ALGAE ~ TREAT, data = andrew_data)                              
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data) 18.069      1
##                                                       Pr(>Chisq)    
## lm(ALGAE ~ TREAT, data = andrew_data)                               
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data)   2.13e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## refitting model(s) with ML (instead of REML)
## Data: andrew_data
## Models:
## lmer(ALGAE ~ (1 | PATCH), data = andrew_data): ALGAE ~ (1 | PATCH)
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data): ALGAE ~ TREAT + (1 | PATCH)
##                                                       Df    AIC    BIC
## lmer(ALGAE ~ (1 | PATCH), data = andrew_data)          3 721.12 728.27
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data)  6 718.83 733.12
##                                                        logLik deviance
## lmer(ALGAE ~ (1 | PATCH), data = andrew_data)         -357.56   715.12
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data) -353.42   706.83
##                                                        Chisq Chi Df
## lmer(ALGAE ~ (1 | PATCH), data = andrew_data)                      
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data) 8.2938      3
##                                                       Pr(>Chisq)  
## lmer(ALGAE ~ (1 | PATCH), data = andrew_data)                     
## lmer(ALGAE ~ TREAT + (1 | PATCH), data = andrew_data)    0.04031 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

What are the random effects?

## $PATCH
##    (Intercept)
## 1   -4.1565776
## 2   18.9539940
## 3  -30.7586745
## 4   15.9612581
## 5  -13.6335746
## 6  -15.7949950
## 7   15.4624688
## 8   13.9661009
## 9    5.6945114
## 10  12.6775618
## 11 -17.0835341
## 12  -1.2885391
## 13   0.2493947
## 14  -1.0807102
## 15  -0.2493947
## 16   1.0807102
## 
## with conditional variances for "PATCH"

plot of chunk plot_ranef

Notes on mixed models

The math is a lot harder.

For simple linear regression (with fixed effects), the log-likelihood function is just the sum of the squared residuals.

But with a mixed model, the likelihood averages over the values of the random effects, which makes everything more difficult.

You sometimes have to worry about convergence.

Since the math is harder, mixed-model-fitting functions like lmer( ) have to use various sorts of numerical optimization methods to find the best-fitting parameters.

Sometimes, these may fail.

Notably, many use the REML approximation:

Usage:

     lmer(formula, data = NULL, REML = TRUE, control = lmerControl(),
          start = NULL, verbose = 0L, subset, weights, na.action,
          offset, contrasts = NULL, devFunOnly = FALSE, ...)

Hypothesis testing?

With fixed effects, for a factor f, the comparison

anova( lm(y ~ f - 1), lm(y ~ 1) )

uses the model that \[ y_i = \beta_{f_i} + \epsilon_i \] to test against the null hypothesis that \[ H_0 : \beta_1 = \beta_2 = \cdots = \beta_m = 0. \]

With random effects,

anova( lm(y ~ (1|f) - 1), lm(y ~ 1) )

uses the model that \[\begin{aligned} y_i &= \beta_{f_i} + \epsilon_i \\ \beta_a &\sim \Normal(0, \eta) \end{aligned}\] to test against the null hypothesis that \[ H_0 : \eta = 0. \]

Back to the height data

Your turn

  1. Add a random effect of family to the model.
  2. How big is the “family” effect?
  3. Assess significance by using anova( ) to compare to a nested model.

Link to the data.

in class

## refitting model(s) with ML (instead of REML)
## Data: galton
## Models:
## lm(height ~ gender + mother + father, data = galton): height ~ gender + mother + father
## lmer(height ~ gender + mother + father + (1 | family), data = galton): height ~ gender + mother + father + (1 | family)
##                                                                       Df
## lm(height ~ gender + mother + father, data = galton)                   5
## lmer(height ~ gender + mother + father + (1 | family), data = galton)  6
##                                                                          AIC
## lm(height ~ gender + mother + father, data = galton)                  3932.8
## lmer(height ~ gender + mother + father + (1 | family), data = galton) 3891.4
##                                                                          BIC
## lm(height ~ gender + mother + father, data = galton)                  3956.8
## lmer(height ~ gender + mother + father + (1 | family), data = galton) 3920.2
##                                                                        logLik
## lm(height ~ gender + mother + father, data = galton)                  -1961.4
## lmer(height ~ gender + mother + father + (1 | family), data = galton) -1939.7
##                                                                       deviance
## lm(height ~ gender + mother + father, data = galton)                    3922.8
## lmer(height ~ gender + mother + father + (1 | family), data = galton)   3879.4
##                                                                        Chisq
## lm(height ~ gender + mother + father, data = galton)                        
## lmer(height ~ gender + mother + father + (1 | family), data = galton) 43.401
##                                                                       Chi Df
## lm(height ~ gender + mother + father, data = galton)                        
## lmer(height ~ gender + mother + father + (1 | family), data = galton)      1
##                                                                       Pr(>Chisq)
## lm(height ~ gender + mother + father, data = galton)                            
## lmer(height ~ gender + mother + father + (1 | family), data = galton)   4.46e-11
##                                                                          
## lm(height ~ gender + mother + father, data = galton)                     
## lmer(height ~ gender + mother + father + (1 | family), data = galton) ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

in class

plot of chunk do_it

Multiple comparisons

A silly example

Suppose 100 people did 100 well-executed experiments to ask if snails move faster while listening to metal than to mozart.

How many would find a statistically significant difference at \(p < 0.05\)?

Would any find a large effect size?

A less silly example

Suppose someone conducts a well-controlled study that records the salary and the mean daily consumption of 100 different foods in a bunch of people.

How many of the foods would be statistically significantly correlated with income at \(p < 0.05\)?

Would any have a large effect size?

The problem

A \(p\)-value is

the probability of seeing something at least as extreme as what was seen in the data, if the null hypothesis were true.

So, if the null hypothesis is true, then by definition, \(p\)-values are uniformly distributed between 0 and 1.

The Bonferroni Correction

A cutoff of \(p < 0.05\) ensures you should not wrongly reject the null hypothesis more than 5% of the time.

But, if you do \(n\) different tests, all at once?

To keep the probability of not wrongly rejecting any of the \(n\) null hypotheses to 5%, take a cutoff of \(p < 0.05/n\).

To tolerate some errors, use the false discovery rate.

Example: Bonferroni

plot of chunk null

Example: False Discovery Rate

plot of chunk null2

Many regressions

Gene expression levels

From Host Genotype and Microbiota Contribute Asymmetrically to Transcriptional Variation in the Threespine Stickleback Gut Clayton M. Small, Kathryn Milligan-Myhre, Susan Bassham, Karen Guillemin, William A. Cresko. Genome Biology and Evolution, March 2017.

Study design:

study_design
study_design

The data

There are 8506 genes whose expression is measured in 84 fish.

plot of chunk plot_data

Normalize

To put coefficients on the same scale:

plot of chunk plotdata2

plot of chunk matplot

Fit lots of models:

and extract coefficients, \(p\)-values

The \(p\)-values

… for an ANOVA comparing

    gene expression ~ Population
    gene expression ~ Population + Treatment + Sex

plot of chunk show_pvals

Coefficents, with \(p < 0.05\) in red: plot of chunk signif

Coefficents, with \(p < 0.05/n\) in red: plot of chunk signif2

The Bonferroni Bunch

##                 Gene_ID   Genome_Loc Gene_Start_bp
## 379  ENSGACG00000001729 scaffold_882          1249
## 594  ENSGACG00000002048     groupXXI       2362051
## 833  ENSGACG00000002397       groupV        842100
## 985  ENSGACG00000003874     groupXIX       5074262
## 1005 ENSGACG00000003899     groupXIX       5087963
## 1014 ENSGACG00000003911     groupXIX       5234537
## 1028 ENSGACG00000003928     groupXIX       5268421
## 1042 ENSGACG00000003947     groupXIX       5289965
## 1102 ENSGACG00000004020     groupXIX       5466205
## 1109 ENSGACG00000004028     groupXIX       5497683
## 1128 ENSGACG00000004050     groupXIX       5505366
## 1138 ENSGACG00000004063     groupXIX       5536933
## 1155 ENSGACG00000004088     groupXIX       5663914
## 1161 ENSGACG00000004098     groupXIX       5670651
## 1162 ENSGACG00000004099     groupXIX       5675242
## 1165 ENSGACG00000004103     groupXIX       5687641
## 1172 ENSGACG00000004112     groupXIX       5691367
## 1185 ENSGACG00000004129     groupXIX       5819632
## 1193 ENSGACG00000004140     groupXIX       5932949
## 1195 ENSGACG00000004142     groupXIX       5949691
## 1197 ENSGACG00000004145     groupXIX       5951392
## 1254 ENSGACG00000004208     groupXIX       5968125
## 1261 ENSGACG00000004215     groupXIX       5984689
## 1304 ENSGACG00000004276     groupXIX       6011659
## 1338 ENSGACG00000004329     groupXIX       6085324
## 1341 ENSGACG00000004333     groupXIX       6090639
## 1355 ENSGACG00000004350     groupXIX       6093858
## 1359 ENSGACG00000004357     groupXIX       6098400
## 1381 ENSGACG00000004380     groupXIX       6111536
## 1384 ENSGACG00000004384     groupXIX       6122297
## 1426 ENSGACG00000004439     groupXIX       6183509
## 1463 ENSGACG00000004485     groupXIX       6201210
## 1473 ENSGACG00000004498     groupXIX       6213441
## 1490 ENSGACG00000004521     groupXIX       6234660
## 1518 ENSGACG00000004555     groupXIX       6253382
## 1534 ENSGACG00000004578     groupXIX       6269682
## 1559 ENSGACG00000004613     groupXIX       6294749
## 1609 ENSGACG00000004670     groupXIX       6301852
## 1612 ENSGACG00000004675     groupXIX       6304507
## 1616 ENSGACG00000004680     groupXIX       6310208
## 1627 ENSGACG00000004691     groupXIX       6319533
## 1630 ENSGACG00000004695     groupXIX       6323877
## 1651 ENSGACG00000004724     groupXIX       6381558
## 1667 ENSGACG00000004744     groupXIX       6392993
## 1671 ENSGACG00000004748       groupI        474993
## 1710 ENSGACG00000004795     groupXIX       6430393
## 1720 ENSGACG00000004806     groupXIX       6434587
## 1739 ENSGACG00000004827     groupXIX       6437512
## 1772 ENSGACG00000004867     groupXIX       6453754
## 1798 ENSGACG00000004903     groupXIX       6544984
## 1939 ENSGACG00000005078     groupXIX       6569082
## 1946 ENSGACG00000005085     groupXIX       6609904
## 1951 ENSGACG00000005092     groupXIX       6644849
## 1975 ENSGACG00000005119     groupXIX       6652728
## 2009 ENSGACG00000005167     groupXIX       6676365
## 2015 ENSGACG00000005176     groupXIX       6715666
## 2020 ENSGACG00000005181     groupXIX       6726200
## 2072 ENSGACG00000005252     groupXIX       6844584
## 2075 ENSGACG00000005255     groupXIX       6848563
## 2101 ENSGACG00000005293     groupXIX       6862001
## 2127 ENSGACG00000005331     groupXIX       6868449
## 2145 ENSGACG00000005355     groupXIX       6872104
## 2155 ENSGACG00000005365     groupXIX       6873765
## 2180 ENSGACG00000005397     groupXIX       6878614
## 2182 ENSGACG00000005399     groupXIX       6880605
## 2187 ENSGACG00000005406     groupXIX       6886885
## 2194 ENSGACG00000005414     groupXIX       6889077
## 2195 ENSGACG00000005416     groupXIX       6895042
## 2209 ENSGACG00000005436     groupXIX       6921327
## 2233 ENSGACG00000005468     groupXIX       6952818
## 2243 ENSGACG00000005483     groupXIX       7010663
## 2248 ENSGACG00000005489     groupXIX       7013605
## 2263 ENSGACG00000005509     groupXIX       7017971
## 2268 ENSGACG00000005514     groupXIX       7023483
## 2288 ENSGACG00000005541     groupXIX       7040298
## 2307 ENSGACG00000005561     groupXIX       7055587
## 2345 ENSGACG00000005613     groupXIX       7078941
## 2361 ENSGACG00000005632     groupXIX       7091066
## 2377 ENSGACG00000005655       groupX       8128412
## 2381 ENSGACG00000005659     groupXIX       7099791
## 2445 ENSGACG00000005940     groupXIX       7354155
## 2473 ENSGACG00000005974     groupXIX       7374591
## 2481 ENSGACG00000005988     groupXIX       7395960
## 2487 ENSGACG00000005996     groupXIX       7404582
## 2490 ENSGACG00000006001     groupXIX       7409406
## 2508 ENSGACG00000006025     groupXIX       7420099
## 2533 ENSGACG00000006058     groupXIX       7436490
## 2564 ENSGACG00000006101     groupXIX       7477087
## 2571 ENSGACG00000006110     groupXIX       7488251
## 2588 ENSGACG00000006135     groupXIX       7504978
## 2594 ENSGACG00000006141     groupXIX       7515235
## 2613 ENSGACG00000006260     groupXIX       7626845
## 2628 ENSGACG00000006281     groupXIX       7634611
## 2656 ENSGACG00000006315     groupXIX       7644684
## 2678 ENSGACG00000006340     groupXIX       7654842
## 2687 ENSGACG00000006351     groupXIX       7669052
## 2813 ENSGACG00000006516     groupXIX       7716217
## 2815 ENSGACG00000006521     groupXIX       7718951
## 2857 ENSGACG00000006582     groupXIX       7755202
## 2915 ENSGACG00000006659     groupXIX       7928238
## 2934 ENSGACG00000006687     groupXIX       7934480
## 2954 ENSGACG00000008302     groupXIX       9271720
## 2962 ENSGACG00000008315     groupXIX       9287888
## 2981 ENSGACG00000008376     groupXIX       9337694
## 3080 ENSGACG00000008494     groupXIX       9448469
## 3107 ENSGACG00000008531     groupXIX       9501117
## 3113 ENSGACG00000008543     groupXIX       9517063
## 3117 ENSGACG00000008550     groupXIX       9523896
## 3132 ENSGACG00000008572     groupXIX       9532775
## 3136 ENSGACG00000008577     groupXIX       9535480
## 3152 ENSGACG00000008599     groupXIX       9564658
## 3158 ENSGACG00000008607      groupXI       5736635
## 3163 ENSGACG00000008612     groupXIX       9579090
## 3168 ENSGACG00000008617     groupXIX       9607677
## 3185 ENSGACG00000008638     groupXIX       9640525
## 3199 ENSGACG00000008655     groupXIX       9666876
## 3221 ENSGACG00000008687     groupXIX       9749129
## 3255 ENSGACG00000008743     groupXIX       9835581
## 3280 ENSGACG00000008779     groupXIX       9929496
## 3283 ENSGACG00000008783     groupXIX       9933073
## 3311 ENSGACG00000008837     groupXIX       9945600
## 3314 ENSGACG00000008843     groupXIX       9951237
## 3350 ENSGACG00000008898     groupXIX      10047090
## 3359 ENSGACG00000008907     groupXIX      10061760
## 3365 ENSGACG00000008914     groupXIX      10072808
## 3376 ENSGACG00000008928     groupXIX      10081037
## 3392 ENSGACG00000008949     groupXIX      10101541
## 3416 ENSGACG00000008981     groupXIX      10122821
## 3441 ENSGACG00000009014     groupXIX      10134903
## 3442 ENSGACG00000009015     groupXIX      10148833
## 3463 ENSGACG00000009039     groupXIX      10185959
## 3512 ENSGACG00000009107     groupXIX      10282175
## 3583 ENSGACG00000009201     groupXIX      10353716
## 3584 ENSGACG00000009202     groupXIX      10363935
## 3590 ENSGACG00000009210     groupXIX      10390061
## 3656 ENSGACG00000009309     groupXIX      10403743
## 3674 ENSGACG00000009342     groupXIX      10441422
## 3703 ENSGACG00000009378     groupXIX      10583153
## 3727 ENSGACG00000010003     groupXIX      11486565
## 3736 ENSGACG00000010014     groupXIX      11517961
## 3745 ENSGACG00000010024     groupXIX      11574991
## 3747 ENSGACG00000010026    groupXIII       9810875
## 3752 ENSGACG00000010032     groupXIX      11627393
## 3760 ENSGACG00000010042     groupXIX      11688379
## 3768 ENSGACG00000010054     groupXIX      11737187
## 3774 ENSGACG00000010060     groupXIX      11744274
## 3790 ENSGACG00000010863     groupXIX      12871147
## 3814 ENSGACG00000010898     groupXIX      12932396
## 3847 ENSGACG00000010936     groupXIX      12952501
## 3866 ENSGACG00000010963     groupXIX      12980638
## 3868 ENSGACG00000010965     groupXIX      12984613
## 3880 ENSGACG00000010978     groupXIX      12985921
## 3900 ENSGACG00000011004     groupXIX      12995354
## 3929 ENSGACG00000011046     groupXIX      13053541
## 3942 ENSGACG00000011062     groupXIX      13172917
## 3957 ENSGACG00000011081     groupXIX      13227769
## 3997 ENSGACG00000011130     groupXIX      13354738
## 4026 ENSGACG00000011173     groupXIX      13427058
## 4027 ENSGACG00000011175     groupXIX      13430600
## 4031 ENSGACG00000011179     groupXIX      13455597
## 4084 ENSGACG00000011713     groupXIX      14968720
## 4091 ENSGACG00000011723     groupXIX      14986398
## 4092 ENSGACG00000011725     groupXIX      14992460
## 4108 ENSGACG00000011745     groupXIX      15000177
## 4126 ENSGACG00000011771     groupXIX      15033199
## 4213 ENSGACG00000011879     groupXIX      15132156
## 4221 ENSGACG00000011888     groupXIX      15143411
## 4242 ENSGACG00000011914     groupXIX      15189308
## 4265 ENSGACG00000011944     groupXIX      15214002
## 4285 ENSGACG00000011975     groupXIX      15276451
## 4331 ENSGACG00000012034     groupXIX      15311956
## 4344 ENSGACG00000012049     groupXIX      15324801
## 4346 ENSGACG00000012053     groupXIX      15336362
## 4357 ENSGACG00000012066     groupXIX      15348293
## 4359 ENSGACG00000012071     groupXIX      15357367
## 4377 ENSGACG00000012099     groupXIX      15387902
## 4383 ENSGACG00000012110     groupXIX      15399820
## 4462 ENSGACG00000012213     groupXIX      15460729
## 4467 ENSGACG00000012221     groupXIX      15481309
## 4492 ENSGACG00000012349     groupXIX      15712132
## 4521 ENSGACG00000012386     groupIII         50881
## 4523 ENSGACG00000012388     groupXIX      15752174
## 4545 ENSGACG00000012412     groupXIX      15770641
## 4570 ENSGACG00000012441     groupXIX      15776072
## 4584 ENSGACG00000012456     groupXIX      15776176
## 4586 ENSGACG00000012458     groupXIX      15837060
## 4597 ENSGACG00000012474     groupXIX      15918028
## 4638 ENSGACG00000012536     groupXIX      15947984
## 4639 ENSGACG00000012538     groupXIX      15950976
## 4642 ENSGACG00000012541     groupXIX      15963171
## 4672 ENSGACG00000012580     groupXIX      16058808
## 4693 ENSGACG00000012612     groupXIX      16082531
## 4719 ENSGACG00000012640     groupXIX      16212950
## 4804 ENSGACG00000012758     groupXIX      16545546
## 4853 ENSGACG00000012835     groupXIX      16608508
## 4880 ENSGACG00000012874     groupXIX      16640414
## 4922 ENSGACG00000013517     groupXIX      18023897
## 4948 ENSGACG00000013552     groupXIX      18279323
## 4958 ENSGACG00000013564     groupXIX      18286218
## 4991 ENSGACG00000013611     groupXIX      18485565
## 4996 ENSGACG00000013617     groupXIX      18488096
## 5002 ENSGACG00000013623     groupXIX      18554402
## 5020 ENSGACG00000013648     groupXIX      18713908
## 5056 ENSGACG00000013687     groupXIX      18747194
## 5064 ENSGACG00000013695     groupXIX      18764501
## 5071 ENSGACG00000013705     groupXIX      18768929
## 5077 ENSGACG00000013714     groupXIX      18804888
## 5119 ENSGACG00000013768     groupXIX      18860234
## 5124 ENSGACG00000013775     groupXIX      18901520
## 5128 ENSGACG00000013779     groupXIX      18943143
## 5131 ENSGACG00000013784     groupXIX      18973913
## 5135 ENSGACG00000013788     groupXIX      18976351
## 5209 ENSGACG00000013883     groupXIX      19632903
## 5217 ENSGACG00000013896     groupXIX      19648021
## 5220 ENSGACG00000013899     groupXIX      19659023
## 5240 ENSGACG00000013931     groupXIX      19748754
## 5246 ENSGACG00000013939     groupXIX      19756843
## 5266 ENSGACG00000013963     groupXIX      19768602
## 5292 ENSGACG00000013996     groupXIX      19940088
## 5353 ENSGACG00000014081     groupXIX      20045284
## 5359 ENSGACG00000014090     groupXIX      20079687
## 6745 ENSGACG00000018690 scaffold_654          2701
## 6747 ENSGACG00000018692 scaffold_654          7332
## 7089 ENSGACG00000019136      groupIV      21957750
## 7984 ENSGACG00000020294     groupVII      15804561
## 8497 ENSGACG00000022681     groupXIX       8017901
##                                                                                                                            Gene_Description
## 379                                                                    *interleukin-8 [Lateolabrax japonicus];ABI48894 [Source:TopBlasxHit]
## 594                                                        RNA-binding region (RNP1, RRM) containing 3 [Source:ZFIN;Acc:ZDB-GENE-060312-35]
## 833                                                                          stearoyl-CoA desaturase b [Source:ZFIN;Acc:ZDB-GENE-050522-12]
## 985                                                                                     parvin, beta [Source:ZFIN;Acc:ZDB-GENE-030131-4411]
## 1005                                                                                     parvin, gamma [Source:ZFIN;Acc:ZDB-GENE-070410-67]
## 1014                                                                                         plexin b2b [Source:ZFIN;Acc:ZDB-GENE-080902-1]
## 1028                                                        tubulin, gamma complex associated protein 6 [Source:HGNC Symbol;Acc:HGNC:18127]
## 1042            adaptor protein, phosphotyrosine interaction, PH domain and leucine zipper containing 2 [Source:ZFIN;Acc:ZDB-GENE-081016-2]
## 1102                                                                       transmembrane protein 117 [Source:ZFIN;Acc:ZDB-GENE-040426-2809]
## 1109                                                                   twinfilin actin-binding protein 1 [Source:HGNC Symbol;Acc:HGNC:9620]
## 1128                                                       interleukin-1 receptor-associated kinase 4 [Source:ZFIN;Acc:ZDB-GENE-040426-738]
## 1138                                         ADAM metallopeptidase with thrombospondin type 1 motif, 20 [Source:HGNC Symbol;Acc:HGNC:17178]
## 1155                                                                    prickle homolog 1a (Drosophila) [Source:ZFIN;Acc:ZDB-GENE-030724-5]
## 1161                                                                                       periphilin 1 [Source:HGNC Symbol;Acc:HGNC:19369]
## 1162                                                                                                                                       
## 1165                                                                          YY1 associated factor 2 [Source:ZFIN;Acc:ZDB-GENE-041210-115]
## 1172                                                                  glucoside xylosyltransferase 1b [Source:ZFIN;Acc:ZDB-GENE-041210-116]
## 1185                                                                                         contactin 1 [Source:HGNC Symbol;Acc:HGNC:2171]
## 1193                                                     patatin-like phospholipase domain containing 8 [Source:HGNC Symbol;Acc:HGNC:28900]
## 1195                                                     DnaJ (Hsp40) homolog, subfamily B, member 9a [Source:ZFIN;Acc:ZDB-GENE-050626-115]
## 1197                                                                                  dynamin 1-like [Source:ZFIN;Acc:ZDB-GENE-040426-1556]
## 1254                                                                                     caldesmon 1b [Source:ZFIN;Acc:ZDB-GENE-090313-229]
## 1261                                                                   2,3-bisphosphoglycerate mutase [Source:ZFIN;Acc:ZDB-GENE-040718-375]
## 1304                                                       CCR4-NOT transcription complex, subunit 4a [Source:ZFIN;Acc:ZDB-GENE-090313-262]
## 1338                                                                         lactate dehydrogenase Bb [Source:ZFIN;Acc:ZDB-GENE-040718-176]
## 1341                                                                              golgi transport 1Ba [Source:ZFIN;Acc:ZDB-GENE-041210-157]
## 1355                                                             solute carrier family 35, member B4 [Source:ZFIN;Acc:ZDB-GENE-030131-2457]
## 1359                                        coiled-coil-helix-coiled-coil-helix domain containing 3b [Source:ZFIN;Acc:ZDB-GENE-030131-4476]
## 1381                                                                neuroepithelial cell transforming 1 [Source:HGNC Symbol;Acc:HGNC:14592]
## 1384                                                       ankyrin repeat and SOCS box containing 13b [Source:ZFIN;Acc:ZDB-GENE-091118-116]
## 1426                                            dehydrogenase E1 and transketolase domain containing 1 [Source:ZFIN;Acc:ZDB-GENE-041212-44]
## 1463                                                 calcium/calmodulin-dependent protein kinase 1Db [Source:ZFIN;Acc:ZDB-GENE-070112-1872]
## 1473                                                                      cyclin-dependent kinase 16 [Source:ZFIN;Acc:ZDB-GENE-030131-2939]
## 1490                                                                                                                                       
## 1518                                                                                           netrin 4 [Source:ZFIN;Acc:ZDB-GENE-050310-1]
## 1534                                                                  ubiquitin specific peptidase 3 [Source:ZFIN;Acc:ZDB-GENE-030131-5142]
## 1559                                                             mannosidase, alpha, class 2C, member 1 [Source:ZFIN;Acc:ZDB-GENE-101103-4]
## 1609                                                           nei endonuclease VIII-like 1 (E. coli) [Source:ZFIN;Acc:ZDB-GENE-040426-994]
## 1612                                                                         COMM domain containing 4 [Source:ZFIN;Acc:ZDB-GENE-060929-600]
## 1616                                                                                   semaphorin 7A [Source:ZFIN;Acc:ZDB-GENE-030131-3633]
## 1627                                       immunoglobulin superfamily containing leucine-rich repeat 2 [Source:ZFIN;Acc:ZDB-GENE-050320-95]
## 1630                                                                    stimulated by retinoic acid 6 [Source:ZFIN;Acc:ZDB-GENE-060616-252]
## 1651                                                                          stomatin (EPB72)-like 1 [Source:ZFIN;Acc:ZDB-GENE-070209-241]
## 1667                                                             hexosaminidase A (alpha polypeptide) [Source:ZFIN;Acc:ZDB-GENE-050417-283]
## 1671                                                       eukaryotic translation initiation factor 2A [Source:ZFIN;Acc:ZDB-GENE-050626-52]
## 1710                                                       phosphopantothenoylcysteine decarboxylase [Source:ZFIN;Acc:ZDB-GENE-040426-1749]
## 1720                                                                 3-hydroxyacyl-CoA dehydratase 3 [Source:ZFIN;Acc:ZDB-GENE-040426-1200]
## 1739                                                     von Willebrand factor A domain containing 9 [Source:ZFIN;Acc:ZDB-GENE-030131-5804]
## 1772                                                                   DENN/MADD domain containing 4A [Source:ZFIN;Acc:ZDB-GENE-060503-285]
## 1798                                                       mitogen-activated protein kinase kinase 1 [Source:ZFIN;Acc:ZDB-GENE-040426-2759]
## 1939                                               small nuclear RNA activating complex, polypeptide 5 [Source:ZFIN;Acc:ZDB-GENE-041111-50]
## 1946                                                                            SMAD family member 6b [Source:ZFIN;Acc:ZDB-GENE-050419-198]
## 1951                                                                              SMAD family member 3b [Source:ZFIN;Acc:ZDB-GENE-030128-4]
## 1975                                                         alpha- and gamma-adaptin binding protein [Source:ZFIN;Acc:ZDB-GENE-040718-120]
## 2009                                                                                       zgc:162898 [Source:ZFIN;Acc:ZDB-GENE-070410-107]
## 2015                                                          protein inhibitor of activated STAT, 1b [Source:ZFIN;Acc:ZDB-GENE-050419-202]
## 2020                                                                        mortality factor 4 like 1 [Source:ZFIN;Acc:ZDB-GENE-040718-348]
## 2072                                                            damage-specific DNA binding protein 2 [Source:ZFIN;Acc:ZDB-GENE-050419-169]
## 2075                                                   kelch repeat and BTB (POZ) domain containing 4 [Source:ZFIN;Acc:ZDB-GENE-040426-937]
## 2101                                                              immunoglobulin mu binding protein 2 [Source:ZFIN;Acc:ZDB-GENE-050419-258]
## 2127                                                                   chitinase domain containing 1 [Source:ZFIN;Acc:ZDB-GENE-030131-9169]
## 2145                                                           Parkinson disease 7 domain containing 1 [Source:ZFIN;Acc:ZDB-GENE-051030-96]
## 2155                                                                                   CD151 molecule [Source:ZFIN;Acc:ZDB-GENE-041010-137]
## 2180                                                                                si:ch211-247i17.1 [Source:ZFIN;Acc:ZDB-GENE-131121-275]
## 2182                                                   calcium release activated channel regulator 2B [Source:ZFIN;Acc:ZDB-GENE-061215-136]
## 2187                                                                          transmembrane protein 138 [Source:ZFIN;Acc:ZDB-GENE-120912-1]
## 2194                                                                       transmembrane protein 258 [Source:ZFIN;Acc:ZDB-GENE-040426-1739]
## 2195                                                                          myelin regulatory factor [Source:ZFIN;Acc:ZDB-GENE-080204-57]
## 2209                                                                                si:ch1073-89b12.1 [Source:ZFIN;Acc:ZDB-GENE-131121-340]
## 2233                                                                                 synaptotagmin VIIa [Source:ZFIN;Acc:ZDB-GENE-090601-5]
## 2243                                                                                                                                       
## 2248                                                                                   si:dkey-201c1.2 [Source:ZFIN;Acc:ZDB-GENE-110408-61]
## 2263                                                                                                                                       
## 2268                                                               p53-induced death domain protein 1 [Source:ZFIN;Acc:ZDB-GENE-081104-353]
## 2288                                                                                   zmp:0000001167 [Source:ZFIN;Acc:ZDB-GENE-140106-127]
## 2307                                                              ATH1, acid trehalase-like 1 (yeast) [Source:ZFIN;Acc:ZDB-GENE-061103-319]
## 2345                                                          RAB3A interacting protein (rabin3)-like 1 [Source:ZFIN;Acc:ZDB-GENE-110921-5]
## 2361                                                                       Hermansky-Pudlak syndrome 5 [Source:ZFIN;Acc:ZDB-GENE-070410-80]
## 2377                                                          small nuclear ribonucleoprotein 40 (U5) [Source:ZFIN;Acc:ZDB-GENE-040426-978]
## 2381                                                  general transcription factor IIH, polypeptide 1 [Source:ZFIN;Acc:ZDB-GENE-040912-164]
## 2445                                                                        fin bud initiation factor a [Source:ZFIN;Acc:ZDB-GENE-111031-2]
## 2473                                    caseinolytic mitochondrial matrix peptidase chaperone subunit b [Source:ZFIN;Acc:ZDB-GENE-130404-1]
## 2481                                               adaptor-related protein complex 4, epsilon 1 subunit [Source:ZFIN;Acc:ZDB-GENE-061221-3]
## 2487                                          guanine nucleotide binding protein (G protein), beta 5a [Source:ZFIN;Acc:ZDB-GENE-070112-342]
## 2490                                                                                        myosin VC [Source:ZFIN;Acc:ZDB-GENE-131127-196]
## 2508                                                                                        myosin VAb [Source:ZFIN;Acc:ZDB-GENE-050411-72]
## 2533                                                               ribosomal L24 domain containing 1 [Source:ZFIN;Acc:ZDB-GENE-040426-1925]
## 2564                                                                            transcription factor 12 [Source:HGNC Symbol;Acc:HGNC:11623]
## 2571                                                                                    cingulin-like 1 [Source:HGNC Symbol;Acc:HGNC:25931]
## 2588                                                                   ADAM metallopeptidase domain 10b [Source:ZFIN;Acc:ZDB-GENE-071115-1]
## 2594                                                              FANCD2/FANCI-associated nuclease 1 [Source:ZFIN;Acc:ZDB-GENE-030131-6225]
## 2613                                                           sulfide quinone reductase-like (yeast) [Source:ZFIN;Acc:ZDB-GENE-050417-436]
## 2628       CTD (carboxy-terminal domain, RNA polymerase II, polypeptide A) small phosphatase like 2b [Source:ZFIN;Acc:ZDB-GENE-030131-1809]
## 2656                                                  poly (ADP-ribose) polymerase family, member 16 [Source:ZFIN;Acc:ZDB-GENE-040426-2289]
## 2678                                                                                                                                       
## 2687                                              UDP glucuronosyltransferase 5 family, polypeptide A1 [Source:ZFIN;Acc:ZDB-GENE-051120-60]
## 2813                                                                 cytochrome c oxidase subunit Vaa [Source:ZFIN;Acc:ZDB-GENE-050522-133]
## 2815                                                             A kinase (PRKA) interacting protein 1 [Source:ZFIN;Acc:ZDB-GENE-030829-24]
## 2857                                            proteasome (prosome, macropain) subunit, alpha type, 1 [Source:ZFIN;Acc:ZDB-GENE-040801-15]
## 2915                                                                                        zgc:56106 [Source:ZFIN;Acc:ZDB-GENE-040426-904]
## 2934                                         pleckstrin homology domain containing, family A member 7a [Source:ZFIN;Acc:ZDB-GENE-050419-75]
## 2954                                                                                 synaptotagmin VIII [Source:ZFIN;Acc:ZDB-GENE-060303-4]
## 2962                                           troponin I type 2a (skeletal, fast), tandem duplicate 1 [Source:ZFIN;Acc:ZDB-GENE-041114-60]
## 2981                                                                    lymphocyte-specific protein 1 [Source:ZFIN;Acc:ZDB-GENE-131127-171]
## 3080                                                                                si:ch1073-174d20.2 [Source:ZFIN;Acc:ZDB-GENE-121214-51]
## 3107                                                                 ring finger and WD repeat domain 3 [Source:ZFIN;Acc:ZDB-GENE-120529-1]
## 3113                                        transmembrane emp24 protein transport domain containing 6 [Source:ZFIN;Acc:ZDB-GENE-131121-182]
## 3117                                                                zinc finger, DHHC-type containing 7 [Source:HGNC Symbol;Acc:HGNC:18459]
## 3132                                                             ankyrin repeat domain 27 (VPS9 domain) [Source:ZFIN;Acc:ZDB-GENE-121105-1]
## 3136                                                             ankyrin repeat domain 27 (VPS9 domain) [Source:ZFIN;Acc:ZDB-GENE-121105-1]
## 3152                                                                                        zgc:162267 [Source:ZFIN;Acc:ZDB-GENE-070410-53]
## 3158                  signal transducer and activator of transcription 3 (acute-phase response factor) [Source:ZFIN;Acc:ZDB-GENE-980526-68]
## 3163                                                                                                                                       
## 3168                                      protein tyrosine phosphatase, receptor-type, Z polypeptide 1a [Source:ZFIN;Acc:ZDB-GENE-090406-1]
## 3185                                                                 aminoadipate-semialdehyde synthase [Source:ZFIN;Acc:ZDB-GENE-061220-8]
## 3199                                                               Ca++-dependent secretion activator 2 [Source:ZFIN;Acc:ZDB-GENE-030903-1]
## 3221                                                         ankyrin repeat and SOCS box containing 15a [Source:ZFIN;Acc:ZDB-GENE-110421-5]
## 3255                                                                  protection of telomeres 1 homolog [Source:ZFIN;Acc:ZDB-GENE-110324-1]
## 3280                                                                                                                                       
## 3283                                                                                       zgc:101553 [Source:ZFIN;Acc:ZDB-GENE-041114-124]
## 3311                                                                                                                                       
## 3314                                                    FtsJ RNA methyltransferase homolog 1 (E. coli) [Source:ZFIN;Acc:ZDB-GENE-041114-83]
## 3350                                                                                    ceramide kinase [Source:HGNC Symbol;Acc:HGNC:19256]
## 3359                                                                                si:ch211-286k11.4 [Source:ZFIN;Acc:ZDB-GENE-131121-159]
## 3365                                                                       GRAM domain containing 4b [Source:ZFIN;Acc:ZDB-GENE-030131-4780]
## 3376                                                                                                                                       
## 3392                                             Bet1 golgi vesicular membrane trafficking protein-like [Source:ZFIN;Acc:ZDB-GENE-040822-2]
## 3416                                                                      G-2 and S-phase expressed 1 [Source:ZFIN;Acc:ZDB-GENE-050522-493]
## 3441                                                                                                                                       
## 3442                                                                     cortactin binding protein 2 [Source:ZFIN;Acc:ZDB-GENE-030131-8134]
## 3463 cystic fibrosis transmembrane conductance regulator (ATP-binding cassette sub-family C, member 7) [Source:ZFIN;Acc:ZDB-GENE-050517-20]
## 3512                                             capping protein (actin filament) muscle Z-line, alpha 2 [Source:HGNC Symbol;Acc:HGNC:1490]
## 3583                                                                                      caveolin 1 [Source:ZFIN;Acc:ZDB-GENE-030131-2415]
## 3584                                                                                       caveolin 2 [Source:ZFIN;Acc:ZDB-GENE-040625-164]
## 3590                                                         testis derived transcript (3 LIM domains) [Source:ZFIN;Acc:ZDB-GENE-040718-59]
## 3656                                                                            centrosomal protein 41 [Source:ZFIN;Acc:ZDB-GENE-040704-35]
## 3674                                                                     dual specificity phosphatase 6 [Source:ZFIN;Acc:ZDB-GENE-030613-1]
## 3703                                            transmembrane and tetratricopeptide repeat containing 3 [Source:ZFIN;Acc:ZDB-GENE-061221-2]
## 3727                           phosphatidylinositol-4-phosphate 3-kinase, catalytic subunit type 2 gamma [Source:HGNC Symbol;Acc:HGNC:8973]
## 3736                                           pleckstrin homology domain containing, family A member 5 [Source:HGNC Symbol;Acc:HGNC:30036]
## 3745                                                                               AE binding protein 2 [Source:HGNC Symbol;Acc:HGNC:24051]
## 3747                                                    ankyrin repeat and death domain containing 1B [Source:ZFIN;Acc:ZDB-GENE-060526-136]
## 3752                                                                phosphodiesterase 3A, cGMP-inhibited [Source:HGNC Symbol;Acc:HGNC:8778]
## 3760                                                              B-cell receptor-associated protein 29 [Source:HGNC Symbol;Acc:HGNC:24131]
## 3768                                                                   HMG-box transcription factor 1 [Source:ZFIN;Acc:ZDB-GENE-050522-414]
## 3774                                           protein kinase, cAMP-dependent, regulatory, type II, beta [Source:HGNC Symbol;Acc:HGNC:9392]
## 3790                                                                 Bloom syndrome, RecQ helicase-like [Source:ZFIN;Acc:ZDB-GENE-070702-5]
## 3814                                                                                   alpha-kinase 3a [Source:ZFIN;Acc:ZDB-GENE-050419-48]
## 3847                                                                         zinc finger protein 592 [Source:ZFIN;Acc:ZDB-GENE-030131-9613]
## 3866                                                                                       zgc:153293 [Source:ZFIN;Acc:ZDB-GENE-060825-315]
## 3868                                                                  RAB19, member RAS oncogene family [Source:HGNC Symbol;Acc:HGNC:19982]
## 3880                                                   cat eye syndrome chromosome region, candidate 5 [Source:ZFIN;Acc:ZDB-GENE-080220-59]
## 3900                                                   Usher syndrome 1C (autosomal recessive, severe) [Source:ZFIN;Acc:ZDB-GENE-060312-41]
## 3929                                                                           MOB kinase activator 2a [Source:ZFIN;Acc:ZDB-GENE-040718-56]
## 3942                              protein tyrosine phosphatase, receptor type, Jb, tandem duplicate 2 [Source:ZFIN;Acc:ZDB-GENE-131120-137]
## 3957                                                                oxysterol binding protein-like 5 [Source:ZFIN;Acc:ZDB-GENE-030131-5872]
## 3997                                                               mitochondrial ribosomal protein L23 [Source:ZFIN;Acc:ZDB-GENE-040625-12]
## 4026                                                                                                                                       
## 4027                                                                  RAB19, member RAS oncogene family [Source:HGNC Symbol;Acc:HGNC:19982]
## 4031                                                                                        im:6904482 [Source:ZFIN;Acc:ZDB-GENE-050506-81]
## 4084                                                                         early endosome antigen 1 [Source:ZFIN;Acc:ZDB-GENE-041111-270]
## 4091                                      nudix (nucleoside diphosphate linked moiety X)-type motif 4a [Source:ZFIN;Acc:ZDB-GENE-031010-33]
## 4092                                                               ubiquitin-conjugating enzyme E2Nb [Source:ZFIN;Acc:ZDB-GENE-040426-1291]
## 4108                                                  nuclear receptor subfamily 1, group H, member 4 [Source:ZFIN;Acc:ZDB-GENE-040718-313]
## 4126                                                                                  si:dkey-103i16.1 [Source:ZFIN;Acc:ZDB-GENE-060503-47]
## 4213                                                                      cation/H+ exchanger protein 2 [Source:ZFIN;Acc:ZDB-GENE-100825-2]
## 4221                                                                         kelch domain containing 10 [Source:HGNC Symbol;Acc:HGNC:22194]
## 4242                                                                cholinergic receptor, muscarinic 2b [Source:ZFIN;Acc:ZDB-GENE-090410-3]
## 4265                                                                                   si:ch211-127m7.3 [Source:ZFIN;Acc:ZDB-GENE-141211-6]
## 4285                                                                          KxDL motif containing 1 [Source:ZFIN;Acc:ZDB-GENE-040801-207]
## 4331                                                               leucine rich repeat containing 17 [Source:ZFIN;Acc:ZDB-GENE-030131-9774]
## 4344                                                                                 si:ch211-236c15.2 [Source:ZFIN;Acc:ZDB-GENE-120709-53]
## 4346                                                               round spermatid basic protein 1-like [Source:HGNC Symbol;Acc:HGNC:24765]
## 4357                                                                           proline rich 5 (renal) [Source:ZFIN;Acc:ZDB-GENE-130530-791]
## 4359                                                                     RAD52 homolog (S. cerevisiae) [Source:ZFIN;Acc:ZDB-GENE-050731-10]
## 4377                                                        ELKS/RAB6-interacting/CAST family member 1a [Source:ZFIN;Acc:ZDB-GENE-091214-5]
## 4383                                                                                                                                       
## 4462                                      nudix (nucleoside diphosphate linked moiety X)-type motif 7 [Source:ZFIN;Acc:ZDB-GENE-131127-212]
## 4467                                                                                                                                       
## 4492                                                                             choline kinase beta [Source:ZFIN;Acc:ZDB-GENE-030131-2928]
## 4521                                                                  chemokine (C-X-C motif) receptor 4 [Source:HGNC Symbol;Acc:HGNC:2561]
## 4523                                                                        progastricsin (pepsinogen C) [Source:HGNC Symbol;Acc:HGNC:8890]
## 4545                                                                                  arylsulfatase A [Source:ZFIN;Acc:ZDB-GENE-050320-118]
## 4570                                                                                                                                       
## 4584                                                                                                                                       
## 4586                                                       SH3 and multiple ankyrin repeat domains 3a [Source:ZFIN;Acc:ZDB-GENE-060503-369]
## 4597                                                        RAB, member of RAS oncogene family-like 2 [Source:ZFIN;Acc:ZDB-GENE-060503-464]
## 4638                                                                                                                                       
## 4639                                                                                                                                       
## 4642                                            putative pyruvate dehydrogenase phosphatase isoenzyme 2 [Source:ZFIN;Acc:ZDB-GENE-000921-2]
## 4672                                                receptor-interacting serine-threonine kinase 3 like [Source:ZFIN;Acc:ZDB-GENE-071115-4]
## 4693                                                                     RNA binding motif protein 28 [Source:ZFIN;Acc:ZDB-GENE-040426-960]
## 4719                                                                         hepatocyte growth factor b [Source:ZFIN;Acc:ZDB-GENE-041014-3]
## 4804                                                     family with sequence similarity 107, member B [Source:ZFIN;Acc:ZDB-GENE-031030-12]
## 4853                                                                                                                                       
## 4880                                                     protein phosphatase 6, regulatory subunit 2b [Source:ZFIN;Acc:ZDB-GENE-070705-441]
## 4922                                                         CCR4-NOT transcription complex, subunit 2 [Source:ZFIN;Acc:ZDB-GENE-070410-70]
## 4948                                                               tetratricopeptide repeat domain 38 [Source:ZFIN;Acc:ZDB-GENE-050522-318]
## 4958                                                                                 si:ch211-59c24.1 [Source:ZFIN;Acc:ZDB-GENE-060503-607]
## 4991                                                                                                                                       
## 4996                                                     calcium release activated channel regulator 2A [Source:HGNC Symbol;Acc:HGNC:28657]
## 5002                                                                                   si:dkeyp-2c8.2 [Source:ZFIN;Acc:ZDB-GENE-081031-100]
## 5020                                                         FYVE, RhoGEF and PH domain containing 4a [Source:ZFIN;Acc:ZDB-GENE-050420-347]
## 5056                                                                       WEE1 homolog 2 (S. pombe) [Source:ZFIN;Acc:ZDB-GENE-030131-5682]
## 5064                                                ATPase, H+ transporting, lysosomal, V1 subunit E1a [Source:ZFIN;Acc:ZDB-GENE-041212-51]
## 5071                                          solute carrier family 25 (glutamate carrier), member 18 [Source:ZFIN;Acc:ZDB-GENE-041111-192]
## 5077                                                   cat eye syndrome chromosome region, candidate 1a [Source:ZFIN;Acc:ZDB-GENE-030902-4]
## 5119                                                                 WAP four-disulfide core domain 1 [Source:ZFIN;Acc:ZDB-GENE-070112-352]
## 5124                                           potassium voltage-gated channel, subfamily G, member 4a [Source:ZFIN;Acc:ZDB-GENE-050419-11]
## 5128                                                                                 si:dkey-246g23.2 [Source:ZFIN;Acc:ZDB-GENE-050419-100]
## 5131                                                            heat shock factor binding protein 1b [Source:ZFIN;Acc:ZDB-GENE-040426-1721]
## 5135                                                                                      zgc:173742 [Source:ZFIN;Acc:ZDB-GENE-030131-6489]
## 5209                                                                                       zgc:103697 [Source:ZFIN;Acc:ZDB-GENE-040912-104]
## 5217                                                           ring finger and SPRY domain containing 1 [Source:ZFIN;Acc:ZDB-GENE-061026-2]
## 5220                                                  ADP-ribosylation factor-like 2 binding protein [Source:ZFIN;Acc:ZDB-GENE-040426-1604]
## 5240                                             apoptosis-inducing factor, mitochondrion-associated, 2 [Source:HGNC Symbol;Acc:HGNC:21411]
## 5246                                                                            Bardet-Biedl syndrome 2 [Source:ZFIN;Acc:ZDB-GENE-020801-1]
## 5266                                    UDP-GlcNAc:betaGal beta-1,3-N-acetylglucosaminyltransferase 9 [Source:ZFIN;Acc:ZDB-GENE-060503-611]
## 5292                                                                                                                                       
## 5353                                          transport and golgi organization 6 homolog (Drosophila) [Source:ZFIN;Acc:ZDB-GENE-050419-237]
## 5359                                           mitogen-activated protein kinase 8 interacting protein 1 [Source:ZFIN;Acc:ZDB-GENE-101025-1]
## 6745                                                                                si:ch1073-89b12.1 [Source:ZFIN;Acc:ZDB-GENE-131121-340]
## 6747                                                                            fatty acid desaturase 2 [Source:ZFIN;Acc:ZDB-GENE-011212-1]
## 7089                                                                                       plexin C1 [Source:ZFIN;Acc:ZDB-GENE-030131-1620]
## 7984                                                       V-set and immunoglobulin domain containing 1 [Source:HGNC Symbol;Acc:HGNC:28675]
## 8497                                                                                  Small nucleolar RNA SNORA19 [Source:RFAM;Acc:RF00413]
##                     Gene_Symbol
## 379                            
## 594                       rnpc3
## 833                        scdb
## 985                       parvb
## 1005                      parvg
## 1014                    plxnb2b
## 1028                    TUBGCP6
## 1042                      appl2
## 1102                    tmem117
## 1109                       TWF1
## 1128                      irak4
## 1138                   ADAMTS20
## 1155                  prickle1a
## 1161            PPHLN1 (1 of 3)
## 1162                           
## 1165                       yaf2
## 1172                    gxylt1b
## 1185             CNTN1 (1 of 2)
## 1193            PNPLA8 (1 of 2)
## 1195                    dnajb9a
## 1197                      dnm1l
## 1254                     cald1b
## 1261                       bpgm
## 1304                     cnot4a
## 1338                      ldhbb
## 1341                    golt1ba
## 1355                    slc35b4
## 1359                    chchd3b
## 1381              NET1 (1 of 2)
## 1384                     asb13b
## 1426                     dhtkd1
## 1463                    camk1db
## 1473             cdk16 (1 of 2)
## 1490                           
## 1518                       ntn4
## 1534                       usp3
## 1559                     man2c1
## 1609                      neil1
## 1612                     commd4
## 1616                     sema7a
## 1627                      islr2
## 1630                      stra6
## 1651                     stoml1
## 1667                       hexa
## 1671                      eif2a
## 1710                      ppcdc
## 1720                      hacd3
## 1739                       vwa9
## 1772                    dennd4a
## 1798                     map2k1
## 1939                     snapc5
## 1946                     smad6b
## 1951                     smad3b
## 1975                      aagab
## 2009                 zgc:162898
## 2015                     pias1b
## 2020                    morf4l1
## 2072                       ddb2
## 2075                     kbtbd4
## 2101                    ighmbp2
## 2127                      chid1
## 2145                      pddc1
## 2155                      cd151
## 2180          si:ch211-247i17.1
## 2182                    cracr2b
## 2187                    tmem138
## 2194                    tmem258
## 2195                       myrf
## 2209 si:ch1073-89b12.1 (1 of 3)
## 2233                      syt7a
## 2243                           
## 2248            si:dkey-201c1.2
## 2263                           
## 2268                      pidd1
## 2288             zmp:0000001167
## 2307                      athl1
## 2345                    rab3il1
## 2361                       hps5
## 2377                    snrnp40
## 2381                     gtf2h1
## 2445                     fibina
## 2473                      clpxb
## 2481                      ap4e1
## 2487                      gnb5a
## 2490                      myo5c
## 2508                     myo5ab
## 2533                    rsl24d1
## 2564             TCF12 (1 of 2)
## 2571             CGNL1 (1 of 2)
## 2588                    adam10b
## 2594                       fan1
## 2613                      sqrdl
## 2628                   ctdspl2b
## 2656                     parp16
## 2678                           
## 2687                     ugt5a1
## 2813                     cox5aa
## 2815                      akip1
## 2857                      psma1
## 2915                  zgc:56106
## 2934                   plekha7a
## 2954                       syt8
## 2962                   tnni2a.1
## 2981                       lsp1
## 3080         si:ch1073-174d20.2
## 3107                      rfwd3
## 3113                      tmed6
## 3117            ZDHHC7 (1 of 2)
## 3132           ankrd27 (1 of 2)
## 3136           ankrd27 (2 of 2)
## 3152        zgc:162267 (2 of 2)
## 3158                      stat3
## 3163                           
## 3168                    ptprz1a
## 3185                       aass
## 3199                     cadps2
## 3221                     asb15a
## 3255                       pot1
## 3280                           
## 3283                 zgc:101553
## 3311                           
## 3314                      ftsj1
## 3350              CERK (1 of 2)
## 3359          si:ch211-286k11.4
## 3365                    gramd4b
## 3376                           
## 3392                      bet1l
## 3416                      gtse1
## 3441                           
## 3442                    cttnbp2
## 3463                       cftr
## 3512                     CAPZA2
## 3583                       cav1
## 3584                       cav2
## 3590                        tes
## 3656                      cep41
## 3674                      dusp6
## 3703                      tmtc3
## 3727                    PIK3C2G
## 3736           PLEKHA5 (1 of 2)
## 3745             AEBP2 (1 of 2)
## 3747                    ankdd1b
## 3752             PDE3A (2 of 2)
## 3760            BCAP29 (2 of 2)
## 3768                       hbp1
## 3774                    PRKAR2B
## 3790                        blm
## 3814                     alpk3a
## 3847                     znf592
## 3866                 zgc:153293
## 3868             RAB19 (1 of 3)
## 3880                      cecr5
## 3900                      ush1c
## 3929                      mob2a
## 3942                   ptprjb.2
## 3957                     osbpl5
## 3997                     mrpl23
## 4026                           
## 4027             RAB19 (2 of 3)
## 4031                 im:6904482
## 4084                       eea1
## 4091                     nudt4a
## 4092                     ube2nb
## 4108                      nr1h4
## 4126           si:dkey-103i16.1
## 4213                       cax2
## 4221           KLHDC10 (1 of 2)
## 4242                     chrm2b
## 4265           si:ch211-127m7.3
## 4285                       kxd1
## 4331            lrrc17 (1 of 2)
## 4344          si:ch211-236c15.2
## 4346                     RSBN1L
## 4357                       prr5
## 4359                      rad52
## 4377                      erc1a
## 4383                           
## 4462                      nudt7
## 4467                           
## 4492                       chkb
## 4521             CXCR4 (2 of 2)
## 4523                        PGC
## 4545                       arsa
## 4570                           
## 4584                           
## 4586                    shank3a
## 4597                      rabl2
## 4638                           
## 4639                           
## 4642                       pdp2
## 4672                     ripk3l
## 4693                      rbm28
## 4719                       hgfb
## 4804                    fam107b
## 4853                           
## 4880                    ppp6r2b
## 4922                      cnot2
## 4948                      ttc38
## 4958           si:ch211-59c24.1
## 4991                           
## 4996           CRACR2A (1 of 2)
## 5002             si:dkeyp-2c8.2
## 5020                      fgd4a
## 5056                       wee2
## 5064                  atp6v1e1a
## 5071                   slc25a18
## 5077                     cecr1a
## 5119                      wfdc1
## 5124                     kcng4a
## 5128           si:dkey-246g23.2
## 5131                     hsbp1b
## 5135                 zgc:173742
## 5209                 zgc:103697
## 5217                     rspry1
## 5220                     arl2bp
## 5240                      AIFM2
## 5246                       bbs2
## 5266                     b3gnt9
## 5292                           
## 5353                     tango6
## 5359                   mapk8ip1
## 6745 si:ch1073-89b12.1 (3 of 3)
## 6747             fads2 (3 of 3)
## 7089                     plxnc1
## 7984                      VSIG1
## 8497                    SNORA19

The paper

We limited differential expression analysis to only those genes represented by at least two reads per million mapped (“copies per million,” CPM) in at least 12 of the 84 libraries (see supplementary fig. S1, Supplementary Material online). We normalized read counts for these 15,847 genes using TMM normalization (Robinson and Oshlack 2010) as implemented by the calcNormFactors function of the R/Bioconductor package edgeR (Robinson et al. 2010). In order to perform gene-wise differential expression analyses in a general linear model framework (Law et al. 2014), we supplied the TMM normalization factors to the voom function of the R/Bioconductor package limma (Ritchie et al. 2015), which generated appropriately weighted log2CPM expression values for all observations. We then fit a linear model for each gene including the fixed effects of factor levels for host population, host family (nested within host population), sex, and microbiota treatment using the limma lmFit function. We did not include a library “batch” effect in the model because initial nMDS ordination did not suggest batch as a major source of transcriptional variation, and our stratified assignment of samples to batches controlled for any confounding effect of batch with respect to other factors of interest. To account for variation between replicate flasks we incorporated flask as a random effect in the model using the limma duplicateCorrelation function. Each hypothesis of interest was tested, for each gene, using one or more contrasts via moderated t-tests applied by the limma function eBayes. To evaluate the effect of our microbiota treatment we performed a within-OC contrast, a within-FW contrast, and an overall contrast. Genes expressed differentially in any of these three contrasts were interpreted as being associated with the presence of microbes. We performed a single contrast to test for an overall effect of host population, and a single contrast to test for an interaction between host population and microbiota, both of these accounting for family differences nested within population. Finally, we performed contrasts to test for an effect of sex and a sex-by-microbiota interaction. For each of these seven contrasts, we controlled the false discovery rate (FDR) at 0.1 using the approach of Benjamini and Hochberg (1995), as implemented by the limma topTable function.

Next week

We’re starting on Kruschke, Doing Bayesian Data Analysis.