\[ %% % Add your macros here; they'll be included in pdf and html output. %% \newcommand{\R}{\mathbb{R}} % reals \newcommand{\E}{\mathbb{E}} % expectation \renewcommand{\P}{\mathbb{P}} % probability \DeclareMathOperator{\logit}{logit} \DeclareMathOperator{\logistic}{logistic} \DeclareMathOperator{\SE}{SE} \DeclareMathOperator{\sd}{sd} \DeclareMathOperator{\var}{var} \DeclareMathOperator{\cov}{cov} \DeclareMathOperator{\cor}{cor} \DeclareMathOperator{\Normal}{Normal} \DeclareMathOperator{\MVN}{MVN} \DeclareMathOperator{\LogNormal}{logNormal} \DeclareMathOperator{\Poisson}{Poisson} \DeclareMathOperator{\Beta}{Beta} \DeclareMathOperator{\Binom}{Binomial} \DeclareMathOperator{\Gam}{Gamma} \DeclareMathOperator{\Exp}{Exponential} \DeclareMathOperator{\Cauchy}{Cauchy} \DeclareMathOperator{\Unif}{Unif} \DeclareMathOperator{\Dirichlet}{Dirichlet} \DeclareMathOperator{\Wishart}{Wishart} \DeclareMathOperator{\StudentsT}{StudentsT} \DeclareMathOperator{\Weibull}{Weibull} \newcommand{\given}{\;\vert\;} \]

The multivariate normal distribution

Peter Ralph

1 February 2021 – Advanced Biological Statistics

The multivariate Normal

… also known as

multivariate Gaussian

or MVN

If \(X\) is random vector that has a multivariate Normal distribution, we say \[\begin{aligned} X = (X_1, \ldots, X_k) \sim \MVN(\mu, \Sigma) . \end{aligned}\]

The parameters are the mean vector \(\mu\) and covariance matrix \(\Sigma\): \[\begin{aligned} \E[X_i] = \mu_i \end{aligned}\] and \[\begin{aligned} \cov[X_i, X_j] = \Sigma_{i,j} . \end{aligned}\]

Properties:

  1. \(X_i \sim \Normal(\mu_i, \sqrt{\Sigma_{i,i}})\)

  2. If \(\Sigma_{i,j} = 0\) then \(X_i\) and \(X_j\) are independent.

  3. Level curves of the probability density function are ellipses.

Example: a univariate linear model

Let’s say that \(X \sim \Normal(0, 1)\) and \[\begin{aligned} Y &= \beta X + \epsilon \\ \epsilon &\sim \Normal(0, \sigma) . \end{aligned}\]

Then \(Y\) also has a Normal distribution, and \[\begin{aligned} \var[X] &= 1, \\ \var[Y] &= \beta^2 + \sigma^2 \qquad \text{and} \\ \cov[X, Y] &= \beta, \end{aligned}\]

so \[\begin{aligned} (X, Y) \sim \MVN\left( \begin{bmatrix} 0 \\ 0 \end{bmatrix} \begin{bmatrix} 1 & \beta \\ \beta & \beta^2 + \sigma^2 \end{bmatrix} \right) . \end{aligned}\]

Let’s have a look:

nobs <- 100000
beta <- 0.7
sigma <- 1
xy <- data.frame(x = rnorm(nobs, mean=0, sd=1))
xy$y <- beta * xy$x + rnorm(nobs, mean=0, sd=sigma)
plot(y ~ x, data=xy, asp=1, pch=20, cex=0.25, col=adjustcolor('black', 0.15))

plot of chunk r sim_mvn

covmat <- cbind(c(1, beta), c(beta, beta^2 + sigma^2))

Let’s have a look:

draw_ellipse(covmat, r=seq(0,1.5,length.out=5), col='red')

plot of chunk r sim_mvn2

Exercise:

Use mvtnorm::rmvnorm( ) to simulate 10,000 random draws from \[ (X_1, X_2, X_3) \sim \MVN\left(0, \Sigma\right) .\] This will give you a \(10^4 \times 3\) matrix. First make histograms of each variable. Then look at a pairs( ) plot.

  1. \[ \Sigma = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{bmatrix} . \]
  1. \[ \Sigma = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 3 \end{bmatrix} . \]
// reveal.js plugins