Peter Ralph
Advanced Biological Statistics
If \(X \sim \Cauchy(\text{center}=\mu, \text{scale}=\sigma)\), then \(X\) has probability density \[\begin{aligned} f(x \given \mu, \sigma) = \frac{1}{\pi\left( 1 + \left( \frac{x - \mu}{\sigma} \right)^2 \right)} . \end{aligned}\]
The Cauchy is a good example of a distribution with “heavy tails”: rare, very large values.
\(X\) has a Student’s \(t\) distribution with \(\text{df}=1\).
If \(Z \sim \Normal(0, 1)\) and \(X \sim \Normal(0,1/Z)\) then \(X \sim \Cauchy(0,1)\).
If \(X_1, X_2, \ldots, X_n\) are independent \(\Cauchy(0,1)\) then \(\max(X_1, \ldots, X_n)\) is of size \(n\).
Wait, what?!?
A single value has the same distribution as the mean of 1,000 of them?
Let’s look:
meanplot <- function (rf, n=1e3, m=100) {
x <- matrix(rf(n*m), ncol=m)
layout(t(1:2))
hist(x[1,][abs(x[1,])<5], breaks=20, freq=FALSE,
main=sprintf("%d samples", m), xlab='value',
xlim=c(-5,5))
hist(colMeans(x)[abs(colMeans(x))<5], breaks=20, freq=FALSE,
main=sprintf("%d means of %d each", m, n), xlab='value',
xlim=c(-5,5))
}
\(X \sim \Normal(0,1)\)
\(X \sim \Cauchy(0,1)\)
Suppose you are measuring relative metabolic rates of mice in the wild. Because life is complicated, the accuracy of your measurements varies widely. A model of the measured rate, \(R_i\), for a mouse at temperature \(T_i\) is \[\begin{aligned} R_i &\sim \Normal(120 + 0.7 * (T_i - 37), 1/|E_i|) \\ E_i &\sim \Normal(0, 1) . \end{aligned}\]
Simulate 200 measurements from this model, for temperatures between 36 and 38, and try to infer the true slope (0.7
).