Hypothesis testing and \(p\)-values

Peter Ralph

28 September 2021 – Advanced Biological Statistics

Hypothesis testing and \(p\)-values

A \(p\)-value is

the probability of seeing a result at least as surprising as what was observed in the data, if the null hypothesis is true.

Usually, this means

a result - numerical value of a statistic
surprising - big
null hypothesis - the model we use to calculate the \(p\)-value

which can all be defined to suit the situation.

What does a small \(p\)-value mean?

If the null hypothesis were true, then you’d be really unlikely to see something like what you actually did.

So, either the “null hypothesis” is not a good description of reality or something surprising happened.

How useful this is depends on the null hypothesis.

For instance

## 
##  Welch Two Sample t-test
## 
## data:  airbnb$price[airbnb$instant_bookable] and airbnb$price[!airbnb$instant_bookable]
## t = 3.6482, df = 5039.8, p-value = 0.0002667
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   4.475555 14.872518
## sample estimates:
## mean of x mean of y 
##  124.6409  114.9668

Also for instance

t.test(airbnb$price)

## 
##  One Sample t-test
## 
## data:  airbnb$price
## t = 91.32, df = 5601, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  116.9734 122.1058
## sample estimates:
## mean of x 
##  119.5396

Is that \(p\)-value useful?

Exercise:

My hypothesis: People tend to have longer index fingers on the hand they write with because writing stretches the ligaments.

(class survey) How many people have a longer index finger on the hand they write with?

(class survey) Everyone flip a coin:

ifelse(runif(1) < 0.5, "H", "T")

We want to estimate the parameter

\[\begin{equation} \theta = \P(\text{random person has writing finger longer}) , \end{equation}\]

and now we have a fake dataset with \(\theta = 1/2\).

Let’s get some more data:

n <- 37 # class size
sum(ifelse(runif(n) < 1/2, "H", "T") == "H")

Now we can estimate the \(p\)-value for the hypothesis that \(\theta = 1/2\).

A faster method:

replicate(1000, sum(rbinom(n, 1, 1/2) > 0))

or, equivalently,

rbinom(1000, n, 1/2)

(in class)

So, where do \(p\)-values come from?

Either math:

Or, computers. (maybe math, maybe simulation, maybe both)

So, where did this \(p\)-value come from?

(tt <- t.test(
        airbnb$price[airbnb$instant_bookable],
        airbnb$price[!airbnb$instant_bookable]
))

## 
##  Welch Two Sample t-test
## 
## data:  airbnb$price[airbnb$instant_bookable] and airbnb$price[!airbnb$instant_bookable]
## t = 3.6482, df = 5039.8, p-value = 0.0002667
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   4.475555 14.872518
## sample estimates:
## mean of x mean of y 
##  124.6409  114.9668

The \(t\) distribution! (see separate slides)