Homework 5: The Mystery Coins

\[ %% % Add your macros here; they'll be included in pdf and html output. %% \newcommand{\R}{\mathbb{R}} % reals \newcommand{\E}{\mathbb{E}} % expectation \renewcommand{\P}{\mathbb{P}} % probability \DeclareMathOperator{\logit}{logit} \DeclareMathOperator{\logistic}{logistic} \DeclareMathOperator{\sd}{sd} \DeclareMathOperator{\var}{var} \DeclareMathOperator{\cov}{cov} \DeclareMathOperator{\cor}{cor} \DeclareMathOperator{\Normal}{Normal} \DeclareMathOperator{\LogNormal}{logNormal} \DeclareMathOperator{\Poisson}{Poisson} \DeclareMathOperator{\Beta}{Beta} \DeclareMathOperator{\Binom}{Binomial} \DeclareMathOperator{\Gam}{Gamma} \DeclareMathOperator{\Exp}{Exponential} \DeclareMathOperator{\Cauchy}{Cauchy} \DeclareMathOperator{\Unif}{Unif} \DeclareMathOperator{\Dirichlet}{Dirichlet} \DeclareMathOperator{\Wishart}{Wishart} \DeclareMathOperator{\StudentsT}{StudentsT} \DeclareMathOperator{\Weibull}{Weibull} \newcommand{\given}{\;\vert\;} \]

Assignment: Your task is to use Rmarkdown to write a short report, readable by a technically literate person. The code you used should not be visible in the final report (unless you have a good reason to show it).

Due: Submit your work via Canvas by the end of the day (midnight) on Thursday, November 18th. Please submit both the Rmd file and the resulting html or pdf file. You can work with other members of class, but I expect each of you to construct and run all of the scripts yourself.

The problem.

I have a sizeable collection of old coins, some of strange shapes. (Not really, but imagine.) I have selected 100 of these and flipped each one some number of times: some more, some less. The resulting data is in the file coin_flips.txt, with one row per coin. Each coin has in intrinsic probability, \(\theta\), with which it comes up Heads; let’s assume that the distribution of these probabilities across my collection of coins is close to a Beta distribution. This means that the number of Heads in each row of the data file has a Beta-Binomial distribution. The overall goal is to infer this distribution of \(\theta\) values, i.e., the values of \(\alpha\) and \(\beta\) in that distribution.

If a given coin has a probability of Heads drawn from a Beta-Binomial(\(\alpha\), \(\beta\)) distribution, then (consulting Wikipedia, the likelihood of getting \(k\) Heads in \(n\) flips is \[ \binom{n}{k}\frac{B(k + \alpha, n - k + \beta)}{B(\alpha, \beta)}, \] where \(B(a, b)\) is the Beta function. It’s easier to work with log-likelihoods, and the logarithm of this likelihood can be computed in R as:

This is the likelihood for a single coin that was flipped \(n\) times getting \(k\) heads, to get the likelihood for an entire dataset of coins you need to add up this log-likelihood across all coins.

Note: Here is example code to make a contour plot, so you know what sort of picture I have in mind for (2). You may of course do this any way you want.

# we'll be plotting a hill with the top at (a, b)
a <- 1.5
b <- 2.3
f <- function (x, y) { - (x - a)^2 - (y - b)^2 }

nx <- 31
xvals <- seq(0, 4, length.out=nx)
yvals <- seq(0, 4, length.out=nx)
xy <- expand.grid(
        x = xvals,
        y = yvals
)
xy$f <- NA
for (j in 1:nrow(xy)) {
    xy$f[j] <- f(xy$x[j], xy$y[j])
}

v2m <- function(v) { matrix(v, nrow=nx) }

image(xvals, yvals, v2m(xy$f))
contour(xvals, yvals, v2m(xy$f), add=TRUE)
max_ij <- arrayInd(which.max(xy$f), dim(v2m(xy$f)))
points(xvals[max_ij[1]], yvals[max_ij[2]], pch="*", cex=3)
legend('topright', pch='*', legend='maximum value')