R?RstudioR analyses in polished RMarkdown filesR resourcesRR installed
RRStudioRStudioRStudioRstudio environment by locating the following features:
Rstudio by clicking the top left icon- open a new R script.RMarkdownRMarkdownR code into descriptive files to keep your life organized
R chunks into Rmarkdown documentsknit button to render markdown> "You know the greatest danger facing us is ourselves, an irrational fear of the unknown.
But there’s no such thing as the unknown — only things temporarily hidden, temporarily not understood."
>
> --- Captain James T. Kirk“You know the greatest danger facing us is ourselves, an irrational fear of the unknown. But there’s no such thing as the unknown — only things temporarily hidden, temporarily not understood.”
— Captain James T. Kirk
-list_element
-sub_list_element #double tab to indent
-sub_list_element #double tab to indent
-sub_list_element #double tab to indent
-list_element
-sub_list_element #double tab to indent
# note the space after each dash- this is important!RMarkdown Files and Rmarkdown AdvancedRRcode chunks in RMarkdown# symbols[]R follows the normal priority of mathematical evaluation (PEDMAS)RInput code chunk and then output
## [1] 16
Input code chunk and then output
## [1] 16
<- operator (better than =).R is case sensitive.## [1] 6
## [1] 4
These do not work
## [1] 14
## [1] 144
## [1] 2.484907
log - is a built-in function of R, and therefore the object of the function needs to be put in parenthesesarguments in the parentheses after the functionprint command## [1] 67
## [1] 69022864
c stands for concatenate## [1] "I Love"
## [1] "Biostatistics"
## [1] "I Love" "Biostatistics"
R thinks in terms of vectors
R user to try to write scripts with that in mindc() function and then entering the exact values with commas separating each element.## [1] 2 3 4 2 1 2 4 5 10 8 9
## [1] 5 6 7 5 4 5 7 8 13 11 12
x is now what is called a list of character values (“I Love”).factors, and we can redefine our character variables as factors.## [1] I Love
## Levels: I Love
R “sees” a variable using str() or class() functions.## chr "I Love"
## [1] "character"
int stands for integers
dbl stands for doubles, or real numbers (or num)
chr stands for character vectors, or strings
dttm stands for date-times (a date + a time)
lgl stands for logical, vectors that contain only TRUE or FALSE
fctr stands for factors, which R uses to represent categorical variables with fixed possible values
date stands for dates
FALSETRUENA which is ‘not available’ and is the default coding for missing data in RR numbers are doubles by default.NANaN which is ‘not a number’Inf-InfMany functions exist to operate on vectors.
mean(n)
median(n)
var(n)
log(n)
exp(n)
sqrt(n)
sum(n)
length(n)
sample(n, replace = T) #has an additional argument (replace=T)?? from functions within packages).R and it is easy enough to write your own functions if none already exist to do what you want to do.seqsample## [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3
## [15] 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7
## [29] 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1
## [43] 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.0 5.1 5.2 5.3 5.4 5.5
## [57] 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9
## [71] 7.0 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 8.0 8.1 8.2 8.3
## [85] 8.4 8.5 8.6 8.7 8.8 8.9 9.0 9.1 9.2 9.3 9.4 9.5 9.6 9.7
## [99] 9.8 9.9 10.0
## [1] 10.0 9.9 9.8 9.7 9.6 9.5 9.4 9.3 9.2 9.1 9.0 8.9 8.8 8.7
## [15] 8.6 8.5 8.4 8.3 8.2 8.1 8.0 7.9 7.8 7.7 7.6 7.5 7.4 7.3
## [29] 7.2 7.1 7.0 6.9 6.8 6.7 6.6 6.5 6.4 6.3 6.2 6.1 6.0 5.9
## [43] 5.8 5.7 5.6 5.5 5.4 5.3 5.2 5.1 5.0 4.9 4.8 4.7 4.6 4.5
## [57] 4.4 4.3 4.2 4.1 4.0 3.9 3.8 3.7 3.6 3.5 3.4 3.3 3.2 3.1
## [71] 3.0 2.9 2.8 2.7 2.6 2.5 2.4 2.3 2.2 2.1 2.0 1.9 1.8 1.7
## [85] 1.6 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3
## [99] 0.2 0.1 0.0
## [1] 100.00 98.01 96.04 94.09 92.16 90.25 88.36 86.49 84.64 82.81
## [11] 81.00 79.21 77.44 75.69 73.96 72.25 70.56 68.89 67.24 65.61
## [21] 64.00 62.41 60.84 59.29 57.76 56.25 54.76 53.29 51.84 50.41
## [31] 49.00 47.61 46.24 44.89 43.56 42.25 40.96 39.69 38.44 37.21
## [41] 36.00 34.81 33.64 32.49 31.36 30.25 29.16 28.09 27.04 26.01
## [51] 25.00 24.01 23.04 22.09 21.16 20.25 19.36 18.49 17.64 16.81
## [61] 16.00 15.21 14.44 13.69 12.96 12.25 11.56 10.89 10.24 9.61
## [71] 9.00 8.41 7.84 7.29 6.76 6.25 5.76 5.29 4.84 4.41
## [81] 4.00 3.61 3.24 2.89 2.56 2.25 1.96 1.69 1.44 1.21
## [91] 1.00 0.81 0.64 0.49 0.36 0.25 0.16 0.09 0.04 0.01
## [101] 0.00
## [1] 100.00 98.01 96.04 94.09 92.16 90.25 88.36 86.49 84.64 82.81
## [11] 81.00 79.21 77.44 75.69 73.96 72.25 70.56 68.89 67.24 65.61
## [21] 64.00 62.41 60.84 59.29 57.76 56.25 54.76 53.29 51.84 50.41
## [31] 49.00 47.61 46.24 44.89 43.56 42.25 40.96 39.69 38.44 37.21
## [41] 36.00 34.81 33.64 32.49 31.36 30.25 29.16 28.09 27.04 26.01
## [51] 25.00 24.01 23.04 22.09 21.16 20.25 19.36 18.49 17.64 16.81
## [61] 16.00 15.21 14.44 13.69 12.96 12.25 11.56 10.89 10.24 9.61
## [71] 9.00 8.41 7.84 7.29 6.76 6.25 5.76 5.29 4.84 4.41
## [81] 4.00 3.61 3.24 2.89 2.56 2.25 1.96 1.69 1.44 1.21
## [91] 1.00 0.81 0.64 0.49 0.36 0.25 0.16 0.09 0.04 0.01
## [101] 0.00
Complete Exercises 1.4-1.7
x <- rnorm(n = 10000, mean = 0, sd = 10)
y <- sample(1:10000, 10000, replace = T)
xy <- cbind(x, y)
plot(xy)dnorm() generates the probability density, which can be plotted using the curve() function.add=TRUEx <- rnorm(1000, 0, 100)
hist(x, xlim = c(-500, 500))
curve(50000 * dnorm(x, 0, 100), xlim = c(-500, 500), add = TRUE,
col = "Red")R FunctionsR can guess what you mean because of order…## [1] 5.7478597 -14.7850405 0.7835355 -10.0918965 11.9909998
## [6] 2.2570687 15.9292746 3.9519431 -8.4260325 -4.0817148
R and get something you really didn’t want…## [1] 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000
R Functions## [1] 6.869129 10.663631 5.367006 19.060287 10.631596 13.703436 5.277918
## [8] 4.030967 11.677516 7.926794
## [1] 6.869129 10.663631 5.367006 19.060287 10.631596 13.703436 5.277918
## [8] 4.030967 11.677516 7.926794
Rhist function.plot function (as well as a number of more sophisticated plotting functions).high level plotting function, which sets the stageLow level plotting functions will tweak the plots and make them beautifulseq_1 <- seq(0, 10, by = 0.1)
plot(seq_1, xlab = "space", ylab = "function of space", type = "p",
col = "red")par(mfrow = c(2, 2))
plot(seq_1, xlab = "time", ylab = "p in population 1", type = "p",
col = "red")
plot(seq_2, xlab = "time", ylab = "p in population 2", type = "p",
col = "green")
plot(seq_square, xlab = "time", ylab = "p2 in population 2",
type = "p", col = "blue")
plot(seq_square_new, xlab = "time", ylab = "p in population 1",
type = "l", col = "yellow")Complete Exercises 1.8-1.9
RRR you can generate your own random data set drawn from nearly any distribution very easily.mydata <- data.frame(habitat, temp, elevation)
row.names(mydata) <- c("Reedy Lake", "Pearcadale", "Warneet",
"Cranbourne", "Lysterfield", "Red Hill", "Devilbend", "Olinda")
head(mydata)## habitat temp elevation
## Reedy Lake mixed 3.4 0.0
## Pearcadale wet 3.4 9.2
## Warneet wet 8.4 3.8
## Cranbourne wet 3.0 5.0
## Lysterfield dry 5.6 5.6
## Red Hill dry 8.1 4.1
R is being able to import data from an external source
R.R look in the PWD, whereas a full path can also be usedwrite.csv(YourFile, "yourfile.csv", quote = F, row.names = T,
sep = ",")
write.table(YourFile, "yourfile.txt", quote = F, row.names = T,
sep = "\t")R, that allows you to analyze just a subset of the data.Complete Exercises 1.10-1.11