# R

At the risk of sounding like a broken record, you’re going to need three things to write R.

• A computer.
• The `R` interpreter.
• A text editor.

R is a free software environment for statistical computing and graphics. You can download a copy from the Comprehensive R Archive Network (CRAN). Versions are available for Windows, MacOS and Unixen. Like Perl, R can run scripts written with a text-editor and saved to file; however, it is common to use R through its interactive command-line interface, at the `>| ` prompt.

Text that should be input into the R command line will look like this, in red:

`t.test( c(1,2,3,4), c(4,6,7,6) )`

In R itself, each input line will be preceded by the > prompt, but I have missed these off so that you will be able to copy-and-paste input text directly into R, and have it run without any modification.

Text that is output from R will look like this, in blue:

```Welch Two Sample t-test

data: c(1, 2, 3, 4) and c(4, 6, 7, 6)
t = -3.6056, df = 5.996, p-value = 0.0113
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-5.455968 -1.044032
sample estimates:
mean of x mean of y
2.50 5.75```

For everyday programming (which in my job, involves a lot of text-munging), I use Perl, so this tutorial for R does not cover all the ins-and-outs of R objects, closures, modules, functional programming, etc., as I mostly use R in the limited domain of statistics and graphing. However, it does include some of the theory underlying the statistical tests. Unlike the Perl tutorial, which is more of a reference work, this tutorial is based around a large number of worked exercises, which will hopefully help you practice the material and stretch you beyond it.

1. Running R code. Saving and running code to run in R. Loading data into R from CSVs. Vectors and functions for manipulating them. `c`, `mean`, `sum`, `max`, `min`, `seq` and `rep`.
2. Kinds of data. Discrete, categorical, (un)bounded, continuous, ordinal, numeric, and friends.
3. Formatting data. Saving data from spreadsheets for import into R. `read`, `data.frame` and `write`.
4. Plotting data. Boxplots and basic graphical parameters in R. `plot`, `boxplot` and the `~` “modelled on” syntax.
5. Descriptive statistics. Measures of central tendency. Measures of data variability. Histograms and degrees of freedom. `hist`, `var`, `sd`, `median` and `summary`.
6. Statistical testing. Fisher p-value and significance, binomial distributions. `binom.test`. Distribution functions: `dbinom`, `pbinom` and `qbinom`
7. Statistical power. Neyman-Pearson hypothesis testing. Type I and II errors. Random variables from distributions: `rbinom`. Installing packages.
8. The F test. Comparison of the variance of two sets of continuous data. `var.test`, and `rnorm`.
9. The t test. Comparison of the means of two sets of continuous data. `t.test`, `tapply` and `qqnorm`
10. Linear regression. Correlation of two sets of continuous data. Checking residuals for normality. `lm`, `segments`, `abline`, `predict`, `residuals` and `fitted`
11. The χ² test. Comparison of expected and observed count data. `chisq.test`, `head`, `str`, `table`, `list`, `matrix` and `dimnames`.
12. Analysis of variance (ANOVA). Basics, and one-way ANOVA. `aov`, `anova` and `TukeyHSD`.
13. Two way ANOVA. Interaction plots and model simplification. `interaction.plot` and `update`.
14. Nonlinear regression. Non-linear least squares curve fitting. `nls`, `coef`, `log` and `lines`.

The posts above don’t cover every conceivable test, but here’s a handy flowchart that will help you find out what test you actually need. It’s not perfect though: an ANOVA is a perfectly sensible technique for analysing a one-factor two-level design with a continuous response variable with normal errors, but the flow-chart will lead you to t. There’s rarely a single, correct answer to ‘how should I analyse this data set?’, but there are certainly many answers to that question!

• • student on 18/07/2017 at 15:49

Please could you do a tutorial for mixed effect models?

• • polypompholyx on 19/07/2017 at 14:28
Author

That’s extremely tempting, but it would probably need me to write at least one, probably two, on generalised linear models first. I’ll keep it in mind, but can’t promise anything!

• • student on 21/07/2017 at 16:26

ooh okay, thank you 🙂

• • Robert Berdan on 26/03/2018 at 02:10

Can I have permission to use your Stentor diagram on my web site http://www.canadiannaturephotographer.com? I am writing an article which will contain my own photomicrographs of Stentors and other ciliates. The sites purpose is education and inspiration. You are welcome to use some of my photographs if you like on your site.

The article is to be posted in about 1-2 weeks it is still under development.
Cheers
Robert Berdan
Calgary, Alberta

• • polypompholyx on 27/03/2018 at 19:15
Author

Yes, certainly. All my images and text are released under a Creative Commons Attribution ShareAlike license, so if you’re just planning on using it in a post on your site, you should feel free to reuse them as long as you say where they’re from. I look forward to seeing your micrographs!

This site uses Akismet to reduce spam. Learn how your comment data is processed.