The latest version of R installed (R version 3.5.2).
v.1.1 or better, Rstudio v.1.2 (Note that this version is not officially released yet).
install.packages("knitr")
install.packages("rmarkdown")
install.packages("rticles")
install.packages("tinytex")
tinytex::install_tinytex() #Please run this command as well to install a LaTeX distribution. This may take a few minutes to install (~150MB).
Usefull trick: In Tools -> Global Options -> Code -> Display, check “Show whitespace characters”. This will let you see spaces and newlines characters in the editor.
Usefull trick: In Rstudio -> Preference -> Appearance. Change editor theme.
Workshop here: https://seb951.github.io/rmarkdown_workshop/Rmarkdown/rmarkdown_main.html
Download the workshop material on GitHub . Unzip and double-click on Rmarkdown.Proj file.
Don’t hesitate to comment (new issues) or request changes (pull request).
Follow the .html (web browser) and the .Rmd (R studio) documents. Try and experiment.
~2 hours: Introduction and practice
~10 minutes pause
~1 hour: other formats (.docx, .pdf and Shiny).
Let’s look at a few examples on the Rstudio gallery
Markdown is a lightweight markup language with plain text formatting syntax (Easy-to-read, easy-to-write plain text format). It is designed so that it can be easily converted to HTML and many other formats (e.g. PDF, MS Word, .docx).
Like other markup languages (e.g. HTML and Latex), it is completely independent from R.
Typically, files have the extension .md .
Look at this example. Examine the html render (GitHub automatically interprets .md files) and the raw file.
An extension of the Markdown syntax that enables R code to be embedded and executed.
Generate fully reproducible reports in different static and dynamic output formats.
Most of these packages are maintained by the R studio team (https://rmarkdown.rstudio.com/, Yihui Xie)
Plain text files that typically have the file extension .Rmd.
Write text & code in R studio.
Knit: The R package rmarkdown
feeds the .Rmd file to the R package knitr
.
knitr
executes code and creates a new `Markdown (.md) document which includes the code and output.
Subsequently tranformed into .html/.tex/.docx by pandoc
. (Note that .tex files need to be transformed by pdflatex
into .pdf files. We’ll come back to that later.)
Pandoc
is an universal document converter, independent of R
.
By default, R studio comes with rmarkdown
, knitr
, and pandoc
(but not pdflatex
).
When you click the Knit button (top left), a document will be generated that includes both content as well as the output of R
code within the document. You can also use the render()
function.
file > new file > R Markdown > HTML
Save it (“myfirstRmarkdown.Rmd”)
Knit
Examine the .html output.
Examine at the .Rmd file structure.
Markdown provides an easy way to make standard types of formatted text, like:
italics (*text*) or italics (_text_)
bold (**bold**)
backslash (\) to interpret a special characters as character
“# and space” at the beginning of line for a header level (6 levels, # to ######)
bold italic (_**bold italic**_)
links ([links](https://www.rmarkdown.rstudio.com)
)
<!–comments–>
newline character: Two spaces and the end of line
paragraph mark: Two cariage returns
*** for an horizontal line
quoted text
(`quoted text`)
> Quoted text: 1st way
> more quoted text
> still more quoted text
Quoted text: 1st way
more quoted text
still more quoted text
`Quoted text: 2nd way`
`more quoted text`
`still more quoted text`
Quoted text: 2nd way
more quoted text
still more quoted text
```
text: 3rd way
quoted text
more quoted text
```
Quoted text: 3rd way
more quoted text
still more quoted text
Species | Counts
——— | —–
H. sapiens | 24
M. musculus | 442
Species | Counts |
---|---|
H. sapiens | 24 |
M. musculus | 442 |
Write some text now (add italicized/bold text, some URLs, and an itemized list, have fun!).
You can use this wikipedia text and list of roses subgenera as an example to reproduce.
Convert the document to a html webpage.
---
title: "Rmarkdown"
author: "Sebastien Renaut"
date: '2018-03-12'
output: html_document
---
Header, metadata, YAML, YAML Ain’t Markup Language (https://en.wikipedia.org/wiki/YAML#History_and_name)
Header specifies configurations (what kind of document will be created, and the options chosen).
It is not required (defaults then apply).
It uses Python
-style indentation to specify some options.
Many options possible depending what type of document you are generating. See below for some examples.
Note that some options can be specified either for the whole document (in the header), the code chunks, or both (chunks options supersede header). More on code chunks later.
---
title: "Rmarkdown"
author: "Sebastien Renaut"
date: "March 20, 2019"
output:
html_document:
code_folding: hide
highlight: tango
number_sections: T
theme: cerulean
toc: yes
toc_depth: 3
---
Note the indentation in the .Rmd document for the output options.
Note that date is populated via an R
function.
See the official R markdown lessons for more information. But these are some formats of interest:
output: html_document
output: ioslides_presentation
output: pdf_document
(This will require that you have a Latex software installed - We’ll get to that later).
output: word_document
(.docx)
interactive shiny
apps (We’ll get to that later as well).
toc: yes
Generate Table of Content.
toc_depth:3
depth of TOC.
number_sections:T
Add section numbering to headers. Note that if you do not want a certain heading to be numbered, you can add {-}
or {.unnumbered}
after the heading, e.g.,
theme:
specifies the theme to use for the page (“cerulean”, “journal”, “flatly”, “readable”, “spacelab”, “united”, and “cosmo”).
highlight:
Code syntax highlighting style (e.g. “tango”, “pygments”, “kate”, “zenburn”).
code_folding: hide
Code is hidden, but each chunk has it’s own button for showing or hiding code.
See the cheatsheet and official R markdown book for more options.
Change theme of your R Markdown
document
Change highlighting
Add Table of Content
Save, knit and play with options.
The real power of R Markdown
comes from mixing Markdown
syntax with chunks of code.
A code chunk is intepreted by knitr
. It works essentially the same as the R
syntax we are familiar with.
A main code chunk may look like this:
```{r example, include = T, message = T, warning=T, echo = F, fig.cap="A figure of random points"}
#Running some R code.
x = rexp(1000)
min(x)
max(x)
hist(x)
```
## [1] 0.0008266717
## [1] 8.883759
On the 1st line, I specify that I will run the R
programming language.
Then, I give the chunk a UNIQUE name and specify options.
There are a large number of chunk options in knitr
documented here.
include = FALSE
: Code and results will NOT appear in the finished file. Code is still interpreted, and the results can be used by other chunks.
echo = F
prevents code, but not results from appearing in the finished file. This is a useful way to embed figures.
message = F
prevents messages that are generated by code from appearing in the finished file.
warning = F
prevents warnings that are generated by code from appearing in the finished file.
fig.cap = "..."
adds a caption to graphical results.
fig.width=...
, fig.height=...
can also change figure width/heigth.
By default R studio creates a Global Options code chunk. Let’s examine this chunk:
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
see cheat sheet for more info.
Note that you can also run inline code. For example, ` r 10+5 ` would be processes as 15.
Run inline code.
Can you find options to print code, but not run it?
Also, try clicking the green arrow in the .Rmd on the right to execute a code chunk and preview its output.
R Markdown can read and execute different languages!
## rmarkdown_main.Rmd
## rmarkdown_main.html
## rmarkdown_main.log
## ['hello', 'python!']
## Hello perl!
Mathematical material is set off by the use of single dollar-sign characters (similar as in the LaTeX typesetting language).
So to write \(E = mc^{2}\), you’d write: $E = mc^{2}$
\(\sum_{i=1}^n ASV\)
\(F_{(1,69)}\) = 1.27, p-value=0.26
\(A = \pi*r^{2}\)
\(\sqrt{b^2 - 4ac}\)
If you need to use an actual dollar sign, you need to preface it with a back-slash \(E = mc^{2}\) versus $E = mc^{2}$
The use of double dollars quotations allows for displayed formulas (centered). \[\sqrt{b^2 - 4ac}\]
See more example equations from this McGill math R Markdown tutorial.
There are several ways to include figures.
Can be included from an URL directly uploaded from the web:
![optional caption here](https://upload.wikimedia.org/wikipedia/commons/7/7f/Rosa_persica%3B_Baikonur_01.jpg){width=250px}
If this figure is small, it can be added to the text directly: eg.: Today, we are using to generate webpages with images…
This is an image previously saved in the figures directory
![](../figures/rosa_banksiae.JPG){width=250px}
In all these cases, graphs are rendered with pandoc
and not knitr
, so pandoc
options need to be specified, not knitr
R graphics options:
It’s simple, but options can be tricky.
You may need to play with spacing, figure size, and figure position.
Options are specified directly after the URL or link (eg. {width=250px} or {width=50%}).
Images can also be interpreted by knitr
as below:
```{r graphic_example, out.width = "20%", fig.cap = "rosa_banksiae", echo = F,fig.align = "center"}
knitr::include_graphics("../figures/rosa_banksiae.JPG")
```
```{r roses, out.width = "50%",echo = F,out.extra='style="float:right; padding:10px"'}
knitr::include_graphics("../figures/rosa_banksiae.JPG")
```
The genus Rosa is subdivided into four subgenera:
Graphs can also be generated directly by R
code, specified in a code chunk (R
options specified in the code chunk) and interpreted by knitr
as we did previously.
```{r another example, echo = F, message = F}
library(ggplot2)
mtcars_ggplot = ggplot(mtcars, aes(x=wt, y=mpg)) +
geom_point() + geom_smooth()
mtcars_ggplot
```
```{r out.width=c('50%', '50%'), fig.show='hold',echo=F,message = F}
mtcars_ggplot
plot(rnorm(10))
```
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
By default, R Markdown displays data frames and matrices as they would be in the R terminal.
You can use the knitr::kable
function for additional formatting, as in the .Rmd file below.
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
#With kable function from knitr (better looking)
knitr::kable(head(mtcars),digits =1,caption = "A motorcars table")
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.9 | 2.6 | 16.5 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.9 | 2.9 | 17.0 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.8 | 2.3 | 18.6 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.1 | 3.2 | 19.4 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.1 | 3.4 | 17.0 | 0 | 0 | 3 | 2 |
Valiant | 18.1 | 6 | 225 | 105 | 2.8 | 3.5 | 20.2 | 1 | 0 | 3 | 1 |
Find a picture on the web. Save it.
Add it to document either directly, or in a code chunk.
Try adjusting size.
Add a table using knitr
.
[^1]
in text, and add reference at the end using this format: [^1]: Renaut 2019. R Markdown footnote. Number 1. pp1-2.
csl: ../csl/peerj.csl
bibliography: ../biblio/test_library.bib
The Citation Style Language (.csl) file specifies the reference format.
It is an open XML-based language that describe the formatting of citations and bibliographies. Reference management programs using .csl include Zotero, Mendeley and Papers3.
Most journals should have a .csl file be on this GitHub repo. But you could also create your own.
@article{altschul1997gapped,
title={Gapped BLAST and PSI-BLAST: a new generation of protein database search programs},
author={Altschul, Stephen F and Madden, Thomas L and Sch{\"a}ffer, Alejandro A and Zhang, Jinghui and Zhang, Zheng and Miller, Webb and Lipman, David J},
journal={Nucleic acids research},
volume={25},
number={17},
pages={3389--3402},
year={1997},
publisher={Oxford University Press}
}
Here, I created a .bib file (../biblio/test_library.bib) in the reference management software Papers3.
I often copy .bib references directly from Google Scholar and add it to a .bib database text file.
The bioinformatics program BLAST (Altschul et al., 1997) has been cited nearly 70,000 times. These are three random references (Thibert-Plante & Hendry, 2010; Wagner et al., 2012; Yoshida et al., 2014) from my database.
Each citation must have a unique key, composed of ‘@’ + the citation identifier from the .bib database file.
Citations go inside square brackets [ ] and are separated by semicolons (;).
You can also write in-text citations by removing the square brackets. For example, Altschul et al. (1997) is cited a lot.
A minus sign (-) before the @ will suppress mention of the author in the citation. This can be useful when the author is already mentioned in the text. For example, The BLAST algorithm by Stephen Altschul and a bunch of other people (1997) have been cited 70,000 times.
By default, references are added at the end of document. Use the code <div id="refs"></div>
to place references elsewhere.
Find 3 papers in Google Scholar. Copy references to a text file (in bibTeX format). Save it with a .bib extension.
Find another Citation Style Language from this GitHub repo (e.g. Nature, PLOS ONE, Indian Journal Of Dermatology, etc.). (hint: type ‘t’ in GitHub repo to activate search function). Save as text file (.csl extension) and modify it in the header.
(Note that references below are generated automatically, except for the footnote.)
Altschul SF., Madden TL., Schäffer AA., Zhang J., Zhang Z., Miller W., Lipman DJ. 1997. Gapped blast and psi-blast: A new generation of protein database search programs. Nucleic acids research 25:3389–3402.
Thibert-Plante X., Hendry A. 2010. The consequences of phenotypic plasticity for ecological speciation. Journal Of Evolutionary Biology:1–17.
Wagner CE., Keller I., Wittwer S., Selz OM., Mwaiko S., Greuter L., Sivasundar A., Seehausen O. 2012. Genome-wide RAD sequence data provide unprecedented resolution of species boundaries and relationships in the Lake Victoria cichlid adaptive radiation. 22:787–798.
Yoshida K., Makino T., Yamaguchi K., Shigenobu S., Hasebe M., Kawata M., Kume M., Mori S., Peichel CL., Toyoda A., Fujiyama A., Kitano J. 2014. Sex Chromosome Turnover Contributes to Genomic Divergence between Incipient Stickleback Species. PLoS Genetics 10:e1004223.
Renaut 2019. R Markdown footnote. Number 1. pp1-2↩