Tutorial
Document Code with R Markdown
Authors:
Last Updated: Apr 8, 2021
You will need to have the rmarkdown
and knitr
packages installed on your computer prior to completing this tutorial. Refer to
the setup materials to get these installed.
Learning Objectives
At the end of this activity, you will:
- Know how to create an R Markdown file in RStudio.
- Be able to write a script with text and R code chunks.
- Create an R Markdown document ready to be ‘knit’ into an HTML document to share your code and results.
Things You’ll Need To Complete This Tutorial
You will need the most current version of R and, preferably, RStudio loaded on your computer to complete this tutorial.
Install R Packages
-
knitr:
install.packages("knitr")
-
rmarkdown:
install.packages("rmarkdown")
-
raster:
install.packages("raster")
-
rgdal:
install.packages("rgdal")
Download Data
Download NEON Teaching Data Subset: TEAK-Data Institute 2016
The LiDAR and imagery data used to create this raster teaching data subset were collected over the National Ecological Observatory Network's (NEON) Lower Teakettle field site and processed at NEON headquarters. The entire dataset can be accessed by request from the NEON Data Portal.
Download DatasetYou will want to create a data directory for all the Data Institute teaching
datasets. We suggest the pathway be ~/Documents/data/NEONDI-2016
or
the equivalent for your operating system. Once you've downloaded and unzipped
the dataset, move it to this directory.
Additional Resources
- R Markdown Cheatsheet: a very handy reference for using R Markdown
- R Markdown Reference Guide: a more expensive reference for R Markdown
- Introduction to R Markdown by Garrett Grolemund: a tutorial for learning R Markdown
Create an Rmd File
RMarkdown in RStudio Video
Our goal in this series is to document our workflow. We can do this by
- Creating an R Markdown (RMD) file in R studio and
- Rendering that RMD file to HTML using
knitr
.
Watch this 6:38 minute video below to learn more about how you can convert an R Markdown
file to HTML (or other formats) using knitr
in RStudio.
The text size in the video is small so you may want to watch the video in
full screen mode.
Now that you have a sense of how R Markdown can be used in RStudio, you are ready to create your own RMD document. Do the following:
- Create a new R Markdown file and choose HTML as the desired output format.
- Enter a Title (Explore NEON LiDAR Data) and Author Name (your name). Then click OK.
- Save the file using the following format: LastName-institute-week3.rmd NOTE: The document title is not the same as the file name.
- Hit the knit button in RStudio (as is done in the video above). What happens?
If everything went well, you should have an HTML format (web page) output after you hit the knit button. Note that this HTML output is built from a combination of code and documentation that was written using markdown syntax.
Next, we'll break down the structure of an R Markdown file.
Understand Structure of an R Markdown file
Let's next review the structure of an R Markdown (.Rmd
) file. There are three
main content types:
- Header: the text at the top of the document, written in YAML format.
- Markdown sections: text that describes your workflow written using markdown syntax.
- Code chunks: Chunks of R code that can be run and also can be rendered using knitr to an output document.
Next let's explore each section type.
Header -- YAML block
An R Markdown file always starts with header written using YAML syntax. There are four default elements in the RStudio generated YAML header:
- title: the title of your document. Note, this is not the same as the file name.
- author: who wrote the document.
- date: by default this is the date that the file is created.
- output: what format will the output be in. We will use HTML.
A YAML header may be structured differently depending upon how your are using it. Learn more on the R Markdown documentation page.
- Title: Provide a title that fits the code that will be in your RMD.
- Author: Add your name here.
-
Output: Leave the default output setting:
html_document
. We will be rendering an HTML file.
R Markdown Text/Markdown Blocks
An RMD document contains a mixture of code chunks and markdown blocks where you can describe aspects of your processing workflow. The markdown blocks use the same markdown syntax that we learned last week in week 2 materials. In these blocks you might describe the data that you are using, how it's being processed and and what the outputs are. You may even add some information that interprets the outputs.
When you render your document to HTML, this markdown will appear as text on the output HTML document.
Learn More about RStudio Markdown Basics
Explore Your R Markdown File
Look closely at the pre-populated markdown and R code chunks in your RMD file.
Does any of the markdown syntax look familiar?
- Are any words in bold?
- Are any words in italics?
- Are any words highlighted as code?
If you are unsure, the answers are at the bottom of this page.
- Remove the template markdown and code chunks added to the RMD file by RStudio. (Be sure to keep the YAML header!)
- At the very top of your RMD document - after the YAML header, add the bio and short research description that you wrote last week in markdown syntax to the RMD file.
- Between your profile and the research descriptions, add a header that says About My Project (or something similar).
- Add a new header stating R Markdown Activity and text below that explaining that this page demonstrates using some of the NEON Teakettle LiDAR data products in R. The wording of this text should clearly describe the code and outputs that you will be adding the page.
Code chunks
Code chunks are where your R code goes. All code chunks start and end with
```
– three backticks or graves. On
your keyboard, the backticks can be found on the same key as the tilde.
Graves are not the same as an apostrophe!
The initial line of a code chunk must appear as:
```{r chunk-name-with-no-spaces}
# code goes here
```
The r
part of the chunk header identifies this chunk as an R code chunk and is
mandatory. Next to the {r
, there is a chunk name. This name is not required
for basic knitting however, it is good practice to give each chunk a unique
name as it is required for more advanced knitting approaches.
Activity: Add Code Chunks to Your R Markdown File
Continue working on your document. Below the last section that you've just added, create a code chunk that loads the packages required to work with raster data in R.
```{r setup-library }
library(rgdal)
library(raster)
```
In R scripts, setting the working directory is normally done once near the beginning of your script. In R Markdown files, knit code chunks behave a little differently, and a warning appears upon kitting a chunk that sets a working directory.
```{r code-setwd}
# set working directory to ensure R can find the file we wish to import.
# This will depend on your local environment.
setwd("~/Documents/data/NEONDI-2016/")
```
You changed the working directory to ~/Documents/data/NEONDI-2016/ (probably via setwd()). It will be restored to [directory path of current .rmd file]. See the Note section in ?knitr::knit ?knitr::knit
That's a bad sign if you want to set the working directory in one code chunk, and read or write data in another code chunk. To allow for a working data directory that is different from your Rmd file's current directory, you can store the directory path in a string variable.
```{r code-setwd-stringvariable}
# set working directory as a string variable for use in other code chunks.
# This will depend on your local environment.
wd <- "~/Documents/data/NEONDI-2016/"
setwd(wd)
```
The setwd(wd)
line could be at the start of a lengthier code chunk that reads
from and writes to data files. Alternatively, since the variable will be kept in
this document's R environment, it can be used with paste() or paste0() when you
need to refer to a filepath. Proceed to the next step for an example of this.
(For further instruction on setting the working directory, see the NEON Data Skills tutorial Set A Working Directory in R.)
Let's add another chunk that loads the TEAK_lidarDSM
raster file.
```{r load-dsm-raster }
# check for the working directory
getwd()
# In this new chunk, the working directory has reverted to default upon kitting.
# Combining the working directory string variable and
# additional path to the file, import a DSM file.
teak_dsm <- raster(paste0(wd, "NEONdata/D17-California/TEAK/2013/lidar/TEAK_lidarDSM.tif"))
```
Now run the code in this chunk.
You can run code chunks:
- Line-by-line: with cursor on current line, Ctrl + Enter (Windows/Linux) or Command + Enter (Mac OS X).
- By chunk: You can run the entire chunk (or multiple chunks) by clicking on the "Run" button in the upper right corner of the RStudio script panel and choosing the appropriate option (Run Current Chunk, Run Next Chunk). Keyboard shortcuts are available for these options.
Code chunk options
You can also add arguments or options to each code chunk. These arguments allow you to customize how or if you want code to be processed or appear on the output HTML document. Code chunk arguments are added on the first line of a code chunk after the name, within the curly brackets.
The example below, is a code chunk that will not be "run", or evaluated, by R. The code within the chunk will appear on the output document, however there will be no outputs from the code.
```{r intro-option, eval=FALSE}
# the code here will not be processed by R
# but it will appear on your output document
1+2
```
We use eval=FALSE
often when the chunk is exporting an file that we don't
need to re-export but we want to document the code used to export the file.
Three common code chunk options are:
-
eval = FALSE
: Do not evaluate (or run) this code chunk when knitting the RMD document. The code in this chunk will still render in our knitted HTML output, however it will not be evaluated or run by R. -
echo = FALSE
: Hide the code in the output. The code is evaluated when the RMD file is knit, however only the output is rendered on the output document. -
results = hide
: The code chunk will be evaluated but the results of the code will not be rendered on the output document. This is useful if you are viewing the structure of a large object (e.g. outputs of a largedata.frame
).
Multiple code chunk options can be used for the same chunk. For more on code chunk options, read R Markdown: The Definitive Guide or the knitr documentation.
Update your RMD file as follows:
- Add a new code chunk that plots the
TEAK_lidarDSM
raster object that you imported above. Experiment with plot colors and be sure to add a plot title. - Run the code chunk that you just added to your RMD document in R (e.g. run in console, not knitting). Does it create a plot with a title?
- In another new code chunk, import and plot another raster file from the NEON data subset
that you downloaded. The
TEAK_lidarCHM
is a good raster to plot. - Finally, create histograms for both rasters that you've imported into R.
- Be sure to document your steps as you go using both code comments and markdown syntax in between the code chunks.
For help opening and plotting raster data in R, see the NEON Data Skills tutorial Plot Raster Data in R.
We will knit this document to HTML in the next tutorial.
Now continue on to the next tutorial to learn how to knit this document into a HTML file.
- Are any words in bold? - Yes, “Knit” on line 10
- Are any words in italics? - No
- Are any words highlighted as code? - Yes, “echo = FALSE” on line 22