Tutorial
Time Series Culmination Activity: Plot using Facets & Plot NDVI with Time Series Data
Authors: Megan A. Jones, Leah A. Wasser
Last Updated: Jun 9, 2024
This tutorial is a culmination activity for the series on working with tabular time series data in R . Refer to Data Carpentry's : Introduction to Geospatial Raster and Vector Data with R for further related content.
Learning Objectives
After completing this tutorial, you will be able to:
- Apply
ggplot2
anddplyr
skills to a new dataset. - Set min/max axis values in
ggplot()
to align data on multiple plots.
Things You’ll Need To Complete This Tutorial
You will need the most current version of R and, preferably, RStudio loaded on your computer to complete this tutorial.
Install R Packages
-
ggplot2:
install.packages("ggplot2")
-
scales:
install.packages("scales")
-
gridExtra:
install.packages("gridExtra")
-
grid:
install.packages("grid")
-
dplyr:
install.packages("dplyr")
-
reshape2:
install.packages("reshape2")
More on Packages in R – Adapted from Software Carpentry.
Download Data
NEON Teaching Data Subset: Meteorological Data for Harvard Forest
The data used in this lesson were collected at the National Ecological Observatory Network's Harvard Forest field site. These data are proxy data for what will be available for 30 years on the NEON data portal for the Harvard Forest and other field sites located across the United States.
Set Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.
An overview of setting the working directory in R can be found here.
R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.
Recommended Tutorials
This tutorial uses both dplyr
and ggplot2
. If you are new to either of these
R packages, we recommend the following NEON Data Skills tutorials before
working through this one.
Plot NDVI & PAR using Daily Data
NDVI Data
Normalized Difference Vegetation Index (NDVI) is an indicator of how green vegetation is.
Watch this two and a half minute video from Karen Joyce that explains what NDVI is and why it is used.
NDVI is derived from remote sensing data based on a ratio the reluctance of visible red spectra and near-infrared spectra. The NDVI values vary from -1.0 to 1.0.
The imagery data used to create this NDVI data were collected over the National Ecological Observatory Network's Harvard Forest field site.
The imagery was created by the U.S. Geological Survey (USGS) using a multispectral scanner on a Landsat Satellite . The data files are Geographic Tagged Image-File Format (GeoTIFF). The tutorial Extract NDVI Summary Values from a Raster Time Series explains how to create this NDVI file from raster data.
Read In the Data
We need to read in two datasets: the 2009-2011 micrometeorological data and the 2011 NDVI data for the Harvard Forest.
# Remember it is good coding technique to add additional libraries to the top of
# your script
library(lubridate) # for working with dates
library(ggplot2) # for creating graphs
library(scales) # to access breaks/formatting functions
library(gridExtra) # for arranging plots
library(grid) # for arranging plots
library(dplyr) # for subsetting by season
# set working directory to ensure R can find the file we wish to import
wd <- "~/Git/data/"
# read in the Harvard micro-meteorological data; if you don't already have it
harMetDaily.09.11 <- read.csv(
file=paste0(wd,"NEON-DS-Met-Time-Series/HARV/FisherTower-Met/Met_HARV_Daily_2009_2011.csv"),
stringsAsFactors = FALSE
)
#check out the data
str(harMetDaily.09.11)
## 'data.frame': 1095 obs. of 47 variables:
## $ X : int 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 ...
## $ date : chr "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" ...
## $ jd : int 1 2 3 4 5 6 7 8 9 10 ...
## $ airt : num -15.1 -9.1 -5.5 -6.4 -2.4 -4.9 -2.6 -3.2 -9.9 -11.1 ...
## $ f.airt : chr "" "" "" "" ...
## $ airtmax : num -9.2 -3.7 -1.6 0 0.7 0 -0.2 -0.5 -6.1 -8 ...
## $ f.airtmax: chr "" "" "" "" ...
## $ airtmin : num -19.1 -15.8 -9.5 -11.4 -6.4 -10.1 -5.1 -9.9 -12.5 -15.9 ...
## $ f.airtmin: chr "" "" "" "" ...
## $ rh : int 58 75 69 59 77 65 97 93 78 77 ...
## $ f.rh : chr "" "" "" "" ...
## $ rhmax : int 76 97 97 78 97 88 100 100 89 92 ...
## $ f.rhmax : chr "" "" "" "" ...
## $ rhmin : int 33 47 41 40 45 38 77 76 63 54 ...
## $ f.rhmin : chr "" "" "" "" ...
## $ dewp : num -21.9 -12.9 -10.9 -13.3 -6.2 -10.9 -3 -4.2 -13.1 -14.5 ...
## $ f.dewp : chr "" "" "" "" ...
## $ dewpmax : num -20.4 -6.2 -6.4 -9.1 -1.7 -7.5 -0.5 -0.6 -11.2 -10.5 ...
## $ f.dewpmax: chr "" "" "" "" ...
## $ dewpmin : num -23.5 -21 -14.3 -16.3 -12.1 -13 -7.6 -11.8 -15.2 -18 ...
## $ f.dewpmin: chr "" "" "" "" ...
## $ prec : num 0 0 0 0 1 0 26.2 0.8 0 1 ...
## $ f.prec : chr "" "" "" "" ...
## $ slrt : num 8.4 3.7 8.1 8.3 2.9 6.3 0.8 2.8 8.8 5.7 ...
## $ f.slrt : chr "" "" "" "" ...
## $ part : num 16.7 7.3 14.8 16.2 5.4 11.7 1.8 7 18.2 11.4 ...
## $ f.part : chr "" "" "" "" ...
## $ netr : num -39.4 -16.6 -35.3 -24.7 -19.4 -18.9 5.6 -21.7 -31.1 -16 ...
## $ f.netr : chr "" "" "" "" ...
## $ bar : int 1011 1005 1004 1008 1006 1009 991 987 1005 1015 ...
## $ f.bar : chr "" "" "" "" ...
## $ wspd : num 2.4 1.4 2.7 1.9 2.1 1 1.4 0 1.3 1 ...
## $ f.wspd : chr "" "" "" "" ...
## $ wres : num 2.1 1 2.5 1.6 1.9 0.7 1.3 0 1.1 0.6 ...
## $ f.wres : chr "" "" "" "" ...
## $ wdir : int 294 237 278 292 268 257 86 0 273 321 ...
## $ f.wdir : chr "" "" "" "" ...
## $ wdev : int 29 42 24 31 26 44 24 0 20 50 ...
## $ f.wdev : chr "" "" "" "" ...
## $ gspd : num 13.4 8.1 13.9 8 11.6 5.1 9.1 0 10.1 5 ...
## $ f.gspd : chr "" "" "" "" ...
## $ s10t : num 1 1 1 1 1 1 1.1 1.2 1.4 1.3 ...
## $ f.s10t : logi NA NA NA NA NA NA ...
## $ s10tmax : num 1.1 1 1 1 1.1 1.1 1.1 1.3 1.4 1.4 ...
## $ f.s10tmax: logi NA NA NA NA NA NA ...
## $ s10tmin : num 1 1 1 1 1 1 1 1.1 1.3 1.2 ...
## $ f.s10tmin: logi NA NA NA NA NA NA ...
# read in the NDVI CSV data; if you dont' already have it
NDVI.2011 <- read.csv(
file=paste0(wd,"NEON-DS-Met-Time-Series/HARV/NDVI/meanNDVI_HARV_2011.csv"),
stringsAsFactors = FALSE
)
# check out the data
str(NDVI.2011)
## 'data.frame': 11 obs. of 6 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ meanNDVI : num 0.365 0.243 0.251 0.599 0.879 ...
## $ site : chr "HARV" "HARV" "HARV" "HARV" ...
## $ year : int 2011 2011 2011 2011 2011 2011 2011 2011 2011 2011 ...
## $ julianDay: int 5 37 85 133 181 197 213 229 245 261 ...
## $ Date : chr "2011-01-05" "2011-02-06" "2011-03-26" "2011-05-13" ...
In the NDVI dataset, we have the following variables:
- 'X': an integer identifying each row
- meanNDVI: the daily total NDVI for that area. (It is a mean of all pixels in the original raster).
- site: "HARV" means all NDVI values are from the Harvard Forest
- year: "2011" all values are from 2011
- julianDay: the numeric day of the year
- Date: a date in format "YYYY-MM-DD"; currently in chr class
-
Ensure that date fields within both datasets are in the Date class. If not, convert the data to the Date class.
-
The NDVI data are limited to 2011, however, the meteorological data are from 2009-2011. Subset and retain only the 2011 meteorological data. Name it
harMet.daily2011
.
HINT: If you are having trouble subsetting the data, refer back to Subset & Manipulate Time Series Data with dplyr tutorial.
Now that we have our datasets with Date class dates and limited to 2011, we can begin working with both.
Plot NDVI Data from a .csv
These NDVI data were derived from a raster and are now integers in a
data.frame
, therefore we can plot it like any of our other values using
ggplot()
. Here we plot meanNDVI
by Date
.
# plot NDVI by date
ggplot(NDVI.2011, aes(Date, meanNDVI))+
geom_point(colour = "forestgreen", size = 4) +
ggtitle("Daily NDVI at Harvard Forest, 2011")+
theme(legend.position = "none",
plot.title = element_text(lineheight=.8, face="bold",size = 20),
text = element_text(size=20))
Two y-axes or Side-by-Side Plots?
When we have different types of data like NDVI (scale: 0-1 index units), Photosynthetically Active Radiation (PAR, scale: 0-65.8 mole per meter squared), or temperature (scale: -20 to 30 C) that we want to plot over time, we cannot simply plot them on the same plot as they have different y-axes.
One option, would be to plot both data types in the same plot space but each
having it's own axis (one on left of plot and one on right of plot). However,
there is a line of graphical representation thought that this is not a good
practice. The creator of ggplot2
ascribes to this dislike of different y-axes
and so neither qplot
nor ggplot
have this functionality.
Instead, plots of different types of data can be plotted next to each other to allow for comparison. Depending on how the plots are being viewed, they can have a vertical or horizontal arrangement.
Plot the NDVI vs Date (previous plot) and PAR vs Date (create a new plot) in the same viewer so we can more easily compare them.
Hint: If you are having a hard time arranging the plots in a single grid, refer back to Plotting Time Series with ggplot in R tutorial.
The figures from this Challenge are nice but a bit confusing as the dates on the x-axis don't exactly line up. To fix this we can assign the same min and max to both x-axes so that they align. The syntax for this is:
limits=c(min=VALUE,max=VALUE)
.
In our case we want the min and max values to
be based on the min and max of the NDVI.2011$Date
so we'll use a function
specifying this instead of a single value.
We can also assign the date format for the x-axis and clearly label both axes.
# plot PAR
plot2.par.2011 <- plot.par.2011 +
scale_x_date(labels = date_format("%b %d"),
date_breaks = "3 months",
date_minor_breaks= "1 week",
limits=c(min=min(NDVI.2011$Date),max=max(NDVI.2011$Date))) +
ylab("Total PAR") + xlab ("")
# plot NDVI
plot2.NDVI.2011 <- plot.NDVI.2011 +
scale_x_date(labels = date_format("%b %d"),
date_breaks = "3 months",
date_minor_breaks= "1 week",
limits=c(min=min(NDVI.2011$Date),max=max(NDVI.2011$Date)))+
ylab("Total NDVI") + xlab ("Date")
# Output with both plots
grid.arrange(plot2.par.2011, plot2.NDVI.2011)
Second, plot PAR, air temperature and NDVI in a single pane for ease of comparison.