Series
Introduction to working with PhenoCam Images
Phenology—the study of how nature changes seasonally—is a core focus of research enabled by NEON data and infrastructure. NEON gathers phenological data at its field sites through various methods including field sampling of specific plants and insects, flyovers to collect remote sensing data and in situ phenocams that collect phenology image data. These data are hosted by PhenoCam Network and also accessible via the NEON data portal as NEON Data Products DP1.00033.001, DP1.00042.001, and DP1.20002.001.
The tutorials in this series were developed in close collaboration with Phenocam Network scientists.
This series provides instruction on how to work with the photographs from the PhenoCam Network to extract the day you want for phenology or time-series analyses.
Learning Objectives
After completing the series, you will be able to:
- Access and download raw and processed data from the PhenoCam network
- Extract time-series data from any stack of digital images, including but not limited to the images of the PhenoCam network
- Quantitatively remove data obtained from bad weather conditions (such as presence fog and cloud, camera field of view shifts)
- Analyze the extracted time-series to address environmental questions
Things You’ll Need To Complete This Series
Install R & Set up RStudio
To complete the tutorial series, you will need an updated version of R
(version 3.4 or later) and, preferably, RStudio installed on your computer.
R
is a programming language that specializes in statistical computing. It is a
powerful tool for exploratory data analysis. To interact with R
, we strongly
recommend
RStudio,
an interactive development environment (IDE).
Install R Packages
You can chose to install packages with each lesson or you can download all of the necessary R packages now. You can install the necessary R packages by running the code below
install.packages('rgdal')
utils::install.packages('devtools', repos = "http://cran.us.r-project.org" )
install.packages('hazer')
install.packages('jpeg')
install.packages('lubridate')
install.packages('hazer')
install.packages('data.table')
install.packages('phenocamapi')
install.packages('phenocamr')
install.packages('xROI')
devtools::install_github('khufkens/phenor')
install.packages('daymetr')
install.packages('tidyverse')
install.packages('shiny')
install.packages('xaringan')
install.packages('maps')
install.packages('raster')
install.packages('sp')
More on Packages in R – Adapted from Software Carpentry.
Interacting with the PhenoCam Server using phenocamapi R Package
Authors: Bijan Seyednasrollah
Last Updated: Nov 23, 2020
The phenocamapi R package is developed to simplify interacting with the PhenoCam network dataset and perform data wrangling steps on PhenoCam sites' data and metadata.
This tutorial will show you the basic commands for accessing PhenoCam data through the PhenoCam API. The phenocampapi R package is developed and maintained by Bijan Seyednarollah. The most recent release is available on GitHub (PhenocamAPI). Additional vignettes can be found on how to merge external time-series (e.g. Flux data) with the PhenoCam time-series.
We begin with several useful skills and tools for extracting PhenoCam data directly from the server:
- Exploring the PhenoCam metadata
- Filtering the dataset by site attributes
- Downloading PhenoCam time-series data
- Extracting the list of midday images
- Downloading midday images for a given time range
Exploring PhenoCam metadata
Each PhenoCam site has specific metadata including but not limited to how a site is set up and where it is located, what vegetation type is visible from the camera, and its climate regime. Each PhenoCam may have zero to several Regions of Interest (ROIs) per vegetation type. The phenocamapi package is an interface to interact with the PhenoCam server to extract those data and process them in an R environment.
To explore the PhenoCam data, we'll use several packages for this tutorial.
library(data.table)
library(phenocamapi)
library(lubridate)
library(jpeg)
We can obtain an up-to-date data.frame
of the metadata of the entire PhenoCam
network using the get_phenos()
function. The returning value would be a
data.table
in order to simplify further data exploration.
# obtaining the phenocam site metadata from the server as data.table
phenos <- get_phenos()
# checking out the first few sites
head(phenos$site)
#> [1] "acadia" "aguatibiaeast" "aguatibianorth" "ahwahnee" "alleypond"
#> [6] "alligatorriver"
# checking out the columns
colnames(phenos)
#> [1] "site" "lat" "lon"
#> [4] "elev" "active" "utc_offset"
#> [7] "date_first" "date_last" "infrared"
#> [10] "contact1" "contact2" "site_description"
#> [13] "site_type" "group" "camera_description"
#> [16] "camera_orientation" "flux_data" "flux_networks"
#> [19] "flux_sitenames" "dominant_species" "primary_veg_type"
#> [22] "secondary_veg_type" "site_meteorology" "MAT_site"
#> [25] "MAP_site" "MAT_daymet" "MAP_daymet"
#> [28] "MAT_worldclim" "MAP_worldclim" "koeppen_geiger"
#> [31] "ecoregion" "landcover_igbp" "dataset_version1"
#> [34] "site_acknowledgements" "modified" "flux_networks_name"
#> [37] "flux_networks_url" "flux_networks_description"
Now we have a better idea of the types of metadata that are available for the Phenocams.
Remove null values
We may want to explore some of the patterns in the metadata before we jump into specific locations.
Let's look at Mean Annual Precipitation (MAP) and Mean Annual
Temperature (MAT) across the different field site and classify those by the
primary vegetation type (primary_veg_type
) for each site. We can find out what
the abbreviations for the vegetation types mean from the following table:
| Abbreviation | Description | |----------|:-------------:|------:| | AG | agriculture | | DB | deciduous broadleaf | | DN | deciduous needleleaf | | EB | evergreen broadleaf | | EN | evergreen needleleaf | | GR | grassland | | MX | mixed vegetation (generally EN/DN, DB/EN, or DB/EB) | | SH | shrubs | | TN | tundra (includes sedges, lichens, mosses, etc.) | | WT | wetland | | NV | non-vegetated | | RF | reference panel | | XX | unspecified |
To do this we'd first want to remove the sites where there is not data and then plot the data.
# removing the sites with unkown MAT and MAP values
phenos <- phenos[!((MAT_worldclim == -9999)|(MAP_worldclim == -9999))]
# extracting the PhenoCam climate space based on the WorldClim dataset
# and plotting the sites across the climate space different vegetation type as different symbols and colors
phenos[primary_veg_type=='DB', plot(MAT_worldclim, MAP_worldclim, pch = 19, col = 'green', xlim = c(-5, 27), ylim = c(0, 4000))]
#> NULL
phenos[primary_veg_type=='DN', points(MAT_worldclim, MAP_worldclim, pch = 1, col = 'darkgreen')]
#> NULL
phenos[primary_veg_type=='EN', points(MAT_worldclim, MAP_worldclim, pch = 17, col = 'brown')]
#> NULL
phenos[primary_veg_type=='EB', points(MAT_worldclim, MAP_worldclim, pch = 25, col = 'orange')]
#> NULL
phenos[primary_veg_type=='AG', points(MAT_worldclim, MAP_worldclim, pch = 12, col = 'yellow')]
#> NULL
phenos[primary_veg_type=='SH', points(MAT_worldclim, MAP_worldclim, pch = 23, col = 'red')]
#> NULL
legend('topleft', legend = c('DB','DN', 'EN','EB','AG', 'SH'),
pch = c(19, 1, 17, 25, 12, 23),
col = c('green', 'darkgreen', 'brown', 'orange', 'yellow', 'red' ))
Filtering using attributes
Alternatively, we may want to only include Phenocams with certain attributes in
our datasets. For example, we may be interested only in sites with a co-located
flux tower. For this, we'd want to filter for those with a flux tower using the
flux_sitenames
attribute in the metadata.
# store sites with flux_data available and the FLUX site name is specified
phenofluxsites <- phenos[flux_data==TRUE&!is.na(flux_sitenames)&flux_sitenames!='',
.(PhenoCam=site, Flux=flux_sitenames)] # return as table
#and specify which variables to retain
phenofluxsites <- phenofluxsites[Flux!='']
# see the first few rows
head(phenofluxsites)
#> PhenoCam Flux
#> 1: alligatorriver US-NC4
#> 2: arsbrooks10 US-Br1: Brooks Field Site 10- Ames
#> 3: arsbrooks11 US-Br3: Brooks Field Site 11- Ames
#> 4: arscolesnorth LTAR
#> 5: arscolessouth LTAR
#> 6: arsgreatbasinltar098 US-Rws
We could further identify which of those Phenocams with a flux tower and in
deciduous broadleaf forests (primary_veg_type=='DB'
).
#list deciduous broadleaf sites with flux tower
DB.flux <- phenos[flux_data==TRUE&primary_veg_type=='DB',
site] # return just the site names as a list
# see the first few rows
head(DB.flux)
#> [1] "alligatorriver" "bartlett" "bartlettir" "bbc1" "bbc2"
#> [6] "bbc3"
PhenoCam time series
PhenoCam time series are extracted time series data obtained from ROI's for a given site.
Obtain ROIs
To download the phenological time series from the PhenoCam, we need to know the
site name, vegetation type and ROI ID. This information can be obtained from each
specific PhenoCam page on the
PhenoCam website
or by using the get_rois()
function.
# obtaining the list of all the available ROI's on the PhenoCam server
rois <- get_rois()
# view what information is returned
colnames(rois)
#> [1] "roi_name" "site" "lat" "lon"
#> [5] "roitype" "active" "show_link" "show_data_link"
#> [9] "sequence_number" "description" "first_date" "last_date"
#> [13] "site_years" "missing_data_pct" "roi_page" "roi_stats_file"
#> [17] "one_day_summary" "three_day_summary" "data_release"
# view first few locations
head(rois$roi_name)
#> [1] "alligatorriver_DB_1000" "arbutuslake_DB_1000" "arbutuslakeinlet_DB_1000"
#> [4] "arbutuslakeinlet_EN_1000" "arbutuslakeinlet_EN_2000" "archboldavir_AG_1000"
Download time series
The get_pheno_ts()
function can download a time series and return the result
as a data.table
.
Let's work with the
Duke Forest Hardwood Stand (dukehw
) PhenoCam
and specifically the ROI
DB_1000
we can run the following code.
# list ROIs for dukehw
rois[site=='dukehw',]
#> roi_name site lat lon roitype active show_link show_data_link
#> 1: dukehw_DB_1000 dukehw 35.97358 -79.10037 DB TRUE TRUE TRUE
#> sequence_number description first_date last_date
#> 1: 1000 canopy level DB forest at awesome Duke forest 2013-06-01 2020-03-17
#> site_years missing_data_pct roi_page
#> 1: 6.6 3.0 https://phenocam.sr.unh.edu/webcam/roi/dukehw/DB_1000/
#> roi_stats_file
#> 1: https://phenocam.sr.unh.edu/data/archive/dukehw/ROI/dukehw_DB_1000_roistats.csv
#> one_day_summary
#> 1: https://phenocam.sr.unh.edu/data/archive/dukehw/ROI/dukehw_DB_1000_1day.csv
#> three_day_summary data_release
#> 1: https://phenocam.sr.unh.edu/data/archive/dukehw/ROI/dukehw_DB_1000_3day.csv NA
# to obtain the DB 1000 from dukehw
dukehw_DB_1000 <- get_pheno_ts(site = 'dukehw', vegType = 'DB', roiID = 1000, type = '3day')
# what data are available
str(dukehw_DB_1000)
#> Classes 'data.table' and 'data.frame': 830 obs. of 35 variables:
#> $ date : chr "2013-06-01" "2013-06-04" "2013-06-07" "2013-06-10" ...
#> $ year : int 2013 2013 2013 2013 2013 2013 2013 2013 2013 2013 ...
#> $ doy : int 152 155 158 161 164 167 170 173 176 179 ...
#> $ image_count : int 57 76 77 77 77 78 21 0 0 0 ...
#> $ midday_filename : chr "dukehw_2013_06_01_120111.jpg" "dukehw_2013_06_04_120119.jpg" "dukehw_2013_06_07_120112.jpg" "dukehw_2013_06_10_120108.jpg" ...
#> $ midday_r : num 91.3 76.4 60.6 76.5 88.9 ...
#> $ midday_g : num 97.9 85 73.2 82.2 95.7 ...
#> $ midday_b : num 47.4 33.6 35.6 37.1 51.4 ...
#> $ midday_gcc : num 0.414 0.436 0.432 0.42 0.406 ...
#> $ midday_rcc : num 0.386 0.392 0.358 0.391 0.377 ...
#> $ r_mean : num 87.6 79.9 72.7 80.9 83.8 ...
#> $ r_std : num 5.9 6 9.5 8.23 5.89 ...
#> $ g_mean : num 92.1 86.9 84 88 89.7 ...
#> $ g_std : num 6.34 5.26 7.71 7.77 6.47 ...
#> $ b_mean : num 46.1 38 39.6 43.1 46.7 ...
#> $ b_std : num 4.48 3.42 5.29 4.73 4.01 ...
#> $ gcc_mean : num 0.408 0.425 0.429 0.415 0.407 ...
#> $ gcc_std : num 0.00859 0.0089 0.01318 0.01243 0.01072 ...
#> $ gcc_50 : num 0.408 0.427 0.431 0.416 0.407 ...
#> $ gcc_75 : num 0.414 0.431 0.435 0.424 0.415 ...
#> $ gcc_90 : num 0.417 0.434 0.44 0.428 0.421 ...
#> $ rcc_mean : num 0.388 0.39 0.37 0.381 0.38 ...
#> $ rcc_std : num 0.01176 0.01032 0.01326 0.00881 0.00995 ...
#> $ rcc_50 : num 0.387 0.391 0.373 0.383 0.382 ...
#> $ rcc_75 : num 0.391 0.396 0.378 0.388 0.385 ...
#> $ rcc_90 : num 0.397 0.399 0.382 0.391 0.389 ...
#> $ max_solar_elev : num 76 76.3 76.6 76.8 76.9 ...
#> $ snow_flag : logi NA NA NA NA NA NA ...
#> $ outlierflag_gcc_mean: logi NA NA NA NA NA NA ...
#> $ outlierflag_gcc_50 : logi NA NA NA NA NA NA ...
#> $ outlierflag_gcc_75 : logi NA NA NA NA NA NA ...
#> $ outlierflag_gcc_90 : logi NA NA NA NA NA NA ...
#> $ YEAR : int 2013 2013 2013 2013 2013 2013 2013 2013 2013 2013 ...
#> $ DOY : int 152 155 158 161 164 167 170 173 176 179 ...
#> $ YYYYMMDD : chr "2013-06-01" "2013-06-04" "2013-06-07" "2013-06-10" ...
#> - attr(*, ".internal.selfref")=<externalptr>
We now have a variety of data related to this ROI from the Hardwood Stand at Duke Forest.
Green Chromatic Coordinate (GCC) is a measure of "greenness" of an area and is
widely used in Phenocam images as an indicator of the green pigment in vegetation.
Let's use this measure to look at changes in GCC over time at this site. Looking
back at the available data, we have several options for GCC. gcc90
is the 90th
quantile of GCC in the pixels across the ROI (for more details,
PhenoCam v1 description).
We'll use this as it tracks the upper greenness values while not including many
outliners.
Before we can plot gcc-90
we do need to fix our dates and convert them from
Factors to Date to correctly plot.
# date variable into date format
dukehw_DB_1000[,date:=as.Date(date)]
# plot gcc_90
dukehw_DB_1000[,plot(date, gcc_90, col = 'green', type = 'b')]
#> NULL
mtext('Duke Forest, Hardwood', font = 2)
Download midday images
While PhenoCam sites may have many images in a given day, many simple analyses can use just the midday image when the sun is most directly overhead the canopy. Therefore, extracting a list of midday images (only one image a day) can be useful.
# obtaining midday_images for dukehw
duke_middays <- get_midday_list('dukehw')
# see the first few rows
head(duke_middays)
#> [1] "http://phenocam.sr.unh.edu/data/archive/dukehw/2013/05/dukehw_2013_05_31_150331.jpg"
#> [2] "http://phenocam.sr.unh.edu/data/archive/dukehw/2013/06/dukehw_2013_06_01_120111.jpg"
#> [3] "http://phenocam.sr.unh.edu/data/archive/dukehw/2013/06/dukehw_2013_06_02_120109.jpg"
#> [4] "http://phenocam.sr.unh.edu/data/archive/dukehw/2013/06/dukehw_2013_06_03_120110.jpg"
#> [5] "http://phenocam.sr.unh.edu/data/archive/dukehw/2013/06/dukehw_2013_06_04_120119.jpg"
#> [6] "http://phenocam.sr.unh.edu/data/archive/dukehw/2013/06/dukehw_2013_06_05_120110.jpg"
Now we have a list of all the midday images from this Phenocam. Let's download them and plot
# download a file
destfile <- tempfile(fileext = '.jpg')
# download only the first available file
# modify the `[1]` to download other images
download.file(duke_middays[1], destfile = destfile, mode = 'wb')
# plot the image
img <- try(readJPEG(destfile))
if(class(img)!='try-error'){
par(mar= c(0,0,0,0))
plot(0:1,0:1, type='n', axes= FALSE, xlab= '', ylab = '')
rasterImage(img, 0, 0, 1, 1)
}
Download midday images for a given time range
Now we can access all the midday images and download them one at a time. However, we frequently want all the images within a specific time range of interest. We'll learn how to do that next.
# open a temporary directory
tmp_dir <- tempdir()
# download a subset. Example dukehw 2017
download_midday_images(site = 'dukehw', # which site
y = 2017, # which year(s)
months = 1:12, # which month(s)
days = 15, # which days on month(s)
download_dir = tmp_dir) # where on your computer
#>
|
| | 0%
|
|======= | 8%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_01_15_120109.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|============== | 17%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_02_15_120108.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|==================== | 25%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_03_15_120151.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|=========================== | 33%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_04_15_120110.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|================================== | 42%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_05_15_120108.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|========================================= | 50%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_06_15_120120.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|================================================ | 58%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_07_15_120109.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|======================================================= | 67%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_08_15_120110.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|============================================================== | 75%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_09_15_120110.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|==================================================================== | 83%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_10_15_120112.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|=========================================================================== | 92%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_11_15_120111.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#>
|
|==================================================================================| 100%
#> Warning in download_midday_images(site = "dukehw", y = 2017, months = 1:12, : /var/folders/
#> bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_12_15_120108.jpg was already
#> in /var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45
#> [1] "/var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45"
# list of downloaded files
duke_middays_path <- dir(tmp_dir, pattern = 'dukehw*', full.names = TRUE)
head(duke_middays_path)
#> [1] "/var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_01_15_120109.jpg"
#> [2] "/var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_02_15_120108.jpg"
#> [3] "/var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_03_15_120151.jpg"
#> [4] "/var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_04_15_120110.jpg"
#> [5] "/var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_05_15_120108.jpg"
#> [6] "/var/folders/bn/w43q_t8s3_xckn5j4plhb289fqhhfx/T//RtmpwGdn45/dukehw_2017_06_15_120120.jpg"
We can demonstrate the seasonality of Duke forest observed from the camera. (Note this code may take a while to run through the loop).
n <- length(duke_middays_path)
par(mar= c(0,0,0,0), mfrow=c(4,3), oma=c(0,0,3,0))
for(i in 1:n){
img <- readJPEG(duke_middays_path[i])
plot(0:1,0:1, type='n', axes= FALSE, xlab= '', ylab = '')
rasterImage(img, 0, 0, 1, 1)
mtext(month.name[i], line = -2)
}
mtext('Seasonal variation of forest at Duke Hardwood Forest', font = 2, outer = TRUE)
The goal of this section was to show how to download a limited number of midday images from the PhenoCam server. However, more extensive datasets should be downloaded from the PhenoCam .
The phenocamapi R package is developed and maintained by Bijan Seyednarollah. The most recent release is available from https://github.com/bnasr/phenocamapi.
Get Lesson Code
Detecting Foggy Images using the hazer Package
Authors: Bijan Seyednasrollah
Last Updated: Nov 23, 2020
In this tutorial, you will learn how to
- perform basic image processing and
- estimate image haziness as an indication of fog, cloud or other natural or artificial factors using the
hazer
R package.
Read & Plot Image
We will use several packages in this tutorial. All are available from CRAN.
# load packages
library(hazer)
library(jpeg)
library(data.table)
Before we start the image processing steps, let's read in and plot an image. This image is an example image that comes with the hazer package.
# read the path to the example image
jpeg_file <- system.file(package = 'hazer', 'pointreyes.jpg')
# read the image as an array
rgb_array <- jpeg::readJPEG(jpeg_file)
# plot the RGB array on the active device panel
# first set the margin in this order:(bottom, left, top, right)
par(mar=c(0,0,3,0))
plotRGBArray(rgb_array, bty = 'n', main = 'Point Reyes National Seashore')
When we work with images, all data we work with is generally on the scale of each individual pixel in the image. Therefore, for large images we will be working with large matrices that hold the value for each pixel. Keep this in mind before opening some of the matrices we'll be creating this tutorial as it can take a while for them to load.
Histogram of RGB channels
A histogram of the colors can be useful to understanding what our image is made
up of. Using the density()
function from the base stats package, we can extract
density distribution of each color channel.
# color channels can be extracted from the matrix
red_vector <- rgb_array[,,1]
green_vector <- rgb_array[,,2]
blue_vector <- rgb_array[,,3]
# plotting
par(mar=c(5,4,4,2))
plot(density(red_vector), col = 'red', lwd = 2,
main = 'Density function of the RGB channels', ylim = c(0,5))
lines(density(green_vector), col = 'green', lwd = 2)
lines(density(blue_vector), col = 'blue', lwd = 2)
In hazer we can also extract three basic elements of an RGB image :
- Brightness
- Darkness
- Contrast
Brightness
The brightness matrix comes from the maximum value of the R, G, or B channel. We
can extract and show the brightness matrix using the getBrightness()
function.
# extracting the brightness matrix
brightness_mat <- getBrightness(rgb_array)
# unlike the RGB array which has 3 dimensions, the brightness matrix has only two
# dimensions and can be shown as a grayscale image,
# we can do this using the same plotRGBArray function
par(mar=c(0,0,3,0))
plotRGBArray(brightness_mat, bty = 'n', main = 'Brightness matrix')
Here the grayscale is used to show the value of each pixel's maximum brightness of the R, G or B color channel.
To extract a single brightness value for the image, depending on our needs we can perform some statistics or we can just use the mean of this matrix.
# the main quantiles
quantile(brightness_mat)
#> 0% 25% 50% 75% 100%
#> 0.09019608 0.43529412 0.62745098 0.80000000 0.91764706
# create histogram
par(mar=c(5,4,4,2))
hist(brightness_mat)
Why are we getting so many images up in the high range of the brightness? Where does this correlate to on the RGB image?
Darkness
Darkness is determined by the minimum of the R, G or B color channel.
Similarly, we can extract and show the darkness matrix using the getDarkness()
function.
# extracting the darkness matrix
darkness_mat <- getDarkness(rgb_array)
# the darkness matrix has also two dimensions and can be shown as a grayscale image
par(mar=c(0,0,3,0))
plotRGBArray(darkness_mat, bty = 'n', main = 'Darkness matrix')
# main quantiles
quantile(darkness_mat)
#> 0% 25% 50% 75% 100%
#> 0.03529412 0.23137255 0.36470588 0.47843137 0.83529412
# histogram
par(mar=c(5,4,4,2))
hist(darkness_mat)
Contrast
The contrast of an image is the difference between the darkness and brightness of the image. The contrast matrix is calculated by difference between the darkness and brightness matrices.
The contrast of the image can quickly be extracted using the getContrast()
function.
# extracting the contrast matrix
contrast_mat <- getContrast(rgb_array)
# the contrast matrix has also 2D and can be shown as a grayscale image
par(mar=c(0,0,3,0))
plotRGBArray(contrast_mat, bty = 'n', main = 'Contrast matrix')
# main quantiles
quantile(contrast_mat)
#> 0% 25% 50% 75% 100%
#> 0.0000000 0.1450980 0.2470588 0.3333333 0.4509804
# histogram
par(mar=c(5,4,4,2))
hist(contrast_mat)
Image fogginess & haziness
Haziness of an image can be estimated using the getHazeFactor()
function. This
function is based on the method described in
Mao et al. (2014).
The technique was originally developed to for "detecting foggy images and
estimating the haze degree factor" for a wide range of outdoor conditions.
The function returns a vector of two numeric values:
- haze as the haze degree and
- A0 as the global atmospheric light, as it is explained in the original paper.
The PhenoCam standards classify any image with the haze degree greater than 0.4 as a significantly foggy image.
# extracting the haze matrix
haze_degree <- getHazeFactor(rgb_array)
print(haze_degree)
#> $haze
#> [1] 0.2251633
#>
#> $A0
#> [1] 0.7105258
Here we have the haze values for our image. Note that the values might be slightly different due to rounding errors on different platforms.
Process sets of images
We can use for
loops or the lapply
functions to extract the haze values for
a stack of images.
You can download the related datasets from here (direct download).
Download and extract the zip file to be used as input data for the following step.
# to download via R
dir.create('data')
#> Warning in dir.create("data"): 'data' already exists
destfile = 'data/pointreyes.zip'
download.file(destfile = destfile, mode = 'wb', url = 'http://bit.ly/2F8w2Ia')
unzip(destfile, exdir = 'data')
# set up the input image directory
#pointreyes_dir <- '/path/to/image/directory/'
pointreyes_dir <- 'data/pointreyes/'
# get a list of all .jpg files in the directory
pointreyes_images <- dir(path = pointreyes_dir,
pattern = '*.jpg',
ignore.case = TRUE,
full.names = TRUE)
Now we can use a for loop to process all of the images to get the haze and A0 values.
(Note, this loop may take a while to process.)
# number of images
n <- length(pointreyes_images)
# create an empty matrix to fill with haze and A0 values
haze_mat <- data.table()
# the process takes a bit, a progress bar lets us know it is working.
pb <- txtProgressBar(0, n, style = 3)
#>
|
| | 0%
for(i in 1:n) {
image_path <- pointreyes_images[i]
img <- jpeg::readJPEG(image_path)
haze <- getHazeFactor(img)
haze_mat <- rbind(haze_mat,
data.table(file = image_path,
haze = haze[1],
A0 = haze[2]))
setTxtProgressBar(pb, i)
}
#>
|
|= | 1%
|
|== | 3%
|
|=== | 4%
|
|===== | 6%
|
|====== | 7%
|
|======= | 8%
|
|======== | 10%
|
|========= | 11%
|
|========== | 13%
|
|============ | 14%
|
|============= | 15%
|
|============== | 17%
|
|=============== | 18%
|
|================ | 20%
|
|================= | 21%
|
|================== | 23%
|
|==================== | 24%
|
|===================== | 25%
|
|====================== | 27%
|
|======================= | 28%
|
|======================== | 30%
|
|========================= | 31%
|
|=========================== | 32%
|
|============================ | 34%
|
|============================= | 35%
|
|============================== | 37%
|
|=============================== | 38%
|
|================================ | 39%
|
|================================= | 41%
|
|=================================== | 42%
|
|==================================== | 44%
|
|===================================== | 45%
|
|====================================== | 46%
|
|======================================= | 48%
|
|======================================== | 49%
|
|========================================== | 51%
|
|=========================================== | 52%
|
|============================================ | 54%
|
|============================================= | 55%
|
|============================================== | 56%
|
|=============================================== | 58%
|
|================================================= | 59%
|
|================================================== | 61%
|
|=================================================== | 62%
|
|==================================================== | 63%
|
|===================================================== | 65%
|
|====================================================== | 66%
|
|======================================================= | 68%
|
|========================================================= | 69%
|
|========================================================== | 70%
|
|=========================================================== | 72%
|
|============================================================ | 73%
|
|============================================================= | 75%
|
|============================================================== | 76%
|
|================================================================ | 77%
|
|================================================================= | 79%
|
|================================================================== | 80%
|
|=================================================================== | 82%
|
|==================================================================== | 83%
|
|===================================================================== | 85%
|
|====================================================================== | 86%
|
|======================================================================== | 87%
|
|========================================================================= | 89%
|
|========================================================================== | 90%
|
|=========================================================================== | 92%
|
|============================================================================ | 93%
|
|============================================================================= | 94%
|
|=============================================================================== | 96%
|
|================================================================================ | 97%
|
|================================================================================= | 99%
|
|==================================================================================| 100%
Now we have a matrix with haze and A0 values for all our images. Let's compare top five images with low and high haze values.
haze_mat[,haze:=unlist(haze)]
top10_high_haze <- haze_mat[order(haze), file][1:5]
top10_low_haze <- haze_mat[order(-haze), file][1:5]
par(mar= c(0,0,0,0), mfrow=c(5,2), oma=c(0,0,3,0))
for(i in 1:5){
img <- readJPEG(top10_low_haze[i])
plot(0:1,0:1, type='n', axes= FALSE, xlab= '', ylab = '')
rasterImage(img, 0, 0, 1, 1)
img <- readJPEG(top10_high_haze[i])
plot(0:1,0:1, type='n', axes= FALSE, xlab= '', ylab = '')
rasterImage(img, 0, 0, 1, 1)
}
mtext('Separating out foggy images of Point Reyes, CA', font = 2, outer = TRUE)
Let's classify those into hazy and non-hazy as per the PhenoCam standard of 0.4.
# classify image as hazy: T/F
haze_mat[haze>0.4,foggy:=TRUE]
haze_mat[haze<=0.4,foggy:=FALSE]
head(haze_mat)
#> file haze A0 foggy
#> 1: data/pointreyes//pointreyes_2017_01_01_120056.jpg 0.2249810 0.6970257 FALSE
#> 2: data/pointreyes//pointreyes_2017_01_06_120210.jpg 0.2339372 0.6826148 FALSE
#> 3: data/pointreyes//pointreyes_2017_01_16_120105.jpg 0.2312940 0.7009978 FALSE
#> 4: data/pointreyes//pointreyes_2017_01_21_120105.jpg 0.4536108 0.6209055 TRUE
#> 5: data/pointreyes//pointreyes_2017_01_26_120106.jpg 0.2297961 0.6813884 FALSE
#> 6: data/pointreyes//pointreyes_2017_01_31_120125.jpg 0.4206842 0.6315728 TRUE
Now we can save all the foggy images to a new folder that will retain the foggy images but keep them separate from the non-foggy ones that we want to analyze.
# identify directory to move the foggy images to
foggy_dir <- paste0(pointreyes_dir, 'foggy')
clear_dir <- paste0(pointreyes_dir, 'clear')
# if a new directory, create new directory at this file path
dir.create(foggy_dir, showWarnings = FALSE)
dir.create(clear_dir, showWarnings = FALSE)
# copy the files to the new directories
file.copy(haze_mat[foggy==TRUE,file], to = foggy_dir)
#> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> [15] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> [29] FALSE FALSE
file.copy(haze_mat[foggy==FALSE,file], to = clear_dir)
#> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> [15] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> [29] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Now that we have our images separated, we can get the full list of haze values only for those images that are not classified as "foggy".
# this is an alternative approach instead of a for loop
# loading all the images as a list of arrays
pointreyes_clear_images <- dir(path = clear_dir,
pattern = '*.jpg',
ignore.case = TRUE,
full.names = TRUE)
img_list <- lapply(pointreyes_clear_images, FUN = jpeg::readJPEG)
# getting the haze value for the list
# patience - this takes a bit of time
haze_list <- t(sapply(img_list, FUN = getHazeFactor))
# view first few entries
head(haze_list)
#> haze A0
#> [1,] 0.224981 0.6970257
#> [2,] 0.2339372 0.6826148
#> [3,] 0.231294 0.7009978
#> [4,] 0.2297961 0.6813884
#> [5,] 0.2152078 0.6949932
#> [6,] 0.345584 0.6789334
We can then use these values for further analyses and data correction.
The hazer R package is developed and maintained by Bijan Seyednarollah. The most recent release is available from https://github.com/bnasr/hazer.
Get Lesson Code
Extracting Timeseries from Images using the xROI R Package
Authors: Bijan Seyednasrollah
Last Updated: Jun 10, 2024
In this tutorial, we'll learn how to use an interactive open-source toolkit, the xROI that facilitates the process of time series extraction and improves the quality of the final data. The xROI package provides a responsive environment for scientists to interactively:
a) delineate regions of interest (ROIs), b) handle field of view (FOV) shifts, and c) extract and export time series data characterizing color-based metrics.
Using the xROI R package, the user can detect FOV shifts with minimal difficulty. The software gives user the opportunity to re-adjust mask files or redraw new ones every time an FOV shift occurs.
xROI Design
The R language and Shiny package were used as the main development tool for xROI,
while Markdown, HTML, CSS and JavaScript languages were used to improve the
interactivity. While Shiny apps are primarily used for web-based applications to
be used online, the package authors used Shiny for its graphical user interface
capabilities. In other words, both the User Interface (UI) and server modules are run
locally from the same machine and hence no internet connection is required (after
installation). The xROI's UI element presents a side-panel for data entry and
three main tab-pages, each responsible for a specific task. The server-side
element consists of R and bash scripts. Image processing and geospatial features
were performed using the Geospatial Data Abstraction Library (GDAL)
and the
rgdal
and raster
R packages.
Install xROI
The latest release of xROI can be directly downloaded and installed from the development GitHub repository.
# install devtools first
utils::install.packages('devtools', repos = "http://cran.us.r-project.org" )
# use devtools to install from GitHub
devtools::install_github("bnasr/xROI")
xROI depends on many R packages including: raster
, rgdal
, sp
, jpeg
,
tiff
, shiny
, shinyjs
, shinyBS
, shinyAce
, shinyTime
, shinyFiles
,
shinydashboard
, shinythemes
, colourpicker
, rjson
, stringr
, data.table
,
lubridate
, plotly
, moments
, and RCurl
. All the required libraries and
packages will be automatically installed with installation of xROI. The package
offers a fully interactive high-level interface as well as a set of low-level
functions for ROI processing.
Launch xROI
A comprehensive user manual for low-level image processing using xROI is available from GitHub xROI. While the user manual includes a set of examples for each function; here we will learn to use the graphical interactive mode.
Calling the Launch()
function, as we'll do below, opens up the interactive
mode in your operating system’s default web browser. The landing page offers an
example dataset to explore different modules or upload a new dataset of images.
You can launch the interactive mode can be launched from an interactive R environment.
# load xROI
library(xROI)
# launch xROI
Launch()
Or from the command line (e.g. bash in Linux, Terminal in macOS and Command Prompt in Windows machines) where an R engine is already installed.
Rscript -e “xROI::Launch(Interactive = TRUE)”
End xROI
When you are done with the xROI interface you can close the tab in your browser and end the session in R by using one of the following options
In RStudio: Press the
Use xROI
To get some hands-on experience with xROI
, we can analyze images from the
dukehw
of the PhenoCam network.
You can download the data set from this link (direct download).
Follow the steps below:
First,save and extract (unzip) the file on your computer.
Second, open the data set in xROI
by setting the file path to your data
# launch data in ROI
# first edit the path below to the dowloaded directory you just extracted
xROI::Launch('/path/to/extracted/directory')
# alternatively, you can run without specifying a path and use the interface to browse
Now, draw an ROI and the metadata.
Then, save the metadata and explore its content.
Now we can explore if there is any FOV shift in the dataset using the CLI processer
tab.
Finally, we can go to the Time series extraction
tab. Extract the time-series. Save the output and explore the dataset in R.
Challenge: Use xROI
Let's use xROI on a little more challenging site with field of view shifts.
Download and extract the data set from this link (direct download, 218 MB) and follow the above steps to extract the time-series.
The xROI R package is developed and maintained by Bijan Seyednarollah. The most recent release is available from https://github.com/bnasr/xROI.
Get Lesson Code
Modeling phenology with the R package phenor
Authors: Koen Hufkens
Last Updated: Nov 23, 2020
This tutorial focuses on aggregating and combining various climate and phenology data sources for modeling purposes using the phenor R package. This tutorial explains the various data sources and in particular phenocam data, the structure of the formatted data and the final modelling procedures using various phenology models.
R Skill Level: Introduction - you've got the basics of R
down and
understand the general structure of tabular data and lists.
Learning Objectives
After completing this tutorial, you will be able:
- to download PhenoCam time series data
- to process time series data into transition date products (phenological events)
- to download colocated climate
- to format these data in a standardized scheme
- to use formatted data to calibrate phenology models
- to make phenology predictions using forecast climate data
Things You’ll Need To Complete This Tutorial
You will need the most current version of R and RStudio loaded on your computer to complete this tutorial. Optionally, a login to the Pan European Phenology Project (PEP725) website can be used for data retrieval.
Install R Packages
These R packages will be used in the tutorial below. Please make sure they are installed prior to starting the tutorial.
-
devtools
install.packages("devtools")
-
phenor:
install_github("khufkens/phenor")
-
phenocamr:
install.packages("phenocamr")
-
maps:
install.packages("maps")
This tutorial has three parts:
- Introductions to the relevant R packages
- Aggregating & format the data
- Model phenology
Due to the the large size of the data involved, we will learn how to obtain research quality data in the aggregating data steps but we will use pre-subsetted data sets for the modeling. The pre-subsetted sets can be downloaded at the end of each section or directly accessed during the modeling section.
The R packages
phenor
The phenor R package is a phenology modeling framework in R. The framework leverages measurements of vegetation phenology from four common phenology observation datasets combined with (global) retrospective and projected climate data. Currently, the package focuses on North America and Europe and relies heavily on Daymet and E-OBS climate data for underlying climate driver data in model optimization. The package supports global gridded CMIP5 forecasts for RCP4.5 and RCP8.5 climate change scenarios using the NASA Earth Exchange global downscaled daily projections.
Phenological model calibration and validation data are derived from four main sources:
- the transition dates derived from PhenoCam time series and included in this package.
- the MODIS MCD12Q2 phenology product using the MODISTools R package
- the Pan European Phenology Project (PEP725)
- the USA National Phenological Network (USA-NPN).
phenocamr
We will also use the the phenocamr package in the processing of data provided through the PhenoCam API and past data releases. Although the uses of standard product releases is encouraged in some instances you might want more control over the data processing and the transition date products generated. phenocamr provides this flexibility.
Get PhenoCam Data
In this tutorial, you are going to download PhenoCam time series, extract transition dates and combine the derived spring phenology data, Daymet data, to calibrate a spring phenology model. Finally, you make projections for the end of the century under an RCP8.5 CMIP5 model scenario.
The PhenoCam Network includes data from around the globe (map.) However, there are other data sources that may be of interest including the Pan European Phenology Project (PEP725). For more on accessing data from the PEP725, please see the final section of this tutorial.
First, we need to set up our R environment.
# uncomment for install
# install.packages("devtools")
# install_github("khufkens/phenor")
# install.packages("phenocamr")
# install.packages("maps")
library("phenocamr")
library("phenor")
library("maps")
library("raster")
To download phenology data from the PhenoCam network use the download_phenocam()
function from the phenocamr package. This function allows you to download site
based data and process it according to a standardized methodology. A full description of the methodology is provided in Scientific
Data: Tracking vegetation phenology across diverse North American biomes using PhenoCam imagery (Richardson et al. 2018).
The code below downloads all time series for deciduous broadleaf data at the
Bartlett Experimental Forest (bartlettir
) PhenoCam site
and estimate the phenophases (spring & autumn). For a detailed description of
the download procedure consult the
phenocamr R package documentation.
# download the three day time series for deciduous broadleaf data at the
# Bartlett site and will estimate the phenophases (spring + autumn).
phenocamr::download_phenocam(
frequency = 3,
veg_type = "DB",
roi_id = 1000,
site = "bartlettir",
phenophase = TRUE,
out_dir = "."
)
## Downloading: bartlettir_DB_1000_3day.csv
## -- Flagging outliers!
## -- Smoothing time series!
## -- Estimating transition dates!
Using the code (out_dir = "."
) causes the downloaded data, both the 3-day time
series and the calculated transition dates, to be stored in your current working
directory. You can change that is you want to save it elsewhere. You will get feedback on the processing steps completed.
We can now load this data; both the time series and the transition files.
# load the time series data
df <- read.table("bartlettir_DB_1000_3day.csv", header = TRUE, sep = ",")
# read in the transition date file
td <- read.table("bartlettir_DB_1000_3day_transition_dates.csv",
header = TRUE,
sep = ",")
Threshold values
Now let's plot the data to see what we are working with. But first, let's
subset the transition date (td
) for each year when 25% of the greenness amplitude (of the 90^th^) percentile is reached (threshold_25
).
# select the rising (spring dates) for 25% threshold of Gcc 90
td <- td[td$direction == "rising" & td$gcc_value == "gcc_90",]
# create a simple line graph of the smooth Green Chromatic Coordinate (Gcc)
# and add points for transition dates
plot(as.Date(df$date), df$smooth_gcc_90, type = "l", xlab = "Date",
ylab = "Gcc (90th percentile)")
points(x = as.Date(td$transition_25, origin = "1970-01-01"),
y = td$threshold_25,
pch = 19,
col = "red")
Now we can se the transition date for each year of interest and the annual patterns of the Gcc.
However, if you want more control over the parameters used during processing,
you can run through the three default processing steps as implemented in
download_phenocam()
and set parameters manually.
Of particular interest is the option to specify your own threshold used in determining transition dates. In the example below, we will set the upper threshold value to 80% of the amplitude (or 0.8). We will visualize the data as above, showing the newly found transition dates along the Gcc curve.
# the first step in phenocam processing is flagging of the outliers
# on the file you visualized in the previous step
detect_outliers("bartlettir_DB_1000_3day.csv",
out_dir = ".")
# the second step involves smoothing the data using an optimization approach
# we force the procedure as it will be skipped if smoothed data is already
# available
smooth_ts("bartlettir_DB_1000_3day.csv",
out_dir = ".",
force = TRUE)
# the third and final step is the generation of phenological transition dates
td <- phenophases("bartlettir_DB_1000_3day.csv",
internal = TRUE,
upper_thresh = 0.8)
Now we have manually set the parameters that were default for our first plot. Note, that here is also a lower and a middle threshold parameter, the order matters so always use the relevant parameter (for parameters, check transition_dates())
Now we can again plot the annual pattern with the transition dates.
# split out the rising (spring) component for Gcc 90th
td <- td$rising[td$rising$gcc_value == "gcc_90",]
# we can now visualize the upper threshold
plot(as.Date(df$date), df$smooth_gcc_90, type = "l",
xlab = "Date",
ylab = "Gcc (90th percentile)")
points(x = as.Date(td$transition_80, origin = "1970-01-01"),
y = td$threshold_80,
pch = 19,
col = "red")
With the above examples you can get a feeling of how to manually re-process PhenoCam time series.
Phenocam Subsetted Data Set
To allow our models to run in a timely manner, we will use the subsetted data
that is included with the phenor packages for the modeling portion of this
tutorial. All deciduous broadleaf forest data in the PhenoCam V1.0 have been processed
using the above settings. This data set is called phenocam_DB
.
Alternatively, you can download a curated dataset from the ORNL DAAC, as fully described in Scientific Data: Tracking vegetation phenology across diverse North American biomes using PhenoCam imagery (Richardson et al. 2018). A limited copy, including only time series and transition dates, is also mirrored as a github repo (500 mb).
Get Climate Data
In order to calibrate phenology models, additional climate data is required. Some of this data is dynamically queried during the formatting of the data. Alternatively, we can get climate data from another source, like the Coupled Model Intercomparison Project (CMIP5). The forecast CMIP5 data is gridded data which is too large to process dynamically. In order to use the CMIP5 data to make phenology projections the data needs to be downloaded one year at a time, and subset where possible to reduce file sizes. Below you find the instructions to download the 2090 CMIP5 data for the RCP8.5 scenario of the MIROC5 model. The data will be stored in the R temporary directory for later use. Please note that this is a large file (> 4 GB).
# download source cmip5 data into your temporary directory
# please note this is a large download: >4GB!
phenor::download_cmip5(
year = 2090,
path = tempdir(),
model = "MIROC5",
scenario = "rcp85"
)
phenor::download_cmip5(
year = 2010,
path = tempdir(),
model = "MIROC5",
scenario = "rcp85"
)
Format Phenology & Climate Data
If both phenology and climate data are available you can aggregate and format
the data for modeling purposes. All functions in the phenor package with a
format_
prefix serve this purpose, although some might lack phenology
validation data.
You can format phenocam data using the format_phenocam()
function, which
requires you to provide the correct path to phenocam transition date files, like
those we downloaded above). This function will match the transition dates from
PhenoCam data with the appropriate Daymet data (dynamically).
In the next code chunk, we will format the phenocam transition date data (in your working directory, ".") correctly. Then we will specify the direction of the curve to be considered and setting the Gcc percentile, threshold and temporal offset.
# Format the phenocam transition date data
# Specify the direction of the curve
# Specify the gcc percentile, threshold and the temporal offset
phenocam_data <- phenor::format_phenocam(
path = ".",
direction = "rising",
gcc_value = "gcc_90",
threshold = 50,
offset = 264,
internal = TRUE
)
## Processing 1 sites
##
|
| | 0%
|
|==================================================================================| 100%
# When internal = TRUE, the data will be returned to the R
# workspace, otherwise the data will be saved to disk.
# view data structure
str(phenocam_data)
## List of 1
## $ bartlettir:List of 13
## ..$ site : chr "bartlettir"
## ..$ location : num [1:2] 44.1 -71.3
## ..$ doy : int [1:365] -102 -101 -100 -99 -98 -97 -96 -95 -94 -93 ...
## ..$ ltm : num [1:365] 13.5 14.1 13.6 13 11.9 ...
## ..$ transition_dates: num [1:9] 133 129 122 133 130 128 136 130 138
## ..$ year : num [1:9] 2008 2009 2010 2011 2012 ...
## ..$ Ti : num [1:365, 1:9] 16 17.2 16.8 15.5 16.2 ...
## ..$ Tmini : num [1:365, 1:9] 7 10 10.5 7.5 6.5 11 16 14.5 7.5 3 ...
## ..$ Tmaxi : num [1:365, 1:9] 25 24.5 23 23.5 26 29 28.5 24 20 18 ...
## ..$ Li : num [1:365, 1:9] 11.9 11.9 11.8 11.8 11.7 ...
## ..$ Pi : num [1:365, 1:9] 0 0 0 0 0 0 5 6 0 0 ...
## ..$ VPDi : num [1:365, 1:9] 1000 1240 1280 1040 960 1320 1800 1640 1040 760 ...
## ..$ georeferencing : NULL
## - attr(*, "class")= chr "phenor_time_series_data"
As you can see, this formats a nested list of data. This nested list is consistent
across all format_
functions.
Finally, when making projections for the coming century you can use the
format_cmip5()
function. This function does not rely on phenology data but
creates a consistent data structure so models can easily use this data.
In addition, there is the option to constrain the data, which is global,
spatially with an extent
parameter. The extent is a vector with coordinates
defining the region of interest defined as xmin, xmax, ymin, ymax in latitude and
longitude.
This code has a large download size, we do not show the output of this code.
# format the cmip5 data
cmip5_2090 <- phenor::format_cmip5(
path = tempdir(),
year = 2090,
offset = 264,
model = "MIROC5",
scenario = "rcp85",
extent = c(-95, -65, 24, 50),
internal = FALSE
)
cmip5_2010 <- phenor::format_cmip5(
path = tempdir(),
year = 2010,
offset = 264,
model = "MIROC5",
scenario = "rcp85",
extent = c(-95, -65, 24, 50),
internal = FALSE
)
Climate Training Dataset
Given the large size of the climate projection data above, we will use subsetted and formatted training dataset. In that section of the tutorial, we will directly read the data into R.
Alternatively, you can download it here as a zip file (128 MB) or obtain the data by cloning the GitHub repository,
git clone https://github.com/khufkens/phenocamr_phenor_demo.git
Now that we have the needed phenology and climate projection data, we can create our model.
Phenology Model Parameterization
Gathering all this data serves as input to a model calibration routine. This routine tweaks parameters in the model specification in order to best fit the response to the available phenology data using the colocated climate driver data.
The default optimization method uses Simulated Annealing to find optimal parameter sets. Ideally the routine is run for >10K iterations (longer for complex models). When the procedure ends, by default, a plot of the modeled ~ measured data is provided in addition to model fit statistics. This gives you quick feedback on model accuracy.
For a full list of all models included and their model structure, please refer to the package documentation and Hufkens et al. 2018. An integrated phenology modelling framework in R: Phenology modelling with phenor.
For the phenology data, we'll used the example data that comes with phenor. This
will allow our models to run faster than if we used all the data we downloaded
in the second part of this tutorial. phencam_DB
includes a subset of the
deciduous broadleaf forest data in the PhenoCam V1.0. This has all been
processed using the settings we used above.
# load example data
data("phenocam_DB")
# Calibrate a simple Thermal Time (TT) model using simulated annealing
# for both the phenocam and PEP725 data. This routine might take some
# time to execute.
phenocam_par <- model_calibration(
model = "TT",
data = phenocam_DB,
method = "GenSA",
control = list(max.call = 4000),
par_ranges = sprintf("%s/extdata/parameter_ranges.csv", path.package("phenor")),
plot = TRUE)
##
## Call:
## stats::lm(formula = data$transition_dates ~ out)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.311 -5.321 -1.247 4.821 35.776
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.0009523 4.9037867 0.00 1
## out 0.9933004 0.0397814 24.97 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.737 on 356 degrees of freedom
## Multiple R-squared: 0.6365, Adjusted R-squared: 0.6355
## F-statistic: 623.4 on 1 and 356 DF, p-value: < 2.2e-16
# you can specify or alter the parameter ranges as located in
# copy this file and use the par_ranges parameter to use your custom version
print(sprintf("%s/extdata/parameter_ranges.csv", path.package("phenor")))
## [1] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library/phenor/extdata/parameter_ranges.csv"
We can list the parameters by looking at one of the nested list items (par
).
# only list the TT model parameters, ignore other
# ancillary fields
print(phenocam_par$par)
## [1] 176.35246 -4.39729 549.56298
Phenology Model Predictions
To finally evaluate how these results would change phenology by the end of the
century we use the formatted CMIP5 data to estimate_phenology()
with those
given drivers.
We will use demo CMIP5 data, instead of the data we downloaded earlier, so that our model comes processes faster.
# download the cmip5 files from the demo repository
download.file("https://github.com/khufkens/phenocamr_phenor_demo/raw/master/data/phenor_cmip5_data_MIROC5_2090_rcp85.rds",
"phenor_cmip5_data_MIROC5_2090_rcp85.rds")
download.file("https://github.com/khufkens/phenocamr_phenor_demo/raw/master/data/phenor_cmip5_data_MIROC5_2010_rcp85.rds",
"phenor_cmip5_data_MIROC5_2010_rcp85.rds")
# read in cmip5 data
cmip5_2090 <- readRDS("phenor_cmip5_data_MIROC5_2090_rcp85.rds")
cmip5_2010 <- readRDS("phenor_cmip5_data_MIROC5_2010_rcp85.rds")
Now that we have both the phenocam data and the climate date we want run our model projection.
# project results forward to 2090 using the phenocam parameters
cmip5_projection_2090 <- phenor::estimate_phenology(
par = phenocam_par$par, # provide parameters
data = cmip5_2090, # provide data
model = "TT" # make sure to use the same model !
)
# project results forward to 2010 using the phenocam parameters
cmip5_projection_2010 <- phenor::estimate_phenology(
par = phenocam_par$par, # provide parameters
data = cmip5_2010, # provide data
model = "TT" # make sure to use the same model !
)
If data are gridded data, the output will automatically be formatted as raster data, which can be plotted using the raster package as a map.
Let's view our model.
# plot the gridded results and overlay
# a world map outline
par(oma = c(0,0,0,0))
raster::plot(cmip5_projection_2090, main = "DOY")
maps::map("world", add = TRUE)
Maybe more intersting is showing the difference between the start (2010) and the end (2090) of the century.
# plot the gridded results and overlay
# a world map outline for reference
par(oma = c(0,0,0,0))
raster::plot(cmip5_projection_2010 - cmip5_projection_2090,
main = expression(Delta * "DOY"))
maps::map("world", add = TRUE)
What can you take away from these model visualizations?
PEP725 data
To get phenocam data for Europe. you will likely want to use the Pan European Phenology Project (PEP725). This section teaching you how to access PEP725 data.
PEP725 Log In
Downloading data from the PEP725 network using phenor is more elaborate as it requires a login on the PEP725 website before you can access any data.
In order to move forward with this tutorial, create a login on the PEP725 website and save your login details in a plain text file (.txt) containing your email address and password on the first and second line, respectively. Name this file appropriately (e.g., pep725_credentials.txt.)
PEP725 Data Availability
To download PEP725 data you need to find out which data are available. You can
either consult the data portal of the website, or use the check_pep725_species()
function. This function allows you to either list all species in the dataset, or
search by (partial) matches on the species names.
# to list all species use
species_list <- phenor::check_pep725_species(list = TRUE)
## Warning in xml2::read_html(data_selection): restarting interrupted promise evaluation
## Warning in xml2::read_html(data_selection): internal error -3 in R_decompress1
## Error in xml2::read_html(data_selection): lazy-load database '/Library/Frameworks/R.framework/Versions/3.6/Resources/library/xml2/R/xml2.rdb' is corrupt
# to search only for Quercus (oak) use
quercus_nr <- phenor::check_pep725_species(species = "quercus")
## Warning in xml2::read_html(data_selection): restarting interrupted promise evaluation
## Warning in xml2::read_html(data_selection): internal error -3 in R_decompress1
## Error in xml2::read_html(data_selection): lazy-load database '/Library/Frameworks/R.framework/Versions/3.6/Resources/library/xml2/R/xml2.rdb' is corrupt
# return results
head(species_list)
## Error in head(species_list): object 'species_list' not found
head(quercus_nr)
## Error in head(quercus_nr): object 'quercus_nr' not found
A query for Quercus returns a species ID number of 111. Once you have established the required species number you can move forward and download the species data.
phenor::download_pep725(
credentials = "~/pep725_credentials.txt",
species = 111,
path = ".",
internal = FALSE
)
The data use policy does not allow to distribute data so this will conclude the tutorial portion on downloading PEP725 observational data. However, the use of the formatting functions required in phenor is consistent and the example using PhenoCam data, above, should make you confident in processing data from the PEP725 database once downloaded.
PEP Climate Data
For the formatting of the PEP725 data, no automated routine is provided due to the size of the download and policy of the E-OBS dataset. Register and download the E-OBS data for the 0.25 degree regular grid for the best estimates of TG, TN, TX, RR, PP (0.5 degree data is supported but not recommended).
Format PEP Climate Data
Similarly, the PEP725 data have a dedicated formatting function in the phenor
package, format_pep725()
. However, it will use the previously downloaded E-OBS
data to provided the required climate data for the downloaded PEP725 data
(both file directories are requested). In addition, you need to specify which
BBCH-scale value
you would like to see included in the final formatted dataset.
# provisional query, code not run due to download / login requirements
pep725_data <- phenor::format_pep725(
pep_path = ".",
eobs_path = "/your/eobs/path/",
bbch = "11",
offset = 264,
count = 60,
resolution = 0.25
)