Event - Workshop
NEON Brownbag: Intro to Working with HDF5
May 28, 2015
Hosted By:
NEON
This workshop will provide hands on experience with working hierarchical data formats (HDF5) in R.
Objectives
After completing this workshop, you will be able to:
- Describe what the Hierarchical Data Format (HDF5) is.
- Create and read from HDF5 files in R.
- Read and visualization time series data stored in an HDF5 format.
Things to Do Before the Workshop
To participant in this workshop, you will need a laptop with the most current version of R and, preferably, RStudio loaded on your computer. For details on setting up R & RStudio in Mac, PC, or Linux operating systems please see Additional Set up Resources below.
Install R Libraries
Please install or update each package prior to the start of the workshop.
- rhdf5:
source("http://bioconductor.org/biocLite.R") biocLite("rhdf5")
- ggplot2:
install.packages("ggplot2")
- dpylr:
install.packages("dplyr")
: data manipulation at its finest! - scales:
install.packages("scales")
: this library makes it easier to plot time series data
Data to Download
NEON Teaching Data Subset: Sample Tower Temperature - HDF5
These temperature data were collected by the National Ecological Observatory Network's flux towers at field sites across the US. The entire dataset can be accessed by request from the NEON Data Portal.
Download NEON Teaching Data Subset: Imaging Spectrometer Data - HDF5
These hyperspectral remote sensing data provide information on the National Ecological Observatory Network's San Joaquin Experimental Range field site. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. The entire dataset can be accessed by request from the NEON Data Portal.
NEON Teaching Data Subset: Field Site Spatial Data
These remote sensing data files provide information on the vegetation at the National Ecological Observatory Network's San Joaquin Experimental Range and Soaproot Saddle field sites. The entire dataset can be accessed by request from the NEON Data Portal.
Background Reading
- If you are unfamiliar with the HDF5 format, please read * Hierarchical Data Formats - What is HDF5?* tutorial
Schedule
Additional Set Up Instructions
R & RStudio
Prior to the workshop you should have R and, preferably, RStudio installed on your computer.
Setting Up R & RStudio
Windows R/RStudio Setup
- Download R for Windows here
- Run the .exe file that was just downloaded
- Go to the RStudio Download page
- Under Installers select RStudio X.XX.XXX - Windows Vista/7/8/10
- Double click the file to install it
Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.
Mac R/RStudio Setup
- Go to CRAN and click on Download R for (Mac) OS X
- Select the .pkg file for the version of OS X that you have and the file will download.
- Double click on the file that was downloaded and R will install
- Go to the RStudio Download page
- Under Installers select RStudio 0.98.1103 - Mac OS X XX.X (64-bit) to download it.
- Once it's downloaded, double click the file to install it
Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.
Linux R/RStudio Setup
- R is available through most Linux package managers. You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run
sudo apt-get install r-base
and for Fedora runsudo yum install R
). - To install RStudio, go to the RStudio Download page
- Under Installers select the version for your distribution.
- Once it's downloaded, double click the file to install it
Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.
Set Working Directory to Downloaded Data
1) Download Data
After clicking on the Download Data button, the data will automatically download to the computer.
2) Locate .zip file
Second, find the downloaded .zip file. Many browsers save downloaded files to your computer’s Downloads directory. If you have previously specified a specific directory (folder) for downloaded files, the .zip file will download there.
3) Move to **data** directory
Third, move the downloaded file to a directory called data within the Documents directory on your computer. You can choose to place the data in other locations, however, you will need to remember to set your R Working Directory to that location and not as we demonstrate in the workshop.
4) Unzip/uncompress
Fourth, we need to unzip/uncompress the file so that the data files can be accessed. Use your favorite tool that can unpackage/open .zip files (e.g., winzip, Archive Utility, etc). The files will now be accessible in three directories:
These directories contain all of the subdirectories and files that we will use in this workshop.
5) Set working directory
Fifth, we need to set the working directory in R to this data directory that is parent to the directories containing the data we want. For complete directions, on how to do that check out the Set A Working Directory in R tutorial.
Install HDFView
The free HDFView application allows you to explore the contents of an HDF5 file.
To install HDFView:
- Click to go to the download page.
- From the section titled HDF-Java 2.1x Pre-Built Binary Distributions select the HDFView download option that matches the operating system and computer setup (32 bit vs 64 bit) that you have. The download will start automatically.
- Open the downloaded file
- Mac - You may want to add the HDFView application to your Applications directory.
- Windows - Unzip the file, open the folder, run the .exe file, and follow directions to complete installation.
- Open HDFView to ensure that the program installed correctly.
Data Tip: The HDFView application requires Java to be up to date. If you are having issues opening HDFView, try to update Java first!
QGIS (Optional)
QGIS is a cross-platform Open Source Geographic Information system.
Online LiDAR Data/las Viewer (Optional)
Plas.io is a open source LiDAR data viewer developed by Martin Isenberg of Las Tools and several of his colleagues.
Location:
TBD