| NSF NEON | Open Data to Understand our Ecosystems

What is a CHM, DSM and DTM? About Gridded, Raster LiDAR Data

LiDAR Point Clouds

Each point in a LiDAR dataset has a X, Y, Z value and other attributes. The points may be located anywhere in space are not aligned within any particular grid.

Representative point cloud data. Source: National Ecological Observatory Network (NEON)

LiDAR point clouds are typically available in a .las file format. The .las file format is a compressed format that can better handle the millions of points that are often associated with LiDAR data point clouds.

Common LiDAR Data Products

The Digital Terrain Model (DTM) product represents the elevation of the ground, while the Digital Surface Model (DSM) product represents the elevation of the tallest surfaces at that point. Imagine draping a sheet over the canopy of a forest, the Digital Elevation Model (DEM) contours with the heights of the trees where there are trees but the elevation of the ground when there is a clearing in the forest.

The Canopy height model represents the difference between a Digital Terrain Model and a Digital Surface Model (DSM - DTM = CHM) and gives you the height of the objects (in a forest, the trees) that are on the surface of the earth.

Free Point Cloud Viewers for LiDAR Point Clouds

For more on viewing LiDAR point cloud data using the Plas.io online viewer, see our tutorial Plas.io: Free Online Data Viz to Explore LiDAR Data.

Check out our Structural Diversity tutorial for another useful LiDAR point cloud viewer available through RStudio, Calculating Forest Structural Diversity Metrics from NEON LiDAR Data

3D Models of NEON Site: SJER (San Joaquin Experimental Range)

Click on the images to view interactive 3D models of San Joaquin Experimental Range site.

3D models derived from LiDAR Data. Left: Digital Terrain Model (DTM), Middle: Digital Surface Model (DSM), Right: Canopy Height Model (CHM). Source: National Ecological Observatory Network (NEON)

Gridded, or Raster, LiDAR Data Products

LiDAR data products are most often worked within a gridded or raster data format. A raster file is a regular grid of cells, all of which are the same size.

A few notes about rasters:

Each cell is called a pixel.
And each pixel represents an area on the ground.
The resolution of the raster represents the area that each pixel represents on the ground. So, for instance if the raster is 1 m resolution, that simple means that each pixel represents a 1m by 1m area on the ground.

Raster or “gridded” data are stored as a grid of values which are rendered on a map as pixels. Each pixel value represents an area on the Earth’s surface. Source: National Ecological Observatory Network (NEON)

Raster data can have attributes associated with them as well. For instance in a LiDAR-derived digital elevation model (DEM), each cell might represent a particular elevation value. In a LIDAR-derived intensity image, each cell represents a LIDAR intensity value.

LiDAR Related Metadata

In short, when you go to download LiDAR data the first question you should ask is what format the data are in. Are you downloading point clouds that you might have to process? Or rasters that are already processed for you. How do you know?

Check out the metadata!
Look at the file format - if you are downloading a .las file, then you are getting points. If it is .tif, then it is a post-processing raster file.

Create Useful Data Products from LiDAR Data

Classify LiDAR Point Clouds

LiDAR data points are vector data. LiDAR point clouds are useful because they tell us something about the heights of objects on the ground. However, how do we know whether a point reflected off of a tree, a bird, a building or the ground? In order to develop products like elevation models and canopy height models, we need to classify individual LiDAR points. We might classify LiDAR points into classes including:

Ground
Vegetation
Buildings

LiDAR point cloud classification is often already done when you download LiDAR point clouds but just know that it’s not to be taken for granted! Programs such as lastools, fusion and terrascan are often used to perform this classification. Once the points are classified, they can be used to derive various LiDAR data products.

Create A Raster From LiDAR Point Clouds

There are different ways to create a raster from LiDAR point clouds.

Point to Raster Methods - Basic Gridding

Let's look one of the most basic ways to create a raster file points - basic gridding. When you perform a gridding algorithm, you are simply calculating a value, using point data, for each pixels in your raster dataset.

To begin, a grid is placed on top of the LiDAR data in space. Each cell in the grid has the same spatial dimensions. These dimensions represent that particular area on the ground. If we want to derive a 1 m resolution raster from the LiDAR data, we overlay a 1m by 1m grid over the LiDAR data points.
Within each 1m x 1m cell, we calculate a value to be applied to that cell, using the LiDAR points found within that cell. The simplest method of doing this is to take the max, min or mean height value of all lidar points found within the 1m cell. If we use this approach, we might have cells in the raster that don't contains any lidar points. These cells will have a "no data" value if we process our raster in this way.

Animation showing the general process of taking LiDAR point clouds and converting them to a raster format. Source: Tristan Goulden, National Ecological Observatory Network (NEON)

Point to Raster Methods - Interpolation

A different approach is to interpolate the value for each cell.

In this approach we still start with placing the grid on top of the LiDAR data in space.
Interpolation considers the values of points outside of the cell in addition to points within the cell to calculate a value. Interpolation is useful because it can provide us with some ability to predict or calculate cell values in areas where there are no data (or no points). And to quantify the error associated with those predictions which is useful to know, if you are doing research.

For learning more on how to work with LiDAR and Raster data more generally in R, please refer to the Data Carpentry's Introduction to Geospatial Raster and Vector Data with R lessons.

Plas.io: Free Online Data Viz to Explore LiDAR Data

In this tutorial, we will explore LiDAR point cloud data using the free, online Plas.io viewer.

Learning Objectives

At the end of this tutorial, you will be able to:

Visualize lidar point clouding using the free online data viewer plas.io
Describe some of the attributes associated with discrete return lidar points, including intensity, classification and RGB values.
Explain the use of and difference between the .las and .laz lidar file formats (standard lidar point cloud formats).

Things You’ll Need To Complete This Tutorial

Access to the internet so you can access the plas.io website.

Download Data

NEON Teaching Data Subset: Sample LiDAR Point Cloud Data (.las)

This .las file contains sample LiDAR point cloud data collected by National Ecological Observatory Network's Airborne Observation Platform group. The .las file format is a commonly used file format to store LIDAR point cloud data. NEON Discrete Return LiDAR Point Cloud Data are available on the NEON Data Portal.

Download NEON Teaching Data Subset: Sample LiDAR Point Cloud Data (.las)

Example visualization of LiDAR data

LiDAR data collected over Grand Mesa, Colorado as a part of instrument testing and calibration by the National Ecological Observatory Network 's Airborne Observation Platform (NEON AOP). Source: National Ecological Observatory Network (NEON)

LiDAR File Formats

LiDAR data are most often available as discrete points. Although, remember that these data can be collected by the lidar instrument, in either discrete or full waveform, formats. A collection of discrete return LiDAR points is known as a LiDAR point cloud.

.las is the commonly used file format to store LIDAR point cloud data. This format is supported by the American Society of Photogrammetry and Remote Sensing (ASPRS). The .laz format was developed by Martin Isenberg of LAStools . Laz is a highly compressed version of .las.

In this tutorial, you will open a .las file, in the plas.io free online lidar data viewer. You will then explore some of the attributes associated with a lidar data point cloud.

LiDAR Attribute Data

Remember that not all lidar data are created equally. Different lidar data may have different attributes. In this tutorial, we will look at data that contain both intensity values and a ground vs non ground classification.

Plas.io Viewer

We will use the plas.io website. in this tutorial. As described on their plas.io github page:

Plasio is a project by Uday Verma and Howard Butler that implements point cloud rendering capability in a browser. Specifically, it provides a functional implementation of the ASPRS LAS format, and it can consume LASzip-compressed data using LASzip NaCl module. Plasio is Chrome-only at this time, but it is hoped that other contributors can step forward to bring it to other browsers.

It is expected that most WebGL-capable browsers should be able to support plasio, and it contains nothing that is explicitly Chrome-specific beyond the optional NaCL LASzip module.

This tool is useful because you don't need to install anything to use it! Drag and drop your lidar data directly into the tool and begin to play! The website also provides access to some prepackaged datasets if you want to experiment on your own.

Enough reading, let's open some NEON LiDAR data!

1. Open a .las file in plas.io

Download the NEON prepackaged lidar dataset (above in Download the Data) if you haven't already.
The file is named: NEON-DS-Sample-LiDAR-Point-Cloud.las
When the download is complete, drag the file NEON-DS-Sample-LiDAR-Point-Cloud.las into the plas.io website. window.
Zoom and pan around the data
Use the particle size slider to adjust the size of each individual lidar point. NOTE: the particle size slider is located a little more than half way down the plas.io toolbar in the "Data" section.

NICE! You should see something similar to the screenshot below:

NEON lidar data in the plas.io online tool.

Navigation in Plas.io

You might prefer to use a mouse to explore your data in plas.io. Let's test the navigation out.

Left click on the screen and drag the data on the screen. Notice that this tilts the data up and down.
Right click on the screen and drag noticing that this moves the entire dataset around
Use the scroll bar on your mouse to zoom in and out.

How The Points are Colored

Why is everything grey when the data are loaded?

Notice that the data, upon initial view, are colored in a black - white color scheme. These colors represent the data's intensity values. Remember that the intensity value, for each LiDAR point, represents the amount of light energy that reflected off of an object and returned to the sensor. In this case, darker colors represent LESS light energy returned. Lighter colors represent MORE light returned.

Lidar intensity values represent the amount of light energy that reflected off of an object and returned to the sensor.

2. Adjust the intensity threshold

Next, scroll down through the tools in plas.io. Look for the Intensity Scaling slider. The intensity scaling slider allows you to define the thresholds of light to dark intensity values displayed in the image (similar to stretching values in an image processing software or even in Photoshop).

Drag the slider back and forth. Notice that you can brighten up the data using the slider.

The intensity scaling slider is located below the color map tool so it's easy to miss. Drag the slider back and forth to adjust the range of intensity values and to brighten up the lidar point clouds.

3. Change the lidar point cloud color options to Classification

In addition to intensity values, these lidar data also have a classification value. Lidar data classification values are numeric, ranging from 0-20 or higher. Some common classes include:

0 Not classified
1 Unassigned
2 Ground
3 Low vegetation
4 Medium vegetation
5 High Vegetation
6 Building

Blue and Orange gradient color scheme submitted by Kendra Sand. What color scheme is your favorite?

In this case, these data are classified as either ground, or non-ground. To view the points, colored by class:

Change the "colorization" setting to "Classification
Change the intensity blending slider to "All Color"
For kicks - play with the various colormap options to change the colors of the points.

Set the colorization to 'classified' and then adjust the intensity blending to view the points, colored by ground and non-ground classification.

4. Spend Some Time Exploring - Do you See Any Trees?

Finally, spend some time exploring the data. what features do you see in this dataset? What does the topography look like? Is the site flat? Hilly? Mountainous? What do the lidar data tell you, just upon initial inspection?

Summary

The plas.io online point cloud viewer allows you to quickly view and explore lidar data point clouds.
Each lidar data point will have an associated set of attributes. You can check the metadata to determine which attributes the dataset contains. NEON data, provided above, contain both classification and intensity values.
Classification values represent the type of object that the light energy reflected off of. Classification values are often ground vs non ground. Some lidar data files might have buildings, water bodies and other natural and man-made elements classified.
LiDAR data often has an intensity value associated with it. This represents the amount of light energy that reflected off an object and returned to the sensor.

Additional Resources:

The Basics of LiDAR - Light Detection and Ranging - Remote Sensing

LiDAR or Light Detection and Ranging is an active remote sensing system that can be used to measure vegetation height across wide areas. This page will introduce fundamental LiDAR (or lidar) concepts including:

What LiDAR data are.
The key attributes of LiDAR data.
How LiDAR data are used to measure trees.

The Story of LiDAR

Key Concepts

Why LiDAR

Scientists often need to characterize vegetation over large regions to answer research questions at the ecosystem or regional scale. Therefore, we need tools that can estimate key characteristics over large areas because we don’t have the resources to measure each and every tree or shrub.

Conventional, on-the-ground methods to measure trees are resource intensive and limit the amount of vegetation that can be characterized! Source: National Geographic

Remote sensing means that we aren’t actually physically measuring things with our hands. We are using sensors which capture information about a landscape and record things that we can use to estimate conditions and characteristics. To measure vegetation or other data across large areas, we need remote sensing methods that can take many measurements quickly, using automated sensors.

LiDAR data collected at the Soaproot Saddle site by the National Ecological Observatory Network's Airborne Observation Platform (NEON AOP).

LiDAR, or Light Detection And Ranging (sometimes also referred to as active laser scanning) is one remote sensing method that can be used to map structure including vegetation height, density and other characteristics across a region. LiDAR directly measures the height and density of vegetation on the ground making it an ideal tool for scientists studying vegetation over large areas.

How LiDAR Works

How Does LiDAR Work?

LiDAR is an active remote sensing system. An active system means that the system itself generates energy - in this case, light - to measure things on the ground. In a LiDAR system, light is emitted from a rapidly firing laser. You can imagine light quickly strobing (or pulsing) from a laser light source. This light travels to the ground and reflects off of things like buildings and tree branches. The reflected light energy then returns to the LiDAR sensor where it is recorded.

A LiDAR system measures the time it takes for emitted light to travel to the ground and back, called the two-way travel time. That time is used to calculate distance traveled. Distance traveled is then converted to elevation. These measurements are made using the key components of a lidar system including a GPS that identifies the X,Y,Z location of the light energy and an Inertial Measurement Unit (IMU) that provides the orientation of the plane in the sky (roll, pitch, and yaw).

How Light Energy Is Used to Measure Trees

Light energy is a collection of photons. As photon that make up light moves towards the ground, they hit objects such as branches on a tree. Some of the light reflects off of those objects and returns to the sensor. If the object is small, and there are gaps surrounding it that allow light to pass through, some light continues down towards the ground. Because some photons reflect off of things like branches but others continue down towards the ground, multiple reflections (or "returns") may be recorded from one pulse of light.

LiDAR waveforms

The distribution of energy that returns to the sensor creates what we call a waveform. The amount of energy that returned to the LiDAR sensor is known as "intensity". The areas where more photons or more light energy returns to the sensor create peaks in the distribution of energy. Theses peaks in the waveform often represent objects on the ground like - a branch, a group of leaves or a building.

An example LiDAR waveform returned from two trees and the ground. Source: NEON .

How Scientists Use LiDAR Data

There are many different uses for LiDAR data.

LiDAR data classically have been used to derive high resolution elevation data models

LiDAR data have historically been used to generate high resolution elevation datasets. Source: National Ecological Observatory Network .

LiDAR data have also been used to derive information about vegetation structure including:
- Canopy Height
- Canopy Cover
- Leaf Area Index
- Vertical Forest Structure
- Species identification (if a less dense forests with high point density LiDAR)

Cross section showing LiDAR point cloud data superimposed on the corresponding landscape profile. Source: National Ecological Observatory Network.

Discrete vs. Full Waveform LiDAR

A waveform or distribution of light energy is what returns to the LiDAR sensor. However, this return may be recorded in two different ways.

A Discrete Return LiDAR System records individual (discrete) points for the peaks in the waveform curve. Discrete return LiDAR systems identify peaks and record a point at each peak location in the waveform curve. These discrete or individual points are called returns. A discrete system may record 1-11+ returns from each laser pulse.
A Full Waveform LiDAR System records a distribution of returned light energy. Full waveform LiDAR data are thus more complex to process, however they can often capture more information compared to discrete return LiDAR systems. One example research application for full waveform LiDAR data includes mapping or modelling the understory of a canopy.

LiDAR File Formats

Whether it is collected as discrete points or full waveform, most often LiDAR data are available as discrete points. A collection of discrete return LiDAR points is known as a LiDAR point cloud.

The commonly used file format to store LIDAR point cloud data is called ".las" which is a format supported by the American Society of Photogrammetry and Remote Sensing (ASPRS). Recently, the .laz format has been developed by Martin Isenberg of LasTools. The differences is that .laz is a highly compressed version of .las.

Data products derived from LiDAR point cloud data are often raster files that may be in GeoTIFF (.tif) formats.

LiDAR Data Attributes: X, Y, Z, Intensity and Classification

LiDAR data attributes can vary, depending upon how the data were collected and processed. You can determine what attributes are available for each lidar point by looking at the metadata. All lidar data points will have an associated X,Y location and Z (elevation) values. Most lidar data points will have an intensity value, representing the amount of light energy recorded by the sensor.

Some LiDAR data will also be "classified" -- not top secret, but with specifications about what the data represent. Classification of LiDAR point clouds is an additional processing step. Classification simply represents the type of object that the laser return reflected off of. So if the light energy reflected off of a tree, it might be classified as "vegetation" point. And if it reflected off of the ground, it might be classified as "ground" point.

Some LiDAR products will be classified as "ground/non-ground". Some datasets will be further processed to determine which points reflected off of buildings and other infrastructure. Some LiDAR data will be classified according to the vegetation type.

Exploring 3D LiDAR data in a free Online Viewer

Check out our tutorial on viewing LiDAR point cloud data using the Plas.io online viewer: Plas.io: Free Online Data Viz to Explore LiDAR Data. The Plas.io viewer used in this tutorial was developed by Martin Isenberg of Las Tools and his colleagues.

Summary

A LiDAR system uses a laser, a GPS and an IMU to estimate the heights of objects on the ground.
Discrete LiDAR data are generated from waveforms -- each point represent peak energy points along the returned energy.
Discrete LiDAR points contain an x, y and z value. The z value is what is used to generate height.
LiDAR data can be used to estimate tree height and even canopy cover using various methods.

Additional Resources

What is the LAS format?
Using .las with Python? las: python ingest
Specifications for las v1.3

Create a Canopy Height Model from Lidar-derived rasters in R

A common analysis using lidar data are to derive top of the canopy height values from the lidar data. These values are often used to track changes in forest structure over time, to calculate biomass, and even leaf area index (LAI). Let's dive into the basics of working with raster formatted lidar data in R!

Learning Objectives

After completing this tutorial, you will be able to:

Work with digital terrain model (DTM) & digital surface model (DSM) raster files.
Create a canopy height model (CHM) raster from DTM & DSM rasters.

Things You’ll Need To Complete This Tutorial

You will need the most current version of R and, preferably, RStudio loaded on your computer to complete this tutorial.

Install R Packages

terra: install.packages("terra")
neonUtilities: install.packages("neonUtilities")

More on Packages in R - Adapted from Software Carpentry.

Download Data

Lidar elevation raster data are downloaded using the R neonUtilities::byTileAOP function in the script.

These remote sensing data files provide information on the vegetation at the National Ecological Observatory Network's San Joaquin Experimental Range and Soaproot Saddle field sites. The entire datasets can be accessed from the NEON Data Portal.

This tutorial is designed for you to set your working directory to the directory created by unzipping this file.

Set Working Directory: This lesson will walk you through setting the working directory before downloading the datasets from neonUtilities.

An overview of setting the working directory in R can be found here.

R Script & Challenge Code: NEON data lessons often contain challenges to reinforce skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.

Create a lidar-derived Canopy Height Model (CHM)

The National Ecological Observatory Network (NEON) will provide lidar-derived data products as one of its many free ecological data products. These products will come in the GeoTIFF format, which is a .tif raster format that is spatially located on the earth.

In this tutorial, we create a Canopy Height Model. The Canopy Height Model (CHM), represents the heights of the trees on the ground. We can derive the CHM by subtracting the ground elevation from the elevation of the top of the surface (or the tops of the trees).

We will use the terra R package to work with the the lidar-derived Digital Surface Model (DSM) and the Digital Terrain Model (DTM).

# Load needed packages

library(terra)

library(neonUtilities)

Set the working directory so you know where to download data.

wd="~/data/" #This will depend on your local environment

setwd(wd)

We can use the neonUtilities function byTileAOP to download a single DTM and DSM tile at SJER. Both the DTM and DSM are delivered under the Elevation - LiDAR (DP3.30024.001) data product.

You can run help(byTileAOP) to see more details on what the various inputs are. For this exercise, we'll specify the UTM Easting and Northing to be (257500, 4112500), which will download the tile with the lower left corner (257000,4112000). By default, the function will check the size total size of the download and ask you whether you wish to proceed (y/n). You can set check.size=FALSE if you want to download without a prompt. This example will not be very large (~8MB), since it is only downloading two single-band rasters (plus some associated metadata).

byTileAOP(dpID='DP3.30024.001',

          site='SJER',

          year='2021',

          easting=257500,

          northing=4112500,

          check.size=TRUE, # set to FALSE if you don't want to enter y/n

          savepath = wd)

This file will be downloaded into a nested subdirectory under the ~/data folder, inside a folder named DP3.30024.001 (the Data Product ID). The files should show up in these locations: ~/data/DP3.30024.001/neon-aop-products/2021/FullSite/D17/2021_SJER_5/L3/DiscreteLidar/DSMGtif/NEON_D17_SJER_DP3_257000_4112000_DSM.tif and ~/data/DP3.30024.001/neon-aop-products/2021/FullSite/D17/2021_SJER_5/L3/DiscreteLidar/DTMGtif/NEON_D17_SJER_DP3_257000_4112000_DTM.tif.

Now we can read in the files. You can move the files to a different location (eg. shorten the path), but make sure to change the path that points to the file accordingly.

# Define the DSM and DTM file names, including the full path

dsm_file <- paste0(wd,"DP3.30024.001/neon-aop-products/2021/FullSite/D17/2021_SJER_5/L3/DiscreteLidar/DSMGtif/NEON_D17_SJER_DP3_257000_4112000_DSM.tif")

dtm_file <- paste0(wd,"DP3.30024.001/neon-aop-products/2021/FullSite/D17/2021_SJER_5/L3/DiscreteLidar/DTMGtif/NEON_D17_SJER_DP3_257000_4112000_DTM.tif")

First, we will read in the Digital Surface Model (DSM). The DSM represents the elevation of the top of the objects on the ground (trees, buildings, etc).

# assign raster to object

dsm <- rast(dsm_file)



# view info about the raster.

dsm

## class       : SpatRaster 
## dimensions  : 1000, 1000, 1  (nrow, ncol, nlyr)
## resolution  : 1, 1  (x, y)
## extent      : 257000, 258000, 4112000, 4113000  (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 11N (EPSG:32611) 
## source      : NEON_D17_SJER_DP3_257000_4112000_DSM.tif 
## name        : NEON_D17_SJER_DP3_257000_4112000_DSM

# plot the DSM

plot(dsm, main="Lidar Digital Surface Model \n SJER, California")

Note the resolution, extent, and coordinate reference system (CRS) of the raster. To do later steps, our DTM will need to be the same.

Next, we will import the Digital Terrain Model (DTM) for the same area. The DTM represents the ground (terrain) elevation.

# import the digital terrain model

dtm <- rast(dtm_file)



plot(dtm, main="Lidar Digital Terrain Model \n SJER, California")

With both of these rasters now loaded, we can create the Canopy Height Model (CHM). The CHM represents the difference between the DSM and the DTM or the height of all objects on the surface of the earth.

To do this we perform some basic raster math to calculate the CHM. You can perform the same raster math in a GIS program like QGIS.

When you do the math, make sure to subtract the DTM from the DSM or you'll get trees with negative heights!

# use raster math to create CHM

chm <- dsm - dtm



# view CHM attributes

chm

## class       : SpatRaster 
## dimensions  : 1000, 1000, 1  (nrow, ncol, nlyr)
## resolution  : 1, 1  (x, y)
## extent      : 257000, 258000, 4112000, 4113000  (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 11N (EPSG:32611) 
## source(s)   : memory
## varname     : NEON_D17_SJER_DP3_257000_4112000_DSM 
## name        : NEON_D17_SJER_DP3_257000_4112000_DSM 
## min value   :                                 0.00 
## max value   :                                24.13

plot(chm, main="Lidar CHM - SJER, California")

We've now created a CHM from our DSM and DTM. What do you notice about the canopy cover at this location in the San Joaquin Experimental Range?

Challenge: Basic Raster Math

Convert the CHM from meters to feet and plot it.

We can write out the CHM as a GeoTiff using the writeRaster() function.

# write out the CHM in tiff format. 

writeRaster(chm,paste0(wd,"CHM_SJER.tif"),"GTiff")

We've now successfully created a canopy height model using basic raster math -- in R! We can bring the CHM_SJER.tif file into QGIS (or any GIS program) and look at it.

Consider checking out the tutorial Compare tree height measured from the ground to a Lidar-based Canopy Height Model to compare a LiDAR-derived CHM with ground-based observations!

Introduction to the National Ecological Observatory Network (NEON)

Here we will provide an overview of the National Ecological Observatory Network (NEON). Please carefully read through these materials and links that discuss NEON’s mission and design.

Learning Objectives

At the end of this activity, you will be able to:

Explain the mission of the National Ecological Observatory Network (NEON).
Explain the how sites are located within the NEON project design.
Explain the different types of data that will be collected and provided by NEON.

The NEON Project Mission & Design

To capture ecological heterogeneity across the United States, NEON’s design divides the continent into 20 statistically different eco-climatic domains. Each NEON field site is located within an eco-climatic domain.

The Science and Design of NEON

To gain a better understanding of the broad scope fo NEON watch this 4 minute long video.

Please, read the following page about NEON's mission.

Data Institute Participants -- Thought Question: How might/does the NEON project intersect with your current research or future career goals?

NEON's Spatial Design

The Spatial Design of NEON

Watch this 4:22 minute video exploring the spatial design of NEON field sites.

Please read the following page about NEON's Spatial Design:

Read this primer on NEON's Sampling Design

Read about the different types of field sites - core and relocatable

NEON Field Site Locations

Explore the NEON Field Site map taking note of the locations of

Aquatic & terrestrial field sites.
Core & relocatable field sites.

Click here to view the NEON Field Site Map

Explore the NEON field site map. Do the following:

Zoom in on a study area of interest to see if there are any NEON field sites that are nearby.
Use the menu below the map to filter sites by name, type, domain, or state.
Select one field site of interest.
- Click on the marker in the map.
- Then click on Site Details to jump to the field site landing page.

Data Institute Participant -- Thought Questions: Use the map above to answer these questions. Consider the research question that you may explore as your Capstone Project at the Institute or about a current project that you are working on and answer the following questions:

Are there NEON field sites that are in study regions of interest to you?
What domains are the sites located in?
What NEON field sites do your current research or Capstone Project ideas coincide with?
Is the site(s) core or relocatable?
Is it/are they terrestrial or aquatic?
Are there data available for the NEON field site(s) that you are most interested in? What kind of data are available?

Data Tip: You can download maps, kmz, or shapefiles of the field sites here.

NEON Data

How NEON Collects Data

Watch this 3:06 minute video exploring the data that NEON collects.

Read the Data Collection Methods page to learn more about the different types of data that NEON collects and provides. Then, follow the links below to learn more about each collection method:

All data collection protocols and processing documents are publicly available. Read more about the standardized protocols and how to access these documents.

Specimens & Samples

NEON also collects samples and specimens from which the other data products are based. These samples are also available for research and education purposes. Learn more: NEON Biorepository.

Airborne Remote Sensing

Watch this 5 minute video to better understand the NEON Airborne Observation Platform (AOP).

Data Institute Participant – Thought Questions: Consider either your current or future research or the question you’d like to address at the Institute.

Which types of NEON data may be more useful to address these questions?
What non-NEON data resources could be combined with NEON data to help address your question?
What challenges, if any, could you foresee when beginning to work with these data?

Data Tip: NEON also provides support to your own research including proposals to fly the AOP over other study sites, a mobile tower/instrumentation setup and others. Learn more here the Assignable Assets programs .

Access NEON Data

NEON data are processed and go through quality assurance quality control checks at NEON headquarters in Boulder, CO. NEON carefully documents every aspect of sampling design, data collection, processing and delivery. This documentation is freely available through the NEON data portal.

Visit the NEON Data Portal - data.neonscience.org
Read more about the quality assurance and quality control processes for NEON data and how the data are processed from raw data to higher level data products.
Explore NEON Data Products. On the page for each data product in the catalog you can find the basic information about the product, find the data collection and processing protocols, and link directly to downloading the data.
Additionally, some types of NEON data are also available through the data portals of other organizations. For example, NEON Terrestrial Insect DNA Barcoding Data is available through the Barcode of Life Datasystem (BOLD). Or NEON phenocam images are available from the Phenocam network site. More details on where else the data are available from can be found in the Availability and Download section on the Product Details page for each data product (visit Explore Data Products to access individual Product Details pages).

Pathways to access NEON Data

There are several ways to access data from NEON:

Via the NEON data portal. Explore and download data. Note that much of the tabular data is available in zipped .csv files for each month and site of interest. To combine these files, use the neonUtilities package (R tutorial, Python tutorial).
Use R or Python to programmatically access the data. NEON and community members have created code packages to directly access the data through an API. Learn more about the available resources by reading the Code Resources page or visiting the NEONScience GitHub repo.
Using the NEON API. Access NEON data directly using a custom API call.
Access NEON data through partner's portals. Where NEON data directly overlap with other community resources, NEON data can be accessed through the portals. Examples include Phenocam, BOLD, Ameriflux, and others. You can learn more in the documentation for individual data products.

Data Institute Participant – Thought Questions: Use the Data Portal tools to investigate the data availability for the field sites you’ve already identified in the previous Thought Questions.

What types of aquatic/terrestrial data are currently available? Remote sensing data?
Of these, what type of data are you most interested in working with for your project while at the Institute.
For what time period does the data cover?
What format is the downloadable file available in?
Where is the metadata to support this data?

Data Institute Participants: Intro to NEON Culmination Activity

Write up a brief summary of a project that you might want to explore while at the Data Institute in Boulder, CO. Include the types of NEON (and other data) that you will need to implement this project. Save this summary as you will be refining and adding to your ideas over the next few weeks.

The goal of this activity if for you to begin to think about a Capstone Project that you wish to work on at the end of the Data Institute. This project will ideally be performed in groups, so over the next few weeks you'll have a chance to view the other project proposals and merge projects to collaborate with your colleagues.

Set up GitHub Working Directory - Quick Intro to Bash

Checklist

Once you have Git and Bash installed, you are ready to configure Git.

On this page you will:

Create a directory for all future GitHub repositories created on your computer

To ensure Git is properly installed and to create a working directory for GitHub, you will need to know a bit of shell -- brief crash course below.

Crash Course on Shell

The Unix shell has been around longer than most of its users have been alive. It has survived so long because it’s a power tool that allows people to do complex things with just a few keystrokes. More importantly, it helps them combine existing programs in new ways and automate repetitive tasks so they aren’t typing the same things over and over again. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including “high-performance computing” supercomputers).

This section is an abbreviated form of Software Carpentry’s The Unix Shell for Novice’s workshop lesson series. Content and wording (including all the above) is heavily copied and credit is due to those creators (full author list).

Our goal with shell is to:

Set up the directory where we will store all of the GitHub repositories during the Institute,
Make sure Git is installed correctly, and
Gain comfort using bash so that we can use it to work with Git & GitHub.

Accessing Shell

How one accesses the shell depends on the operating system being used.

OS X: The bash program is called Terminal. You can search for it in Spotlight.
Windows: Git Bash came with your download of Git for Windows. Search Git Bash.
Linux: Default is usually bash, if not, type bash in the terminal.

Bash Commands

The dollar sign is a prompt, which shows us that the shell is waiting for input; your shell may use a different character as a prompt and may add information before the prompt.

When typing commands, either from these tutorials or from other sources, do not type the prompt ($), only the commands that follow it. In these tutorials, subsequent lines that follow a prompt and do not start with $ are the output of the command.

listing contents - ls

Next, let's find out where we are by running a command called pwd -- print working directory. At any moment, our current working directory is our current default directory. I.e., the directory that the computer assumes we want to run commands in unless we explicitly specify something else. Here, the computer's response is /Users/neon, which is NEON’s home directory:

$ pwd

/Users/neon

**Data Tip:** Home Directory Variation - The home directory path will look different on different operating systems. On Linux it may look like `/home/neon`, and on Windows it will be similar to `C:\Documents and Settings\neon` or `C:\Users\neon`. (It may look slightly different for different versions of Windows.) In future examples, we've used Mac output as the default, Linux and Windows output may differ slightly, but should be generally similar.

If you are not, by default, in your home directory, you get there by typing:


$ cd ~

Now let's learn the command that will let us see the contents of our own file system. We can see what's in our home directory by running ls --listing.

$ ls

Applications   Documents   Library   Music   Public
Desktop        Downloads   Movies    Pictures

(Again, your results may be slightly different depending on your operating system and how you have customized your filesystem.)

ls prints the names of the files and directories in the current directory in alphabetical order, arranged neatly into columns.

**Data Tip:** What is a directory? That is a folder! Read the section on Directory vs. Folder if you find the wording confusing.

Change directory -- cd

Now we want to move into our Documents directory where we will create a directory to host our GitHub repository (to be created in Week 2). The command to change locations is cd followed by a directory name if it is a sub-directory in our current working directory or a file path if not. cd stands for "change directory", which is a bit misleading: the command doesn't change the directory, it changes the shell's idea of what directory we are in.

To move to the Documents directory, we can use the following series of commands to get there:

$ cd Documents

These commands will move us from our home directory into our Documents directory. cd doesn't print anything, but if we run pwd after it, we can see that we are now in /Users/neon/Documents.

If we run ls now, it lists the contents of /Users/neon/Documents, because that's where we now are:

$ pwd

/Users/neon/Documents

$ ls


data/  elements/  animals.txt  planets.txt  sunspot.txt

To use cd, you need to be familiar with paths, if not, read the section on Full, Base, and Relative Paths .

Make a directory -- mkdir

Now we can create a new directory called GitHub that will contain our GitHub repositories when we create them later. We can use the command mkdir NAME-- “make directory”

$ mkdir GitHub

There is not output.

Since GitHub is a relative path (i.e., doesn't have a leading slash), the new directory is created in the current working directory:

$ ls

data/  elements/  GitHub/  animals.txt  planets.txt  sunspot.txt

**Data Tip:** This material is a much abbreviated form of the Software Carpentry Unix Shell for Novices workhop. Want a better understanding of shell? Check out the full series!

Is Git Installed Correctly?

All of the above commands are bash commands, not Git specific commands. We still need to check to make sure git installed correctly. One of the easiest ways is to check to see which version of git we have installed.

Git commands start with git. We can use git --version to see which version of Git is installed

$ git --version

git version 2.5.4 (Apple Git-61)

If you get a git version number, then Git is installed!

If you get an error, Git isn’t installed correctly. Reinstall and repeat.

Setup Git Global Configurations

Now that we know Git is correctly installed, we can get it set up to work with.

The text below is modified slightly from Software Carpentry's Setting up Git lesson.

When we use Git on a new computer for the first time, we need to configure a few things. Below are a few examples of configurations we will set as we get started with Git:

our name and email address,
to colorize our output,
what our preferred text editor is,
and that we want to use these settings globally (i.e. for every project)

On a command line, Git commands are written as git verb, where verb is what we actually want to do.

Set up you own git with the following command, using your own information instead of NEON's.

$ git config --global user.name "NEON Science"
$ git config --global user.email "neon@BattelleEcology.org"
$ git config --global color.ui "auto"

Then set up your favorite text editor following this table:

Editor	Configuration command
nano	`$ git config --global core.editor "nano -w"`
Text Wrangler	`$ git config --global core.editor "edit -w"`
Sublime Text (Mac)	`$ git config --global core.editor "subl -n -w"`
Sublime Text (Win, 32-bit install)	`$ git config --global core.editor "'c:/program files (x86)/sublime text 3/sublime_text.exe' -w"`
Sublime Text (Win, 64-bit install)	`$ git config --global core.editor "'c:/program files/sublime text 3/sublime_text.exe' -w"`
Notepad++ (Win)	`$ git config --global core.editor "'c:/program files (x86)/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin"`
Kate (Linux)	`$ git config --global core.editor "kate"`
Gedit (Linux)	`$ git config --global core.editor "gedit -s -w"`
emacs	`$ git config --global core.editor "emacs"`
vim	`$ git config --global core.editor "vim"`

The four commands we just ran above only need to be run once: the flag --global tells Git to use the settings for every project in your user account on this computer.

You can check your settings at any time:

$ git config --list

You can change your configuration as many times as you want; just use the same commands to choose another editor or update your email address.

Now that Git is set up, you will be ready to start the Week 2 materials to learn about version control and how Git & GitHub work.

**Data Tip:** GitDesktop is a GUI (one of many) for using GitHub that is free and available for both Mac and Windows operating systems. In NEON Data Skills workshops & Data Institutes will only teach how to use Git through command line, and not support use of GitDesktop (or any other GUI), however, you are welcome to check it out and use it if you would like to.

Install Git, Bash Shell, R & RStudio

This page outlines the tools and resources that you will need to get started working on the many R-based tutorials that NEON provides.

Checklist

This checklist includes the tools that need to be set-up on your computer. Detailed directions to accomplish each objective are below.

Install Bash shell (or shell of preference)
Install Git
Install R & RStudio

Bash/Shell Setup

Install Bash for Windows

Download the Git for Windows installer.
Run the installer and follow the steps below (these may look slightly different depending on Git version number):
1. Welcome to the Git Setup Wizard: Click on "Next".
2. Information: Click on "Next".
3. Select Destination Location: Click on "Next".
4. Select Components: Click on "Next".
5. Select Start Menu Folder: Click on "Next".
6. Adjusting your PATH environment: Select "Use Git from the Windows Command Prompt" and click on "Next". If you forgot to do this programs that you need for the event will not work properly. If this happens rerun the installer and select the appropriate option.
7. Configuring the line ending conversions: Click on "Next". Keep "Checkout Windows-style, commit Unix-style line endings" selected.
8. Configuring the terminal emulator to use with Git Bash: Select "Use Windows' default console window" and click on "Next".
9. Configuring experimental performance tweaks: Click on "Next".
10. Completing the Git Setup Wizard: Click on "Finish".

This will provide you with both Git and Bash in the Git Bash program.

Install Bash for Mac OS X

The default shell in all versions of Mac OS X is bash, so no need to install anything. You access bash from the Terminal (found in /Applications/Utilities). You may want to keep Terminal in your dock for this workshop.

Install Bash for Linux

The default shell is usually Bash, but if your machine is set up differently you can run it by opening a terminal and typing bash. There is no need to install anything.

Git Setup

Git is a version control system that lets you track who made changes to what when and has options for easily updating a shared or public version of your code on GitHub. You will need a supported web browser (current versions of Chrome, Firefox or Safari, or Internet Explorer version 9 or above).

Git installation instructions borrowed and modified from Software Carpentry.

Git for Windows

Git should be installed on your computer as part of your Bash install.

Git on Mac OS X

Video Tutorial

Install Git on Macs by downloading and running the most recent installer for "mavericks" if you are using OS X 10.9 and higher -or- if using an earlier OS X, choose the most recent "snow leopard" installer, from this list. After installing Git, there will not be anything in your /Applications folder, as Git is a command line program.

**Data Tip:** If you are running Mac OSX El Capitan, you might encounter errors when trying to use git. Make sure you update XCODE. Read more - a Stack Overflow Issue.

Git on Linux

If Git is not already available on your machine you can try to install it via your distro's package manager. For Debian/Ubuntu run sudo apt-get install git and for Fedora run sudo yum install git.

Setting Up R & RStudio

Windows R/RStudio Setup

Please visit the CRAN Website to download the latest version of R for windows.
Run the .exe file that was just downloaded
Go to the RStudio Download page
Download the latest version of Rstudio for Windows
Double click the file to install it

Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.

Mac R/RStudio Setup

Go to CRAN and click on Download R for (Mac) OS X
Select the .pkg file for the version of OS X that you have and the file will download.
Double click on the file that was downloaded and R will install
Go to the RStudio Download page
Download the latest version of Rstudio for Mac
Once it's downloaded, double click the file to install it

Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.

Linux R/RStudio Setup

R is available through most Linux package managers. You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo yum install R).
To install RStudio, go to the RStudio Download page
Under Installers select the version for your distribution.
Once it's downloaded, double click the file to install it

Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.

Data Institute: Install Required R Packages

R and RStudio

Once R and RStudio are installed (in Install Git, Bash Shell, R & RStudio ), open RStudio to make sure it works and you don’t get any error messages. Then, install the needed R packages.

Install/Update R Packages

Please make sure all of these packages are installed and up to date on your computer prior to the Institute.

install.packages(c("raster", "rasterVis", "rgdal", "rgeos", "rmarkdown", "knitr", "plyr", "dplyr", "ggplot2", "plotly"))
The rhdf5 package is not on CRAN and must be downloaded directly from Bioconductor. The can be done using these two commands directly in your R console.
- #install.packages("BiocManager")
- #BiocManager::install("rhdf5")

Install QGIS & HDF5View

Install HDFView

The free HDFView application allows you to explore the contents of an HDF5 file.

To install HDFView:

Click to go to the download page.
From the section titled HDF-Java 2.1x Pre-Built Binary Distributions select the HDFView download option that matches the operating system and computer setup (32 bit vs 64 bit) that you have. The download will start automatically.
Open the downloaded file.

Mac - You may want to add the HDFView application to your Applications directory.
Windows - Unzip the file, open the folder, run the .exe file, and follow directions to complete installation.

Open HDFView to ensure that the program installed correctly.

**Data Tip:** The HDFView application requires Java to be up to date. If you are having issues opening HDFView, try to update Java first!

Install QGIS

QGIS is a free, open-source GIS program. Installation is optional for the 2018 Data Institute. We will not directly be working with QGIS, however, some past participants have found it useful to have during the capstone projects.

To install QGIS:

Download the QGIS installer on the QGIS download page here. Follow the installation directions below for your operating system.

Windows

Select the appropriate QGIS Standalone Installer Version for your computer.
The download will automatically start.
Open the .exe file and follow prompts to install (installation may take a while).
Open QGIS to ensure that it is properly downloaded and installed.

Mac OS X

Select KyngChaos QGIS download page. This will take you to a new page.
Select the current version of QGIS. The file download (.dmg format) should start automatically.
Once downloaded, run the .dmg file. When you run the .dmg, it will create a directory of installer packages that you need to run in a particular order. IMPORTANT: read the READ ME BEFORE INSTALLING.rtf file!

Install the packages in the directory in the order indicated.

GDAL Complete.pkg
NumPy.pkg
matplotlib.pkg
QGIS.pkg - NOTE: you need to install GDAL, NumPy and matplotlib in order to successfully install QGIS on your Mac!

**Data Tip:** If your computer doesn't allow you to open these packages because they are from an unknown developer, right click on the package and select Open With >Installer (default). You will then be asked if you want to open the package. Select Open, and the installer will open.

Once all of the packages are installed, open QGIS to ensure that it is properly installed.

LINUX

Select the appropriate download for your computer system.
Note: if you have previous versions of QGIS installed on your system, you may run into problems. Check out

this page from QGIS for additional information. 3. Finally, open QGIS to ensure that it is properly downloaded and installed.

The Importance of Reproducible Science

Verifiability and reproducibility are among the cornerstones of the scientific process. They are what allows scientists to "stand on the shoulder of giants". Maintaining reproducibility requires that all data management, analysis, and visualization steps behind the results presented in a paper are documented and available in full detail. Reproducibility here means that someone else should either be able to obtain the same results given all the documented inputs and the published instructions for processing them, or if not, the reasons why should be apparent. From Reproducible Science Curriculum

## Learning Objectives At the end of this activity, you will be able to:

Summarize the four facets of reproducibility.
Describe several ways that reproducible workflows can improve your workflow and research.
Explain several ways you can incorporate reproducible science techniques into your own research.

Getting Started with Reproducible Science

Please view the online slide-show below which summarizes concepts taught in the Reproducible Science Curriculum.

View Reproducible Science Slideshow

A Gap In Understanding

Image of a Twitter post submitted by Tracy Steal highlighting the obstacles slowing adoption of reproducible science pratices. These are: People are unaware there is a problem, 100% reproducibility is hard, One workflow does not fit all, Lack of motivation, and are scared of intial time investments. — Obstacles slowing adoption of reproducible science practices. Source: Reproducible Science Curriculum

Reproducibility and Your Research

Graphic showing the spectrum of reproducibility for published research. From left to right, left being not reproducible and right being the gold standard, we have publication only, publication plus code, publication plus code and data, publication with linked and executable code and data, and full replication. — Reproducibility spectrum for published research. Source: Peng, RD Reproducible Research in Computational Science Science (2011): 1226–1227 via Reproducible Science Curriculum

How reproducible is your current research?

View Reproducible Science Checklist

**Thought Questions:** Have a look at the reproducible science check list linked, above and answer the following questions:

Do you currently apply any of the items in the checklist to your research?
Are there elements in the list that you are interested in incorporating into your workflow? If so, which ones?

Additional Readings (optional)

Nature has collated and published (with open-access) a special archive on the Challenges of Irreproducible Science .
The Nature Publishing group has also created a Reporting Checklist for its authors that focuses primaily on reporting issues but also includes sections for sharing code.
Recent open-access issue of Ecography focusing on reproducible ecology and software packages available for use.
A nice short blog post with an annotated bibliography of "Top 10 papers discussing reproducible research in computational science" from Lorena Barba: Barba group reproducibility syllabus.

Subscribe to

LiDAR Point Clouds

Common LiDAR Data Products

Free Point Cloud Viewers for LiDAR Point Clouds

3D Models of NEON Site: SJER (San Joaquin Experimental Range)

Gridded, or Raster, LiDAR Data Products

LiDAR Related Metadata

Create Useful Data Products from LiDAR Data

Classify LiDAR Point Clouds

Create A Raster From LiDAR Point Clouds

Point to Raster Methods - Basic Gridding

Point to Raster Methods - Interpolation

Learning Objectives

Things You’ll Need To Complete This Tutorial

Download Data

Example visualization of LiDAR data

LiDAR File Formats

LiDAR Attribute Data

Plas.io Viewer

1. Open a .las file in plas.io

Navigation in Plas.io

How The Points are Colored

2. Adjust the intensity threshold

3. Change the lidar point cloud color options to Classification

4. Spend Some Time Exploring - Do you See Any Trees?

Summary

Additional Resources:

The Story of LiDAR

Key Concepts

Why LiDAR

How LiDAR Works

How Does LiDAR Work?

How Light Energy Is Used to Measure Trees

LiDAR waveforms

How Scientists Use LiDAR Data

Discrete vs. Full Waveform LiDAR

LiDAR File Formats

LiDAR Data Attributes: X, Y, Z, Intensity and Classification

Exploring 3D LiDAR data in a free Online Viewer

Summary

Additional Resources

Learning Objectives

Things You’ll Need To Complete This Tutorial

Install R Packages

Download Data

Recommended Reading

Create a lidar-derived Canopy Height Model (CHM)

Challenge: Basic Raster Math

Learning Objectives

The NEON Project Mission & Design

The Science and Design of NEON

NEON's Spatial Design

The Spatial Design of NEON

NEON Field Site Locations

NEON Data

How NEON Collects Data

Specimens & Samples

Airborne Remote Sensing

Access NEON Data

Pathways to access NEON Data

Checklist

Crash Course on Shell

Accessing Shell

Bash Commands

listing contents - ls

Change directory -- cd

Make a directory -- mkdir

Is Git Installed Correctly?

Setup Git Global Configurations

Checklist

Bash/Shell Setup

Install Bash for Windows

Install Bash for Mac OS X

Install Bash for Linux

Git Setup

Git for Windows

Git on Mac OS X

Git on Linux

Setting Up R & RStudio

Windows R/RStudio Setup

Mac R/RStudio Setup