Here we will provide an overview of the National Ecological Observatory
Network (NEON). Please carefully read through these materials and links that
discuss NEON’s mission and design.
Learning Objectives
At the end of this activity, you will be able to:
Explain the mission of the National Ecological Observatory Network (NEON).
Explain the how sites are located within the NEON project design.
Explain the different types of data that will be collected and provided by NEON.
The NEON Project Mission & Design
To capture ecological heterogeneity across the United States, NEON’s design
divides the continent into 20 statistically different eco-climatic domains. Each
NEON field site is located within an eco-climatic domain.
The Science and Design of NEON
To gain a better understanding of the broad scope fo NEON watch this 4 minute long
video.
Explore the NEON field site map. Do the following:
Zoom in on a study area of interest to see if there are any NEON field sites that are nearby.
Use the menu below the map to filter sites by name, type, domain, or state.
Select one field site of interest.
Click on the marker in the map.
Then click on Site Details to jump to the field site landing page.
Data Institute Participant -- Thought Questions:
Use the map above to answer these questions. Consider the research question that
you may explore as your Capstone Project at the Institute or about a current
project that you are working on and answer the following questions:
Are there NEON field sites that are in study regions of interest to you?
What domains are the sites located in?
What NEON field sites do your current research or Capstone Project ideas
coincide with?
Is the site(s) core or relocatable?
Is it/are they terrestrial or aquatic?
Are there data available for the NEON field site(s) that you are most
interested in? What kind of data are available?
Watch this 3:06 minute video exploring the data that NEON collects.
Read the
Data Collection Methods
page to learn more about the different types of data that NEON collects and
provides. Then, follow the links below to learn more about each collection method:
NEON also collects samples and specimens from which the other data products are based. These samples are also available for research and education purposes. Learn more:
NEON Biorepository.
Airborne Remote Sensing
Watch this 5 minute video to better understand the NEON Airborne Observation
Platform (AOP).
Data Institute Participant – Thought Questions:
Consider either your current or future research or the question you’d like to
address at the Institute.
Which types of NEON data may be more useful to address these questions?
What non-NEON data resources could be combined with NEON data to help address your question?
What challenges, if any, could you foresee when beginning to work with these data?
Data Tip: NEON also provides support to your own
research including proposals to fly the AOP over other study sites, a mobile
tower/instrumentation setup and others. Learn more here the
Assignable Assets programs .
Access NEON Data
NEON data are processed and go through quality assurance quality control checks at NEON headquarters in Boulder, CO.
NEON carefully documents every aspect of sampling design, data collection, processing and delivery. This documentation is freely available through the NEON data portal.
Explore NEON Data Products.
On the page for each data product in the catalog you can find the basic information
about the product, find the data collection and processing protocols, and link
directly to downloading the data.
Additionally, some types of NEON data are also available through the data portals
of other organizations. For example,
NEON Terrestrial Insect DNA Barcoding Data
is available through the
Barcode of Life Datasystem (BOLD).
Or NEON phenocam images are available from the
Phenocam network site.
More details on where else the data are available from can be found in the Availability and Download
section on the Product Details page for each data product (visit
Explore Data Products
to access individual Product Details pages).
Pathways to access NEON Data
There are several ways to access data from NEON:
Via the NEON data portal.
Explore and download data. Note that much of the tabular data is available in zipped
.csv files for each month and site of interest. To combine these files, use the
neonUtilities package (R tutorial, Python tutorial).
Use R or Python to programmatically access the data. NEON and community members
have created code packages to directly access the data through an API. Learn more
about the available resources by reading the Code Resources page or visiting the
NEONScience GitHub repo.
Using the NEON API. Access NEON data directly using a custom API call.
Access NEON data through partner's portals. Where NEON data directly overlap
with other community resources, NEON data can be accessed through the portals.
Examples include Phenocam, BOLD, Ameriflux, and others. You can learn more in the
documentation for individual data products.
Data Institute Participant – Thought Questions:
Use the Data Portal tools to investigate the data availability for the field
sites you’ve already identified in the previous Thought Questions.
What types of aquatic/terrestrial data are currently available? Remote sensing data?
Of these, what type of data are you most interested in working with for your project while at the Institute.
For what time period does the data cover?
What format is the downloadable file available in?
Where is the metadata to support this data?
Data Institute Participants: Intro to NEON Culmination Activity
Write up a brief summary of a project that you might want to explore while at the
Data Institute in Boulder, CO. Include the types of NEON (and other data) that you
will need to implement this project. Save this summary as you will be refining
and adding to your ideas over the next few weeks.
The goal of this activity if for you to begin to think about a Capstone Project
that you wish to work on at the end of the Data Institute. This project will ideally be
performed in groups, so over the next few weeks you'll have a chance to view the other
project proposals and merge projects to collaborate with your colleagues.
Once you have Git and Bash installed, you are ready to configure Git.
On this page you will:
Create a directory for all future GitHub repositories created on your computer
To ensure Git is properly installed and to create a working directory for GitHub,
you will need to know a bit of shell -- brief crash course below.
Crash Course on Shell
The Unix shell has been around longer than most of its users have been alive.
It has survived so long because it’s a power tool that allows people to do
complex things with just a few keystrokes. More importantly, it helps them
combine existing programs in new ways and automate repetitive tasks so they
aren’t typing the same things over and over again. Use of the shell is
fundamental to using a wide range of other powerful tools and computing
resources (including “high-performance computing” supercomputers).
Set up the directory where we will store all of the GitHub repositories
during the Institute,
Make sure Git is installed correctly, and
Gain comfort using bash so that we can use it to work with Git & GitHub.
Accessing Shell
How one accesses the shell depends on the operating system being used.
OS X: The bash program is called Terminal. You can search for it in Spotlight.
Windows: Git Bash came with your download of Git for Windows. Search Git Bash.
Linux: Default is usually bash, if not, type bash in the terminal.
Bash Commands
$
The dollar sign is a prompt, which shows us that the shell is waiting for
input; your shell may use a different character as a prompt and may add
information before the prompt.
When typing commands, either from these tutorials or from other sources, do not
type the prompt ($), only the commands that follow it.
In these tutorials, subsequent lines that follow a prompt and do not start with
$ are the output of the command.
listing contents - ls
Next, let's find out where we are by running a command called pwd -- print
working directory. At any moment, our current working directory is our
current default directory. I.e., the directory that the computer assumes we
want to run commands in unless we explicitly specify something else. Here, the
computer's response is /Users/neon, which is NEON’s home directory:
$ pwd
/Users/neon
**Data Tip:** Home Directory Variation - The home
directory path will look different on different operating systems. On Linux it
may look like `/home/neon`, and on Windows it will be similar to
`C:\Documents and Settings\neon` or `C:\Users\neon`.
(It may look slightly different for different versions of Windows.)
In future examples, we've used Mac output as the default, Linux and Windows
output may differ slightly, but should be generally similar.
If you are not, by default, in your home directory, you get there by typing:
$ cd ~
Now let's learn the command that will let us see the contents of our own
file system. We can see what's in our home directory by running ls --listing.
$ ls
Applications Documents Library Music Public
Desktop Downloads Movies Pictures
(Again, your results may be slightly different depending on your operating
system and how you have customized your filesystem.)
ls prints the names of the files and directories in the current directory in
alphabetical order, arranged neatly into columns.
**Data Tip:** What is a directory? That is a folder! Read the section on
Directory vs. Folder
if you find the wording confusing.
Change directory -- cd
Now we want to move into our Documents directory where we will create a
directory to host our GitHub repository (to be created in Week 2). The command
to change locations is cd followed by a directory name if it is a
sub-directory in our current working directory or a file path if not.
cd stands for "change directory", which is a bit misleading: the command
doesn't change the directory, it changes the shell's idea of what directory we
are in.
To move to the Documents directory, we can use the following series of commands
to get there:
$ cd Documents
These commands will move us from our home directory into our Documents
directory. cd doesn't print anything, but if we run pwd after it, we can
see that we are now in /Users/neon/Documents.
If we run ls now, it lists the contents of /Users/neon/Documents, because
that's where we now are:
$ pwd
/Users/neon/Documents
$ ls
data/ elements/ animals.txt planets.txt sunspot.txt
Now we can create a new directory called GitHub that will contain our GitHub
repositories when we create them later.
We can use the command mkdir NAME-- “make directory”
$ mkdir GitHub
There is not output.
Since GitHub is a relative path (i.e., doesn't have a leading slash), the
new directory is created in the current working directory:
$ ls
data/ elements/ GitHub/ animals.txt planets.txt sunspot.txt
**Data Tip:** This material is a much abbreviated form of the
Software Carpentry Unix Shell for Novices
workhop. Want a better understanding of shell? Check out the full series!
Is Git Installed Correctly?
All of the above commands are bash commands, not Git specific commands. We
still need to check to make sure git installed correctly. One of the easiest
ways is to check to see which version of git we have installed.
Git commands start with git.
We can use git --version to see which version of Git is installed
$ git --version
git version 2.5.4 (Apple Git-61)
If you get a git version number, then Git is installed!
If you get an error, Git isn’t installed correctly. Reinstall and repeat.
Setup Git Global Configurations
Now that we know Git is correctly installed, we can get it set up to work with.
When we use Git on a new computer for the first time, we need to configure a
few things. Below are a few examples of configurations we will set as we get
started with Git:
our name and email address,
to colorize our output,
what our preferred text editor is,
and that we want to use these settings globally (i.e. for every project)
On a command line, Git commands are written as git verb, where verb is what
we actually want to do.
Set up you own git with the following command, using your own information instead
of NEON's.
The four commands we just ran above only need to be run once:
the flag --global tells Git to use the settings for every project in your user
account on this computer.
You can check your settings at any time:
$ git config --list
You can change your configuration as many times as you want; just use the
same commands to choose another editor or update your email address.
Now that Git is set up, you will be ready to start the Week 2 materials to learn
about version control and how Git & GitHub work.
**Data Tip:**
GitDesktop
is a GUI (one of many) for
using GitHub that is free and available for both Mac and Windows operating
systems. In NEON Data Skills workshops & Data Institutes will only teach how to
use Git through command line, and not support use of GitDesktop
(or any other GUI), however, you are welcome to check it out and use it if you
would like to.
Run the installer and follow the steps below (these may look slightly different depending on Git version number):
Welcome to the Git Setup Wizard: Click on "Next".
Information: Click on "Next".
Select Destination Location: Click on "Next".
Select Components: Click on "Next".
Select Start Menu Folder: Click on "Next".
Adjusting your PATH environment:
Select "Use Git from the Windows Command Prompt" and click on "Next".
If you forgot to do this programs that you need for the event will not work properly.
If this happens rerun the installer and select the appropriate option.
Configuring the line ending conversions: Click on "Next".
Keep "Checkout Windows-style, commit Unix-style line endings" selected.
Configuring the terminal emulator to use with Git Bash:
Select "Use Windows' default console window" and click on "Next".
Configuring experimental performance tweaks: Click on "Next".
Completing the Git Setup Wizard: Click on "Finish".
This will provide you with both Git and Bash in the Git Bash program.
Install Bash for Mac OS X
The default shell in all versions of Mac OS X is bash, so no
need to install anything. You access bash from the Terminal
(found in
/Applications/Utilities). You may want to keep
Terminal in your dock for this workshop.
Install Bash for Linux
The default shell is usually Bash, but if your
machine is set up differently you can run it by opening a
terminal and typing bash. There is no need to
install anything.
Git Setup
Git is a version control system that lets you track who made changes to what
when and has options for easily updating a shared or public version of your code
on GitHub. You will need a
supported
web browser (current versions of Chrome, Firefox or Safari, or Internet Explorer
version 9 or above).
Git installation instructions borrowed and modified from
Software Carpentry.
Git for Windows
Git should be installed on your computer as part of your Bash install.
Install Git on Macs by downloading and running the most recent installer for
"mavericks" if you are using OS X 10.9 and higher -or- if using an
earlier OS X, choose the most recent "snow leopard" installer, from
this list.
After installing Git, there will not be anything in your
/Applications folder, as Git is a command line program.
**Data Tip:**
If you are running Mac OSX El Capitan, you might encounter errors when trying to
use git. Make sure you update XCODE.
Read more - a Stack Overflow Issue.
Git on Linux
If Git is not already available on your machine you can try to
install it via your distro's package manager. For Debian/Ubuntu run
sudo apt-get install git and for Fedora run
sudo yum install git.
Setting Up R & RStudio
Windows R/RStudio Setup
Please visit the CRAN Website to download the latest version of R for windows.
Download the latest version of Rstudio for Windows
Double click the file to install it
Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.
Once it's downloaded, double click the file to install it
Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.
Linux R/RStudio Setup
R is available through most Linux package managers.
You can download the binary files for your distribution
from CRAN. Or
you can use your package manager (e.g. for Debian/Ubuntu
run sudo apt-get install r-base and for Fedora run
sudo yum install R).
Under Installers select the version for your distribution.
Once it's downloaded, double click the file to install it
Once R and RStudio are installed, click to open RStudio. If you don't get any error messages you are set. If there is an error message, you will need to re-install the program.
Once R and RStudio are installed (in
Install Git, Bash Shell, R & RStudio
), open RStudio to make sure it works and you don’t get any error messages. Then,
install the needed R packages.
Install/Update R Packages
Please make sure all of these packages are installed and up to date on your
computer prior to the Institute.
The rhdf5 package is not on CRAN and must be downloaded directly from
Bioconductor. The can be done using these two commands directly in your R
console.
From the section titled HDF-Java 2.1x Pre-Built Binary Distributions
select the HDFView download option that matches the operating system and
computer setup (32 bit vs 64 bit) that you have. The download will start
automatically.
Open the downloaded file.
Mac - You may want to add the HDFView application to your Applications
directory.
Windows - Unzip the file, open the folder, run the .exe file, and follow
directions to complete installation.
Open HDFView to ensure that the program installed correctly.
**Data Tip:**
The HDFView application requires Java to be up to date. If you are having issues
opening HDFView, try to update Java first!
Install QGIS
QGIS is a free, open-source GIS program. Installation is optional for the 2018
Data Institute. We will not directly be working with QGIS, however, some past
participants have found it useful to have during the capstone projects.
To install QGIS:
Download the QGIS installer on the
QGIS download page here. Follow the installation directions below for your
operating system.
Windows
Select the appropriate QGIS Standalone Installer Version for your computer.
The download will automatically start.
Open the .exe file and follow prompts to install (installation may take a
while).
Open QGIS to ensure that it is properly downloaded and installed.
Select the current version of QGIS. The file download (.dmg format) should
start automatically.
Once downloaded, run the .dmg file. When you run the .dmg, it will create a
directory of installer packages that you need to run in a particular order.
IMPORTANT: read the READ ME BEFORE INSTALLING.rtf file!
Install the packages in the directory in the order indicated.
GDAL Complete.pkg
NumPy.pkg
matplotlib.pkg
QGIS.pkg - NOTE: you need to install GDAL, NumPy and matplotlib in order to
successfully install QGIS on your Mac!
**Data Tip:** If your computer doesn't allow you to
open these packages because they are from an unknown developer, right click on
the package and select Open With >Installer (default). You will then be asked
if you want to open the package. Select Open, and the installer will open.
Once all of the packages are installed, open QGIS to ensure that it is properly
installed.
LINUX
Select the appropriate download for your computer system.
Note: if you have previous versions of QGIS installed on your system, you may
run into problems. Check out
Verifiability and reproducibility are among the cornerstones of the scientific
process. They are what allows scientists to "stand on the shoulder of giants".
Maintaining reproducibility requires that all data management, analysis, and
visualization steps behind the results presented in a paper are documented and
available in full detail. Reproducibility here means that someone else should
either be able to obtain the same results given all the documented inputs and
the published instructions for processing them, or if not, the reasons why
should be apparent.
From Reproducible Science Curriculum
## Learning Objectives
At the end of this activity, you will be able to:
Summarize the four facets of reproducibility.
Describe several ways that reproducible workflows can improve your workflow and research.
Explain several ways you can incorporate reproducible science techniques into
your own research.
Getting Started with Reproducible Science
Please view the online slide-show below which summarizes concepts taught in the
Reproducible Science Curriculum.
Reproducibility spectrum for published research.
Source: Peng, RD Reproducible Research in Computational Science Science (2011): 1226–1227 via Reproducible Science Curriculum
The Nature Publishing group has also created a
Reporting Checklist
for its authors that focuses primaily on reporting issues but also includes
sections for sharing code.
Recent open-access issue of
Ecography
focusing on reproducible ecology and software packages available for use.
A nice short blog post with an annotated bibliography of "Top 10 papers discussing reproducible research in computational science" from Lorena Barba:
Barba group reproducibility syllabus.
After completing this tutorial, you will be able to:
Define hyperspectral remote sensing.
Explain the fundamental principles of hyperspectral remote sensing data.
Describe the key attributes that are required to effectively work with
hyperspectral remote sensing data in tools like R or Python.
Describe what a "band" is.
Mapping the Invisible
About Hyperspectral Remote Sensing Data
The electromagnetic spectrum is composed of thousands of bands representing
different types of light energy. Imaging spectrometers (instruments that collect
hyperspectral data) break the electromagnetic spectrum into groups of bands
that support classification of objects by their spectral properties on the
earth's surface. Hyperspectral data consists of many bands -- up to hundreds of
bands -- that cover the electromagnetic spectrum.
The NEON imaging spectrometer collects data within the 380nm to 2510nm portions
of the electromagnetic spectrum within bands that are approximately 5nm in
width. This results in a hyperspectral data cube that contains approximately
426 bands - which means big, big data.
Key Metadata for Hyperspectral Data
Bands and Wavelengths
A band represents a group of wavelengths. For example, the wavelength values
between 695nm and 700nm might be one band as captured by an imaging spectrometer.
The imaging spectrometer collects reflected light energy in a pixel for light
in that band. Often when you work with a multi or hyperspectral dataset, the
band information is reported as the center wavelength value. This value
represents the center point value of the wavelengths represented in that band.
Thus in a band spanning 695-700 nm, the center would be 697.5).
Imaging spectrometers collect reflected light information within
defined bands or regions of the electromagnetic spectrum. Source: National
Ecological Observatory Network (NEON)
Spectral Resolution
The spectral resolution of a dataset that has more than one band, refers to the
width of each band in the dataset. In the example above, a band was defined as
spanning 695-700nm. The width or spatial resolution of the band is thus 5
nanometers. To see an example of this, check out the band widths for the
Landsat sensors.
Full Width Half Max (FWHM)
The full width half max (FWHM) will also often be reported in a multi or
hyperspectral dataset. This value represents the spread of the band around that
center point.
The Full Width Half Max (FWHM) of a band relates to the distance
in nanometers between the band center and the edge of the band. In this
case, the FWHM for Band C is 5 nm.
In the illustration above, the band that covers 695-700nm has a FWHM of 5 nm.
While a general spectral resolution of the sensor is often provided, not all
sensors create bands of uniform widths. For instance bands 1-9 of Landsat 8 are
listed below (Courtesy of USGS)