Data Product Revisions and Releases
How do we support transparency and reproducibility in the science generated using NEON data? In part, by thoroughly documenting our data and by making static, unchangeable versions of our data available for the lifetime of the Observatory each year.
NEON data are published as “data products”, each of which is a collection of measurements, organized into one or more tables or files, that were generated from the same sensor assembly or collected using the same protocol or set of protocols. When working with NEON data, it is important to understand the comparability of data within a data product. In general, all of the data within a product are comparable from the first date of collection to the last. That said, it is important to understand whether the data you work with are Provisional or Released, and whether a data product has been Revised over time.
Provisional and Released Data
Data are initially published with a Provisional status, which means that data may be updated on an as-needed basis, without guarantee of reproducibility. Until the first data Release was published in January of 2021, all NEON data were Provisional. Provisional data allows NEON to publish data more rapidly, while retaining the ability to make corrections or additions as they are identified.
After initial publication, a lag time occurs before the data are formally Released. During this lag time, extra quality control (QC) procedures, which are described in data product specific documentation, may be performed. This lag time also ensures all data from laboratory analyses are available before a data Release. Additionally, the user community will have had the opportunity to work with the data and provide quality-related feedback.
Each Release consists of a complete set of data files that will not be changed further and will remain accessible (except for AOP datasets) throughout the lifetime of the Observatory. For more information on how AOP datasets are handled, please continue reading (or see the "AOP Release Process" section).
Each year’s Release will include all data collected for each included data product up to a subsystem-specific lag period prior to the release date. For IS, AOP, and OS data products, the respective provisional periods are 6, 12, and 12 months. Exceptions may be made for certain OS data products that need more time to obtain and publish external lab data, and for other unexpected disruptions to data availability. Users should refer to the release manifest for specific data that is included in each release.
Digital Object Identifiers (DOI) for Released Data
Each data product within a Release is associated with a Digital Object Identifier (DOI) for reference and citation. DOI URLs will always resolve back to their corresponding data product release’s landing webpage and are thus ideal for citing NEON data in publications and applications. Data products that are bundled with another product, and are not downloadable individually, use their parent products’ DOIs. Data products that are hosted fully by another repository are not included in any release and are not assigned DOIs by NEON.
AOP Release Process
Unlike NEON's other systems (IS and OS), AOP does not preserve historical versions of data. Most AOP data products are high volume; thus it is expensive to store and make openly and freely available more than the most recent version. Therefore, NEON's annual releases for AOP data products are only available for the current release year – from the date of release to approximately 11 months later when the AOP team begins preparing for the next year’s release. For example, Release-2022 versions of AOP data products were available from January 20, 2022 until mid-December 2022.
DOIs for AOP data products for a given RELEASE will be tombstoned prior to each subsequent Release. These tombstoned data can be thought of as “out of print”: the DOI for each data product release is still valid, but the version of the data that the DOI referred to is no longer available for download. A DOI that has been tombstoned will resolve to the data product release's webpage which explains that the released version of the product is no longer available for download (e.g., Discrete return LiDAR point cloud (DP1.30003.001), also shown in the figure below).
Prior to each annual NEON Release, AOP scientists review the existing data and reprocess data if any issues are identified. Then they begin a month-long transition period of replacing older files with newer ones. During this period, the current data release tag may no longer point to the same exact files for certain AOP data products that are undergoing updates. Although the data portal or API may indicate availability of a data product at specific sites for specific months, some files may be unavailable for a day or two before being replaced by updated versions.
NEON will publish a Data Notification indicating when AOP is transitioning between one Release and the next (e.g., AOP Data Availability Notification – Release 2024). We suggest holding off on downloading AOP data during this interim period, or submitting a request through the Contact Us form to obtain information about the status of the data product(s) you are interested in.
Data Quality in Provisional and Released Data
Data quality review is an active and continual process for NEON data products, including both automated and manual procedures that may detect quality issues in published data. For an overview of NEON quality assurance and quality control procedures, see the Data Quality page. These processes yield corrections and quality flagging in Provisional data that are applied on a rolling basis, without notice to end users. If corrections are also identified for data already included in a Release, those data are also updated but only on an annual basis such that updates are included in the next Release. Thanks to these processes, the data in each year’s Release are the highest-quality data available at the time. The most recently published Provisional data have been subject only to the automated quality control procedures, and these data are the most likely to change in response to additional quality assessment.
Some data products are processed on a schedule that aligns with the Releases, resulting in a more distinct change between Provisional and Release, or between Releases. Stage-discharge rating curves and Continuous discharge are initially calculated and published Provisionally using the previous water year’s model, and are recalculated at the end of the water year before inclusion in the Release. Similarly, Eddy covariance data are reprocessed prior to the Release each year using the latest code version, so that data within a Release are all based on the same code.
Downloading and Using Provisional and Released Data
The default download for any given data product from the NEON Data Portal will include the most recent Release plus all Provisional data generated since the Release. Alternatively, you may select a specific Release.
The API includes a Releases endpoint, as well as information about releases in the Products, Sites, and Data endpoints.
A manifest file is included with all downloaded data packages. This file provides names and information about all files included in the package, including file size, checksums for verification purposes, and permanent links to each file. The manifest also specifies whether each file is provisional or associated with a release.
Your download will include files packaged within folders within a single zip file. Our R package, neonUtilities v2.0 and above, has numerous functionalities including the ability to join files across sites and months for IS and OS data.
If you have downloaded the same provisional data file on two different dates, it is possible to discover whether changes have occurred by inspecting the time stamp at the end of each file name (for instrumented and observational data products, not data products from the airborne observation platform). The time stamp corresponds to the date and time at which the file was created, and is only changed when data are republished. More information is available at Data Formats and Conventions.
Please plan to publish and archive Provisional data used in publication in an appropriate repository, as NEON will not assign a DOI until the data are included in an official Release. Please read our Publishing Research Outputs page to learn more.
For more details about downloading and using Releases and Provisional data, see the tutorial Understanding Releases and Provisional Data.
Learn about Release 2024 | Learn about Release 2023 |
Learn about Release 2022 | Learn about Release 2021 |
Data Product Revisions
If an instrument or protocol is significantly changed to the extent that users should be aware of potential issues with incompatibility, we will generate a new Revision of the data product, denoted by a change in data product identifier. Data from different revisions of the same data product are not directly comparable and should be used with caution when combining for use or analysis. Upon a data product revision, the REV field of the data product identifier will be incremented. The data product identifier takes the form DPL.PRNUM.REV, where DPL is the data product level, PRNUM is the product number, and REV is the product revision. Each data product revision will be findable in the Explore Data Product page, along with a short summary of the changes made between revisions.
Last updated January 29, 2024