- Data note
- Open access
- Published:
XPolaris: an R-package to retrieve United States soil data at 30-meter resolution
BMC Research Notes volume 14, Article number: 327 (2021)
Abstract
Objectives
This data article aims to introduce the “XPolaris” R-package, designed to facilitate access to detailed soil data at any geographical location within the contiguous United States (CONUS). Without the need of advanced R-programming skills, XPolaris enables users to convert raster data from the POLARIS database into traditional spreadsheet format [i.e., Comma-Separated Values (CSV)] for further data analyses.
Data description
The core of this publication is a code-tutorial envisioned to assist users in retrieving soil raster data within the CONUS. All data is sourced from the POLARIS database, a 30-m probabilistic map of soil series and different soil properties [Chaney et al. Geoderma 274:54, 2016, Chaney et al. Water Resour Res 55:2916, 2019]. POLARIS represents an optimization of the Soil Survey Geographic (SSURGO) database, circumventing issues of spatial disaggregation, harmonizing, and filling spatial gaps. POLARIS was constructed using a machine learning algorithm, the Disaggregation and Harmonisation of Soil Map Units Through Resampled Classification Trees (DSMART-HPC) [Odgers et al. Geoderma 214:91, 2014]. Although the data is easily accessible in a raster format, retrieving large amounts of data can be time-consuming or require advanced programming skills.
Objective
The objective of this dataset [1] is to introduce the R-package “XPolaris”, a collection of functions for retrieving soil data from the POLARIS database [2, 3]. Although POLARIS raster images are easily accessible and a client API (Application Programming Interface) has been recently released [4], programming skills are necessary to retrieve large amounts of data. Therefore, the core functionalities of XPolaris will facilitate accessing soil data regardless of the number of geographical locations. Due to a large volume of data in each raster image, efficient coding is necessary to match the user need with a minimum download requirement. Examples of research publications taking advantage of soil information from the POLARIS database are presented below:
In [5], gridded soil data (soil organic matter, clay, silt, and sand at 0–15 cm) was obtained for 679 site-years across North America. The research aimed to predict corn yield using a machine learning algorithm (conditional random forests). About 50% of corn yield variability was explained by crop management and soil variables, with previous crop and soil organic matter as the most relevant features.
In [6], soil water variables (ksat, θsaturated, θresdiual, and van Genuchten–Mualem parameters) from 95 US locations were used in the SWAP model for simulating crop evapotranspiration reduction (drought stress). The project aimed to predict soybean biological nitrogen fixation using linear model regularization (elastic net). This method identified soil and weather variables most strongly associated with nitrogen fixation (40% of evaluated features).
Data description
Data files are deposited in the Harvard Dataverse repository “Retrieving POLARIS data using R-software” [1]. The RMarkdown file (*.rmd) (Data file 1 in Table 1) was generated using R version 4.0.3 (MacOS, 64-bit) and R-studio v1.4.1103. It intends to present XPolaris and its core functionalities. There is no limit on the amount of data retrieved by the user. However, the image download depends on internet connection and large objects can surpass the memory limit of the R environment and/or machine. The code chunks must be executed in the order they are presented in the RMarkdown file. Users can replace the location data with their own.
In the tutorial portable document file (*.pdf) (Data file 2 in Table 1) users are introduced to the input format (Sect. “Introduction” of the tutorial) and the three functions related to: (1) checking images from which location data must be retrieved (Sect. “Location areas”); (2) downloading raster images covering requested soil variables and depths (Sect. “Downloading images”); and (3) extracting the soil data from the images to generate a CSV output for further analyses (Sect. “Extracting soil data”). Details on the function arguments are included in another portable document file (*.pdf) (Data file 3 in Table 1).
The POLARIS database provides 13 soil variables (Data file 2 in Table 1) related to physical and chemical properties (e.g., soil organic matter, pH, clay, silt, sand, bulk density, ksat, etc.) at six different depth layers (0–5, 5–15, 15–30, 30–60, 60–100, and 100–200 cm) and a 30-m spatial resolution. Because the database was constructed from a probabilistic model [7], values are summarized by their mean, mode, median (p50), 5th (p5) and 95th (p95) percentiles. All POLARIS raster files use a geographic coordinate system (GCS) and the WGS84 datum.
The CSV file (Dataset 1 in Table 1) is an example of location input, containing three geographical coordinates in Kansas for which soil data will be retrieved and the R functions will be tested. The example data also comes with the XPolaris package [8]. XPolaris facilitates code implementation by exempting users from writing extensive functions. In addition, the package was tested across different operating systems, being released in CRAN [9].
Limitations
-
The local machine must have available disk space to store the raster images.
-
Visualization functions are not included for the retrieved soil data.
-
Currently, soil data cannot be summarized within spatial polygons.
-
Soil data output is not directly compatible with crop simulation models (e.g., APSIM, DSSAT).
Availability of data and materials
The data described in this Data note can be freely and openly accessed on Harvard Dataverse: https://doi.org/10.7910/DVN/DCZ0N3 [undefined]. Please see Table 1 for details about the data.
Abbreviations
- SWAP:
-
Soil Water Atmosphere and Plant
- ksat:
-
Saturated hydraulic conductivity
- θsaturated :
-
Saturated soil water content
- θresdiual :
-
Residual water content
- DSMART:
-
Disaggregation and Harmonisation of Soil Map Units Through Resampled Classification Trees
- SSURGO:
-
Soil Survey Geographic database
- CSV:
-
Comma-Separated Values
- DSSAT:
-
Decision Support System for Agrotechnology Transfer
References
Moro Rosso, LH., de Borja Reis, AF., de, Correndo, AA. & Ciampitti, IA. Retrieving POLARIS data using R-software, Harvard Dataverse, V2, https://doi.org/10.7910/DVN/DCZ0N3 (2021).
Chaney NW, Wood EF, McBratney AB, Hempel JW, Nauman TW, Brungard CW, Odgers NP. POLARIS: a 30-meter probabilistic soil series map of the contiguous United States. Geoderma. 2016;274:54–67. https://doi.org/10.1016/j.geoderma.2016.03.025.
Chaney NW, Minasny B, Herman JD, Nauman TW, Brungard CW, Morgan CLS, McBratney AB, Wood EF, Yimam Y. POLARIS soil properties: 30-m probabilistic maps of soil properties over the contiguous United States. Water Resour Res. 2019;55:2916–38. https://doi.org/10.1029/2018WR022797.
Correndo AA, Rotundo JL, Tremblay N, Archontoulis S, Coulter JA, Ruiz-Diaz D, Franzen D, Franzluebbers A, Nafziger E, Schwalbert R, Steinke K, Williams J, Messina CD, Ciampitti IA. Assessing the uncertainty of maize yield without nitrogen fertilization. Field Crops Res. 2021;260: 107985. https://doi.org/10.1016/j.fcr.2020.107985.
de Borja Reis AF, Moro Rosso LH, Purcell LC, Naeve S, Casteel SN, Kovács P, Archontoulis S, Davidson D, Ciampitti IA. Environmental factors associated with nitrogen fixation prediction in soybean. Front Plant Sci. 2021;12. https://doi.org/10.3389/fpls.2021.675410.
Odgers NP, Sun W, McBratney AB, Minasny B, Clifford D. Disaggregating and harmonising soil map units through resampled classification trees. Geoderma. 2014;214–215:91–100. https://doi.org/10.1016/j.geoderma.2013.09.024.
Moro Rosso LH, de Borja Reis AF, Correndo AA, Ciampitti IA. XPolaris: Retrieving Soil Data from POLARIS. 2021. https://cran.r-project.org/web/packages/XPolaris/index.html.
R Core Team. R: A Language and Environment for Statistical Computing. 2021. https://www.r-project.org/.
Acknowledgements
Authors express their gratitude for the financial support provided by Kansas Corn Commission and Kansas State University for sponsoring LMR’s M.S. program and Dr. Ciampitti’s research program. Contribution no. 22-024-J from the Kansas Agricultural Experiment Station.
Funding
Kansas State University and Kansas State Research and Extension.
Author information
Authors and Affiliations
Contributions
LHMR contributed with the draft of the programming code, visualization, and wrote the data-note draft. AFBR revised the programming code and revised the data-note. AAC revised the programming code, visualization and revised the data-note. IAC revised the programming code, revised the data-note, and supervised the project. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
Authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Moro Rosso, L.H., de Borja Reis, A.F., Correndo, A.A. et al. XPolaris: an R-package to retrieve United States soil data at 30-meter resolution. BMC Res Notes 14, 327 (2021). https://doi.org/10.1186/s13104-021-05729-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13104-021-05729-y