Skip to main content

XPolaris: an R-package to retrieve United States soil data at 30-meter resolution

Abstract

Objectives

This data article aims to introduce the “XPolaris” R-package, designed to facilitate access to detailed soil data at any geographical location within the contiguous United States (CONUS). Without the need of advanced R-programming skills, XPolaris enables users to convert raster data from the POLARIS database into traditional spreadsheet format [i.e., Comma-Separated Values (CSV)] for further data analyses.

Data description

The core of this publication is a code-tutorial envisioned to assist users in retrieving soil raster data within the CONUS. All data is sourced from the POLARIS database, a 30-m probabilistic map of soil series and different soil properties [Chaney et al. Geoderma 274:54, 2016, Chaney et al. Water Resour Res 55:2916, 2019]. POLARIS represents an optimization of the Soil Survey Geographic (SSURGO) database, circumventing issues of spatial disaggregation, harmonizing, and filling spatial gaps. POLARIS was constructed using a machine learning algorithm, the Disaggregation and Harmonisation of Soil Map Units Through Resampled Classification Trees (DSMART-HPC) [Odgers et al. Geoderma 214:91, 2014]. Although the data is easily accessible in a raster format, retrieving large amounts of data can be time-consuming or require advanced programming skills.

Objective

The objective of this dataset [1] is to introduce the R-package “XPolaris”, a collection of functions for retrieving soil data from the POLARIS database [2, 3]. Although POLARIS raster images are easily accessible and a client API (Application Programming Interface) has been recently released [4], programming skills are necessary to retrieve large amounts of data. Therefore, the core functionalities of XPolaris will facilitate accessing soil data regardless of the number of geographical locations. Due to a large volume of data in each raster image, efficient coding is necessary to match the user need with a minimum download requirement. Examples of research publications taking advantage of soil information from the POLARIS database are presented below:

In [5], gridded soil data (soil organic matter, clay, silt, and sand at 0–15 cm) was obtained for 679 site-years across North America. The research aimed to predict corn yield using a machine learning algorithm (conditional random forests). About 50% of corn yield variability was explained by crop management and soil variables, with previous crop and soil organic matter as the most relevant features.

In [6], soil water variables (ksat, θsaturated, θresdiual, and van Genuchten–Mualem parameters) from 95 US locations were used in the SWAP model for simulating crop evapotranspiration reduction (drought stress). The project aimed to predict soybean biological nitrogen fixation using linear model regularization (elastic net). This method identified soil and weather variables most strongly associated with nitrogen fixation (40% of evaluated features).

Data description

Data files are deposited in the Harvard Dataverse repository “Retrieving POLARIS data using R-software” [1]. The RMarkdown file (*.rmd) (Data file 1 in Table 1) was generated using R version 4.0.3 (MacOS, 64-bit) and R-studio v1.4.1103. It intends to present XPolaris and its core functionalities. There is no limit on the amount of data retrieved by the user. However, the image download depends on internet connection and large objects can surpass the memory limit of the R environment and/or machine. The code chunks must be executed in the order they are presented in the RMarkdown file. Users can replace the location data with their own.

Table 1 Overview of data files/data sets

In the tutorial portable document file (*.pdf) (Data file 2 in Table 1) users are introduced to the input format (Sect. “Introduction” of the tutorial) and the three functions related to: (1) checking images from which location data must be retrieved (Sect. “Location areas”); (2) downloading raster images covering requested soil variables and depths (Sect. “Downloading images”); and (3) extracting the soil data from the images to generate a CSV output for further analyses (Sect. “Extracting soil data”). Details on the function arguments are included in another portable document file (*.pdf) (Data file 3 in Table 1).

The POLARIS database provides 13 soil variables (Data file 2 in Table 1) related to physical and chemical properties (e.g., soil organic matter, pH, clay, silt, sand, bulk density, ksat, etc.) at six different depth layers (0–5, 5–15, 15–30, 30–60, 60–100, and 100–200 cm) and a 30-m spatial resolution. Because the database was constructed from a probabilistic model [7], values are summarized by their mean, mode, median (p50), 5th (p5) and 95th (p95) percentiles. All POLARIS raster files use a geographic coordinate system (GCS) and the WGS84 datum.

The CSV file (Dataset 1 in Table 1) is an example of location input, containing three geographical coordinates in Kansas for which soil data will be retrieved and the R functions will be tested. The example data also comes with the XPolaris package [8]. XPolaris facilitates code implementation by exempting users from writing extensive functions. In addition, the package was tested across different operating systems, being released in CRAN [9].

Limitations

  • The local machine must have available disk space to store the raster images.

  • Visualization functions are not included for the retrieved soil data.

  • Currently, soil data cannot be summarized within spatial polygons.

  • Soil data output is not directly compatible with crop simulation models (e.g., APSIM, DSSAT).

Availability of data and materials

The data described in this Data note can be freely and openly accessed on Harvard Dataverse: https://doi.org/10.7910/DVN/DCZ0N3 [undefined]. Please see Table 1 for details about the data.

Abbreviations

SWAP:

Soil Water Atmosphere and Plant

ksat:

Saturated hydraulic conductivity

θsaturated :

Saturated soil water content

θresdiual :

Residual water content

DSMART:

Disaggregation and Harmonisation of Soil Map Units Through Resampled Classification Trees

SSURGO:

Soil Survey Geographic database

CSV:

Comma-Separated Values

DSSAT:

Decision Support System for Agrotechnology Transfer

References

  1. Moro Rosso, LH., de Borja Reis, AF., de, Correndo, AA. & Ciampitti, IA. Retrieving POLARIS data using R-software, Harvard Dataverse, V2, https://doi.org/10.7910/DVN/DCZ0N3 (2021).

  2. Chaney NW, Wood EF, McBratney AB, Hempel JW, Nauman TW, Brungard CW, Odgers NP. POLARIS: a 30-meter probabilistic soil series map of the contiguous United States. Geoderma. 2016;274:54–67. https://doi.org/10.1016/j.geoderma.2016.03.025.

    Article  CAS  Google Scholar 

  3. Chaney NW, Minasny B, Herman JD, Nauman TW, Brungard CW, Morgan CLS, McBratney AB, Wood EF, Yimam Y. POLARIS soil properties: 30-m probabilistic maps of soil properties over the contiguous United States. Water Resour Res. 2019;55:2916–38. https://doi.org/10.1029/2018WR022797.

    Article  Google Scholar 

  4. https://github.com/chaneyn/polaris_api_client.

  5. Correndo AA, Rotundo JL, Tremblay N, Archontoulis S, Coulter JA, Ruiz-Diaz D, Franzen D, Franzluebbers A, Nafziger E, Schwalbert R, Steinke K, Williams J, Messina CD, Ciampitti IA. Assessing the uncertainty of maize yield without nitrogen fertilization. Field Crops Res. 2021;260: 107985. https://doi.org/10.1016/j.fcr.2020.107985.

    Article  Google Scholar 

  6. de Borja Reis AF, Moro Rosso LH, Purcell LC, Naeve S, Casteel SN, Kovács P, Archontoulis S, Davidson D, Ciampitti IA. Environmental factors associated with nitrogen fixation prediction in soybean. Front Plant Sci. 2021;12. https://doi.org/10.3389/fpls.2021.675410.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Odgers NP, Sun W, McBratney AB, Minasny B, Clifford D. Disaggregating and harmonising soil map units through resampled classification trees. Geoderma. 2014;214–215:91–100. https://doi.org/10.1016/j.geoderma.2013.09.024.

    Article  Google Scholar 

  8. Moro Rosso LH, de Borja Reis AF, Correndo AA, Ciampitti IA. XPolaris: Retrieving Soil Data from POLARIS. 2021. https://cran.r-project.org/web/packages/XPolaris/index.html.

  9. R Core Team. R: A Language and Environment for Statistical Computing. 2021. https://www.r-project.org/.

Download references

Acknowledgements

Authors express their gratitude for the financial support provided by Kansas Corn Commission and Kansas State University for sponsoring LMR’s M.S. program and Dr. Ciampitti’s research program. Contribution no. 22-024-J from the Kansas Agricultural Experiment Station.

Funding

Kansas State University and Kansas State Research and Extension.

Author information

Authors and Affiliations

Authors

Contributions

LHMR contributed with the draft of the programming code, visualization, and wrote the data-note draft. AFBR revised the programming code and revised the data-note. AAC revised the programming code, visualization and revised the data-note. IAC revised the programming code, revised the data-note, and supervised the project. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Luiz H. Moro Rosso, Adrian A. Correndo or Ignacio A. Ciampitti.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

Authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Moro Rosso, L.H., de Borja Reis, A.F., Correndo, A.A. et al. XPolaris: an R-package to retrieve United States soil data at 30-meter resolution. BMC Res Notes 14, 327 (2021). https://doi.org/10.1186/s13104-021-05729-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13104-021-05729-y

Keywords