- Data note
- Open Access
XPolaris: an R-package to retrieve United States soil data at 30-meter resolution
BMC Research Notes volume 14, Article number: 327 (2021)
This data article aims to introduce the “XPolaris” R-package, designed to facilitate access to detailed soil data at any geographical location within the contiguous United States (CONUS). Without the need of advanced R-programming skills, XPolaris enables users to convert raster data from the POLARIS database into traditional spreadsheet format [i.e., Comma-Separated Values (CSV)] for further data analyses.
The core of this publication is a code-tutorial envisioned to assist users in retrieving soil raster data within the CONUS. All data is sourced from the POLARIS database, a 30-m probabilistic map of soil series and different soil properties [Chaney et al. Geoderma 274:54, 2016, Chaney et al. Water Resour Res 55:2916, 2019]. POLARIS represents an optimization of the Soil Survey Geographic (SSURGO) database, circumventing issues of spatial disaggregation, harmonizing, and filling spatial gaps. POLARIS was constructed using a machine learning algorithm, the Disaggregation and Harmonisation of Soil Map Units Through Resampled Classification Trees (DSMART-HPC) [Odgers et al. Geoderma 214:91, 2014]. Although the data is easily accessible in a raster format, retrieving large amounts of data can be time-consuming or require advanced programming skills.
The objective of this dataset  is to introduce the R-package “XPolaris”, a collection of functions for retrieving soil data from the POLARIS database [2, 3]. Although POLARIS raster images are easily accessible and a client API (Application Programming Interface) has been recently released , programming skills are necessary to retrieve large amounts of data. Therefore, the core functionalities of XPolaris will facilitate accessing soil data regardless of the number of geographical locations. Due to a large volume of data in each raster image, efficient coding is necessary to match the user need with a minimum download requirement. Examples of research publications taking advantage of soil information from the POLARIS database are presented below:
In , gridded soil data (soil organic matter, clay, silt, and sand at 0–15 cm) was obtained for 679 site-years across North America. The research aimed to predict corn yield using a machine learning algorithm (conditional random forests). About 50% of corn yield variability was explained by crop management and soil variables, with previous crop and soil organic matter as the most relevant features.
In , soil water variables (ksat, θsaturated, θresdiual, and van Genuchten–Mualem parameters) from 95 US locations were used in the SWAP model for simulating crop evapotranspiration reduction (drought stress). The project aimed to predict soybean biological nitrogen fixation using linear model regularization (elastic net). This method identified soil and weather variables most strongly associated with nitrogen fixation (40% of evaluated features).
Data files are deposited in the Harvard Dataverse repository “Retrieving POLARIS data using R-software” . The RMarkdown file (*.rmd) (Data file 1 in Table 1) was generated using R version 4.0.3 (MacOS, 64-bit) and R-studio v1.4.1103. It intends to present XPolaris and its core functionalities. There is no limit on the amount of data retrieved by the user. However, the image download depends on internet connection and large objects can surpass the memory limit of the R environment and/or machine. The code chunks must be executed in the order they are presented in the RMarkdown file. Users can replace the location data with their own.
In the tutorial portable document file (*.pdf) (Data file 2 in Table 1) users are introduced to the input format (Sect. “Introduction” of the tutorial) and the three functions related to: (1) checking images from which location data must be retrieved (Sect. “Location areas”); (2) downloading raster images covering requested soil variables and depths (Sect. “Downloading images”); and (3) extracting the soil data from the images to generate a CSV output for further analyses (Sect. “Extracting soil data”). Details on the function arguments are included in another portable document file (*.pdf) (Data file 3 in Table 1).
The POLARIS database provides 13 soil variables (Data file 2 in Table 1) related to physical and chemical properties (e.g., soil organic matter, pH, clay, silt, sand, bulk density, ksat, etc.) at six different depth layers (0–5, 5–15, 15–30, 30–60, 60–100, and 100–200 cm) and a 30-m spatial resolution. Because the database was constructed from a probabilistic model , values are summarized by their mean, mode, median (p50), 5th (p5) and 95th (p95) percentiles. All POLARIS raster files use a geographic coordinate system (GCS) and the WGS84 datum.
The CSV file (Dataset 1 in Table 1) is an example of location input, containing three geographical coordinates in Kansas for which soil data will be retrieved and the R functions will be tested. The example data also comes with the XPolaris package . XPolaris facilitates code implementation by exempting users from writing extensive functions. In addition, the package was tested across different operating systems, being released in CRAN .
The local machine must have available disk space to store the raster images.
Visualization functions are not included for the retrieved soil data.
Currently, soil data cannot be summarized within spatial polygons.
Soil data output is not directly compatible with crop simulation models (e.g., APSIM, DSSAT).
Availability of data and materials
The data described in this Data note can be freely and openly accessed on Harvard Dataverse: https://doi.org/10.7910/DVN/DCZ0N3 [undefined]. Please see Table 1 for details about the data.
Soil Water Atmosphere and Plant
Saturated hydraulic conductivity
- θsaturated :
Saturated soil water content
- θresdiual :
Residual water content
Disaggregation and Harmonisation of Soil Map Units Through Resampled Classification Trees
Soil Survey Geographic database
Decision Support System for Agrotechnology Transfer
Moro Rosso, LH., de Borja Reis, AF., de, Correndo, AA. & Ciampitti, IA. Retrieving POLARIS data using R-software, Harvard Dataverse, V2, https://doi.org/10.7910/DVN/DCZ0N3 (2021).
Chaney NW, Wood EF, McBratney AB, Hempel JW, Nauman TW, Brungard CW, Odgers NP. POLARIS: a 30-meter probabilistic soil series map of the contiguous United States. Geoderma. 2016;274:54–67. https://doi.org/10.1016/j.geoderma.2016.03.025.
Chaney NW, Minasny B, Herman JD, Nauman TW, Brungard CW, Morgan CLS, McBratney AB, Wood EF, Yimam Y. POLARIS soil properties: 30-m probabilistic maps of soil properties over the contiguous United States. Water Resour Res. 2019;55:2916–38. https://doi.org/10.1029/2018WR022797.
Correndo AA, Rotundo JL, Tremblay N, Archontoulis S, Coulter JA, Ruiz-Diaz D, Franzen D, Franzluebbers A, Nafziger E, Schwalbert R, Steinke K, Williams J, Messina CD, Ciampitti IA. Assessing the uncertainty of maize yield without nitrogen fertilization. Field Crops Res. 2021;260: 107985. https://doi.org/10.1016/j.fcr.2020.107985.
de Borja Reis AF, Moro Rosso LH, Purcell LC, Naeve S, Casteel SN, Kovács P, Archontoulis S, Davidson D, Ciampitti IA. Environmental factors associated with nitrogen fixation prediction in soybean. Front Plant Sci. 2021;12. https://doi.org/10.3389/fpls.2021.675410.
Odgers NP, Sun W, McBratney AB, Minasny B, Clifford D. Disaggregating and harmonising soil map units through resampled classification trees. Geoderma. 2014;214–215:91–100. https://doi.org/10.1016/j.geoderma.2013.09.024.
Moro Rosso LH, de Borja Reis AF, Correndo AA, Ciampitti IA. XPolaris: Retrieving Soil Data from POLARIS. 2021. https://cran.r-project.org/web/packages/XPolaris/index.html.
R Core Team. R: A Language and Environment for Statistical Computing. 2021. https://www.r-project.org/.
Authors express their gratitude for the financial support provided by Kansas Corn Commission and Kansas State University for sponsoring LMR’s M.S. program and Dr. Ciampitti’s research program. Contribution no. 22-024-J from the Kansas Agricultural Experiment Station.
Kansas State University and Kansas State Research and Extension.
Ethics approval and consent to participate
Consent for publication
Authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Moro Rosso, L.H., de Borja Reis, A.F., Correndo, A.A. et al. XPolaris: an R-package to retrieve United States soil data at 30-meter resolution. BMC Res Notes 14, 327 (2021). https://doi.org/10.1186/s13104-021-05729-y
- Soil properties