Volume 55, Issue 4 p. 2916-2938
Research Article
Free Access

POLARIS Soil Properties: 30-m Probabilistic Maps of Soil Properties Over the Contiguous United States

Nathaniel W. Chaney

Corresponding Author

Nathaniel W. Chaney

Department of Civil and Environmental Engineering, Duke University, Durham, NC, USA

Correspondence to: N. W. Chaney

[email protected]

Search for more papers by this author
Budiman Minasny

Budiman Minasny

Department of Environmental Sciences, Faculty of Agriculture and Environment, The University of Sydney, Sydney, New South Wales, Australia

Search for more papers by this author
Jonathan D. Herman

Jonathan D. Herman

Department of Civil and Environmental Engineering, University of California, Davis, CA, USA

Search for more papers by this author
Travis W. Nauman

Travis W. Nauman

U.S. Geological Survey, Southwest Biological Science Center, Moab, UT, USA

Search for more papers by this author
Colby W. Brungard

Colby W. Brungard

Department of Plant and Environmental Sciences, New Mexico State University, Las Cruces, NM, USA

Search for more papers by this author
Cristine L. S. Morgan

Cristine L. S. Morgan

Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, USA

Search for more papers by this author
Alexander B. McBratney

Alexander B. McBratney

Department of Environmental Sciences, Faculty of Agriculture and Environment, The University of Sydney, Sydney, New South Wales, Australia

Search for more papers by this author
Eric F. Wood

Eric F. Wood

Department of Civil and Environmental Engineering, Princeton University, Princeton, NJ, USA

Search for more papers by this author
Yohannes Yimam

Yohannes Yimam

Formation Environmental LLC, Sacramento, CA, USA

Search for more papers by this author
First published: 05 February 2019
Citations: 70

Abstract

Soils play a critical role in the cycling of water, energy, and carbon in the Earth system. Until recently, due primarily to a lack of soil property maps of a sufficiently high-quality and spatial detail, a minor emphasis has been placed on providing high-resolution measured soil parameter estimates for land surface models and hydrologic models. This study introduces Probabilistic Remapping of SSURGO (POLARIS) soil properties—a database of 30-m probabilistic soil property maps over the contiguous United States (CONUS). The mapped variables over CONUS include soil texture, organic matter, pH, saturated hydraulic conductivity, Brooks-Corey and Van Genuchten water retention curve parameters, bulk density, and saturated water content. POLARIS soil properties was assembled by (1) depth harmonizing and aggregating the pedons in the National Cooperative Soil Survey Soil Characterization Database and the components in Soil Survey Geographic Database into a database of 21,481 different soil series, each soil series having its own vertical profiles of different soil properties, (2) pruning the original POLARIS soil series maps using conventional soil maps to improve soil series prediction accuracy, and (3) merging the assembled soil series databases with the pruned POLARIS soil series maps to construct the soil property maps over CONUS. POLARIS soil properties includes 100-bin histograms for each layer and variable per grid cell and a series of summary statistics at 30-, 300-, and 3,000-m spatial resolution. Evaluation of POLARIS soil properties using in situ measurements shows an average R2 of 0.41, normalized root-mean-square error of 12%, and a normalized mean absolute error of 8.8%.

Key Points

  • This study introduces POLARIS soil properties, a probabilistic 30-m soil property maps over the contiguous United States
  • POLARIS soil properties provides improved local prior distributions of soil parameters for land surface and hydrologic models
  • In situ evaluation shows an average R2 of 0.41, a normalized root-mean-square error of 12%, and a normalized mean absolute error of 8.8%

1 Introduction

Soil plays a critical role in the cycling of water, energy, and carbon in the Earth system at a range of temporal and spatial scales. Soil helps regulate and influence ecosystems through the provision of nutrients for living organisms; the storage, cycling, and release of water, heat, carbon, and nitrogen, and essential nutrients; and the provision of a medium for vegetation growth and the anchoring of artificial structures (Brady & Weil, 2008; Chaney et al., 2014; Crow et al., 2012; Farouki, 1986; Grayson et al., 1997; Lichstein et al., 2014; Manzoni & Porporato, 2009; Rodriguez-Iturbe & Porporato, 2004). Recognizing the importance of soil in the Earth system, land surface models aim to simulate soil function through the simulated coupling of the water, heat, and carbon states and fluxes (Best et al., 2011; Koster et al., 2000; Liang et al., 1994; Milly et al., 2014; Niu et al., 2011; Oleson et al., 2013).

The accuracy of the simulated soil processes in land surface models is strongly tied to a robust characterization of the underlying soil properties within these models; in other words, the model parameters (e.g., water retention curve parameters) can play a key role in the modeled processes (Demaria et al., 2012; Hou et al., 2012; Melsen et al., 2016; Rosero et al., 2010). For example, under heavy rainfall the partitioning of precipitation into infiltration and surface runoff can vary strongly depending on the antecedent soil moisture conditions and soil composition; under identical conditions, a soil with high clay and low sand content will commonly have a much higher fraction of surface runoff than a soil with high sand and low clay content. Prior sensitivity analysis studies have shown that soil parameters are one of the key sources of uncertainty in hydrologic and land surface models (Chaney et al., 2015).

The significant uncertainty in existing soil property maps is one of the primary reasons for the continued reliance on parameter calibration in hydrologic and land surface models. To calibrate the model parameters, prior distributions are constrained using observations of water, energy, and carbon states and fluxes to improve model performance (Cibin et al., 2010; Döll et al., 2003; Harding et al., 2014; Sheffield et al., 2006; Troy et al., 2008). In the simplest case, these prescribed prior distributions are assumed spatially invariant and do not account for local environmental characteristics (Chaney et al., 2015). This underconstrained condition effectively allows optimization routines to trade physical realism for model performance by compensating for other uncertainties in meteorological forcing and model structure, a problem amplified by spatially distributed models with many degrees of freedom (Gupta et al., 2008). As a result, optimization routines may arrive at one of several behavioral parameter sets, a pervasive challenge known as parameter equifinality (Beven, 2006).

Recognizing these challenges, recent studies have improved parameter estimation techniques based on spatially varying geophysical characteristics, including the Model Parameter Regionalization (MPR) approach (Samaniego et al., 2010). This method develops nonlinear transfer functions linking high-resolution geophysical data, including soil properties, to hydrologic model parameters. Using a target hydrologic model, the coefficients of the transfer functions are calibrated against water and energy flux observations, and the resulting parameters are upscaled to the desired model scale. The MPR approach has been applied across a wide range of scales and hydrologic conditions and has demonstrated a significant potential to help move beyond previous regionalization approaches (e.g., Mizukami et al., 2017; Rakovec et al., 2016; Samaniego et al., 2017). MPR and other parameter regionalization approaches strongly depend on the availability of fine-scale soil data to inform the existing underlying soil properties, which typically include continental-scale soil data sets (e.g., Miller & White, 1998). However, these data are limited by their poor representation of both uncertainty and multiscale heterogeneity, which limits their ability to bridge the persistent gap between model estimates and macroscale observations of land surface states and fluxes. An opportunity remains for fine-scale probabilistic soil property data to inform parameter estimation frameworks for hydrologic and land surface models.

These persistent challenges of prescribing reliable model parameters in hydrologic and land surface models is one of the key motivations of digital soil mapping (DSM). The underlying basis of DSM is to leverage existing high-resolution soil and environmental data, semiautomated computing approaches (e.g., Machine Learning), and the physical relationship between soil properties and the physical environment (e.g., topography, parent material, land cover, and climate) to create digital soil maps (note that DSM is different from digitizing traditional soil maps; McBratney et al., 2003). DSM presents a unique opportunity to address many of the underlying challenges of existing soil data sets used in land surface modeling: (1) provide an improved and spatially complete and consistent characterization of the observed multiscale heterogeneity of soil properties, (2) improve the quality of the deterministic soil parameter estimates, and (3) provide locally relevant prior distributions for soil properties to bound the unavoidable uncertainties in soil data sets. These DSM efforts are leading to the emergence of multiple continental and global data sets at spatial resolutions between 30 m and 1 km (Arrouays et al., 2014; Chaney et al., 2016; Hengl et al., 2017; Ramcharan et al., 2017; Sanchez et al., 2009).

A recognition of the possibility of using DSM to address the weaknesses of soil property maps used for land surface models and hydrologic models has led to the recent development of the probabilistic POLARIS soil series maps over the contiguous United States (CONUS; Chaney et al., 2016). Derived from Soil Survey Geographic Database (SSURGO), a century's worth of soil survey, POLARIS provides predictions with uncertainties of soil series—the narrowest category in the United States Department of Agriculture (USDA) soil classification system—over the contiguous United States at a 30-m spatial resolution. This study first improves the original POLARIS data set (known as POLARIS soil series for the remainder of this study) and then derives property information to produce 30-m maps of a suite of 13 soil property variables at six different depth layers over CONUS. This database known as POLARIS soil properties was assembled by (1) harmonizing and aggregating the pedon data in the National Cooperative Soil Survey (NCSS) soil characterization database (SCD) and the component information in SSURGO into a database of 21,481 different soil series, each soil series having its own vertical profiles of different soil properties, (2) pruning the original probabilistic POLARIS soil series maps using SSURGO and the second version of the State Soil Geographic data set (STATSGO2) to improve soil series prediction accuracy, and (3) merging the assembled soil series databases with the pruned POLARIS soil series maps to construct maps of 100-bin histograms for each layer and property per 30-m grid cell over CONUS. A series of summary statistics were also derived over the domain including mean, mode, median, 5th percentile, and 95th percentile at 1, 10, and 100 arcsec (30, 300, and 3,000 m). POLARIS soil properties was then evaluated using SCD. Finally, a nonparametric sensitivity metric was used to quantify the changes between global prior distributions of the mapped soil properties and the localized 30-m posterior distributions of the same properties.

2 Data

2.1 POLARIS Soil Series

POLARIS soil series (Chaney et al., 2016) provides 30-m soil series predictions with uncertainties over CONUS. It was constructed using available high-resolution geospatial environmental data (e.g., elevation and land cover data) and random forests, a machine learning method, to remap SSURGO over CONUS. For each grid cell, it predicts the 50 most probable soil series with their associated prediction probabilities based on local environmental covariates. POLARIS soil series provides a spatially continuous, internally consistent, probabilistic prediction of soil series. These probabilistic soil series maps aimed to build on the wealth of data in SSURGO by gap-filling unmapped areas using survey data from the surrounding regions, removing artificial discontinuities at political boundaries, and spatially disaggregating multicomponent polygons. However, in practice, the lack of sufficient constraints on the predictions led to generally highly uncertain—and at times physically implausible—soil series predictions. As such, prior to mapping the soil properties, this paper further constrains POLARIS soil series (explained below) using SSURGO and STASGO2 to improve the soil series predictions.

2.2 SSURGO and STATSGO2

SSURGO is a compilation of the highest detail soil surveys over the United States (Soil Survey Staff, 2018). Managed and updated annually by the NCSS, it is a polygon vector map where each polygon is assigned to a map unit; map units are usually shared among polygons. Generally, for a given region, these polygons are built by local surveyors extrapolating observed soil information (i.e., pedons) through soil/landscape models and available environmental data (e.g., areal images). Furthermore, the second version of STATSGO2 generalizes the higher detail soil survey maps in SSURGO to provide a coarser soil survey database over the contiguous United States (Soil Survey Staff, 2017). In areas where detailed soil survey maps are not available, soil environmental covariates (e.g., geology, topography, vegetation, and climate) are used to gap fill missing areas with existing map units in STATSGO2.

In SSURGO, map units generally consist of a set of soil-type components that describe the underlying soil and landscape characteristics of the map unit (e.g., soil texture). These components can be either soil series (or a higher taxonomic level) or other characteristic land features such as urban areas, water bodies, or rock outcrops. For each component, a set of soil properties are commonly reported at multiple horizons. These properties include percent sand, percent silt, percent clay, field capacity, permanent wilting point, organic matter, and bulk density, among others. These properties come from different sources including local expert knowledge, lab measurements, and pedotransfer functions. In many cases, each property at each horizon provides a lower bound, a representative value, and an upper bound—the parameters for a triangular distribution.

2.3 NASIS

The National Soil Information System (NASIS) database is the official National Resource Conservation Service database that contains in situ pedon observations made over the years by soil surveyors over the United States. Each point in NASIS is assigned a soil series based on local surveyor knowledge—the majority of data in NASIS are pedons used for correlating map units across landscapes and field soil transects used for determining map unit composition and component percentages in SSURGO. NASIS sites with latitude and longitude values within CONUS, including areas not currently mapped by SSURGO, were used in this study. The curated NASIS database consists of over 200,000 point observations. Following the approach used to evaluate POLARIS soil series in Chaney et al. (2016), NASIS was used to evaluate the pruned POLARIS soil series and to assess its improvement.

2.4 NCSS SCD

The NSCSS SCD provides lab measurements for a suite of soil properties for over 20,000 pedons (National Cooperative Soil Survey, 2016). Since collection, these sampled pedons have been analyzed to extract soil properties of interest including soil texture, organic matter content, permanent wilting point, field capacity, and bulk density. The majority of the SCD pedons were obtained and analyzed over the last 40 years with 75% of the analysis happening over the past 20 years. As the reference SCD over the United States, it is the primary database from which individual component soil properties in SSURGO are defined. In this paper, the 2017 version of SCD was used to augment each SSURGO-derived soil series database, fit the pedotransfer functions in NeuroTheta, and to evaluate the resulting POLARIS soil properties. To ensure a robust evaluation of the final soil property maps, the pedons in SCD are split at random into training (80%) and evaluation (20%) data sets. For the remainder of this article, the training data set will be defined as SCDt and the evaluation data set as SCDe. The withheld SCD set is labeled as an evaluation set instead of validation set as the SCD data will still have some representation in the SSURGO property data used in other parts of the prediction process and is thus not a completely independent test set as would be expected for a formal validation. However, separating out the 20% will provide some degree of independence in our evaluation.

3 Methods

3.1 Creating POLARIS Soil Properties

To create POLARIS soil properties, three main steps were followed: (1) assemble for each soil series a property database from SSURGO and SCDt, (2) prune the original POLARIS soil series using SSURGO and STATSGO2, and (3) use the assembled soil series database of property profiles and the pruned POLARIS soil series to create gridded soil property maps over CONUS (i.e., POLARIS soil properties). This section provides an overview of each of these steps. The soil properties that have been assembled and mapped are outlined in Table 1.

Table 1. Soil Properties That Are Mapped Over the Contiguous United States at a 30-m Spatial Resolution
Name Min Max Units Origin Scale
Sand 0.0 100 % SSURGO and SCDt Linear
Silt 0.0 100 % SSURGO and SCDt Linear
Clay 0.0 100 % SSURGO and SCDt Linear
θs 0.24 0.81 m3/m3 SSURGO and SCDt Linear
θr 0.001 0.25 m3/m3 NeuroTheta Linear
Bulk density 0.5 2.0 g/cm3 SSURGO and SCDt Linear
Ksat 0.001 250 cm/hr SSURGO Log10
Organic matter 0.001 100 % SSURGO and SCDt Log10
pH 3.0 10 N/A SSURGO and SCDt Linear
hb (Brooks-Corey) 0.1 250 cm NeuroTheta Log10
λ (Brooks-Corey) 0.1 1.0 cm NeuroTheta Linear
α (Van Genuchten) 0.01 10 cm NeuroTheta Log10
n (Van Genuchten) 1.0 2.5 N/A NeuroTheta Linear
  • Note. The properties for each soil series database from which the maps are created originate from different sources including SSURGO components, SCDt pedons, and the NeuroTheta pedotransfer function. The table above provides the property name, its minimum and maximum value, the origin of the data, and the scale on which the property is mapped. Note that θs is assumed to be equal to the porosity which is calculated from the measured bulk density. SSURGO = Soil Survey Geographic Database; SCD = soil characterization database; N/A = not applicable.

3.1.1 Assemble a Properties Database for Each Soil Series

  1. Harmonize and aggregate the SSURGO components and SCDt pedons. The first step to create the profile database for each soil series was to assemble and harmonize the property data available for a given soil series. The following steps were applied for each soil series found in POLARIS soil series. Each component cs that belongs to soil series s across all map units in SSURGO was harmonized over depth by vertically interpolating the minimum, mode, and maximum profiles that are reported for property p to six layers (0 to 5, 5 to 15, 15 to 30, 30 to 60, 60 to 100, and 100 to 200 cm; Bishop et al., 1999). For each harmonized layer urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0001 the derived minimum, mode, and maximum values were used to assemble a triangular distribution; 1,000 samples were drawn from each triangular distribution. All samples per layer urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0002 were then merged and summarized via a 100-bin histogram with bin edges defined by the minimum, maximum, and scale parameters outlined in Table 1. The same approach is used for a soil series's corresponding pedons in SCD except in this case a single value distribution was used (i.e., the mean is the measured value). For each layer lc,p, the computed histogram derived from the SSURGO components and the computed histogram derived from the SCD pedons were merged into a single 100-bin histogram by averaging. This nonparametric approach avoids assumptions about the underlying shape of the resulting distribution while ensuring consistency across soil series—for each soil property, the same bin edges are used for all layers and soil series (see Figure 1).
  2. NeuroTheta: Parameters for water retention curves. Although SSURGO and SCD provide many soil properties, they lack the hydraulic properties necessary to parameterize hydrologic and land surface models—a key motivation for the development of POLARIS soil properties. To address this challenge, the approach introduced in Minasny and McBratney (2002) was used to estimate the distribution of plausible Van Genuchten and Brooks-Corey parameters per layer for each soil series; the water retention curve parameters are outlined in Table 1 (Figure 2). This method uses an Artificial Neural Network (ANN) to relate the water retention curve parameters to observed θ1,500, θ33, θs, sand, clay, and bulk density. NeuroTheta uses an ensemble of ANNs that were fit via bootstrapping to a data set of water retention data resulting from the combination of SCD (National Cooperative Soil Survey, 2016), UNSODA (Nemes et al., 2001), and GRIZZLY (Haverkamp et al., 1997) data sets. To adequately provide uncertainty information in the predicted water retention curve parameters, the following steps are used. First, NeuroTheta is run 100 times per soil series' layer using 100 different sets of predictors that were assembled using Latin Hypercube Sampling (McKay et al., 1979). This approach is possible because each predictor for a given soil series and layer is a 100-bin histogram. Second, for each set of predictors, NeuroTheta is run using 10 different ANNs selected from the ensemble of trained ANNs. The combination of both the ANN ensembles and Latin Hypercube Sampling led to 1,000 samples per water retention curve parameters. Similar to the already existing soil properties per soil series, these 1,000 samples were summarized via 100-bin histograms with bin edges defined by the minimum, maximum, and scale parameters outlined in Table 1. For further details on NeuroTheta see supporting information S1.
  3. Completing each soil series database: Gap filling. To ensure each soil series profile properties database is complete, an additional step was taken to gap fill missing soil properties for soil series' with missing data. First, soil series were aggregated by different taxonomic level (family, subgroup, group, suborder, and order). Second, 100-bin histograms were assembled for each variable and layer by averaging the computed 100-bin histograms of a given taxonomic level's immediately adjacent finer level—the coarser the taxonomic level, the more spread exists in the posterior distributions. For each variable and layer, the missing properties for each soil series were gap filled by assigning the histogram of the soil series' closest taxonomic level that has a corresponding populated histogram. For each soil series, the resulting database provides 100-bin histograms per layer for each variable in Table 1.
Details are in the caption following the image
Harmonization of the Boswell soil series using the profile information from the SSURGO components and SCD pedons. In the depth-harmonized profile, the black line represents the mean of each layer and the gray shading shows the spread from the 1st to 99th percentiles derived from each layer's computed histogram. SSURGO = Soil Survey Geographic Database; SCD = soil characterization database.
Details are in the caption following the image
Application of NeuroTheta to estimate the Brooks-Corey and Van Genuchten hydraulic parameters for the Boswell soil series. The predictors used in NeuroTheta are the depth-harmonized and merged SSURGO and SCDt profiles. Each layer's black line represents the mean and the gray shading shows the spread from the 1st to 99th percentiles derived from each layer's computed histogram. SSURGO = Soil Survey Geographic Database; SCD = soil characterization database.

3.1.2 Pruning POLARIS Soil Series

In the original POLARIS soil series, the only constraints on soil series predictions were provided by the environmental covariates. Although these maps were complete in that the observed soil series in most cases could be found generally within the 10 most probable predicted soil series, the mere volume of possible soil series led to low probabilities in the predictions and high confusion among the top predictions per grid cell. In many cases, the most probable soil series were found to be substantially different from each other to the extent that simply merging their properties would not be physically consistent. To maximize the accuracy of POLARIS soil properties, the original POLARIS soil series was pruned using existing legacy soil surveys (SSURGO and STATSGO2). The term “pruned” is used due to the Random Forest (i.e., ensemble of decision trees) origin of POLARIS soil series. It should be noted that the pruning is not done explicitly on the original Random Forests that were used to assemble POLARIS soil series. Instead, the pruning is performed implicitly by removing soil series from the 10 most probable soil series at each grid cell. This histogram of soil series represents the terminal node of the given grid cell's corresponding original Random Forest.

First, the SSURGO and STATSGO2 vector maps were rasterized to a 30-m spatial resolution. To avoid removing correlated soil series, the pruning of POLARIS soil series was done at the subgroup taxonomic level. For each grid cell, the pruning consisted of removing all soil series whose corresponding subgroup is not in the list of plausible subgroups in the legacy soil survey data. When available, priority was given to SSURGO over STATSGO2. In the rare cases where no predicted soil series remain after pruning, the original POLARIS soil series predictions were left unaltered. Furthermore, to minimize the reappearance of spatial discontinuities in SSURGO after pruning, all soil series that did not have the same taxonomic order of the most probable soil series were removed. The probabilities of the remaining soil series after pruning were then normalized for each grid cell to ensure that they add up to 100%; the resulting database of 30-m maps over CONUS is called pruned POLARIS soil series (pPOLARIS soil series). Figure 3 shows an example of the pruning process over a 1 arc degree domain in southeastern Kansas.

Details are in the caption following the image
Example of pruning POLARIS soil series over a 1 arc degree domain in southeastern Kansas. At each 30-m grid cell, POLARIS soil series is pruned at the subgroup level using SSURGO and STATSGO2. SSURGO = Soil Survey Geographic Database; STATSGO2 = second version of the State Soil Geographic data set.

3.1.3 PROPR: Assembling the Maps of Soil Properties

Having assembled a property database for each soil series and pruned the POLARIS soil series predictions (pPOLARIS soil series), maps of soil properties were assembled by merging both data sets for each grid cell (see Figure 4). Following Odgers et al. (2015), this was accomplished for each layer per grid cell by computing the weighted mean of the soil property histograms of all soil series predicted by pPOLARIS soil series in that grid cell. This intermediate data set contains 100-bin histograms for each layer and soil property per grid cell over CONUS. A set of products were then derived from these histograms including the mean, median, mode, and the 5th and 95th percentiles.

Details are in the caption following the image
POLARIS soil properties was assembled by merging the pPOLARIS data set with the 20,000+ soil series over the contiguous United States. The resulting database is composed of 30-m maps over the domain for the mean, median, mode, 5th percentile, and 95th percentile for six vertical layers for all variables in Table 1.

3.2 Assessing Changes in the Parameter Distributions

One of the primary objectives of hydrologic and land surface models is to provide optimal estimates of states and fluxes (e.g., runoff). Given the large uncertainty in model parameters (especially soil properties), these models generally use optimization techniques to tune model parameters to ensure the best model performance. Due to the coarse and poor characterization of horizontal and vertical soil heterogeneity, soil parameters in previous studies have mainly come from prior distributions (generally uniform) that do not account for local environmental characteristics (e.g., geology). The maps in POLARIS soil properties provide a path toward revisiting these oversimplifications by providing more constrained parameter distributions (e.g., residual water content) at each layer of each 30-m grid cell. Note that the priors used in hydrologic and land surface models would be the posteriors in POLARIS soil properties.

To quantify how much POLARIS soil properties constrains the prior distributions of the soil parameters that are used in hydrologic and land surface models (i.e., water retention curve parameters), for each grid cell and layer a distance metric was computed based on the range between the 5th and 95th percentiles of the empirical prior and posterior distributions for parameter x
urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0003(1)

This range metric r is a nonparametric statistic that provides insight into how different the posterior distribution is compared to the prior distribution; the minimum value is 0.0 (in the rare case that the range between the 5th and 95th percentiles is condensed to zero) and there is no defined maximum value since the range between 5th and 95th percentiles can also increase in the localized posterior distributions. Lower values of r generally indicate more constrained posterior distributions of soil properties, which should enable improved high-resolution hydrologic and land surface modeling. For further details see Chaney et al. (2015), which employed a similar metric based on cumulative distribution functions. To facilitate evaluation over CONUS, a depth-weighted average of each variable's range metric was calculated for each 30-m grid cell ( urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0004) and upscaled to a 1-km spatial resolution.

4 Results

4.1 Properties Database for Each Soil Series

The SSURGO components and SCD pedons were harmonized and merged following the approach introduced in section 3.1.1. As an example, Figure 5 shows the resulting profiles for the 15 variables defined in Table 1 for the Boswell soil series. The black line represents the mean of each layer, and the shading shows the spread from the 1st to 99th percentiles; these percentiles were derived from each layer's corresponding histogram. For example, organic matter and saturated hydraulic conductivity decrease with depth. Some variables show little change with depth (e.g., pH) while others show significant differences (e.g., clay). Similar profile information is assembled for all 21,481 soil series that are used to build the maps in POLARIS soil properties.

Details are in the caption following the image
Example depth-harmonized profiles of the 15 variables defined in Table 1 for the Boswell soil series. The black line represents the mean of each layer, and the gray shading shows the spread from the 1st to 99th percentiles.

Figure 6 shows the per-layer histograms of soil properties for all variables, representing prior distributions over all soil series in CONUS. These prior distributions can be interpreted as the distributions which were then conditioned using local environmental covariates and available legacy soil data to arrive at the localized posterior distributions in the POLARIS properties database. These results exemplify the nonuniformity of these distributions (e.g., sand shows strong multimodality) and thus reinforce the need for the use of a nonparametric approach to summarize soil property distributions. Figure 6 also reveals the underlying challenges of using SSURGO component data. In the case of saturated hydraulic conductivity, the multimodality makes it apparent that most pedons likely report one of only 4–6 different values, illustrating that lookup tables are the most common origin of Ksat estimates in SSURGO. Finally, we note that the bubbling pressure (hbBC) values are roughly an order of magnitude below previous studies suggesting an area for further improvement of the NeuroTheta pedotransfer function in the future.

Details are in the caption following the image
The histograms for each variable and layer are combined for all soil series to arrive at a distribution for all variables and layers over the contiguous United States. These distributions can be interpreted as the prior distributions which are then conditioned using local environmental covariates and available legacy soil data to arrive at the local posterior distributions in POLARIS soil properties.

4.2 Pruning POLARIS Soil Series

To increase the prediction accuracy of POLARIS soil series, the original predicted probabilistic maps were pruned using SSURGO and STATSGO2 (see section 3.1.2). To illustrate the improvement in pPOLARIS soil series, Figure 7 compares the evaluation of predictions from SSURGO, POLARIS soil series, and pPOLARIS soil series at all taxonomic levels using NASIS pedons. As noted in Chaney et al. (2016), the majority of NASIS observations are field soil transect observations used for developing SSURGO map unit concepts. As such, NASIS is not a completely independent evaluation of SSURGO, POLARIS soil series, or pPOLARIS soil series. However, since no other comparable spatial database of soil observations exists for CONUS, NASIS is chosen as the best option for evaluation. This evaluation consists of comparing each NASIS pedon soil series name (or higher taxonomic level) to the predicted soil series, where rank 1 corresponds to the most probable soil series prediction (see Chaney et al., 2016, for more details on this evaluation approach). The distribution of these rank matches is summarized via the empirical cumulative distribution function of the rank match values.

Details are in the caption following the image
Evaluation of the soil series predictions in SSURGO, POLARIS soil series, and pPOLARIS soil series at all taxonomic levels (soil series, family, subgroup, group, suborder, and order) using NASIS pedons. The comparison for POLARIS soil series and pPOLARIS soil series is performed for NASIS sites where SSURGO contains soil surveys (w/ SSURGO) and for NASIS sites where SSURGO does not contain soil surveys (w/o SSURGO). The evaluation consists of searching for each NASIS site's corresponding soil series name (or higher taxonomic level) in the predicted soil series at the collocated 30-m grid cell where rank 1 corresponds to the most probable soil series prediction. The distribution of these rank matches is summarized by displaying the empirical cumulative distribution function of the rank match values. SSURGO = Soil Survey Geographic Database; NASIS = National Soil Information System.

The evaluation of SSURGO illustrates that at the soil series level, 40% of the NASIS pedons have a match at rank 1 (i.e., the component with the highest percentage area of map unit). This increases to above 70% when examining at the taxonomic order level. The accuracy of the original POLARIS soil series shows a large decrease in performance when compared to SSURGO (∼15% of the NASIS sites have a match at rank 1). However, POLARIS soil series has the added benefit of providing predictions—albeit with low accuracies—over unmapped regions in SSURGO. For pPOLARIS soil series, the rank 1 accuracy at the soil series level increases to 26% with minimal changes in accuracy over unmapped regions. However, the added value of pPOLARIS soil series becomes more apparent toward higher taxonomic levels. At the great group level, at rank 1 there is relatively minimal differences between SSURGO and pPOLARIS soil series. The accuracy of the taxonomic order in areas that are unmapped in SSURGO is also substantially improved when compared to the original POLARIS. These results highlight the benefits of constraining soil class predictions with available legacy soil surveys while illustrating the remaining uncertainties in pPOLARIS soil series.

4.3 POLARIS Soil Properties

After assembling each soil series profile database and pruning POLARIS soil series, the PROPR algorithm was used to merge the two products and produce soil property maps for all the variables in Table 1 (see section 3.1.3). For each layer and variable of each grid cell in pPOLARIS soil series, the histograms of all the corresponding soil series were merged by summing the weighted densities of each bin; the weights are the probabilities of the soil series predictions from pPOLARIS soil series. This leads to 100-bin histograms for 6 vertical layers (0 to 5, 5 to 15, 15 to 30, 30 to 60, 60 to 100, and 100 to 200 cm) and 13 different variables for all 30-m grid cells over CONUS. Note that technically the final soil property maps should be known as pPOLARIS soil properties. However, for simplicity POLARIS soil properties is used instead.

As an example, Figure 8 shows examples of the local posterior distributions over a 1 arc degree domain in southeastern Kansas. The 100-bin histograms of θs for all six layers are shown for eight distinct cells in the domain; the mean of each layer is superimposed on each histogram as a black line. The eight examples illustrate the impact that that the local environmental information can have on constraining the priors (Figure 8). However, the degree to which they are constrained varies between the eight grid cells. On one hand, the example on the center left of the domain (Figure 8d) shows highly constrained distributions for all layers. On the other hand, the example in the center of the domain (Figure 8e) shows a large spread between 0.33 and 0.55; this case is the least constrained when compared to its corresponding priors.

Details are in the caption following the image
The posterior distributions of θs for all six vertical layers for a selected number of 30-m grid cells (A–H) in a 1° domain in southeastern Kansas are shown via 100-bin histograms. The mean of each layer is superimposed on each histogram as a black line. The map in the center shows the POLARIS soil properties depth-weighted mean of θs.

This 30-m histogram database over CONUS was used to derive a series of summary statistics; these include the mean, median, mode, 5th percentile, and 95th percentile. Figure 9 shows the depth-weighted average of the mean for all 13 variables as described in Table 1 that are mapped over CONUS. Figure 9 also shows the estimated values of θ1,500 and θ33 which were calculated from the mapped Van Genuchten parameters. Furthermore, to illustrate the uncertainty in the predictions, Figure 10 shows the maps for the depth-weighted 5th percentile, mean, and 95th percentile for sand, silt, and clay. A visual analysis of these maps illustrates the large uncertainty that can emerge in the predictions. For example, over the western United States there is a significant spread of uncertainty in the plausible mapped 30-m sand estimates; while in other regions, such as the Mississippi valley, the range is much less. The Sandhills of Nebraska are one of the regions with greatest predictability as shown by the reduced differences between the 5th and 95th percentiles for sand, silt, and clay.

Details are in the caption following the image
The 30-m histogram database over contiguous United States is used to derive the mean for each layer and variable per 30-grid cell. Each panel shows the depth-weighted average of the mean for 15 soil properties over contiguous United States. Note that the shown permanent wilting point (θ1,500) and field capacity (θ33) are derived from the mapped Van Genuchten parameters.
Details are in the caption following the image
Maps of the mean, 5th percentile, and 95th percentile for sand, silt, and clay over contiguous United States.

To show the fine-scale detail available in the derived maps of soil properties, Figure 11 shows 30-m predictions for different layers and variables over several 1 arc degree regions. For comparison, the SCDt and SCDe observations are plotted over each prediction as pentagons and squares, respectively. When compared to the SCD observations, the performance of the predicted properties varies across the regions. For example, POLARIS soil property captures the strong contrast in pH between the central and adjacent regions in the observations. However, there are cases such as the map of field capacity in northern California (Figure 11g), where the correlation between the observations and predictions is not very strong. Additionally, the comparison between the map of organic matter in southern Montana (Figure 11c) and the map of clay in southeastern Kansas (Figure 11b) shows the limited progress in POLARIS soil series toward predicting high-resolution maps of soil properties from legacy soil surveys of varying resolution (i.e., spatial disaggregation) which was one of the original main objectives in creating POLARIS soil series.

Details are in the caption following the image
(a–i) Regional examples of the POLARIS soil properties for a selection of soil properties and layers. Each plot's inset map shows the geographic location of the 1 arc degree box. The depth-harmonized SCD property measurements are superimposed on top of their collocated 30-m grid cell. Measurements from SCDt and SCDe are shown as pentagons and squares, respectively. SCD = soil characterization database.

4.4 Evaluation of POLARIS Soil Properties Using SCD

To evaluate the localized (30 m) soil property predictions, POLARIS soil properties was compared to the SCDe measurements. The evaluation results are shown in Figure 12 via scatter plots with summary metrics provided in Table 2. Note that (1) both the SCDe measurements of θr and θs are not direct measurements but were derived from NeuroTheta and bulk density, respectively, and (2) for consistency, values that are outside of the bounds defined in Table 1 are treated as outliers and removed. Furthermore, the predicted wilting point and field capacity values were derived from the mapped Brooks-Corey and Van Genuchten parameters. The scatter plots are binned into hexagonal regions where the color shows the density. The best performance occurs for organic matter, silt, sand, and pH, and the worst performance occurs for bulk density, θs, θwp, and θfc. This is most likely strongly related to the poor performance in the mapping of bulk density which is used to estimate θs and is a predictor in NeuroTheta.

Details are in the caption following the image
Evaluation of POLARIS soil properties using SCDe. Each property value at each layer in POLARIS soil properties (y axis) is compared to its corresponding observation contained within SCDe (x axis). For each soil property, the resulting scatter plot is binned into hexagons; the shown colors illustrate the number of points in each hexagon where blue is low and yellow is high. SCD = soil characterization database.
Table 2. Evaluation of the POLARIS Soil Properties With Measurements From SCD Using R2, RMSE, nRMSE, MAE, and nMAE
Property R2 RMSE nRMSE (%) MAE nMAE (%) 25th to 75th (%) 5th to 95th (%) 0th to 100th (%)
Sand 0.54 18 18 13 13 39 75 89
Silt 0.58 14 14 10 10 39 75 89
Clay 0.46 11 11 7.8 7.8 35 70 87
Bulk density 0.30 0.17 8.4 0.13 6.3 30 60 79
pH 0.60 0.79 8.6 0.59 6.4 35 71 88
Organic matter 0.58 2.3 2.3 1.8 1.8 44 79 94
θs 0.31 0.060 10 0.046 7.7 32 66 89
θr 0.37 0.027 14 0.019 9.7 72 94 97
θfc,vg 0.30 0.078 16 0.061 13 N/A N/A N/A
θwp,vg 0.29 0.059 12 0.042 8.5 N/A N/A N/A
θfc,bc 0.30 0.079 16 0.063 13 N/A N/A N/A
θwp,bc 0.29 0.058 12 0.042 8.5 N/A N/A N/A
  • Note. Each property's minimum and maximum as defined in Table 1 are used to normalize their respective RMSE and MAE to compute the nRMSE and nMAE. The table also includes the percentage of horizons (all layers and all sites) where the corresponding SCD measurement is found between the 25th to 75th, 5th to 95th, and 0th to 100th predicted percentiles. SCD = soil characterization database; RMSE = root-mean-square error; nRMSE = normalized root-mean-square error; MAE = mean absolute error; nMAE = normalized mean absolute error; N/A = not applicable.

The probabilistic soil property maps in POLARIS soil properties provide another approach to evaluation, namely, whether the SCDe measurements can be found within the local posterior distributions. To this end, Table 2 also includes the percentage of horizons in which the SCDe measurements for a given variable fall within the bounds of the predicted 25th to 75th, 5th to 95th, and 0th to 100th percentiles. On average 41% of the horizons bound the measurements within the 25th and 75th percentiles for all variables. This average increases to approximately 74% when considering the range between the 5th and 95th percentiles and increases to 89% when considering the range between the 0th and 100th percentiles. These results are encouraging as they show that even though deterministic predictions can at times be far off from the measured values, the measurements are generally captured within the local posterior distributions. Having realistic constraints on parameter values for hydrologic and land surface models provides an opportunity to more adequately bind optimized parameters to physical reality and to better understand and characterize the structural deficiencies in these models.

4.5 Assessing Changes in the Parameter Distributions

The primary motivation behind the development of POLARIS soil properties was to combine existing soil data (soil survey and pedon information) with high-resolution environmental information to provide robust, highly constrained, and localized distributions of soil properties. To quantify how different the posterior distributions are from a given property's corresponding prior distribution, the depth weighted range metric ( urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0005) as described in section 3.2 was computed for each variable per 30-m grid cell in POLARIS soil properties. The urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0006 metric was computed for two different prior distributions: (1) the CONUS distributions assembled in this study and shown in Figure 6 and (2) uniform distributions assembled using the minimum and maximum values defined in Table 1. The comparison between these two cases indicates the extent to which the CONUS priors from Figure 6 constrain the variables relative to the commonly used uniform distributions.

Figure 13 shows the maps of urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0007 computed by comparing this paper's assembled CONUS-wide prior distributions (Figure 6) and POLARIS soil properties. A value of 1.0 indicates no change relative to the prior, while a value of 0.0 indicates that the distribution has been reduced to a single value. The spatial average reduction in x95 − x5 in the posterior distributions is largest for saturated hydraulic conductivity (0.44), pH (0.45), clay (0.51), sand (0.53), silt (0.55), and λ (0.59) while the smallest changes occur in bulk density (0.61), n (0.61), θr (0.63), θs (0.63), α (0.69), organic matter (0.70), and hb (0.71). It is important to note that although a value of urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0008 is used as the maximum threshold in Figure 13, there are grid cells where the actual values exceed 1.0. The limited change in the water retention curve parameters when compared to other properties such as soil texture can most likely be attributed to the prediction uncertainties in the ANN pedotransfer function in NeuroTheta (see section 3.1.1 for details). Figure 13 also highlights regional differences in the uncertainty reductions contributed by POLARIS soil properties.

Details are in the caption following the image
The urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0009 metric between the contiguous United States-wide prior (see Figure 6) and posterior distributions for the mapped soil properties in POLARIS soil properties is computed at each 30-m grid cell and then upscaled to a 1-km spatial resolution over contiguous United States for visualization.

Finally, Figure 14 shows the maps of urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0010 computed by comparing POLARIS soil properties with uniform priors. In this comparison, the ranking in spatially averaged reduction in x95 − x5 from largest to smallest is n (0.23), clay (0.24), bulk density (0.25), λ (0.26), hb (0.26), pH (0.27), θs (0.27), saturated hydraulic conductivity (0.27), organic matter (0.28), α (0.29), θr (0.45), silt (0.37), and sand (0.50). The differences with the results from Figure 13 are noticeable, suggesting that POLARIS soil properties substantially constrains the default uniform distributions that are commonly used in model parameter optimization routines.

Details are in the caption following the image
The urn:x-wiley:wrcr:media:wrcr23830:wrcr23830-math-0011 metric between the uniform prior and posterior distributions for the mapped soil properties in POLARIS soil properties is computed at each 30-m grid cell and then upscaled to a 1-km spatial resolution over contiguous United States for visualization.

5 Discussion

5.1 Improved Prior Distributions of Soil Properties for Hydrologic and Land Surface Models

The representation of horizontal and vertical heterogeneity in hydrologic and land surface models remains a persistent challenge at regional to global scales. This is primarily due to (1) a lack of spatially complete data sets that describe the soil properties and (2) a lack of robust and realistic measures of uncertainty within these data. Addressing these issues is critical as uncertainties in soil properties continue to limit our understanding of their role in the water, energy, and biogeochemical cycles (Baroni et al., 2017; Chaney et al., 2014; Folberth et al., 2016; Lichstein et al., 2014; Rezanezhad et al., 2016). POLARIS soil properties addresses this challenge by leveraging DSM approaches to move beyond simple deterministic estimates to provide locally relevant distributions for each property and layer at a 30-m spatial resolution.

By using histograms to propagate uncertainties from the SCDt pedons and SSURGO components through the probabilistic soil series maps of POLARIS soil series to create POLARIS soil properties, most observed soil properties in SCDe are found within the histograms reported for each variable and layer per grid cell; when the observation is not found within its corresponding local posterior, it is due to the soil series prediction accuracy in pPOLARIS soil series—this is due to weaknesses in the methods used to create pPOLARIS soil series and the training data set (SSURGO). Furthermore, these histograms provide a robust nonparametric approach to account for the underlying shape of each distribution. In short, POLARIS soil properties provides the modeling community with localized prior distributions that will help move beyond the ubiquitous use of ill-informed global uniform distributions for parameter optimization.

The soil property maps developed in this study can serve as input to existing parameter estimation frameworks for hydrologic and land surface models. Connecting the data in POLARIS soil properties to parameters associated with a specific model and scale would involve some combination of transfer functions, upscaling, regularization, and calibration against observed fluxes. Note that in these applications, for variables such as the saturated hydraulic conductivity, one would want to use the weighted geometric or harmonic mean instead of the currently computed weighted arithmetic mean. The regularization step is typically needed to reduce the dimension of the calibration problem for spatially distributed models. Here POLARIS soil properties could provide an a priori parameter field for which a constant multiplier could be calibrated (Pokhrel & Gupta, 2010; Pokhrel et al., 2008). While these considerations are beyond the scope of this paper, recent work has shown significant improvements in parameter estimation (e.g., Mizukami et al., 2017; Samaniego et al., 2010) that could further benefit from improved high-resolution probabilistic soil information. This has two potentially significant benefits: (1) a path toward reducing the possibility of multiple plausible yet different parameter sets (i.e., parameter equifinality; Beven, 2006) and (2) constraining the model parameter space thus elucidating deficiencies in the representation of processes within hydrologic and land surface models.

5.2 Challenge: Disregarding the Covariance of Soil Properties

The probabilistic soil property maps in POLARIS soil properties provide independent probability distributions for each layer and soil variable per 30-m grid cell over CONUS. Although this level of detail provides a significant step toward a more complete characterization of the uncertainty in soil information, the assumed independence between these distributions both spatially and between properties is a pending issue that must be addressed moving forward.
  1. Property covariance. The soil properties in Table 1 can be interpreted as metrics used to quantify soil structure and function. As such, these summary metrics are commonly interconnected (e.g., sand and saturated hydraulic conductivity). Similar to existing regional to global soil properties data sets, POLARIS soil properties disregard this correlation and summarize a grid cell's soil properties through univariate, independent distributions. This simplification can lead to physical inconsistencies among soil properties. Although steps can be taken to minimize these physical inconsistencies (e.g., only sample strongly uncorrelated properties), the only robust solution is to account for the covariance among properties (i.e., joint distributions).
  2. Spatial covariance. Not accounting for the spatial covariance of a given soil property in the univariate distributions within POLARIS soil properties is most likely less an issue than the interproperty covariance. Nonetheless, it could lead to unrealistic sampled spatial soil patterns. For example, sampling from the distributions of θs for all layers in a set of adjacent grid cells can lead to a three-dimensional structure of θs that disregards existing horizontal and vertical correlation. Another example illustrates the challenges with regard to the vertical covariance. If θs of all the measured pedons that correspond to a given soil series decrease linearly with depth but there are large differences in the mean between the pedons, then the uncertainty bounds per layer would be high. Without including information regarding the vertical covariance, when sampling the layers independently, one could obtain flat or even inverted (low to high) θs profiles that do not correspond with the observations. At least for vertical correlation, this does not need to be the case; the pedons in SCDe and components in SSURGO from which the mapped distributions are assembled have characteristic vertical profiles; accounting for this observed vertical covariance in the mapped distributions would already improve the physical consistency of the sampled profiles.

Moving forward, there should be a concerted effort within DSM to incorporate interproperty covariance and intraproperty spatial covariance into three-dimensional property maps. This is especially relevant as these products move beyond providing only deterministic (and uncertain) estimates and instead aim to provide a complete characterization of plausible soil configurations (e.g., Arrouays et al., 2014). It appears that the most viable approach would be to assemble joint distributions per grid cell that would account for the different layers and properties; these joint distributions could be summarized via multivariate Gaussian distributions or a n by m histogram where n is the number of layers and m is the number of properties. The independence assumption would then be restricted to intergrid cell variability.

5.3 Future Improvements: POLARIS Soil Series Predictions

Although the POLARIS soil properties database provides unprecedented detail and uncertainty information for hydrologic and land surface models, the evaluation results in sections 4.2 and 4.4 suggest that substantial improvements are plausible in the near future to further constrain the derived 30 m prior soil property distributions over CONUS. More specifically, the largest gains would most likely come from revisiting and improving the probabilistic maps of soil series in POLARIS soil series.

To this end, one option is to follow the suggestions outlined in Chaney et al. (2016) to improve the original method used to produce the first version of POLARIS soil series. However, although this approach will most likely provide more robust predictions, it is unclear if it will improve SSURGO's accuracy—one of the original goals of POLARIS soil series. Given that the accuracy of predicting the correct soil series when using the most probable component in SSURGO is ∼40% (see Figure 7), the plausible upper bound of accuracy using this approach would likely not exceed SSURGO's accuracy by too much.

Another approach is to directly use the NASIS pedons (which is the underlying database of SSURGO and STATSGO2) and existing high-resolution environmental covariates to map the soil series (and higher taxonomic levels) over the contiguous United States. Although similar to Ramcharan et al. (2017), this approach would move beyond using a single model for the entire domain by incorporating a modified version of the moving window approach used in Chaney et al. (2016). The limitation with this approach would be the varying density in NASIS pedons over CONUS.

Finally, an alternative approach is to combine both methods by relying primarily on NASIS pedons in regions where the density of in situ samples is adequate and using the dissaggregation approach in regions where the density of NASIS in situ sites is insufficient. Regardless of which approach is used, the most robust improvement in soil series predictions will most likely come from methods that take full advantage of the wealth of data available in the century's worth of soil survey in both in situ and vector format.

6 Conclusions

Advances in DSM over the past decade have led to strong innovations in predicting soil information at regional, continental, and global scales. This paper leveraged these advances to produce a state-of-the-art database of soil properties for hydrologic and land surface models over the contiguous United States at a 30-m spatial resolution. To accomplish this goal, POLARIS soil properties was assembled by combining a pruned version of POLARIS soil series with the pedon data in the NCSS SCD and the component property information in SSURGO. The resulting product includes 100-bin histograms per soil layer and variable per grid cell over CONUS. A series of statistics were also derived from these histograms including mean, median, mode, 5th percentile, and 95th percentile. Moving forward, improvements of POLARIS soil properties should focus on improving the accuracy of POLARIS soil series and accounting for both the spatial covariance and interproperty covariance.

POLARIS soil properties provides a unique opportunity in hydrology and land surface modeling to move beyond coarse legacy soil maps and to revisit the use of simplistic prior distributions of model parameters in optimization routines. By providing locally relevant prior distributions for all layers and variables per 30-m grid cell, more realistic constraints can be placed on the optimization routines of land surface and hydrologic models. This offers two potentially significant benefits: (1) a path toward minimizing the possibility of multiple plausible yet different parameter sets and (2) constraining the model parameter space, thus elucidating deficiencies in the representation of processes within hydrologic and land surface models. Finally, a more complete characterization of soil properties in space will facilitate improved modeling of the Earth system that will in turn impact its applications in numerical weather forecasting, precision agriculture, and drought and flood forecasting.

Acknowledgments

The 1 arcsec (∼30 m), 10 arcsec (∼300 m), and 100 arcsec (∼3,000 m) versions of POLARIS soil properties are available at www.polaris.earth. Portions of this work were funded by the U.S. Geological Survey Land Change Science program. Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government.