Volume 128, Issue 12 e2023JF007319
Research Article
Open Access

Aeolian Sediment Transport Responses to Vegetation Cover Change: Effects of Sampling Error on Model Uncertainty

Robert R. K. Wojcikiewicz

Corresponding Author

Robert R. K. Wojcikiewicz

Plant and Environmental Sciences Department, New Mexico State University, Las Cruces, NM, USA

USDA-ARS Jornada Experimental Range, Las Cruces, NM, USA

Correspondence to:

R. R. K. Wojcikiewicz,

[email protected]

Contribution: Conceptualization, Methodology, Formal analysis, ​Investigation, Writing - original draft, Writing - review & editing

Search for more papers by this author
Nicholas P. Webb

Nicholas P. Webb

USDA-ARS Jornada Experimental Range, Las Cruces, NM, USA

Contribution: Conceptualization, Methodology, ​Investigation, Resources, Data curation, Writing - original draft, Writing - review & editing

Search for more papers by this author
Brandon L. Edwards

Brandon L. Edwards

USDA-ARS Jornada Experimental Range, Las Cruces, NM, USA

Contribution: Data curation, Writing - review & editing

Search for more papers by this author
Justin W. Van Zee

Justin W. Van Zee

USDA-ARS Jornada Experimental Range, Las Cruces, NM, USA

Contribution: Data curation, Writing - review & editing

Search for more papers by this author
Ericha M. Courtright

Ericha M. Courtright

USDA-ARS Jornada Experimental Range, Las Cruces, NM, USA

Contribution: Data curation, Writing - review & editing

Search for more papers by this author
Brad F. Cooper

Brad F. Cooper

USDA-ARS Jornada Experimental Range, Las Cruces, NM, USA

Contribution: Data curation

Search for more papers by this author
Niall P. Hanan

Niall P. Hanan

Plant and Environmental Sciences Department, New Mexico State University, Las Cruces, NM, USA

Search for more papers by this author
First published: 08 December 2023

Abstract

Although it is widely known that observations of aeolian sediment transport are susceptible to large sampling errors, sample designs are frequently used that do not sufficiently reduce the measurement uncertainties inherent in the study of aeolian processes. Here, we examine the influence of sample size (n) and sampling location on uncertainty in models of aeolian sediment transport responses to vegetation cover change. We compare measurements from a stratified random array of 27 horizontal sediment mass flux samplers to vegetative cover data collected at a 1 ha site over a period of nearly 6 years. To assess the sensitivity of modeled relationships between aeolian transport and vegetative cover to sample design, we analyze statistical regressions for all possible combinations of sample size and sampler locations. We show that at least 17 randomly located samplers are needed to consistently capture the sediment mass flux response to vegetative cover change. We found that multiple statistically significant models can describe the sediment flux-vegetative cover relationship when using smaller sample sizes, demonstrating the risks of inferring sediment transport response from an underpowered sample design. Across vegetative functional groups, we found that woody cover generally influenced aeolian sediment transport rates more than herbaceous cover, while model uncertainty at large sample sizes (n > 17) showed the limitation of using vegetative cover as an indicator of aeolian sediment transport. Our results suggest an evaluation of sampling practices in aeolian sediment transport studies may be needed to avoid inferential errors that are likely pervasive in this field of study.

Key Points

  • Heterogeneity in soil, vegetation, and wind cause aeolian sediment flux to have large spatiotemporal variability

  • Models describing change in aeolian sediment fluxes with ground cover are highly sensitive to sampling error

  • Large sample sizes and a variety of environmental indicators are needed to produce consistent models that can inform wind erosion management

Plain Language Summary

Studies of aeolian sediment transport are prone to sampling errors, largely due to the many factors that control the transport process. However, researchers often look to simplify sample designs and monitoring methods as monitoring is time consuming and expensive. To that end, vegetative cover is an easily collected and interpreted environmental indicator that affects the rates of aeolian sediment transport. However, little consideration has been given to how many samples are needed to confidently model the relationship between changes in vegetative cover and sediment transport. We investigate how changes in sample size, modeling approaches, and vegetative cover may affect inferences made about aeolian sediment transport. We show that using traditional approaches in aeolian research (using small sample sizes and relatively simple environmental data) could risk false conclusions being drawn about aeolian transport processes, even with models that appear to be accurate. If sampling does not adequately describe the sediment transport “population” response, variability in field measurements can be so large that it undermines our understanding of how and why aeolian sediment transport rates change over time. These findings may require a re-evaluation of typical sediment transport sampling practices.

1 Introduction

Aeolian research is highly susceptible to sampling errors because observations of key processes and controls often do not adequately characterize spatiotemporal variability at a site (Poortinga et al., 2014; von Holdt et al., 2019). This is because aeolian sediment transport is threshold-driven and responds nonlinearly to controlling factors such as wind speed and turbulence intensity; the cover, distribution and structure of surface roughness (e.g., vegetation, rocks); and disturbances that affect vegetation and soil properties like crusting and availability of loose erodible material (Gillies, 2013; Shao, 2010). Saltation mass fluxes, for example, are typically not normally distributed and exhibit large spatial and temporal variability (Bergametti & Gillette, 2010). However, fieldwork and wind tunnel studies of aeolian processes are expensive, and are frequently limited by project budgets and timelines, so sample sizes are often small (Chappell et al., 2003b). These factors create a situation in which, unless sampling is carefully planned, there is great potential to under-sample the spatial and temporal variability of sediment transport and thus misrepresent and misinterpret results (Sterk & Stein, 1997).

Sample designs that only measure the temporal variance of aeolian transport (e.g., monitoring a single or very few locations over time) inadequately measure the spatial variance for a treatment or area (e.g., too few sampling locations). Further, studies in which sampling is located subjectively or for reasons of convenience (e.g., locating samplers in the middle of unvegetated gaps instead of randomly, access to power sources, infrastructure, etc.) will result in spatially biased measurements. Both risk producing data that do not represent the aeolian transport dynamics for a study site. The consequence is that inference about change detection and models describing process responses to controlling factors may be statistically invalid, even if tests suggest a statistically significant change or relationship. This is because rates of type I (falsely inferring change) and type II (falsely inferring no change) errors are determined not only by sample size, but more specifically whether spatial and temporal variances of the population are captured by the sample design (de Gruijter et al., 2006). These errors are not confined to analyses of variance. For example, fitting a model to data describing changes in sediment mass flux with vegetation cover at a given study area assumes that the data adequately represent how the mass flux population changes over the cover gradient for the study area in question. If a sample does not represent the population for the corresponding area, analyses of variance and model fitting may suggest there is a statistically significant change when there is not, or may indicate a certain rate of change in a model when in fact the population behavior is different (Serdar et al., 2021). Adequately encapsulating the spatiotemporal variability of aeolian sediment transport is therefore critical for interpreting whether sediment fluxes are different or changing among treatments, and how environmental controls such as varying ground cover influence rates of sediment transport change (Chappell et al., 2003a).

A common interest of aeolian researchers and land managers affected by sample design is understanding how different vegetation types and vegetation disturbances influence sediment transport rates (Okin, 2008; Okin et al., 2006; Webb et al., 2014). Data on the relationships between vegetation indicators (e.g., fractional cover, lateral cover, scaled canopy gap) and sediment mass fluxes, and on changes in sediment mass fluxes with changing vegetation communities can be used to develop predictive models and to set benchmarks to assess erosion risk and whether management objectives are being met (Karl et al., 2017; Webb et al., 2020). Land uses and climate change are driving land cover change in ways that can increase aeolian sediment transport and the need for robust models and management tools. For example, vegetation has substantially changed in the Chihuahuan Desert of North America over the past 150 years, with grasslands on sandy soils transitioning to shrubland. These changes have led to increases in bare ground connectivity and increased aeolian transport, which have reinforced the ecological transition through positive feedbacks (Okin et al., 2006). The spatial variability of vegetation has also increased, as what was previously more uniform grassland turned into “islands” of shrubs in nebkha dune fields (Floyd & Gill, 2011; McGlynn & Okin, 2006). These dynamics make it important to understand how vegetation change influences sediment transport rates with confidence so that managers can now act to mitigate wind erosion and avoid further potentially irreversible degradation (Bestelmeyer et al., 2018). Developing predictive models and identifying cover thresholds and benchmarks requires knowing that statistical inference about aeolian transport responses to vegetation cover change is robust. This requires ensuring that studies adequately sample both sediment transport rates and vegetation change.

Multiple studies have sought to quantitatively describe the relationships between vegetation indicators and aeolian sediment transport through field observations and wind tunnel experiments. These studies explored how horizontal sediment mass fluxes (Q, g m−1 s−1) change with changing ground cover, focusing on both the direct and aerodynamic sheltering effects of standing and prostrate vegetation (Mayaud & Webb, 2017; Wolfe & Nickling, 1993). Different relationships between cover and Q have been reported, although studies often show an exponential decline in Q as the fractional vegetation cover increases, with the rate of decrease depending on vegetation type and growth form (e.g., Findlater et al., 1990; Leys, 1991). This relationship has been observed in both wind tunnels and field studies with mixed vegetation communities (e.g., Lancaster & Baas, 1998; Li et al., 2007). Power, exponential, and log-linear regressions have all been used to fit observed relationships between Q and vegetation cover (Lancaster & Baas, 1998; Leys, 19911992; Munson et al., 2011), and Lyles and Allison (1981) showed examples where linear regressions could be used to fit values of Q that had not been log-transformed. Empirical evidence of a negative exponential relationship between Q and vegetation cover is generally supported by drag-partitioning studies that have described the aerodynamic effects of vegetation on aeolian sediment transport (e.g., Chappell & Webb, 2016; Marticorena & Bergametti, 1995; Okin, 2008; Raupach et al., 1993). Nonetheless, it is possible that the studies cited here were susceptible to sampling error, as the spatial variability of Q typically isn't measured in wind tunnels, and field study sample sizes (n) have often been small (e.g., n ≤ 5) with non-random sediment sampler placement. Webb et al. (2019) demonstrated that large sample sizes (i.e., n ≥ 27 sediment samplers per hectare) and random sampling are needed to confidently infer aeolian sediment transport changes from analysis of variance studies at the plot scale. Their results suggest that there may be large uncertainties in studies of aeolian sediment transport responses to vegetation change, and in the potential to infer the robustness of models that may appear statistically significant, when in fact they are undermined by type I or type II errors. However, research is yet to directly address how sample designs used to monitor aeolian sediment transport rates influence error rates of model fitting and the validity of models.

Here, we assess the effect of sampling error on the use of ground cover indicators to make inferences about aeolian sediment transport. Specifically, our objective is to assess how the form and statistical significance of the relationship between Q and ground cover is influenced by sample size (n) and location of sediment samplers. We compare repeated high-density samples of horizontal sediment mass flux with measurements of total, woody, and non-woody vegetation cover spanning a period of nearly 6 years to explore how changes in the percentage of ground cover influence Q. We show that when using an insufficient number of samplers (n < ∼17), multiple seemingly significant relationships can be inferred. Finally, multiple types of regression (non-linear, log-linear, and linear) are applied to the different comparisons to examine the effectiveness of total and fractional cover as a predictor for Q. The novel contribution of this work is to explicitly address how sample design impacts the form and error rates of models fit to measurements to describe aeolian sediment transport responses to environmental change. While Webb et al. (2019) found that large sample sizes are needed to confidently detect changes in aeolian sediment transport rates, the current study goes beyond Webb et al. (2019) by assessing the implications of sampling error for model fitting. This work represents a crucial next step in understanding how monitoring approaches affect inference about aeolian sediment transport rates, patterns across landscapes, and change through time, and is needed for researchers and land managers to assess the uncertainty of models and their limitations for informing management decisions.

2 Data and Methods

2.1 Study Area

Data were collected at the Jornada Experimental Range (JER) National Wind Erosion Research Network (NWERN) site, situated in the south-central region of New Mexico, USA, within the Jornada Basin. The Jornada Basin is bound by the San Andres mountains to the east and the Rio Grande Valley to the west (Figure 1). The vegetation at the JER site consists of a mix of warm season grasses, perennial forbs, and moderately sized honey mesquite shrubs (Prosopis glandulosa Torr.) (Webb, Herrick, et al., 2016). The climate in the area is arid, with the majority of rain falling between July and October due to the summer monsoon, with a high degree of interannual variability in rainfall (Wainwright, 2006). The NWERN comprises 18 sites located across the western US and Canada with the goal of improving the understanding of wind erosion and dust emission processes across land use and land cover types (Webb, Herrick, et al., 2016). Each site is 1 ha (100 × 100 m) and intensively instrumented to measure aeolian sediment transport rates (horizontal and vertical mass fluxes), meteorological conditions including wind speed, and soil and vegetation properties that influence wind erosion following standardized protocols (Webb et al., 2015) (Figure 2).

Details are in the caption following the image

Geographic location of the study area in southern New Mexico. The star represents the precise location of the study area in the Jornada Basin (latitude 32.63, longitude −106.74), shown here in relation to the boundaries of the Jornada Experimental Range and Chihuahuan Desert Rangeland Research Center.

Details are in the caption following the image

Sampling design and layout of Modified Wilson and Cooke samplers in the study area. Modified from the National Wind Erosion Research Network website (winderosionnetwork.org).

2.2 Horizontal Sediment Mass Flux Data

Spatiotemporal sediment mass flux data were collected using 27 Modified Wilson and Cooke (MWAC) sediment samplers (Figure 2). The MWAC sediment sampler masts have samplers with inlets at heights of 0.10, 0.25, 0.5, and 0.85 m above ground level. A regular 3 × 3 square grid was used to stratify the study area. Three MWAC sediment sampler masts were installed at random locations within each grid cell. The data set spans a time period of nearly 6 years (September 2015–June 2021; Table 1) and sediment from the 27 samplers was collected approximately every 28 days. Only the 20 collection dates that coincided with vegetation transect monitoring at the study area (described in the following section) were used (Table 1). Sediment was extracted from the samplers using a wet collection method and then weighed to determine sediment masses following Webb et al. (2015).

Table 1. Dates of Sediment Sampler Collections and Corresponding Vegetation Measurements
Date of sediment collection Date of vegetation measurement
24-09-2015 22-09-2015
28-03-2016 21-03-2016
19-07-2016 13-07-2016
24-10-2016 06-10-2016
29-03-2017 08-03-2017
08-08-2017 23-08-2017
14-11-2017 18-10-2017
08-03-2018 14-03-2018
11-06-2018 06-06-2018
03-10-2018 26-10-2018
05-03-2019 05-03-2019
18-09-2019 20-09-2019
31-10-2019 18-10-2019
13-12-2019 12-12-2019
24-03-2020 17-03-2020
08-06-2020 09-06-2020
15-09-2020 01-09-2010
14-12-2020 22-12-2020
15-03-2021 17-03-2021
15-06-2021 08-06-2021
Vertically integrated sediment mass fluxes were calculated for each 28-day sampling period and MWAC mast by first normalizing the sediment masses by the MWAC inlet areas (2.34 × 10−4 m2)), then using non-linear least squares regression. Two- or three-parameter functions were fit to the regressions depending on whether sediment was detected at three or four heights respectively. The models were then integrated from 0 to 1.0 m height and divided by sampling periods to establish Q as follows:
Q = 0 1 q ( z ) d z , $Q=\underset{0{}}{\overset{1}{\int }}q(z)dz,$ (1)
where q(z) is the sediment mass collected per unit inlet area (m−2) per sampling period (day) at heights z (m) and Q is expressed with units of g m−1 day−1. Four of the 27 samplers were not used in this study because the fits of q versus z were rejected (R2 < 0.98) in enough instances to have an insufficient number of dates with acceptable data. These rejections are largely due to physical blockages of organic matter in the inlets of the MWAC samplers, such as plant litter, spiders, and spider webs. To control for seasonal variability in wind speed (Gillette & Monger, 2006), sediment fluxes were normalized by the number of minutes during each sampling period for which wind speed recorded at the site was greater than 6 m s−1 (this value is referred to as w when reporting normalized values of Q) when sediment transport was observed to be initiated at the site. This threshold was frequently measured at an adjacent site on a sandy loam soil with similar surface conditions (Webb, Galloza, et al., 2016).

2.3 Vegetation Fractional Cover Data

Estimates of total, woody and non-woody vegetation cover (%) at the site were made from data collected roughly four times a year using the line-point intercept (LPI) method (Herrick et al., 2018). The LPI method was implemented along three 100 m transects with 60° spacing, intersecting at 50 m in the center of the site. LPI data were recorded every 0.25 m along the transects. Percent cover of vegetation functional groups (herbaceous, woody, total cover) was estimated following US Department of Agriculture (USDA) Plants Database (https://plants.usda.gov/) designations using the terradactyl R package (McCord et al., 2022).

2.4 Model Fitting and Analysis

Measurements of horizontal sediment flux and vegetation cover were selected within 2 weeks of each other (Table 1), and seasonal variability was captured by the vegetation measurements. The biggest structural vegetation change at the site occurs during the mesquite green-up in the spring, and vegetation measurements are taken on both sides of this change to avoid comparing sediment flux and vegetation cover during periods of change (Browning et al., 2017; Ziegler et al., 2023). In order to examine the importance of sample size for model uncertainty, we examined how the functional relationship between Q and vegetative cover indicators changed when certain samplers were excluded from the analysis, effectively limiting the sample size. To study the effects of spatial variability in aeolian sediment transport at the site, we also looked at different groupings of these artificially limited sample sizes (i.e., different sampler locations). This was achieved by creating all possible combinations for n = 1 sampler up to n = 22 samplers creating ∼8 million combinations. Each combination group represented a variety of different possible sampler locations for a given n, and thus different possible spatial means and variances of Q. There were more possible combinations for some groups than others. For example, when only looking at combinations of 1 from a set of 23, there are only 23 possibilities. The number of possible combinations grows to 1,352,078 for the combinations of both 11 and 12 samplers and then decreases until the combinations of 22 samplers, where there are again only 23 possible combinations (where n = 23, there is only one possible combination). Each combination's sediment mass fluxes were compared against the total, non-woody, and woody cover percentages using power (Equation 2), log-linear (Equation 3), and linear (Equation 4) regressions. When back transformed, Equation 3 takes the form of a negative exponential equation (i.e., y = a · ebx).
y = a x b $y=a{x}^{-b}$ (2)
log ( y ) = b x + a $\log (y)=bx+a$ (3)
y = b x + a $y=bx+a$ (4)

3 Results

Percent vegetation cover recorded at the site during the sampling period ranged from 14.4% to 36.5% for total foliar cover, 5.9% to 19.7% for woody cover, and 4.8% to 25.3% for non-woody cover (Figure 3). Non-normalized sediment fluxes on the dates used in this study ranged from 3.7 to 208.65 g m−1 day−1 (Figure 4).

Details are in the caption following the image

Changes in the measured values of (a) total cover, (b) woody cover, and (c) non-woody cover at the Jornada Experimental Range National Wind Erosion Research Network site. We used 20 measurements of vegetative cover, which correspond to the 20 sediment flux measurements used in this study (Table 1).

Details are in the caption following the image

Changes in the measured values of sediment flux (Q) over a roughly 6-year period of data collection at the Jornada Experimental Range National Wind Erosion Research Network (NWERN) site. Each data point, whether a black point or blue cross represents a mean value of the 27 Modified Wilson and Cooke samplers at the NWERN site. Black points represent measurements taken but not used in this study. Blue crosses represent the 20 measurements that were used in this study and correspond with a measurement of vegetative cover (Table 1). Standard errors about the mean are also reported for these measurements denoted by blue crosses. Note that here, values of Q are not normalized by periods of high wind speed, as they are throughout the rest of this study.

3.1 Power Regressions

For the power regressions of normalized Q against total cover, the median b (Equation 2) coefficient values were largest where n = 1, and asymptotically approached the value of 9.7, with median values approximately remaining at 9.7 where n > 4 (Figure 5). Variability around the median was generally consistent and rapidly decreased where n > 19. P-values for the coefficients were very low for all n, with the median p-values decreasing from 0.0001 as n increased. The log transformed intercept values showed almost the exact same trends in median value and variability as the coefficients.

Details are in the caption following the image

Boxplots showing summary values of power regressions (Equation 2), which are fit to values of Q taken from all possible combinations of samplers against values of fractional cover, organized into 3 columns and 3 rows. The x-axis in every graph ranges from 1 to 23 and represents the number of samplers taken from the set of 23 used to make sampler combinations. Columns (from l to r) are organized as showing the results of total vegetative cover, woody cover, and non-woody cover regressions. The y-axis in the top two rows for each figure represents coefficient values and p-values, respectively. The y-axis in the bottom row shows y-intercept values, representing theoretical flux amounts if there was no cover, and is in units of log(g m−1 s−1/w). The boxplots presented here show the median (black line), the 75th and 25th percentile of values (white boxes above and below the median), and 1.5 ± IQR (dotted lines).

The woody cover coefficients showed a similar trend, although the median coefficient values were higher, approaching a value of 17.6 as n increased. Variability around the mean was initially consistent with variability decreasing where n > 16. p-values were slightly higher for the woody cover regressions, with median values decreasing from 0.04 where n = 1 to ∼0.000001 where n = 9. From n = 10 to 23, median p-values increased slightly. The log transformed intercept values showed very similar trends in median value and variability as the coefficients.

For the non-woody regressions, coefficient values were smaller, with median coefficient values increasing with n and converging near ∼0.8. Variability around the medians for non-woody cover was smaller than for the total and woody regressions, but a similar trend was seen in that there was a consistent level of variability around the median that began to decrease when n > 17. p-values for the non-woody coefficients were significantly higher than for the other two regressions and had larger variability, with median p-values steadily decreasing from 0.75 where n = 1 to 0.2 where n = 23. The log-transformed intercept values showed almost the exact same trends in median value and variability as the coefficients.

3.2 Log-Linear Regressions

For the log-linear regressions of normalized Q against total cover, median b (Equation 3) coefficient values were highest where n = 1 and approached a value of −0.53 as n increased (Figure 6). Variability around the median increased until n = 17 and then decreased rapidly where n > 17. The intercept values had very similar variability trends to coefficient values, but median values increased, following a nearly perfect inverted relationship to the coefficient trends. p-values for the coefficients were generally low, with median p-values decreasing asymptotically from 0.12 as n increased.

Details are in the caption following the image

Same as Figure 5 but for log-linear regressions (Equation 3, or Equation 4 if back-transformed). Y-intercept values are not log-transformed.

The woody cover regression coefficient values showed a similar trend to total cover values, although values were slightly larger, with medians approaching −0.73 as n increased. The variability trends around median values were very similar to the total cover trends. The intercept values also showed a similar relationship to coefficient values as the total cover regressions, with intercept values increasing with n, effectively showing an inverse shape to the coefficient means. p-values decreased asymptotically from 0.6 where n = 1 with increasing n.

Non-woody coefficient means approached a value of −0.81 with increasing n with similar but slightly less pronounced trends in variability as the power and untransformed linear regression types. The median intercept values for non-woody regressions showed an inverted shape to median coefficient values with increasing n. The non-woody p-values showed a somewhat irregular trend with more variability than total and woody regression p-values, with the highest p-value median recorded where n = 6, decreasing to 0.1 where n = 23. No significant median p-values were observed at any value of n.

3.3 Linear Regressions

For the untransformed linear regressions of normalized Q against total cover, the median coefficient values of b (Equation 4) were largest where n = 1 and approached a value of −5.4 as n increased (Figure 7). Variability around the median was consistent until n > 17, where variability decreased. We found that the intercept values had very similar variability trends to the coefficient values, with median values showing a near exact inverted relationship to the coefficient trends. p-values for the coefficients were generally low, with median p-values decreasing asymptotically from 0.05 as n increased.

Details are in the caption following the image

Same as Figure 5 but for linear regressions (Equation 3). Y-intercept values are not log-transformed.

The woody coefficient values showed a similar trend to the total cover coefficients, although values were slightly larger, with medians approaching −7.2 as n increased. Variability around the median had similar trends to the total cover regressions, although variability was slightly larger and decreased rapidly when n > 17. Median intercept values showed a similar inverted trend of coefficients, with slightly more variability. p-values were slightly higher than the total cover regressions, asymptotically decreasing from 0.11 with increasing n.

Non-woody coefficient means approached a value of −8.4 with increasing n. Variability around the median remained consistent until n > 18, beyond which it gradually decreased. The non-woody intercept values showed similar trends in coefficient variability along with an inversion of the coefficient trend as n increased. Non-woody regression median p-values were the highest of the three regressions, asymptotically decreasing from 0.41 with increasing n. The variability of p-values was also higher than for the total and woody cover regressions.

3.4 Summary and Illustration of Sampling Effects

Across all sample sizes (n), a negative relationship between normalized Q and percent ground cover is evident. However, there were large variances in regression slopes where n < ∼17, with slope variance dramatically decreasing when n ≥ ∼17 (Table 2). Between different plant functional groups, woody cover regressions had smaller variances and more significant slopes than non-woody regressions, with total cover regressions having the least variance and most significant results of the three plant functional cover groups. Across all models, there were significant regression values at very small sample sizes across regression types, with both woody and total cover groups having some significant regressions where n = 1. However, results from all the sampler combinations show that when n < 17, a wide variety of regression model intercepts and coefficients are reported, with the majority of the models appearing to be highly statistically significant. Consistency among regression models (i.e., low levels of variance) only emerged when sample sizes were large (n ≥ 17).

Table 2. Summary Table of Key Changes in Variance and Significance of Regressions Across All Sampler Combinations and Cover Types
Regression type Total cover Woody cover Non-woody cover
n where coefficient variance begins to clearly decrease Power 20 17 18
Log-linear 18 18 18
Linear 18 18 19
Coefficient value where n = 23 (highest sampling resolution) Power 9.7 17.6 0.8
Log-linear −0.53 −0.73 −0.81
Linear −5.4 −7.2 −8.4
n needed to achieve significant median coefficient values (p < 0.05) Power 1 1 No median significant p-values observed
Log-linear 3 7 No median significant p-values observed
Linear 1 2 14

Figure 8 illustrates the effects of sampling error on regression model uncertainty when using a subset of the data and models. Each collection of points in Figure 8 represents a combination of five sediment sampler masts (i.e., n = 5). In total, seven sampler combinations are used from the more than 30,000 regressions where n = 5 in Figure 8. The figure reveals the risks of inadequately sampling aeolian sediment transport rates when trying to understand aeolian process responses to vegetation cover change. For measurements from a single combination of five sediment sampler masts, three different regression models (power, log-linear, and linear) may all have statistically significant (p < 0.05) fits (Figure 8a). If those five sediment sampler masts were set up in three different configurations, three different model forms of the same type (e.g., power) may all have statistically significant (p < 0.05) fits (Figure 8b). Similarly, three different regression models (log-linear, linear, and power) could be fit to describe the response of normalized Q to three different sampler configurations, all with statistically significant (p < 0.05) fits (Figure 8c). As demonstrated in Figures 5-7, this outcome of large uncertainty in both the type (e.g., power, log-linear, linear) and form (coefficient and intercept) of the regression models could occur using almost any combination of sample size and sediment sampler locations where n < 17, due to small sample sizes producing highly variable models, most of which may appear statistically significant.

Details are in the caption following the image

Various regression types fit for both the same and different combinations of five samplers. Each regression line in this figure has a significant slope coefficient. For each graph, the x-axis represents total vegetative cover (%) and the y-axis represents values of Q normalized by wind speed. Panel (a) shows the three different regressions used in this study fit to a single combination of five samplers. Panel (b) demonstrates three different power regressions used to fit three different combinations of five samplers, and (c) shows the three different regression types used to fit three other different combinations of 5 samplers. We note that the log-linear regressions shown here have had their y-axis values back-transformed and thus appear as a negative exponential curve with the formula: (y = a · ebx).

4 Discussion

Heterogeneity in vegetation cover, meteorological conditions, and soil surface conditions at different scales lead to large spatiotemporal variability in aeolian transport processes. Robust monitoring sample designs are therefore needed to reduce uncertainty and to accurately capture the trends and relationships of sediment transport. For the first time, our results have demonstrated how inference about the type (e.g., power, log-linear, linear) and form (coefficient and intercept) of regression models describing the relationship between aeolian sediment transport rates and vegetation cover indicators are highly sensitive to sampling error due to inadequate sample size and sample location. Many models may appear to be statistically significant, but resampling revealed that consistent, statistically significant models only emerged with larger sample sizes (n ≥ ∼17) at our study site. Intensive sampling of a site is therefore required to adequately capture the effects of vegetation change on aeolian horizontal sediment mass flux with robust regression model fitting. At smaller sample sizes (n < 17), statistically significant regression fits may be observed, but large variances in the coefficients and intercepts for those regressions reflect multiple possible fits (i.e., large uncertainty) depending on sample locations. As such, robust inference about the true nature of the response of aeolian sediment transport to changing vegetation cover would not be possible due to the large variance brought about by sampling not adequately representing the population—thus, any observed relationship between Q and vegetation cover would most likely be affected by sampling error. Because different sampler locations can lead to different model fits appearing to be significant, the apparent significance of a fitted model may simply depend on the spatial distribution (i.e., location) of samplers and whether the sample design, if inadequately sized, happens to capture the spatial variability of aeolian sediment transport across the area of interest. Non-random sampler placement (e.g., in the middle of unvegetated gaps) thus has great potential to bias estimates of the mean sediment transport rate at a site and impact models and inference about responses to vegetation cover change.

This study examines aeolian sediment transport responses to changing vegetation at a single site through time. The work builds on previous research by Sterk and Stein (1997), Chappell et al. (2003a2003b), and Webb et al. (2019) showing the importance of sample size for mapping and change detection. Webb et al. (2019) found that, across land use and land cover types, sample designs that use small sample sizes and non-random sampling are ineffective for detecting statistically significant changes in Q with a high level of confidence. This study provides further evidence that the spatial heterogeneity of controls on aeolian sediment transport affects how many samplers are needed to accurately capture the relationship between vegetation cover and Q, and there is a universal need for intensive sampling of aeolian sediment transport processes. Our results provide important new insights into the effects of sampling error on models, showing that conventional sampling methods in aeolian field studies are likely underpowered (use insufficient n), which leads to an increased risk of inadequately describing the spatial variability of sediment transport, greater uncertainty about aeolian process responses to environmental change, and inferences that may not be statistically sound despite model fits that are statistically significant.

Our findings have implications for how sampling should be designed to monitor aeolian sediment transport processes. Studies in a variety of environments have demonstrated the underlying physical mechanisms of large spatiotemporal variability in aeolian sediment transport, which include variability in soil texture, dynamic soil properties (e.g., physical and biological soil crusts, loose erodible material), and wind flow over the ground surface as it is influenced by turbulence due to differential surface heating and interactions with non-erodible roughness elements from sand grains to rocks and vegetation (Gillette, 1999; Webb & Strong, 2011). These properties all drive variability in soil erodibility, wind erosivity, and sediment transport at scales from centimeters to kilometers. For example, Baas and Sherman (2005) observed that small irregularities in surface topography and turbulent eddies, generated from higher in the boundary layer, result in turbulence at the soil surface that produces irregular sediment transport patterns over unvegetated surfaces. The presence of vegetation has been shown to further add to the variability of wind flow over the soil surface due to drag partitioning (e.g., Dupont et al., 2014; Okin, 2005; Walter et al., 2012). These studies suggest that the spatiotemporal variability of aeolian sediment mass fluxes can be large no matter what the soil type or ground cover. Implicit in this variability is the need for large sample sizes to describe aeolian transport rates and their change over space and time, as suggested by Sterk and Stein (1997) and Chappell et al. (2003a). Webb et al. (2019) found that structurally simple sites, for example, fallow agricultural fields, had large spatial and temporal variances of Q and required the same sampling approaches to structurally complex sites like the shrub-invaded grassland site used in the present study. Critically, here we have demonstrated that the minimum sample size required to establish a robust regression model relating Q to vegetation cover change (n > 17) is consistent with that found by Webb et al. (2019) required to detect a change in Q across space and through time. The field studies cited above and extensive research on sampling theory that preceded our work (e.g., de Gruijter et al., 2006) suggest that, when the objective is to describe aeolian transport patterns and processes over an area, the need for large sample sizes and random sample designs will exist irrespective of scale. Pragmatically, sampling should therefore be designed with the support of a pilot study and/or power analysis based on measured variances of Q, accounting for the expected amount of change in Q within and between sites, the expected spatial and temporal variances, and desired confidence that the detected change occurred. Assuming that a statistically significant model is robust is risky unless it is clear that the underpinning sampling has sufficient power to detect change. Here we have shown that sampling errors arising from under-powered sample designs will increase type I and type II error rates in regression model fitting—increasing the uncertainty of aeolian models and reducing the effectiveness of models to support research and land management decisions.

Our results also show quantitatively for the first time that the significance of the relationship between aeolian sediment transport rates and vegetation cover differs by plant functional group. The total foliar cover of all plants provided the most statistically meaningful regressions. This result was not surprising, as the total foliar cover indicator represents the totality of vegetative roughness at a site. Woody and non-woody cover indicators had similar relationships with normalized Q as total foliar cover, however, suggesting that both functional groups are an important control on sediment mass flux at the study site. A detailed statistical analysis to determine the relative contributions of each functional group to the overall functional relationship is beyond the scope of this study, but the analysis presented here allows for some broad conclusions. We found that woody cover regression models tended to have lower p-values than non-woody regression models with the same sample size, suggesting that the woody cover metric provided more information about a potential aeolian sediment transport response. Additionally, the power non-woody regressions show increasing values of b, which is opposite to the total and woody cover trends, and the p-values of non-woody log-linear regressions increase with low values of n (3 < n < 6). This suggests that non-woody cover provides a weaker control on the flux process and may lead to noisier relationships. This interpretation is supported by drag partition theory on the effect of roughness elements on Q (e.g., Edwards et al., 2022; Okin, 2008; Raupach et al., 1993). Because woody plants at the study site (honey mesquite) are significantly larger in length, width and height than grasses and forbs, they should extract more momentum from the wind due to drag, and aerodynamically shelter more of the soil surface than herbaceous cover. As the total number of sediment samplers at the site was large (n = 27), we are confident that the influence of woody and herbaceous vegetation on spatial variability of sediment transport was accounted for in measurements of Q, and the woody cover metric was more effective in explaining the functional relationship. As shown in Figure 3, herbaceous vegetation cover has been steadily decreasing, and woody cover increasing, at the site since 2015, so it would be expected that the woody cover indicator would have a stronger, or increasingly strong, effect on aeolian sediment transport as the site transitions toward a shrub-dominated state (Bergametti & Gillette, 2010; Bestelmeyer et al., 2018; Gillette et al., 2006).

The three regression types used in this study were chosen as each had been used at least once in previous studies of sediment transport processes. Their purpose in this study is to show the multiple ways that one could model a statistically significant relationship between Q and vegetative cover, rather than investigating which model was most accurate. However, there are some aspects of the modeling equations that are worth discussing here due to some anomalous statistical behavior. Such a trend is present in the p-values of the woody cover power regressions, which start increasing at n > 9. Since woody cover appears to play a more crucial role than non-woody cover in controlling sediment flux at this site, we believe that the increase in p-values is more related to the power regression used rather than the result of the functional group's influence on sediment transport. While both power and log-linear regressions model a negative exponential shape, the power regression has less flexibility in assimilating increasing data due to its symmetric nature, infinite value at x = 0, and potential stretching of its “tail,” which may make it more challenging to reject the null hypothesis (b = 0).

Finally, our results provide new insights into the potential limitations of percent vegetative ground cover as an indicator of wind erosion, which has implications for land management. Even with a sample design that minimizes sampling error, it is apparent that multiple interpretations of the same data set may be possible instead of data revealing a clear functional relationship. While some uncertainty is expected in functional relationships derived from field data, our results support the interpretation that, in the absence of information about surface roughness structure (i.e., height and spatial distribution of plants, rocks, etc.), percent ground cover indicators do not describe the first-order aerodynamic effects of vegetation on aeolian sediment transport (Okin, 2008; Raupach et al., 1993; Webb et al., 2014). With sufficient sampling resolution, percent vegetative ground cover can be used as an indicator to broadly predict how Q will respond to changes in vegetation including functional group cover over time, but other environmental information is likely necessary to make vegetation management decisions with confidence about potential aeolian sediment transport response rates. Relying on percent vegetation ground cover indicators alone, especially with small sample sizes, could be misleading for identifying functional (aeolian and ecological) thresholds for sediment transport often used to set monitoring benchmarks and to assess ecological risks of wind erosion and management success (Chappell et al., 2018; Webb et al., 2021). Additional site-specific environmental information could include vegetation height and the unvegetated canopy gap size distribution—an indicator of the spatial distribution or arrangement of vegetation at a site. Data to calculate these structural indicators can be collected using standard methods of Herrick et al. (2018), as implemented in this study and nationally in the US and Mongolia, by federal agencies to support rangeland health assessments and management (e.g., Toevs et al., 2011; Webb, Herrick, et al., 2016).

5 Conclusions

Model development and land management needs have driven the desire to understand aeolian sediment transport responses to vegetation cover change. However, monitoring aeolian transport rates is highly susceptible to sampling error due to the many factors that influence wind erosivity and soil having large spatiotemporal variability. This variability necessitates a detailed understanding of how sampling can sufficiently represent environmental heterogeneity to describe patterns and changes in aeolian transport rates. Key to reducing sampling error in field and wind tunnel studies are sample designs that adequately represent the spatiotemporal variability of aeolian sediment transport processes across a study area. Here, we investigated the effects of sample design and sampling error on regression models to describe aeolian transport responses to vegetation cover change. We fit three different types of regression models to all possible combinations of measured horizontal sediment mass flux (Q) and vegetative ground cover data from the intensely sampled JER NWERN site to elucidate the nature of the functional relationship between Q and vegetation cover and to establish the sensitivity of the observed relationship to sample size and location. Our results show that model uncertainty and interpretation of aeolian sediment transport-vegetation cover relationships are strongly affected by sample design.

Here, for the first time, our results explicitly demonstrate the sensitivity of model form and error rates to sample designs to measure the impact of environmental change on Q. We found that at least 17 randomly located sediment samplers were needed to reduce sampling uncertainty and produce a consistent regression model fit to describe the sediment transport-vegetation cover relationship with power or negative exponential (log-linear) models. Using less than 17 samplers, or non-random sampling, can produce apparently significant model fits but with large variability in parameter values. Thus, underpowered sampling runs the risk of encouraging false inferences about the relationship between vegetation cover and sediment mass fluxes. Practically, this research has shown that conducting power analyses, which can be informed by pilot studies, will be important for ensuring that sample designs used in aeolian research enable robust model fitting and inference about the behavior of aeolian transport processes. As monitoring with small sample sizes and biased sample locations remains common in aeolian research, we hope our novel and explicit demonstration of the impacts of sampling error on models will encourage a more considered approach to sampling and its implications for understanding and informing the management of aeolian systems.

At the shrub-invaded grassland site used in this study, we found that percent woody cover was likely a better predictor of sediment transport than non-woody cover—reflecting the ongoing loss of herbaceous vegetation at the site as well as the stronger attenuating effect of larger woody mesquite shrubs on sediment transport. Our novel approach to analyzing measured responses in Q to changes in vegetation functional group cover over time highlights the limitations of using percent vegetation cover alone as a predictor of aeolian sediment transport. Even when sampling error is minimized, we found that different types and forms of models described the normalized Q-vegetation cover relationship at high levels of statistical significance. Using indicators of vegetation structure in addition to cover to interpret sediment transport data could more effectively inform management strategies to reduce wind erosion.

Acknowledgments

This research was a contribution of the Long-Term Agroecosystem Research (LTAR) network supported by the U.S. Department of Agriculture (USDA). Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the US Government. We thank Sarah McCord and Joseph Brehm for their assistance with data preparation and indicator calculations. This research was also supported by funding from the National Science Foundation to New Mexico State University for the Jornada Basin Long Term Ecological Research Program (DEB 2025166), and by the College of Agricultural, Consumer and Environmental Sciences at New Mexico State University.

    Data Availability Statement

    Data used in this research are available on the Zenodo repository https://doi.org/10.5281/zenodo.7574914, Webb (2023) and are cited and referenced in the manuscript.