Groundwater Pumping Impacts on Real Stream Networks: Testing the Performance of Simple Management Tools
Abstract
Quantifying reductions in streamflow due to groundwater pumping (“streamflow depletion”) is essential for conjunctive management of groundwater and surface water resources. Analytical models are widely used to estimate streamflow depletion but include potentially problematic assumptions such as simplified stream-aquifer geometry and rely on largely untested depletion apportionment equations to distribute depletion from a well among different stream reaches. Here, we use archetypal numerical models to evaluate the sensitivity of five depletion apportionment equations to stream networks with varying drainage densities, topographic relief, and groundwater recharge rates; and statistically evaluate the sources of error for each equation. We introduce a new depletion apportionment equation called web squared which considers stream network geometry, and find that it performs the best under most conditions tested. For all depletion apportionment equations, performance decreases with increases in drainage density, relief, or recharge rates, and all equations struggle to estimate depletion in short stream reaches. Poorly performing apportionment equations tend to underestimate streamflow depletion relative to numerical model results, leading to a negative bias and underpredicted variability, while error in the best performing apportionment equations tends to be due to imperfect correlation. From a management perspective, apportionment equations with error due to bias and variability are preferable as they correctly identify which reaches will be affected and can be statistically corrected. Overall, these results indicate that the web squared method introduced here, which explicitly considers stream geometry, performs the best over a range of real-world conditions, and will be most accurate in flatter and drier environments.
Key Points
- New streamflow depletion apportionment equation with stream geometry performs best across a variety of stream network geometries
- Performance of all depletion apportionment equations decreases with increased drainage density, relief, and groundwater recharge rates
- Spatial application of Kling-Gupta Efficiency is useful for identifying different sources of error and accompanying management implications
Plain Language Summary
Pumping groundwater for human uses such as irrigation can reduce flow in streams by intercepting water which otherwise would have eventually flowed into the river channel or causing water to flow out of the stream into the subsurface. This “streamflow depletion” reduces the water available to downstream users and ecosystems. Due to a lack of data and resources, relatively simple (“analytical”) groundwater models are often used to estimate pumping impacts, but they are based on unrealistic assumptions, such as straight streams. In this study, we introduce a new “depletion apportionment” equation used to estimate pumping impacts that considers the spatial configuration of real stream networks. By comparing it to more complex (“numerical”) groundwater models, we find that our new equation works better than existing equations under a variety of conditions. All of the depletion apportionment equations we test perform best in flatter, drier settings where streams are spaced further apart. Finally, we compare the causes of error among equations, which have different implications for water management decisions. Overall, our results show that stream geometry is an important factor to consider when making groundwater pumping decisions, and the new depletion apportionment equation introduced here is a useful tool for water managers.
1 Introduction
Groundwater is a critical contributor to streamflow and supports both aquatic ecosystems and human needs (Acreman et al., 2014; Booth et al., 2016; Gleeson & Richter, 2017; Zektser et al., 2005). For instance, groundwater discharge into streams provides a stable supply of water during dry periods and is a key regulator of water temperature, an important water quality parameter for aquatic ecosystems (Johnson et al., 2017; Kurylyk et al., 2014, 2015; Strauch et al., 2017; Zorn et al., 2012). It has long been recognized that groundwater pumping can reduce streamflow via the “capture” of groundwater that would have otherwise discharged into a stream (Barlow et al., 2018; Bredehoeft, 1982, 2002; Theis, 1941). In extreme cases, pumping may even reverse the hydraulic gradient at the stream and induce infiltration from the streambed into the aquifer (Barlow & Leake, 2012). Reductions in groundwater discharge to and/or induced infiltration from streams are broadly known as “streamflow depletion,” and can have devastating effects on ecosystems and downstream water users (Barlow & Leake, 2012; Zorn et al., 2012).
Streamflow depletion is not possible to measure directly and can be estimated using both numerical and analytical models. Numerical models (e.g., MODFLOW) are widely used for the evaluation of pumping impacts on groundwater levels and discharge to streams (Ahlfeld et al., 2016; Bredehoeft & Kendy, 2008; Lackey et al., 2015). However, numerical models are time and labor-intensive to construct, validate, and apply (Rathfelder, 2016). Therefore, they are typically generated for a specific aquifer and used at local to regional scales (Leake et al., 2010; Nyholm et al., 2002).
On the other hand, analytical models of streamflow depletion have the advantage of being computationally simple, and are therefore often used for water management and permitting decisions (Jayawan et al., 2016; Miller et al., 2007; Reeves et al., 2009). However, analytical solutions adopt a suite of potentially problematic assumptions often including that of an infinite horizontal aquifer bounded by a single linear stream (Glover & Balmer, 1954; Hantush, 1965; Hunt, 1999; Jenkins, 1968; Theis, 1941). Several studies have evaluated the performance of different analytical models via comparison with numerical models, and found that resistance to flow through the streambed (Jayawan et al., 2016; Sophocleous et al., 1995); subsurface heterogeneity and anisotropy (Li et al., 2016); aquifer storativity (Jayawan et al., 2016); and the degree of aquifer penetration by the stream channel (Butler et al., 2001; Sophocleous et al., 1995) are particularly important considerations.
To use analytical models in real-world settings, geometric methods known as “depletion apportionment” equations are used to distribute streamflow depletion calculated analytically for a single reach to stream networks with multiple reaches. However, relatively little research has compared the performance of different depletion apportionment equations. Reeves et al. (2009), the only study the authors are aware of, evaluated nine depletion apportionment equations via comparison with output from a MODFLOW numerical model during the development of the Michigan Water Withdrawal Assessment Tool (http://www.deq.state.mi.us/wwat). They elected to use an inverse-distance weighting approach (described in more detail in section 2.2) in their tool because it performed reasonably well compared to numerical model output, was relatively simple to calculate, and has a theoretical basis in analytical solutions to streamflow depletion with multiple streams (Wilson, 1993). However, this comparison was based on a single watershed within a larger regional-scale groundwater flow model, and therefore the transferability of their conclusions to stream networks with different hydrological characteristics (e.g., drainage density, topographic relief, and groundwater recharge) is unknown.
To enhance the utility of analytical models as a management tool, we ask, which depletion apportionment equations compare most favorably to numerical model simulations across a range of realistic stream networks? Using the groundwater flow system around Nanaimo, British Columbia (Canada) as an exemplar, we test a suite of analytical depletion apportionment methods across stream networks varying in drainage density, topographic relief, and groundwater recharge rates. We make three novel contributions to the literature: (1) the introduction of two new depletion apportionment equations, which we call web and web squared (section 2.2); (2) a novel spatial application of model evaluation criteria typically used for time series data (section 2.4) and the development of new visualization methods to assess sources of error (sections 3.1 and 4.2); and (3) evaluation and sensitivity analysis of five depletion apportionment equations across diverse stream network geometries (sections 3.1–3.3) to guide their use in water resource management.
2 Methods
2.1 Modeling Approach
Modeling approaches to quantify streamflow depletion within a stream network can be broadly divided into three groups (Table 1): (1) analytical models paired with depletion apportionment equations; (2) archetypal numerical models which simplify real-world conditions to evaluate processes in a generalizable manner; and (3) site-specific numerical models. The choice of approach depends on the aims of a particular study, and the modeler must weigh trade-offs between complexity, available resources, and intended model application. For water resource management, analytical solutions are often used for preliminary analysis and in data-scarce settings due to the relative simplicity of developing and implementing them. As resources and interest are available, analytical models are often superseded by site-specific numerical models, which allow for detailed exploration of different management strategies on local surface water-groundwater interactions.
Analytical models with apportionment equations | Archetypal numerical models | Site-specific numerical models | |
---|---|---|---|
Boundary conditions | Analytical models consider one or two streams with simplified geometry and constant head; depletion apportionment equations distribute depletion to different stream reaches. | Complex stream geometry simulated as constant river boundary condition with specified head. | Complex stream geometry represented by a mix of boundary conditions such as river, constant head, drain, etc. |
Parameter values, input data, and geometry | Assume flat, infinite homogeneous, isotropic aquifers with no vertical flow. Input data sets exist for most aquifers. | Simplified subsurface; topographic relief can be included. Moderate input data requirements which exist for most aquifers. | Heterogeneous and anisotropic, multiple layers with complex geometry. Many regions do not have enough data. |
Required effort, skill, and calibration | Moderate effort (minutes–days) and skill (generalists). Not calibrated. | Significant effort (weeks) and skill (specialists). Not calibrated. | Significant effort (months) and skill (experts). Calibrated to hydrogeologic and hydrologic measurements. |
Examples from literature | Foglia et al. (2013); Jayawan et al. (2016); Reeves et al. (2009). Only Reeves tested depletion apportionment equations. | Kendy and Bredehoeft (2006); Konikow & Leake (2014); Lackey et al. (2015). | Ahlfeld et al. (2016); Feinstein et al. (2016); Fienen et al. (2018); Reeves et al. (2009). |
In this study, our goal was to evaluate the sensitivity of the performance of depletion apportionment equations to different stream network geometries by systematically varying drainage density, topographic relief, and groundwater recharge rates. Thus, we elected to use archetypal numerical models for comparison to eliminate local, site-specific complexity, and instead focus on process-based understanding (Gleeson et al., 2016; Voss, 2011a, 2011b; Zipper et al., 2017b). Archetypal models use a realistic set of hydraulic parameters to provide broadly relevant output, and are therefore not calibrated as they are not intended to recreate real-world conditions. This approach allows us to isolate the impacts of stream network geometry on streamflow depletion and answer the question posed in section 1. Furthermore, we are not testing the performance of one or multiple analytical models, as has been accomplished in previous work (Butler et al., 2001; Jayawan et al., 2016; Li et al., 2016; Sophocleous et al., 1995; Spalding & Khaleel, 1991). Rather, we are comparing the distribution of depletion within a stream network among various depletion apportionment equations (section 2.2) with our archetypal numerical model (section 2.3).
Our archetypal domain was based on the groundwater system around the City of Nanaimo on Vancouver Island, British Columbia, Canada (Figure 1). We selected this domain due to a strong east-west gradient in drainage density, calculated as the length of stream per 1,500 m spatial resolution grid cell. We took advantage of this natural gradient by selecting three subdomains corresponding to low, medium, and high drainage density for testing the apportionment equations (Figure 1). Each of these domains has 62 stream reaches, but vary in area from 7.6 km2 (high density) to 81.6 km2 (low density). Stream network geometry is from the Canadian National Hydro Network (Government of Canada, 2016).

To test the depletion apportionment equations, we created a grid of synthetic pumping wells in each drainage density domain, the spacing of which varied between drainage densities due to the order of magnitude difference in domain size. After creating the grid, we eliminated wells in MODFLOW cells which contained a river segment (see section 2.3 for more details about the MODFLOW model). This led to slight differences in the total number of wells between the domains, though all had at least 50 pumping wells. In the low density domain, there were 62 wells spaced at 1,080 m; in the medium density domain, there were 52 wells spaced at 1,009 m; and in the high density domain, there were 54 wells spaced at 494 m (Figure 1c).
2.2 Depletion Apportionment Equations
To evaluate different depletion apportionment equations, we calculated the streamflow depletion fraction for each stream reach while pumping each well using five different apportionment equations (Figure 2). The first three (Thiessen polygon, inverse distance, and inverse distance squared) were previously evaluated in Reeves et al. (2009) for the Kalamazoo aquifer in Michigan, while the final two (web inverse distance and web inverse distance squared) are new contributions in this study which are designed to consider the entire geometry of a stream network, rather than a single point on each stream reach.




where d is the horizontal distance from the well to the closest point on stream reach j, and n is the total number of stream reaches.


These apportionment equations have strong theoretical and physical justification. Wilson (1993) demonstrated that the proportion of flow to a pumping well between two parallel streams is a function of the inverse of the distance between each stream and the well, as used by Reeves et al. (2009) to justify the use of the inverse distance method, which also supports the web methods. The web-based method of subdividing a stream into equally spaced points extends the inverse distance approaches based on the analysis of Kollet et al. (2002), who demonstrated that where the assumption of infinitely long streams is not valid (e.g., real stream networks), capture fraction is equal to the integral of changes in leakage along the length of a finite stream reach; by breaking the stream up into equally spaced points, the web methods distribute depletion based on the finite length of each stream reach, rather than a single point as used in the inverse distance and Thiessen polygon methods. Thus, finer point spacing in the web method may better account for different stream network geometries, though it would increase computational cost of performing these calculations, which can be significant depending on the total length of streams in the domain. Finally, the squared term (in both the inverse distance squared and web squared methods) is intended to give greater weight to stream reaches closer to the well (Reeves et al., 2009). We conducted exploratory analysis using a range of exponents for the inverse distance and web approaches in addition to squared (e.g., d3, d4, etc.), but elected to conduct our full analysis using only d and d2 since higher exponents did not significantly improve performance and are less justified by hydrologic theory. Given that both the inverse distance and web-based methods include all streams within the domain, the use of a nonlinear weighting parameter may be more important as the size of the area tested increases due to far-field streams representing a larger proportion of the overall stream network.
2.3 Numerical Modeling
To evaluate the performance of the different analytical apportionment equations in a variety of stream network geometries, we performed a sensitivity analysis by comparing depletion apportionment equation results to archetypal numerical models parameterized with different drainage densities, topographic relief, and recharge rates. We selected these variables for sensitivity analysis because they exert a strong control on stream network geometry: drainage density by defining the spatial distribution of streams, topographic relief by changing the vertical position of both streams and pumping wells, and groundwater recharge by changing the water table geometry and the aquifer thickness. Given our focus on stream and aquifer geometry, we did not conduct a sensitivity analysis to the parameters controlling subsurface flow (e.g., Table 2). Previous research has focused on this (Butler et al., 2001; Jayawan et al., 2016; Li et al., 2016; Sophocleous et al., 1995) and future work will investigate additional stream geometries under a wide range of subsurface parameterizations.
Parameter | Value |
---|---|
Number of rows x number of columns |
Low density: 200 × 100 Medium density: 105 × 135 High density: 62 × 56 |
Cell width x cell height |
Low density: 107.3 m × 103.6 m Medium density: 101.1 m × 100.9 m Low density: 101.5 m × 100.4 m |
Number of layers | 10 |
Layer thickness | 10 m |
Hydraulic conductivity (isotropic) | 1 × 10−5 m s−1 |
Specific storage | 1 × 10−5 m−1 |
Specific yield | 0.2 |
Effective porosity | 0.14 |
Total porosity | 0.3 |
First, we tested sensitivity to drainage density by creating an archetypal steady-state numerical model of each drainage density domain using MODFLOW-2005 (Harbaugh, 2005), a finite-difference saturated groundwater flow model which has previously been used to evaluate the performance of analytical solutions of streamflow depletion (Butler et al., 2001; Jayawan et al., 2016; Reeves et al., 2009; Sophocleous et al., 1995). As discussed above (section 2.1), these models were intended to be simplified representations of the groundwater system around Nanaimo BC to isolate the impact of different stream geometries on streamflow depletion, rather than site-specific calibrated numerical models (Table 1).
Most parameters were constant across the three drainage density domains (Table 2), and selected to be representative of a typical sandy alluvial aquifer (Fetter, 2000). Each domain had a flat land surface with a homogeneous unconfined aquifer extending 100 m below ground for the initial simulations. Streams were represented using the river (RIV) package as 4 m in depth, 10 m in width, with a streambed thickness of 1 m and streambed conductivity of 0.01 m s−1. Recent work has highlighted the challenges associated with estimating capture and allocation of streamflow depletion in nonlinear groundwater systems (Nadler et al., 2018; Schneider et al., 2017); we elected to simulate unconfined aquifers (which have nonlinear steady-state head distributions between boundaries, unlike confined aquifers) and use the RIV package for streams (in which leakage is a nonlinear function of head in the aquifer and stream, unlike the linear Generalized Head Boundary package) to evaluate depletion apportionment equations in a system more closely mimicking real-world conditions.
We simulated pumping wells using the well (WEL) package. Wells were screened over the entire aquifer thickness (100 m) and sequentially pumped at a rate of 1,000 m3 d−1 such that there was a separate model realization for each pumping well and domain. We ignored the potential contributions of nonflowing surface water features; lakes within the domain were not considered, and the ocean (which is along the north edge of the medium density domain and all edges of the low density domain except the west) were set as inactive cells (no-flow) to avoid variable-density flow and contribution to pumping from ocean water, which was outside the scope of this study.
Second, we conducted an additional sensitivity analysis of our depletion apportionment equations to topographic relief and groundwater recharge rates in the low density domain since this domain had the best overall performance in the flat simulations (see section 3.1) and thus should be more sensitive to changes than a poorly performing domain whose performance cannot decrease as much. First, we introduced relief into the domain using the Canada digital elevation model (Natural Resources Canada, 1997). The top of the numerical model domain was defined as the land surface, which ranged from 0 to 211 m above sea level (masl). The top nine layers were terrain-following and 10 m in thickness, and the bottommost layer extended to −100 masl. Wells were screened over their top 100 m. We then tested the effects of groundwater recharge using the recharge (RCH) package. We applied five different recharge rates (0.01, 0.05, 0.1, 0.5, and 1.0 m yr−1) to represent a range of recharge/hydraulic conductivity ratios (3.17 × 10−5 to 3.17 × 10−3); recharge, which is not typically included in analytical streamflow depletion solutions (Glover & Balmer, 1954; Hunt, 1999; Theis, 1941), introduces an addition aspect of nonlinearity to the groundwater flow system and allows us to further explore the limits of depletion apportionment equations. To compensate for the increased supply of water, we also increased the pumping rate to 5,000 m3 d−1. All other parameters were the same as the flat low density model.

Combined, this model design provided an opportunity to test depletion apportionment equations in a nonlinear setting.
2.4 Model Evaluation
We evaluated the performance of the different analytical apportionment equations via comparison to MODFLOW output. The output variable evaluated was fi, the fraction of total streamflow depletion occurring within each stream reach for a given well, which could vary from 0% (the pumping well has no effect on stream-aquifer interactions in a given reach) to 100% (all streamflow depletion from a pumping well came from a single reach). Following Reeves et al. (2009), we calculated fit for a given depletion apportionment equation using only reaches with >5% streamflow depletion in either the MODFLOW or depletion apportionment approaches to avoid performance evaluation to be overly impacted by minor differences in small estimates of depletion. As an example to illustrate the methodology, Figure 3 shows the data for an arbitrary pumping well in each of the drainage density domains (corresponding to rows) with all of the depletion apportionment equations (columns). For a given row, only reaches colored cyan, green, orange, or red are used to compare MODFLOW and the depletion apportionment approach, and dark blue reaches are ignored.

Example plot showing estimated depletion for different stream reaches under each apportionment method for a single pumping well (red dot). Supporting information Figure S1 shows a map of depletion for a given reach.



While the hydrologic community has traditionally used the KGE on time series data, our model output data is spatial, corresponding to steady-state streamflow depletion estimates associated with different stream reach and well combinations. This novel use of the KGE allowed us to spatially evaluate both overall fit, and the performance related to correlation (r), variability (γ), and bias (β). The overall KGE and each of the individual metrics (r, γ, β) have an ideal value of 1.



3 Results
3.1 Sensitivity to Drainage Density
Across all drainage densities in the flat domains, the web squared method consistently best matched MODFLOW results, followed by the inverse distance squared method (Table 3; Figure 4). All depletion apportionment equations had a significant (p < 0.001) positive linear relationships with MODFLOW estimates across all drainage densities, with R2 values ranging from 0.24 (Thiessen, low density) to 0.76 (web squared, medium density). For both the inverse distance and web methods, the squared equations performed better than the linear equations across all drainage densities, as the linear equations consistently underestimated depletion (Figures 4a–4c).
Drainage density | Relief | Recharge (mm yr−1) | Kling-Gupta Efficiency (KGE) | ||||
---|---|---|---|---|---|---|---|
Thiessen | Inverse distance | Inverse distance squared | Web | Web squared | |||
Sensitivity to drainage density in flat domains | |||||||
High | No | 0 | −0.043 | 0.139 | 0.447 | 0.079 | 0.543 |
Medium | No | 0 | 0.450 | 0.165 | 0.608 | 0.152 | 0.626 |
Low | No | 0 | 0.648 | 0.247 | 0.686 | 0.215 | 0.765 |
Sensitivity to relief and recharge in low drainage density domain | |||||||
Low | Yes | 0 | 0.573 | 0.169 | 0.590 | 0.100 | 0.596 |
Low | Yes | 10 | 0.569 | 0.176 | 0.591 | 0.096 | 0.594 |
Low | Yes | 50 | 0.560 | 0.161 | 0.578 | 0.091 | 0.585 |
Low | Yes | 100 | 0.555 | 0.156 | 0.577 | 0.091 | 0.580 |
Low | Yes | 500 | 0.520 | 0.130 | 0.545 | 0.065 | 0.535 |
Low | Yes | 1,000 | 0.433 | 0.074 | 0.463 | 0.003 | 0.440 |
- Note. Bold text is the best performance for each domain. MSE is shown in supporting information Table S1.

Performance of each method and domain; only well/reach combinations with a depletion of >5% included. (a–c) MODFLOW versus analytical depletion apportionment for high, medium, and low drainage density domains. All linear best-fit lines are statistically significant (p < 0.05). (d–f) Difference between analytical and MODFLOW approaches for high, medium, and low drainage density domains.
For all depletion apportionment equations, performance decreased as drainage density increased, with the lowest KGE in the high density domain, intermediate in the medium density domain, and highest in the low density domain (Table 3). The decrease in performance of the depletion apportionment equations at higher drainage densities was associated with a systematic underestimation of depletion, particularly at low levels of depletion (Figures 4a and 4d). This pattern was strongest for the area-based Thiessen polygon method, which performed the worst in the high density domain but the third best in the medium and low density domains. However, the slope of the best fit line for the inverse distance squared and web squared approaches were closest to 1 in all domains, indicating they scale effectively across a range of depletion magnitudes in all drainage density domains.
All of the depletion apportionment equations performed poorly at predicting depletion in short stream lengths (Figure 5), which are in many cases <0.01 km, or an order of magnitude smaller than MODFLOW cell sizes (supporting information Figure S2 and Table 2). These small reaches are primarily found in the low drainage density domain (Figure 2 and supporting information Figures S2 and S3) at the base of a topographically steep area (supporting information Figure S4), potentially representing springs. This led to a relatively consistent spatial distribution of error across all depletion apportionment equations, though the Thiessen polygon approach also had frequent errors near the boundaries of the domain where polygons about the domain edge in one or more directions (supporting information Figure S5). Dividing a stream into individual reaches represented by line segments is typically based on the locations of confluences and short stream reaches are a potential source of error which may be more important in highly branching stream networks.

Performance of each depletion apportionment relative to MODFLOW as a function of stream reach length. See supporting information Figure S2 for distribution of stream reach lengths in each domain.
The cause of error (bias, correlation, or variability) was more strongly controlled by the choice of depletion apportionment equation than drainage density (Figure 6). The web squared method, which performed the best, tended to have among the most evenly distributed error profiles with 37–71% due to correlation, 23–43% due to variability, and 6–21% due to bias. Error in the inverse distance squared method was mostly correlation (61–93%), with the remainder due to bias (5–24%) and variability (2–20%). For the Thiessen polygon approach, virtually all (85–100%) error was due to imperfect correlation. Error in the linear inverse distance and web methods was due primarily to variability and bias, which are linked due to the systematic underestimation of depletion by the apportionment equations (Figure 4). Across all domains and depletion apportionment equations, there was a negative bias among stream reaches with >5% depletion complemented by a positive bias among stream reaches with <5% depletion. This indicates that the apportionment equations underpredicted depletion relative to the numerical model in the stream reaches which are most strongly affected by pumping, while simultaneously overestimating small amounts of depletion in many reaches which had relatively minor or no depletion in the numerical model. This bias was negatively correlated with drainage density, with the smallest bias in the low density domain.

Ternary diagrams visualizing overall fit (KGE) and contribution of bias, variability, and correlation to total error (MSE). (a) Comparison between depletion apportionment equations and drainage density for flat, no recharge simulations. Shapes are size-coded by KGE, such that larger points have a better overall fit. (b) Annotated ternary diagram highlighting relevance of different types of error to streamflow depletion management. Pop-out scatterplots show examples analogous to Figure 4 for each endmember point of the ternary diagram.
3.2 Sensitivity to Relief
When we incorporated topographic relief into the low density domain, the rank-ordering of the depletion apportionment equations remained unchanged (from best to worst: web squared, inverse distance squared, Thiessen polygon, inverse distance, web; Table 3), though the gap between the web squared and inverse distance squared methods decreases dramatically. For the best method (web squared), the decrease in performance due to the introduction of relief into the low density domain was approximately equal to the decrease in performance associated with going from low to medium drainage density (Table 3). However, while performance skill decreased due to relief, the patterns of performance were comparable with the flat domain; for example, the inverse distance squared method had the closest slope to 1.0 (Figure 7a), the inverse distance and web methods consistently underestimated depletion (Figures 7a and 7e), and the causes of variability remained primarily correlation errors for the best-performing approaches (Figure 7i), especially Thiessen polygon. As in the flat domains, there was a negative bias for all depletion apportionment equations, with the smallest bias using the Thiessen polygon approach.

3.3 Sensitivity to Recharge
As the amount of groundwater recharge increased, the performance of all depletion apportionment equations decreased (Table 3). Web squared performed the best at recharge rates ≤ 100 mm yr−1 (followed by inverse distance squared), while inverse distance squared performed the best at recharge rates ≥ 500 mm yr−1 (followed by web squared). Despite this change in rank order at high recharge levels, the performance of the web squared and inverse distance squared were extremely similar across all recharge rates, differing only at the second decimal place of KGE for recharge rates ≤ 1,000 mm yr−1, and MSE for the web squared method was lowest for all scenarios simulated (Table 3 and supporting information Table S1). As noted with the introduction of relief (section 3.2), the patterns of performance remained comparable both to the flat domain and among different recharge rates: the slope of the inverse distance squared was closest to 1.0 (Figures 7a–7d), depletion was consistently underestimated by the inverse distance and web methods (Figures 7a–7h), and the causes of error for the best-performing approaches remained correlation errors for the best-performing approaches (Figures 7i–7l), especially Thiessen polygon.
For several well-reach combinations, MODFLOW-predicted depletion was either <0% (meaning less river leakage when the well was pumped) or >100% (meaning greater than the total leakage summed across all reaches). These two unusual circumstances are by definition related in equation 6: it is impossible for depletion of >100% to occur in a reach without negative depletion occurring elsewhere in the domain. Negative depletion estimates occurred when high recharge rates led to strong head gradients, including head rising above the surface elevation (supporting information Figure S4), due to the no-flow boundaries along the edges of our no-flow domain. Pumping slightly reduced the gradients in places, leading to changes in watershed divide locations.
4 Discussion
4.1 Depletion Apportionment Equation Performance
In order to use analytical streamflow depletion models as effective groundwater-surface water management tools, it is necessary to understand where and under what conditions they perform effectively. Previous work by Reeves et al. (2009) tested nine depletion apportionment equations for a single stream reach in Michigan, and concluded that an inverse distance weighting approach using the closest point on each stream reach to a well was reasonably effective in comparison with numerical model results and grounded in hydrogeologic theory (Wilson, 1993). In this study, we tested this conclusion in a variety of settings including multiple stream network geometries, topography, and groundwater recharge conditions. We found that a new method introduced here (web squared) outperforms the inverse distance approach under most of the conditions simulated (Table 3 and supporting information Table S1). This indicates that complete stream network geometry, rather than a single point on each stream, is a critical consideration for the accurate use of analytical solutions.
Stream length was an important control on the performance of all of the depletion apportionment equations, with a substantially worse fit to MODFLOW results in very short (<0.1 km) stream reaches (Figure 5). These short streams are found primarily in the low density domain at the base of a topographically steep feature and potentially represent springs, a type of groundwater-dependent ecosystem which is particularly vulnerable to pumping (Currell, 2016; Eamus et al., 2015; Rohde et al., 2017). Given that the length of these reaches is smaller than the MODFLOW grid cells used to represent them, this error may be driven by a scale mismatch between the two methods; finer meshes in numerical models may be necessary to accurately estimate depletion in these short reaches.
Additionally, the performance of all depletion apportionment equations decreased when topographic relief and groundwater recharge were introduced into the domain. This is because both relief and recharge increase spatial variability in the water table within the domain, creating water table gradients locally driving flow which are ignored by the purely geometric depletion apportionment equations. For instance, a stream reach in close proximity to a well may be on the other side of local groundwater divide, and therefore be relatively unaffected by pumping, or have an increase in groundwater discharge following pumping (negative streamflow depletion in Figure 7) due to an increase in the contributing area to that stream reach associated with a shift in the groundwater divide. Currently, the depletion apportionment equations use only horizontal distance between each well and a stream reach (section 2.2), and including terms representing elevation differences and local variability in elevation may be a path to improve performance, particularly in high-relief settings.
4.2 Importance of Different Sources of Error
In this study, we apply the KGE spatially and develop a novel approach to quantify and visualize the contribution of different sources of error (e.g., Figure 6). We weighted the different types of error (correlation, bias, variability) equally in the calculation of the KGE. However, depending on study, policy, or management goals, it is possible to assign different weights to these components which may influence the selection of the preferred depletion apportionment equation. Figure 6b highlights some of the considerations associated with different types of error. For instance, methods where error is primarily due to bias and variability are best at identifying which streams are affected by a pumping well, though the magnitude of depletion may be incorrect—however, this may be statistically corrected if the degree of bias/variability is known. In contrast, methods where error is primarily due to correlation are most effective at predicting mean network-wide depletion, but not identifying specific reaches which may be affected. Given that error in the web squared method tends to be less associated with correlation than either the inverse distance squared or Thiessen polygon approaches, this is further support for its use in screening for potential streamflow depletion.
The prioritization of different types of errors, therefore, is a local decision depending on social and political priorities (Acreman et al., 2014; Quevauviller et al., 2016). The flexibility of the KGE and the ability to decompose mean squared error into its various components (Gudmundsson et al., 2012; Gupta et al., 2009) make it a valuable tool for evaluating analytical models and depletion apportionment equations, so that water managers in locations without existing numerical models can choose appropriate tools. For environmental reasons, conservative estimates of depletion are preferred as they avoid overallocation of water resources (Gleeson & Richter, 2017; Jayawan et al., 2016; Rathfelder, 2016; Reeves et al., 2009). Concerningly, all of the depletion apportionment equations tested here had a negative bias in our archetypal domain, ranging from −0.2% (Thiessen polygon, flat low density domain) to −72.2% (inverse distance, flat high density domain) (Figures 4 and 7). A negative bias means that (on average) streamflow depletion will be underestimated when using the depletion apportionment equation relative to the numerical model. This differs from previous work by Rathfelder (2016), which found that analytical models tended to overpredict depletion relative to a calibrated numerical model; however, Rathfelder (2016) was looking at transient depletion for a single stream over a relatively short (2 year) timeframe, while our study investigates long-term steady-state depletion distributed among a network. These results highlight the importance of quantifying bias locally and correcting where possible, and additional testing of depletion apportionment equations under transient conditions.
4.3 Operationalization and Future Research Needs
These results highlight the potential of depletion apportionment equations to accurately distribute streamflow depletion and estimate capture fraction within a variety of different stream networks, even in nonlinear groundwater flow systems such as the unconfined aquifers tested here (Nadler et al., 2018). To operationalize these apportionment equations, it is necessary to combine them with analytical streamflow depletion models, in particular those considering transient pumping effects to determine the timescales over which impacts will occur. However, it is critical to test and evaluate the performance of analytical models both separate from and combined with depletion apportionment equations to ensure that their performance is sufficient to address management-related questions. The State of Michigan's Water Withdrawal Assessment Tool, described in Reeves et al. (2009), provides one model for how these tools can be tested and operationalized; and our ongoing research is testing several combination of analytical models with depletion apportionment equations under transient conditions in multiple hydrogeological settings to determine where these tools can be effectively implemented.
Our study also highlights several factors impacting streamflow depletion which should be explored in future work. First, model boundary conditions should be sufficiently far from both the wells and the stream reaches of interest. Where nonflowing surface water features such as a coastline are present, these can introduce a considerable source of error, as depletion apportionment equations have not been tested for variable density flow (e.g., saltwater intrusion). Second, given that streams may potentially dry as a result of pumping which can lead to nonlinearities in the base flow response to pumping (Ahlfeld et al., 2016), the streamflow-routing (SFR; Niswonger & Prudic, 2005) MODFLOW package may be preferred to the river (RIV) package used in this study (Feinstein et al., 2016, 2018). However, given that analytical models assume that streams will not dry, using SFR would be less directly comparable to analytical model results, which are more closely approximated by the Generalized Head Boundary (GHB) package. Finally, as noted in section 2, this study focused on the effects of stream geometry, and we do not assess the sensitivity of our results to subsurface parameters controlling groundwater flow such as hydraulic conductivity, streambed conductance, or aquifer heterogeneity; or to the discretization of stream networks into points for the web and web squared methods.
5 Synthesis and Conclusions
Groundwater is widely used for irrigation around the world and groundwater pumping can be a major driver to low streamflow, particularly by exacerbating hydrologic drought (de Graaf et al., 2014; Siebert et al., 2010; Veldkamp et al., 2017; Wada et al., 2012, 2013; Zipper et al., 2017a). To avoid negative impacts of streamflow depletion on ecosystems and stakeholders, it is essential to both quantify the source of water used by wells and put that knowledge into the hands of management decision-makers (Gleeson et al., 2012; Irvine, 2018; Van Loon et al., 2016). Due to the high effort, expertise, and data required to make a site-specific numerical model (Table 1), analytical models paired with depletion apportionment equations may be an essential management tool that can be used to screen pumping wells to avoid excessive depletion.
- Web squared, a new method introduced here which explicitly considers stream network geometry, performs the best across a range of drainage density, topographic, and groundwater recharge scenarios, followed by the inverse distance squared method.
- The performance of all depletion apportionment equations decreased as drainage density increased, topographic relief was included, groundwater recharge increased, and stream reach length shortened.
- The KGE and error decomposition approaches demonstrated here are valuable metrics for assessing the performance of streamflow depletion approaches, as it allows for the separate assessment of performance criteria (correlation, bias, variability) with different management implications.
Future work is needed to test the performance of these depletion attribution methods in different hydrostratigraphic settings, and including additional complexity such as subsurface heterogeneity and transient groundwater flow conditions, to better constrain their use as conjunctive groundwater-surface water management tools.
Acknowledgments
We appreciate helpful discussions with Ben Kerr and Tara Forstner during the analysis and writing process, as well as comments from Marc Bierkens and two anonymous reviewers. This work was funded by a Natural Sciences and Engineering Research Council Collaborative Research and Development Grant (NSERC CRD) to the University of Victoria and Foundry Spatial. Support to Andreas Hartmann was provided by the Emmy Noether-Programme of the German Research Foundation (DFG; grant HA 8113/1-1; project “Global Assessment of Water Stress in Karst Regions in a Changing World”). Data and code are available on GitHub (https://github.com/szipper/NanaimoAttributionMethods). All analyses were performed using R 3.4.3 (R Core Team, 2017) and graphics were made using ggplot2 (Wickham, 2009), ggtern (Hamilton, 2017), and InkScape (The Inkscape Team, 2015).