Volume 55, Issue 4 p. 2939-2960
Research Article
Free Access

Can Improved Flow Partitioning in Hydrologic Models Increase Biogeochemical Predictability?

Mahyar Shafii

Corresponding Author

Mahyar Shafii

Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, Ontario, Canada

Water Institute, University of Waterloo, Waterloo, Ontario, Canada

Correspondence to: M. Shafii,

[email protected]

Search for more papers by this author
James R. Craig

James R. Craig

Water Institute, University of Waterloo, Waterloo, Ontario, Canada

Department of Civil and Environmental Engineering, University of Waterloo, Waterloo, Ontario, Canada

Search for more papers by this author
Merrin L. Macrae

Merrin L. Macrae

Water Institute, University of Waterloo, Waterloo, Ontario, Canada

Department of Geography and Environmental Studies, Wilfrid Laurier University, Waterloo, Ontario, Canada

Department of Geography and Environmental Management, University of Waterloo, Waterloo, Ontario, Canada

Search for more papers by this author
Michael C. English

Michael C. English

Department of Geography and Environmental Studies, Wilfrid Laurier University, Waterloo, Ontario, Canada

Search for more papers by this author
Sherry L. Schiff

Sherry L. Schiff

Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, Ontario, Canada

Water Institute, University of Waterloo, Waterloo, Ontario, Canada

Search for more papers by this author
Philippe Van Cappellen

Philippe Van Cappellen

Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, Ontario, Canada

Water Institute, University of Waterloo, Waterloo, Ontario, Canada

Search for more papers by this author
Nandita B. Basu

Nandita B. Basu

Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, Ontario, Canada

Water Institute, University of Waterloo, Waterloo, Ontario, Canada

Department of Civil and Environmental Engineering, University of Waterloo, Waterloo, Ontario, Canada

Search for more papers by this author
First published: 21 March 2019
Citations: 12


Hydrologic models partition flows into surface and subsurface pathways, but their calibration is typically conducted only against streamflow. Here we argue that unless model outcomes are constrained using flow pathway data, multiple partitioning schemes can lead to the same streamflow. This point becomes critical for biogeochemical modeling as individual flow paths may yield unique chemical signatures. We show how information on flow pathways can be used to constrain hydrologic flow partitioning and how improved partitioning can lead to better water quality predictions. As a case study, an agricultural basin in Ontario is used to demonstrate that using tile discharge data could increase the performance of both the hydrology and the nitrogen transport models. Watershed-scale tile discharge was estimated based on sparse tile data collected at some tiles using a novel regression-based approach. Through a series of calibration experiments, we show that utilizing tile flow signatures as calibration criteria improves model performance in the prediction of nitrate loads in both the calibration and validation periods. Predictability of nitrate loads is improved even with no tile flow data and by model calibration only against an approximate understanding of annual tile flow percent. However, despite high values of goodness-of-fit metrics in this case, temporal dynamics of predictions are inconsistent with reality. For instance, the model predicts significant tile discharge in summer with no tile flow occurrence in the field. Hence, the proposed tile flow upscaling approach and the partitioning-constrained model calibration are vital steps toward improving the predictability of biogeochemical models in tiled landscapes.

Key Points

  • Tile flow data were used to constrain flow partitioning in hydrologic models
  • Novel methodology developed to upscale from single tile to watershed scale
  • Constraining tile flow resulted in better performance of the nitrogen model

1 Introduction

The amount of nitrogen (N) circulating in the biosphere has dramatically increased since the second half of the twentieth century due to human activities such as the intensification of agriculture for food and energy production (Galloway et al., 2008; Schlesinger, 2009; Vitousek et al., 1997). Human activities have led to increasing N leaching to groundwater (Puckett et al., 2011) and surface water (Howden et al., 2010), and increased loadings have not only put drinking water supply at risk but also impacted ecosystem health, for example, through eutrophication of receiving waters (e.g., Blann et al., 2009; Rabalais et al., 2002; Rockström et al., 2009; Scavia et al., 2006; Schulz, 2004; Testa et al., 2014; Turner et al., 2006; Turner & Rabalais, 1994; Vitousek et al., 1997). As a result, the fate and transport of N within terrestrial and in-stream domains have received increased attention among water researchers.

Many different models have been developed in the past few decades to describe N dynamics in watersheds by establishing functional links between watershed hydrologic responses and water quality. These models, which are of different levels of complexity (e.g., MIKESHE; Christiansen et al., 2004; SPNM; Williams, 1980; AGNPS; Young & Shepherd, 1995; SWAT; Arnold et al., 1998; HSPF; Bicknell et al., 1996; SWIM; Krysanova et al., 1998; and HGS; Yang et al., 2018), incorporate a number of hydrochemical processes for the simulation of the coupled catchment hydrology and biogeochemistry. Given the dependence of N cycling upon the physical characteristics of the landscape and hydrologic controls (Cirmo & McDonnell, 1997; Vidon & Hill, 2004), an improved understanding of the correct internal catchment processes and flow pathways is essential for model simulations to be realistic. However, the incorporation of complex processes in models generally involves more parameters to be estimated via calibration, which in turn increases the degrees of freedom and leads to problems of parameter identifiability, that is, the equifinality issue in which several parameter sets might eventually turn out to be equally acceptable (Beven, 2006; Kirchner, 2006).

The literature shows that to enhance model identifiability, diagnostic model evaluation methods can be employed to impose proper constraints for model discrimination (Gupta et al., 2008). For example, model evaluation utilizing hydrological signatures extracted from streamflow data has been shown to improve model predictions (e.g., Euser et al., 2013; Gupta et al., 2008; Pokhrel et al., 2012; Shafii & Tolson, 2015; Winsemius et al., 2009; Yilmaz et al., 2008). Due to the critical importance of flow pathways in biogeochemical modeling, since water flowing through different pathways may yield very different biogeochemical signatures, it is desirable to constrain the hydrologic partitioning in models based on either streamflow signatures and/or additional datasets. Constraining models using the information extracted from experts' knowledge and such datasets has previously proven to be valuable in hydrological modeling. Recently, Shafii et al. (2017) showed that utilizing hydrological signatures derived only from streamflow data did not necessarily result in correct flow partitioning. Thus, to reduce uncertainties in solute transport modeling, measurements along different hydrologic flow paths are warranted. Because it is extremely difficult, or even impossible, to physically separate water from different paths, there is very limited research on constraining flow partitioning based on actual data on flow pathways (e.g., van der Velde et al., 2010). As such, researchers use auxiliary data (mostly tracers and isotopes) to constrain flow partitioning and/or individual hydrochemical processes. Many previous studies have employed such data in a multicriteria calibration context (Hooper et al., 1988; Mroczkowski et al., 1997), for hydrograph separation (Ladouche et al., 2001; Soulsby et al., 2003) and for estimating the fraction of young and old water (Son & Sivapalan, 2007; Vaché & McDonnell, 2006). Moreover, it has been shown that soft data (Seibert & McDonnell, 2002; Son & Sivapalan, 2007) are also helpful in model conditioning. For instance, Yen et al. (2014) consider regional estimates of the annual denitrification mass and the fraction of groundwater nitrate load as soft data for parameter estimation.

Artificial drainage, for example, tile drainage, is known to be an important pathway for water and solute transport in agricultural catchments (Basu et al., 2010; Blann et al., 2009; Boland-Brien et al., 2014; Guan et al., 2011; Macrae et al., 2007a; Sloan et al., 2016). The percentage of Ontario farms with tile drainage has increased over the past decade, and their ability to convey nutrients from agricultural fields, is tied into eutrophication problems in receiving waters such as Lake Erie (Jarvie et al., 2017). Some watershed models include mathematical representations of tile flow. However, because it is generally not known where tiles are located and when tile flow occurs, this flow pathway is usually considered by models in an approximate way, and the underlying parameters are adjusted via model calibration with respect to streamflow data (e.g., David et al., 2009; Green et al., 2006). Only a few modeling studies have utilized tile discharge data to validate their models, either a posteriori, that is, after streamflow calibration (e.g., Boles et al., 2015; Davis et al., 2000; Moriasi et al., 2013), or by constraining physically based models for streamflow predictions at field scale (Hansen et al., 2013; Rozemeijer et al., 2010; Vrugt et al., 2004). To our knowledge, no studies in the literature have demonstrated the direct use of tile flow data and corresponding metrics for improving biogeochemical predictability.

Our overall hypothesis for this study is that improving flow partitioning in a hydrologic model using subsurface tile flow data in addition to streamflow data will enhance the biogeochemical predictability of a catchment-scale nitrogen model. We test this hypothesis in a small, first-order agricultural catchment in Ontario where daily streamflow and nitrate data, and intermittent tile flow data, are available. We develop a novel methodology to upscale intermittent tile flow data at a subset of tiles in the catchment to estimate catchment-scale tile discharge. Since tile flow data are not available in all catchments, we further evaluate the ability of soft data—expert knowledge on percent tile flow in a catchment—to constrain model outcomes. The main objective in this research is to demonstrate that appropriate constraining of flow partitioning in a hydrologic model increases the ability of the model to reproduce streamflow and nitrogen dynamics.

2 Material and Methods

2.1 Study Area

The Strawberry Creek Watershed (SCW) is a small (~2.7-km2) first-order catchment located in Southern Ontario, Canada (Figure 1). SCW is a tributary of Hopewell Creek that discharges into the Grand River, which ultimately flows into Lake Erie. Land use in SCW is dominantly agriculture, with corn, soybean, and cereal as major crops. Soils are a poorly drained mixture of gray-brown luvisols and Melanic brunisols (Present & Wicklund, 1971). Surficial soil textures are predominantly loam and silt loam. A layer of clay-rich Maryhill till present at approximately 2-m depth makes shallow groundwater flow primarily lateral (Mengis et al., 1999). We use soil data obtained from Agriculture and Agri-Food Canada, C. S. I. S. C. and NSDB (National Soil DataBase) (2010) to characterize soil properties in the model. Approximately 65% of the catchment is artificially drained by subsurface tile drains, made of either perforated clay or polyethylene tubing ~10 cm in diameter and installed approximately 1 m below the soil surface. Seven active tile drains exist in the basin, as shown in Figure 1b.

Details are in the caption following the image
(a) Study area located in southwestern Ontario (solid line represents the Canada-United States border); (b) catchment boundary, land use, and the inlet of seven active (A-G) and two plugged (PL) tiles in the catchment.

Mean annual precipitation is 909 mm, and mean annual air temperature (1971–2,000) is 7 °C, with monthly mean ranging from −7 °C in January to 20 °C in July (Environment Canada, 2004). The obtained annual potential evapotranspiration is 590 mm, which results in an aridity index of 0.65. Mean annual runoff in the time frame 1996–2002 was 323 mm (Macrae et al., 2007a). There are a total of nine tile drain outlets draining into the stream among which two (denoted PL in Figure 1) were plugged during the data collection period and were not contributing to stream hydrochemistry. The majority of annual tile discharge occurs between the months of November and April (Macrae et al., 2007b), increasing in response to snowmelt and precipitation events. During dry periods, perennial groundwater inputs sustain stream base flow with no tile and/or surface flow in the basin. Rarely, exceptionally dry periods happen (e.g., the summer of 2001) when the streamflow ceases completely.

Measurements of streamflow and tile discharge, climate data, and chemistry at the outlet of the basin are reported in Macrae et al. (2007a, 2007b); daily data (forcings and streamflow) from the March 2000 to August 2001 period are used for calibrating the model. Daily tile discharge measurements are available for the year 2001 from one of the seven working tiles in the basin (tile B; see Figure 1); daily nitrate data are not available at this tile. To constrain model outcomes using tile flow, we employ tile data in the time frame January–August 2001 that overlaps the streamflow and nitrate data used in this study. We also use the data reported in (Rashid, 2013) for conducting model validation; all available streamflow and nitrate data available in the time frame September 1997 to April 1999 are used for evaluating the performance of the calibrated hydrology and nitrogen model. We consider a 1-year warm-up period between 1 September 1996 and 31 August 1997.

The hydrologic and nitrogen model developed in this study run at the scale of Hydrologic Response Units (HRUs). Each HRU is a particular combination of land use and soil type, which are delineated using land use and soil data obtained from the data platform of the Grand River Conservation Authority (https://data.grandriver.ca/) and the existing soil database (Agriculture and Agri-Food Canada, C. S. I. S. C., and NSDB (National Soil DataBase), 2010), respectively. Soil data comprise of polygons each providing different soil parameters for the underlying soil profile. Land use data show that the land class in the catchment is either cropland or forest. Overlaying soil and land use data, we identified six HRUs with surface area ranging from 11 to 124 Ha. Figure 1 shows the two HRU types (i.e., forest ~10% and cropland ~90% of the catchment).

2.1.1 Tile Discharge Estimation Upscaling Approach

One of the main contributions in this study is the constraining of the outcomes of our hydrology model using signatures that are calculated based on tile flow measurements. As such, we required a continuous record of daily discharge from all tiles at the catchment scale, which was not directly available in the study watershed. Rather, the available data included a complete time series of daily tile discharge at only one tile outlet in the basin (tile B; see Figure 1), as well as scattered tile discharge measurements made simultaneously at all seven active tiles (including tile B) throughout the data collection period. Macrae et al. (2007b) showed that there were strong statistical relationships between the discharge through tile B and the other tiles in the watershed. Consequently, to estimate daily tile discharge at the catchment scale, we established a linear regression equation between log-transformed discharge at tile B (independent variable) and logarithm of the sum of discharges of all seven tiles (dependent variable). The equation derived in this way (with r2 of 0.87 between dependent and independent variables) was used to estimate total tile flow at the catchment outlet at every time step. Statistically, such an equation would give long-term average tile flow at the catchment scale on a particular time step based on flow in tile B without accounting for any uncertainties in the data and/or the regression model. To take uncertainties into account, we needed certain acceptability thresholds extracted from the regression model that could define the upper and lower bound for tile flow on a particular day. As such, we determined the statistical prediction intervals for daily tile flow values using the regression model. As explained later in section 2.3.1, tile discharge time series and the corresponding intervals were eventually utilized in the model calibration process.

2.2 Model Setup

2.2.1 Hydrologic Model

We modified the hydrologic model developed in Shafii et al. (2017) to account for the inclusion of the tile-drain pathway. The hydrologic model is developed within the flexible modeling framework RAVEN (Craig, 2017). Figure 2a schematically illustrates the hydrologic model and the underlying processes including precipitation, canopy interception, snowmelt, infiltration, overland runoff, evapotranspiration, percolation, tile flow, base flow, and finally hydrologic routing. We assume the landscape to consist of two soil compartments, with the upper one representing the soil above the tile drains and the lower compartment representing the part of the soil below the tiles. It is assumed that water table fluctuates in the lower compartment, which means that this compartment is partially saturated. When this storage gets full, water table rises above tiles (i.e., storage in the upper compartment increases). The storage in the lower compartment controls the amount of base flow and potential evapotranspiration from this compartment (see Appendix A). Our approach to formulating these two fluxes is similar to a previous study by Ye et al. (2012). Note that the term compartment and zone are used interchangeably in this paper to refer to the upper and lower box in Figure 2.

Details are in the caption following the image
Model structure, hydrologic (a) and nitrogen (b), showing soil compartments or zones (upper and lower), nitrogen pools (organic and mineral), and hydrochemical processes. Appendix B details the nitrate mixing mechanism in the passive storage.

In terms of processes in the upper compartment, if the moisture in the upper soil compartment exceeds field capacity, percolation (i.e., vertical drainage) occurs from the upper to the lower compartment. Subsequently, tiles are activated if the remaining moisture in the overlying soil still exceeds field capacity. The rate of tile flow increases from zero (at field capacity) toward a maximum rate (when the overlying soil becomes saturated); this maximum rate is a calibration parameter. This assumption is in line with recent findings in Lam et al. (2016) who showed with high-frequency data that tiles in loamy soils do not flow until field capacity is reached.

All processes of the hydrologic model, which runs at the HRU scale, are mathematically described in Appendix A. Table A1 describes the parameters of the hydrologic model that must be adjusted by calibration. Model parameters that are adjusted via calibration are spatially invariant (i.e., constant over all six HRUs delineated in the study area). Note that according to common practice for tile drainage in Ontario soils, the tiles pathway in the model is located at the depth of 1 m, which means the thickness of upper soil compartment is 1 m. The depth of the lower soil compartment is estimated via calibration.

2.2.2 Biogeochemical Model

The hydrologic model is coupled with a nitrogen model to simulate nitrate loss at the catchment scale. The concepts employed for developing the nitrogen model are based on previous studies (van der Velde et al., 2010; Hrachowitz et al., 2013). As shown in Figure 2b, the model has two nitrogen reservoirs in the upper soil compartment (organic and mineral) and two conceptual mineral nitrogen reservoirs in the lower zone, namely, the active and passive storage. The details of passive storage and the internal mixing in the lower zone will be described later. We assume the mineral reservoirs to be composed of nitrate. We do not explicitly model ammonium dynamics in the system—field measurements collected in the study area indicate that ammonium represents only 3.5% of the mineral pool (ammonimum + nitrate) in the top 10 cm of the soil (Solondz, 2005). The fluxes that modify the nitrogen reservoir within the upper zone (i.e., top 1 m of the soil profile) include (i) input fluxes composed of fertilizer added 90% to the mineral and 10% to the organic pools (as shown in Figure 2), atmospheric deposition added to the mineral pool, biological fixation, and crop residue added to the organic nitrogen pool; (ii) plant uptake; (iii) mineralization of organic matter, which releases nitrate; and (iv) denitrification, that is, bacterial reduction of nitrate under anoxic conditions due to potentially high water content in the upper zone; (v) leaching to the lower zone; and (vi) nitrate transport through tiles to the stream. Note that surface runoff is assumed to carry no nitrate in our model. The processes considered in the lower zone include (i) denitrification, due to the fact that this zone is mostly water saturated; (ii) plant uptake; and (iii) nitrate transport through base flow to the stream.

Model processes are linked to temporally variable factors (e.g., soil moisture, temperature, and growing season) using a number of rate constants that are estimated via calibration. Appendix B provides detailed mathematical representations of the different processes, and Table B1 describes all variables and calibration parameters used in the proposed nitrogen model. Note that most of the nitrogen processes are coupled with the corresponding hydrologic fluxes, which in turn are functions of climate and soil controls (see the equations in Appendix A). More specifically, plant uptake of nitrate is proportional to the evapotranspiration flux, nitrate leaching is dependent upon percolation to the lower soil compartment, and tile and groundwater nitrate exports are related to tile discharge and base flow, respectively.

Catchments always exhibit low-pass filter characteristics whereby the amplitudes and high-frequency variability of solute input signals are attenuated significantly (e.g., Basu et al., 2010; Kirchner, 2003; Kirchner et al., 2000; Martinec et al., 1974). To conduct coupled hydrologic and biogeochemical modeling, models must explicitly account for celerities in streamflow simulation (i.e., short time scales) and velocities for solute transport (i.e., long characteristic times; McDonnell & Beven, 2014). To accommodate both time scales, we follow previous studies (e.g., Barnes & Bonell, 1996) and introduce passive mixing storage in the underlying soil compartment of the model, shown in dark blue in Figure 2. The passive storage does not impact the water storage but influences the solute dynamics by a partial (rather than complete) mixing mechanism in order to emulate transit time distribution characteristics; see details in Appendix B. Using such passive storage has proven to be a simple approach to simulate solute transport modeling (e.g., Fenicia et al., 2010; Hrachowitz et al., 2013, 2015; Shaw et al., 2008). The upper compartment is assumed to be well-mixed, and partial mixing is applied only to the lower compartment. The partial mixing requires two parameters, the size of the passive storage and a mixing coefficient, both of which are estimated via calibration.

In terms of model discretization, the model is run on two HRU types, namely, cropland comprising four HRUs and forest consisting of two HRUs. The hydrologic fluxes of four agricultural HRUs and two forest HRUs are aggregated and used for transporting nitrogen. All biogeochemical processes described earlier are applied to the cropland HRUs, whereas no fertilizer application is considered in the forest HRUs. Appendix B elaborates on the data used in our case study to estimate variables such as fertilizer amounts and application date, atmospheric deposition, and biological fixation. Expressions for mixing and biogeochemical processes in the system are lumped over the two HRU types (i.e., cropland and forest). The coupled model runs at the daily time scale. Once the simulation is finished over the entire time frame, we calculate the annual, monthly, and daily loads of nitrate delivered via tile pathway and at the catchment outlet.

2.3 Model Calibration

The calibration of our coupled model is a two-step process. First, the hydrologic model is calibrated (step 1), and subsequently, the parameters of the nitrogen model are estimated (step 2). Sequential model calibration allows us to evaluate whether constraining flow pathways in the hydrology model improves biogeochemical predictability. Given that streamflow data are more available than nitrate data, sequential calibration allows us to test the predictability of our water quality prediction module, when data are not available. In step 1, we implemented multiple calibration scenarios (detailed in section 2.3.1), identified hydrologically consistent hydrologic model realizations and transferred their hydrologic fluxes to the nitrogen model. Subsequently, model parameters were estimated in step 2 considering a number of metrics and signatures quantified based on nitrogen loading, elaborated on in section 2.3.3.

2.3.1 Calibration of Hydrologic Model

To demonstrate the importance of flow partitioning for correct hydrochemical predictions, we consider three calibration scenarios for our hydrologic model where the information extracted from tile data is utilized sequentially. All calibration experiments in step 1 involve a set of criteria that are different in each calibration scenario. The multicriteria optimization algorithm AMALGAM (Vrugt & Robinson, 2007) is used to conduct all calibration experiments and identify optimal parameter sets. Due to the multicriteria approach, the optimization algorithm may yield multiple optimal solutions, called the Pareto solutions in the optimization terminology. These solutions are equally acceptable because there are no other solutions that outperform them against all the underlying criteria. As a result, we use all Pareto model realizations in the process of comparing different calibration scenarios. To reduce the impact of parameter initialization on the optimal results, we run AMALGAM in 10 trials in all three calibration scenarios and store all Pareto solutions obtained in different trials in a pool for further analyses and comparisons. SCENARIO-1: Model Calibration Using Hydrological Signatures

In SCENARIO-1, we consider the maximization of the NSE metric (Nash & Sutcliffe, 1970) calculated based on streamflow at the catchment outlet as one calibration criterion, that is, aiming to minimize the deviations between observed and simulated discharge time series. Even though such a calibration strategy may be commonly used among hydrologists, previous studies have shown that using only goodness-of-fit metrics is not sufficient in reproducing streamflow dynamics (e.g., Gupta et al., 2009; Martinez & Gupta, 2010; Shafii et al., 2017). To improve model identification in this case, three hydrological signatures extracted from streamflow time series are utilized as other calibration criteria in SCENARIO-1. These signatures (see Table 1) include (1) annual water yield, and high-flow volume in the flow duration curve, that is, sum of discharges with 0–0.25 probability of exceedance, at (2) monthly and (3) daily time scales, respectively. The reason that SCENARIO-1 focuses on the signatures related to high flows (i.e., 1, 2, and 3 in Table 1) is because a large proportion of catchment-scale nitrate export occurs during high flow events (Macrae et al., 2007a).

Table 1. Hydrological Signatures Used in Different Calibration Scenarios, the Corresponding References, and Their Acceptability Thresholds
ID Hydrological signature [dim.] Scenario Reference Acceptability range
1 2 3
1 Annual water yield [L3] x x x Shafii and Tolson (2015) [−5, 5] %bias between obs. and sim.
2 Flow duration curve high-flow volume, monthly [L3] x x x Shafii and Tolson (2015) [−5, 5] %bias between obs. and sim.
3 Flow duration curve high-flow volume, daily [L3] x x x Shafii and Tolson (2015) [−5, 5] %bias between obs. and sim.
4 Tile flow contribution in annual water yield [−] x x Culley et al. (1983) and Macrae et al. (2007b) 40%–60%
5 Average monthly tile flow in February 2001 [L3 T−1] x This study [32, 51] Liter per second
6 Average monthly tile flow in Jul and Aug 2001 [L3 T−1] x This study [0, 1.3] Liter per second
Nitrate loading signature
1 Annual export [M] x x x This study [−10, 10] %bias between obs. and sim.
2 Jan to Apr export [M] x x x This study [−20, 20] %bias between obs. and sim.

In SCENARIO-1, we investigate whether or not selected hydrological signatures might implicitly support proper flow partitioning and nitrogen modeling, without incorporating any information about tiles. This calibration scenario is inspired from the methodology developed in Shafii and Tolson (2015) and attempts to locate optimal parameter values in the search space. To do so, the optimization algorithm finds all solutions whose simulated signatures are sufficiently close to the observed signatures. In other words, the deviations between observed and simulated signatures for the optimal solutions fall within the acceptability ranges provided in Table 1. Including hydrological signatures as calibration diagnostics improves the ability of the search algorithm to find optimum parameters sets using only streamflow data without flow partitioning metrics. SCENARIO-2: Signature-Based Model Calibration Using Estimated, Annual Average Tile Flow

SCENARIO-2 is similar to SCENARIO-1 with the only difference that we include an additional signature in the calibration criteria set, namely, the proportion of tile flow contribution to the annual water yield. Our rationale here is that even though continuous tile data are not available in most of tile-drained catchments, it may be possible to constrain the proportion of tile discharge in the annual water balance. We explore in SCENARIO-2 whether such understanding is helpful for constraining model-generated flow partitioning, and eventually improving the nitrogen model performance. In our study area, we identified an acceptable range for tile discharge contribution as 40%–60% based on Macrae et al. (2007b) who reported a contribution of 42% for the same study basin and Culley et al. (1983) who observed 60% in a field-scale study in another catchment in southern Ontario. Similar to SCENARIO-1, a number of Pareto solutions will be identified in this scenario as well. SCENARIO-3: Constraining Hydrologic Model With Continuous Tile Flow Data

In this calibration scenario, we add extra signatures into the calibration criteria set based on the estimated continuous daily tile discharge at the catchment scale using the regression-based approach explained in section 2.1.1. Our goal is to investigate whether combining such information with previous signatures could further improve model performance. The tile flow upscaling approach based on a linear regression analysis is prone to uncertainties. Therefore, we do not use the estimated daily catchment-scale tile flows to calibrate our model (e.g., we do not use any goodness-of-fit metric, such as NSE, between simulated and observed total tile flow values). Rather, we define two hydrological signatures and use the regression results (i.e., 95% prediction intervals) to determine their acceptability limits. These signatures are monthly tile flow in February 2001 and sum of tile flow in July and August 2001, which are shown along with the corresponding acceptability limits in Table 1. Essentially, the acceptability limits considered for these signatures are the 95% prediction intervals that the regression analysis provides given the representational uncertainty involved. Our approach is consistent with previous research on the identification of limits of acceptability (Beven, 2006; Blazkova & Beven, 2009; Liu et al., 2009).

2.3.2 Identifying Behavioral Hydrologic Models

Hydrologic model calibration process is an automatic sampling routine that, upon visiting many solutions in the search space, yields the Pareto optimal solutions. Because Pareto solutions are not necessarily the best results with respect to all calibration criteria (i.e., it is possible that those solutions perform very poorly against a subset of criteria), it is not logical to transfer all Pareto solutions to the nitrogen model calibration step. As such, we decided to select only those solutions on the Pareto front for which the deviation of all signatures fell within the corresponding acceptability bound. In other words, solutions of interest have to satisfy signatures 1–3, 1–4, and 1–5 in SCENARIO-1, SCENARIO-2, and SCENARIO-3, respectively. However, we noticed that such solutions often did not exist in the pool of Pareto solutions obtained in 10 different trials in each scenario; therefore, we relaxed the acceptability bounds so that a number of solutions could pass the selection criteria. To determine the relaxation factor, we tried several extensions of the acceptability bounds until at least five solutions remained in the resulting set of selected models. Extending the acceptability limits of signature deviation to ±10 for signatures 1–3 and relaxing the acceptability limits by 20% for signatures 4–6 met this criterion. In this way, 7, 12, and 5 model realizations were identified in SCENARIO-1, SCENARIO-2, and SCENARIO-3, respectively. Such solutions are called behavioral solutions hereafter, and their hydrologic fluxes will be transferred to the nitrogen model calibration step.

2.3.3 Estimation of Nitrogen Model Parameters

To evaluate whether constraining flow partitioning also results in a more meaningful reproduction of biogeochemical dynamics (i.e., our main objective), we transfer the hydrologic fluxes associated with behavioral solutions obtained in the previous step to the nitrogen model and test the ability of each solution to simulate biogeochemical fluxes. Subsequently, we calibrated the nitrogen model for each behavioral solution. Figure 3 summarizes our approach to finding the optimal calibration results for the nitrogen model considering a hypothetical case with two hydrologic calibration metrics and two nitrogen calibration criteria (both to be maximized). For each behavioral model (out of the pool of Pareto solutions) identified in the hydrologic model calibration step (shown in the left panel in Figure 3), we run the nitrogen model calibration routine to identify the Pareto optimal parameter sets (i.e., the middle panel in Figure 3). The metrics that we use for this step are the NSE measure calculated based on nitrate loads at the catchment outlet, as well as the two signatures for nitrogen loading (annual nitrate export and January–April nitrate export—the acceptability thresholds considered for these signatures are also shown in Table 1). In this way, the number of nitrogen model calibration trials in each scenario equals the number of behavioral hydrologic models obtained in that scenario (3 in the hypothetical case shown in Figure 3). Once the Pareto solutions are identified in each nitrogen model's calibration trial, all of them are deemed as candidate best solutions, and are called Level-1 Pareto solutions, hereafter. Then, these solutions are sorted again to identify the final Pareto models that are called Level-2 Pareto solutions in the rest of this paper (right panel in Figure 3). In this way, Level-2 solutions are subset of Level-1 solutions.

Details are in the caption following the image
Schematics of our model calibration approach composed of hydrologic model calibration (left panel), nitrogen model calibration resulting in Level-1 Pareto solutions (middle panel), and final Pareto dominance sorting to identify Level-2 Pareto nitrogen model realizations (rightmost panel); figure involves a hypothetical case with two criteria for both hydrologic and nitrogen model calibration, both of which are to be maximized. Left panel also shows behavioral zone (hatched area) and the underlying behavioral solutions (black circles) in the hydrology step.

3 Results

3.1 Hydrological Predictability

3.1.1 Hydrological Model Performance

Model calibration is carried out for the three aforementioned scenarios (SCENARIO-1, SCENARIO-2, and SCENARIO-3) for 10 trials of 1,000 simulations each (i.e., 10,000-simulation computational budget). We followed the behavioral solution selection process described in section 2.3.2 and identified behavioral solutions in each scenario, taking into account all samples taken by our search algorithm in the calibration of hydrologic model in different scenarios. NSE values associated with behavioral solutions were in the range 0.65–0.7, 0.7–0.75, and 0.64–0.69 in SCENARIO-1, SCENARIO-2, and SCENARIO-3, respectively. The validation NSE values were in the range 0.41–0.59, 0.55–0.63, and 0.59–0.61 in SCENARIO-1, SCENARIO-2, and SCENARIO-3, respectively. NSE values indicate that SCENARIO-2 outperforms the other two scenarios in the calibration period; however, NSE values for scenarios 2 and 3 are fairly identical in the validation period. As pointed out earlier, all behavioral solutions fell within relaxed acceptability thresholds, meaning that the deviations for signatures 1–3 are similar among all scenarios. However, signatures 4–6 were different for each scenario. Figure 4 illustrates signatures 4–6 values associated with the behavioral solutions obtained in each scenario during the calibration period. Also shown in Figure 4 is the extended acceptability levels considered for identifying behavioral solutions. It is observed in Figure 4 that SCENARIO-2 and SCENARIO-3 satisfy signatures 4 and 5; however, only SCENARIO-3 is able to satisfy signature 6 (summer tile flow). In other words, even though similar NSE values have been obtained in different calibration scenarios, model performance looks different across scenarios when signatures are taken into account. Such variations in model performance will have implications with respect to biogeochemical predictability, which will be investigated in section 3.2.

Details are in the caption following the image
Range of values of signatures 4–6 (Table 1) associated with behavioral solutions (vertical bars) obtained in calibration scenarios SC1, SC2, and SC3 in the calibration period. Horizontal lines represent extended limits of acceptability for identifying behavioral solutions.

3.1.2 Hydrologic Partitioning

There are three lateral flow pathways in our hydrologic model, namely, overland runoff, tile flow, and base flow (Figure 2). Figures 5a–5c demonstrates the temporal variability of the contribution of tile flow in the annual water yield at the monthly scale (between January and August 2001) obtained in three calibration scenarios for the hydrologic model. We calculated the observed tile flow contribution and the corresponding 95% prediction intervals (shown in Figure 5) using streamflow measurements at the catchment outlet and regression-based tile flow estimates. Simulated points illustrated in Figure 5 are behavioral solutions obtained in each scenario. It is observed that when flow partitioning is added into the calibration criteria set (i.e., scenario 3), the simulated range of tile flow contribution demonstrates a higher level of consistency with the observed bound compared with those given by SCENARIO-1 and SCENARIO-2. For instance, in SCENARIO-1, there are a number of solutions with no tile flow at all, which is completely inconsistent with reality. However, unless flow partitioning signatures are utilized in model calibration, it is likely that such solutions end up in the behavioral solutions set. A signature based on an approximate tile flow contribution (i.e., SCENARIO-2) was not sufficient in reproducing tile flow dynamics either, as simulated points fell outside of prediction intervals of the regression model (middle panel in Figure 5), especially in summer months when there is generally no tile flow in the basin. Overall, flow partitioning reveals that SCENARIO-3 is the most appropriate calibration scenario. In the next section we will look into the comparison among the three scenarios from the nitrogen dynamics perspective.

Details are in the caption following the image
Temporal variability of tile flow contribution in the total water yield; observed (solid triangles) and 95% prediction intervals (hatched area) are obtained from regression; simulated values are all behavioral solutions obtained in each calibration scenario.

3.2 Biogeochemical Predictability

3.2.1 Nitrogen Model Performance: Nash Sutcliffe Efficiency

The nitrogen model is run using the hydrologic fluxes from all the behavioral solutions obtained in the three scenarios described in section 2.3.1. We then estimate the parameters of the nitrogen model as described previously in section 2.3.3 and shown in Figure 3. Upon running the optimization algorithm for calibrating the nitrogen model, there were 14, 43, and 11 Level-1 Pareto solutions found in scenarios 1 to 3, respectively. Figure 6 illustrates the observed nitrate loadings and simulations generated by Level-1 Pareto solutions during the calibration (left panel) and validation (right panel) period for the three calibration scenarios. Despite some differences in peak simulations, the ensemble of simulated time series is generally similar. Therefore, graphical comparison among three calibration scenarios is not straightforward, and a number of quantitative quality metrics are needed for conducting a more sound comparison. In the rest of this section, we will focus on such quantities.

Details are in the caption following the image
Nitrate loading observations (solid circles) and Level-1 Pareto simulations (triangles) obtained during calibration (left panel) and validation (right panel) for three calibrations scenarios (rows).

Nitrate loading NSE values obtained in the calibration and validation period, for all Level-1 Pareto solutions are shown in Figure 7. For the calibration period, the NSE values for nitrate load are comparable among all scenarios, although SCENARIO-2 yields slightly higher NSE values. This is in contrast to the validation period when both SCENARIO-2 and SCENARIO-3 give distinctly higher NSE values compared to SCENARIO 1. The superior NSE values for the nitrate model in SCENARIO-2 and SCENARIO-3 highlights the importance of correct characterization of flow pathways when hydrologic fluxes are subsequently used to drive biogeochemical fluxes. It is also observed in Figure 7 that uncertainty on the model performances is reduced for SCENARIO-2 in the calibration period and for SCENARIO-3 in the validation period. In other words, when only hydrological signatures are used to select the best hydrologic models (SCENARIO-1), with no constraints on flow partitioning, it is likely that the selected model realizations with high hydrologic performance do not perform as well with respect to solute transport especially in an independent time frame, (i.e., validation period).

Details are in the caption following the image
Nitrate loading NSE mean and range (vertical bar) and individual values (solid circles) associated with all Level-1 Pareto solutions obtained in the nitrogen model calibration experiment following three hydrologic model calibration scenarios.

3.2.2 Model Transferability

We assessed model transferability (i) from hydrology to nitrogen modeling and (ii) through time from the calibration to validation period. To do so, we calculated and plotted daily streamflow and nitrate loading NSE values (Figure 8a), as well as daily nitrate loading NSE values for the calibration and validation period (Figure 8b). Note that it is desired for points in Figure 8 to fall on top right corner, which indicates low deterioration of model performance through cross-constituent and temporal transfer. In general, for a model realization to be transferable, the coordinate values of the points shown in Figure 8 need to be simultaneously as close as possible to (1, 1). Figure 8a shows the results for all Level-1 Pareto solutions. Overall inclusion of the tile flow signatures (scenarios 2 and 3) led to higher NSE values for both streamflow and nitrate loading. Notably, SCENARIO-2, despite not being able to capture the temporal dynamics of flow partitioning adequately (Figure 5), resulted in higher NSE values compared to SCENARIO-3. This finding indicates that if only NSE is considered for model performance assessment, it may achieve the right answers (i.e., high NSE) for the wrong reasons (i.e., tile flows obtained in SCENARIO-2 were not consistent with reality, as shown in Figure 5). Thus, the identification of the best scenario requires more evaluation metrics to be taken into consideration.

Details are in the caption following the image
Streamflow and nitrate loading NSE in calibration period for Level-1 Pareto solutions (a), and nitrate loading NSE in calibration and validation period for Level-1 and Level-2 Pareto solutions (b). Also shown are the boundaries encapsulating all Level-1 Pareto solutions in different scenarios, and 1:1 line; the ideal region is top right corner.

Figure 8b demonstrates nitrate loading NSE values obtained during the calibration (x axis) and the validation (y axis) period for both Level-1 and Level-2 Pareto solutions. Among all Level-1 Pareto solutions, 2, 3, and 2 solutions were identified as Level-2 in scenario 1, 2, and 3, respectively. Similar to previous observations, SCENARIO-2 yields the most transferable solutions, followed by SCENARIO-3 and SCENARIO-1 for both Level-1 and Level-2 solutions. The above analysis indicates that accounting for flow partitioning signatures in the calibration of our hydrologic model, even an approximate understanding about percent tile flow contribution, consistently improves biogeochemical predictability. The finding of enhanced model transferability for SCENARIO-2 compared to SCENARIO-3—the latter is more robust based on flow partitioning—is somewhat counterintuitive. To further evaluate this claim with respect to metrics beyond the goodness of fit, we investigate (in the next section) the ability of the model to capture seasonal nitrate loadings.

3.2.3 Nitrate Loading Signatures

Figure 9a illustrates the range of deviations in the signature values for Level-1 (vertical bars) and Level-2 (solid circles) Pareto solutions obtained by running the nitrate model using the hydrologic fluxes generated in the different calibration scenarios (using behavioral models). In general, nitrate signatures associated with Level-1 and Level-2 Pareto solutions fall below the 20% acceptability threshold in all three scenarios during the calibration period. However, in the validation period, it is only SCENARIO-3 in which all Level-2 Pareto solutions fall below the acceptability limit for both the annual and the seasonal nitrate signatures. In other words, the signature values for Level-2 Pareto solutions in the validation period collectively indicate that SCENARIO-3 systematically outperforms other two scenarios. It was shown earlier that SCENARIO-2 had resulted in higher NSE values in comparison to other two scenarios; however, nitrate signatures in Figure 9 reveal that appropriate flow partitioning using measured tile flow data leads to the most biogeochemically robust model.

Details are in the caption following the image
Nitrate signatures deviations associated with Level-1 (vertical bars) and Level-2 (solid circles) Pareto solutions in calibration (a) and validation (b) period.

4 Discussion

The research involved the development of a coupled hydrology-nitrogen model for an agricultural catchment, and the model runs at the HRU scale. The study basin is a heterogeneous mix of corn, soy, winter wheat with some alfalfa or hay. However, we do not know which fields were in which crops over the study period. Therefore, we assigned an average nitrogen fertilizer amount of 85 kg/ha to all agricultural HRUs within the basin. It was our expectation that the effects of spatial heterogeneity in crops on nitrogen dynamics at the catchment scale would be marginal. Moreover, a number of calibration parameters of the nitrogen model such as the mineralization rate constant are considered constant spatially because we did not have sufficient field data to differentiate across fields. Thus, rate constants such as that of mineralization represent spatially averaged effective rates. The rationale behind such simplifications in the nitrogen model is that our spatially simplified model proves to be capable of properly capturing the temporal nitrate dynamics at the catchment scale. This finding is consistent with previous studies in similar agricultural catchments where the hydrological response of highly engineered hydrologic systems could be captured by simplified models (Basu, Rao, et al., 2010).

Our approach to calibrating the model is to both consider goodness-of-fit metrics (between simulated and observed streamflow time series) and to utilize flow partitioning signatures that are quantified based on the flow pathway data. It has been argued that NSE is not always the best measure of a model's performance (Gupta et al., 2009), and our analysis (e.g., sections 3.2) confirms that indeed a model can have a higher NSE but have a poorer performance for other metrics. Multiple previous studies advocated for the evaluation of model performance against not a single, but a range of performance measures for improving the process of scrutinizing acceptable models (Beven, 2006; Gupta et al., 2008, 2009; Martinez & Gupta, 2010). Our results demonstrate that involving partitioning signatures based on tile data is one way of conducting a multicriteria model assessment. Constraining flow partitioning could also be carried out using other readily available hydrological data such as soil moisture (Wanders et al., 2014), total water storage data (Rakovec et al., 2016), crop yield (Malagó et al., 2017), and vegetation indices such as normalized difference vegetation index (Narasimhan et al., 2005) obtained from remote sensing. Using such data for model calibration has been promising for improving the estimation of the parameters involved in the description of base flow and evapotranspiration. Future work would involve simultaneous use of various alternate data sets together to increase the consistency of hydrologic models.

In the multicriteria model calibration approach developed in our study, model parameters are adjusted sequentially, that is, first streamflow and then nitrate fluxes are calibrated. If streamflow and nitrogen data are both available in a catchment, it is often the practice to calibrate them simultaneously. We conducted a simultaneous calibration experiment in our study (results not shown here), and observed that indeed once both streamflow and nitrate data were used, the differences in model performance was less significant among the different calibration scenarios. This is possibly because the information contained in the catchment-scale nitrate data guides the search algorithm naturally towards the optimal region in parameter space that partitions the flow correctly between the surface and the subsurface pathways. However, nitrate data are generally available at much fewer locations in the watershed than streamflow data. This study shows that tile flow measurements (even coarse upscaled estimates) utilized by the proposed signature-based model calibration approach provide an additional source of information for the development of biogeochemical models when sufficient water quality data are not available.

It is interesting to note that using any available information on flow pathways (e.g., approximate estimates, sporadic sampling, and/or continuous measurements of tile flow) enhances model identification (e.g., SCENARIO-2 model calibration in our study). This is an important finding as it suggests the possibility of using such constraints even in cases where detailed information on tile flows is not available. Even though tile data have been previously used to constrain streamflow simulations by physically-based models at small scales (Hansen et al., 2013; Rozemeijer et al., 2010; Vrugt et al., 2004), the main challenge, however, is to upscale such an approach to larger scales, whereby the interplay between details in the model parameterization and the computational costs becomes important. Our study showing that models can be constrained even using only sporadic measurements of tile flow makes it easier to upscale these approaches from the small watershed to the regional scale. Our study catchment is not too large and a physically-based numerical model could have done quite a great job to adequately represent tile flows, probably at reasonable computational cost. However, to overcome the computational requirements of these models at larger regional scales, methodologies similar to what is outlined in our study are expected to be useful.

It is shown in this research that using flow partitioning signatures in the calibration process results in more appropriate model response during the validation period (e.g., results obtained in scenarios II and III in comparison to those obtained in SCENARIO-I). Such an improved biogeochemical predictability is achieved due to an increase in parameters adequacy, rather than modifications in the model structure (because we only consider one model structure). It has been previously argued that the deterioration of model performance when going from a calibration period to any independent validation period is potentially due to deficiency in model structure (Martinez & Gupta, 2010). Such a structural deficiency could be attributed to lack of complexity (i.e., insufficient set of processes, parameters, and storage compartments) and/or to mal-description of processes (i.e., when key processes are not represented, processes' mathematical descriptions are inadequate). Here we show that improvements in the representation of flow pathways by choosing more appropriate parameter values, rather than more complex model structures, reduces the probability of model performance deterioration in the prediction mode, both for hydrology and nitrogen dynamics. Future work can involve testing both alternate model structures and parameter adequacy on improving flow partitioning signatures. The ideal approach to watershed model development is certainly finding the balance between adequate parameterization (based on a set of performance and partitioning measures) and reasonable model structure in terms of represented processes and how different storage units are connected to each other. Nevertheless, our findings highlight the importance of measuring data on flow paths (e.g., tile flow in our case study) prior to the development of biogeochemical models considering such a balanced approach.

5 Summary

The main objective of this research was to demonstrate that proper constraining of flow partitioning in a hydrologic model enhances the ability of the model to predict biogeochemical fluxes. Specifically, we use as a case study a 2.7-km2 agricultural watershed in southern Ontario and show through a series of numerical experiments how incorporation of tile flow signatures, in addition to streamflow data, increases the ability of the model to predict nitrate loads at the catchment outlet. Three calibration scenarios were created. The first scenario used continuous streamflow data and hydrologic signatures based on streamflow. The second scenario considered everything in the first scenario and a criterion based on annual tile flow as a proportion of annual water yield. This criterion was defined based on previous studies in southern Ontario that provided estimates for tile discharge. In the third scenario, two signatures were added to the set of calibration criteria, which were defined based on tile flow measurements in the study area. A novel regression methodology was developed to upscale continuous tile flow data at a single tile in the catchment, in conjunction with sparse monitoring data in all the seven tiles in the catchment. Catchment-scale tile flow was estimated using regression analysis where tile discharge measurements at a subset of the entire tile network were upscaled to determine tile flow at the basin level. The three scenarios were used to generate a set of equally acceptable behavioral solutions, and the hydrologic fluxes from these solutions were used to run the biogeochemical model. The biogeochemical model comprised of three pools of nitrogen (one organic and one nitrate pool in the soil above tiles and one nitrate pool in the soil below tiles), and simulated catchment-scale nitrogen dynamics represented in terms of organic and nitrate loadings to the stream. The biogeochemical model was evaluated using goodness of fit between simulated and observed nitrate loads, as well as two nitrate signatures at the catchment outlet.

Results revealed that as we incrementally incorporated more information from the tile flow dataset into the calibration process, both streamflow and nitrate loads were reproduced more precisely. When no information on tiles was included in model calibration, the resulting model response significantly underestimated nitrate export despite acceptable reproduction of streamflow. One main reason for this poor performance was that a number of behavioral solutions in this scenario did not generate any tile flow but reproduced the streamflow dynamics fairly satisfactorily. Because tile discharge was not a factor in selecting behavioral models in this scenario, such solutions were selected. On the other hand, when the hydrologic model was constrained against tile flow data using relevant signatures, both streamflow and nitrate predictions were improved and tile discharge was consistent with reality. However, when average annual catchment-scale tile flow was estimated based on previous reports without any consideration of temporal variability (i.e., the second scenario), despite higher goodness-of-fit metrics, model outcome was inconsistent with reality. For example, significant tile flow was generated by the model during summer when tiles were known to be inactive. On the other hand, accounting for tile flow seasonality based on actual observations in the field (i.e., the third scenario) resulted in more realistic simulations, as well as more appropriate nitrate predictions in the validation period as quantified by nitrate loading signatures.

The difference in model performance among calibration scenarios with and without hydrologic partitioning signatures highlights the importance of tile flow as a major pathway for nitrate transport, which requires modelers to collect some information (e.g., similar to our second and third calibration scenario) for constraining tile discharge in their models. Even if tile discharge data are obtained for only a subset of tiles in the catchment, the upscaling approach developed in this study can be used to improve the appropriately incorporate the resulting data to constrain solute transport models.


This research was supported with funding from NSERC Strategic Partnership grant (STPGP-447692-2013) and also with funding from Canada Excellence Research Chair in Ecohydrology in the Department of Earth and Environmental Sciences at University of Waterloo. The authors would also like to thank the Natural Sciences and Engineering Research Council of Canada (NSERC), Innovation Fund Denmark (IFD), Swedish Research Council (FORMAS), and Fundação para a Ciência e a Tecnologia (FCT), for funding, in the frame of the collaborative international consortium (LEAP) financed under the ERA-NET Cofund WaterWorks2015 Call. This ERA-NET is an integral part of the 2016 Joint Activities developed by the Water Challenges for a Changing World Joint Programme Initiative (Water JPI). The data used in this research for the development of hydrology and nitrogen model can be found in two published theses (i) https://scholars.wlu.ca/etd/492/ and (ii) https://scholars.wlu.ca/geog_etd/1/ and also by requested from the corresponding author [email protected].

    Appendix A

    This appendix provides the equations implemented in the hydrologic part of the coupled model developed in this study. Table A1 defines different variables (of time or space) and calibration parameters (with their prior ranges) used in the equations in this appendix. The values associated with variables shown in Table A1 are spatially different over each HRU and are determined based on soil and climatic data. However, parameter values are constant over all HRUs. Hydrologic fluxes (such as infiltatrion and percolation) are not shown in Table A1 and are detailed only in the text. Note that mm/d is the unit for all hydrologic fluxes that include infiltration Minf, the combination of excess rainfall and snowmelt R, overland flow Moverland, potential snowmelt Mmelt,TI, soil evaporation from the upper and lower compartments urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0001 and urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0002, percolation Mperc, tile flow Mtile, base flow Mbase.

    Table A1. Variables and Calibration Parameters of the Hydrologic Model
    Quantity Description Unit Prior rangea
    Variables: f(t/x)b
    PET(x,t) Potential evapotranspiration rate [mm/day] N/A
    φsoil(x,t) Soil compartment water content [mm] N/A
    φmax(x) = Hn Maximum soil storage [mm] N/A
    HU Upper soil compartment thickness [mm] 1000
    n(x) Porosity [−] N/A
    φtens(x) = φFC(x) − φWP(x) Tension storage [mm] N/A
    φFC(x) Storage at field capacity [mm] N/A
    φWP(x) Storage at wilting point [mm] N/A
    Supper(x,t) Water storage in upper soil compartment [mm] N/A
    Slower(x,t) Water storage in lower soil compartment [mm] N/A
    Ta(x,t) Average daily air temperature oC N/A
    Mp,max Maximum percolation rate [mm/day] [0.1–30]
    Mi,max Maximum tile flow rate [mm/day] [0.01–1.0]
    HL Lower soil compartment thickness [mm] [100–500]
    k Base flow coefficient [1/day] [0.001–0.5]
    β Infiltration parameter [−] [1.5–2.5]
    Srain Rain canopy storage [mm] [1–10]
    Ssnow Snow canopy storage [mm] [1–10]
    CPET PET correction factor [−] [0.5–1.5]
    α Precipitation partitioning coefficient [−] [0–1]
    ΔT Rain-snow mixture temperature range oC [0.0–3.0]
    Ma DD factor [mm/day/oC] [2.0–5.0]
    • Note. Notation t and x are removed from the equations provided below.
    • a Brackets show the range used in calibration, and nonbracketed numbers are either fixed values or noncalibration variables.
    • b T: time, x:space (that is, over model's computational units or HRUs)—when not present, the symbol represents a constant.
    • c Constant over space and time, estimated via calibration.
    The hydrologic model is formulated based on the mass balance in two soil compartments (upper and lower compartments shown in Figure 2) and a surface water storage from which water is routed to the basin outlet. Water balance equations in the upper (Supper) and lower (Slower) soil compartments are formulated as
    in which Minf is the infiltration rate calculated as urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0005 where R is the combination of excess rainfall and snowmelt, urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0006 is maximum soil storage and urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0007 is soil water content in the upper soil compartment, and β is a coefficient. Overland flow (Moverland) to the surface water storage is calculated as Moverland = R − Minf. To calculate snowmelt, first, total precipitation is partitioned into rain and snow calculating α, the proportion of snow, α = 0.5 − TaT where Ta is the average air temperature (oC) and ΔT is a temperature range (oC); α varies from 1 to 0 as temperature changes from −ΔT/2 to ΔT/2. The potential snowmelt rate Mmelt,TI is calculated as Mmelt,TI = Ma. max (Ta, 0) where Ma is the degree-day (DD) factor.

    Excess rainfall is the remainder of precipitation after subtracting snow and canopy interception (5% of rainfall up to a maximum value, i.e., Srain and Ssnow). Note that water stored in canopy is evaporated by the rate PET that is estimated using the formulations provided in Hargreaves and Samani (1985); see RAVEN user manual (Craig, 2017) for more information. PET is also corrected using a constant correction factor (CPET), which is a calibration parameter.

    Soil evaporation is linearly proportional to the soil saturation, but distributed by root fraction between upper and lower soil compartments. If soil moisture exceeds the tension storage, soil evaporation from the upper ( urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0008) and lower ( urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0009) compartment are calculated as urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0010 and urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0011, respectively, where φtens is the tension storage, and U and L refer to the upper and lower soil compartments, respectively. ζU and ζL are fixed at 0.7 and 0.3, respectively. Soil evaporation can exhaust the storage down to the wilting point storage and not more than that.

    Percolation (Mperc) becomes active when soil moisture in the upper compartment exceeds the storage at field capacity and is calculated using a threshold-driven liner model urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0012 where Mp,max is maximum percolation rate, and φFC is storage at the field capacity, and U refers to the upper soil compartment. Similar to percolation, tile flow (Mtile) also becomes active when there is more water in the upper soil compartment than saturation at field capacity and is calculated as urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0013 where Mi,max is maximum tile flow rate. Base flow (Mbase) is calculated as urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0014 where k is base flow coefficient and urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0015 is soil water content in the lower compartment. During each time step, the overland, tile, and base flow components are stored in the surface water, which is then routed towards the watershed outlet. Since the time of concentration calculated using empirical formulations in Fang et al. (2008) are less than a day in the catchment, water generated in each time step can be transferred to the outlet on the same day.

    Appendix B

    This appendix provides the equations implemented in the nitrogen mass balance module of the coupled model. Table B1 elaborates on different variables and calibration parameters (and their prior range) associated to the nitrate module, which are also used in the equations provided in this appendix. Regarding fertilizer application, due to unavailability of detailed field survey data, we estimated a constant year-integrated amount of fertilizer based on a previous modeling work in the Grand River watershed (Liu et al., 2016), who reported time-dependent fertilizer application based on literature. Fertilizer is applied uniformly during the first two weeks of May in each year only to agricultural HRUs (i.e., no fertilizer was considered in forest HRUs). Note that Fo and Fi are considered to be uniformly distributed in the first two weeks of May (i.e., values shown in Table B1 are used to calculate daily amount of fertilizer in those two weeks). Atmospheric deposition and biological fixation are determined based on the study by Zhang (2016) and are uniformly distributed in the time frame mentioned in Table B1 (i.e., the values shown for BF and Da in Table B1 are used to calculate daily amount of fixation and desposition. Da is applied to all HRUs, whereas BF is applied only to agricultural HRUs

    Table B1. Variables and Calibration Parameters of the Nitrogen Model
    Quantity Description Unit Prior rangea
    Variables: f(t/x)b
    Worg(t,x) Nitrogen mass in the organic pool in the upper zone [kg] N/A
    Wmin(t,x) Nitrate mass in the mineral pool in the upper zone [kg] N/A
    Wsat,a(t,x) Nitrate mass in the active pool in the lower zone [kg] N/A
    Wsat,p(t,x) Nitrate mass in the passive pool in the lower zone [kg] N/A
    Fo(t) Organic fertilizer applied in the first 2 weeks of May [kg/ha/yr] 10
    Fi(t) Inorganic fertilizer applied in the first 2 weeks of May [kg/ha/yr] 85
    BF(t) Biological fixation during growing season [kg/ha/yr] 43
    Da(t) Atmospheric deposition [kg/ha/yr] 10
    Rc(t) Crop residue at the end of September [kg/ha/yr] 100
    RZ(t) Storage in the upper compartment [mm] N/A
    Slower(t) Storage in the lower compartment [mm] N/A
    km Mineralization rate constant [1/day] [1e−4–1e−3]
    kd Denitrification rate constant [1/day] [0.002–0.005]
    pm Partial mixing coefficient [−] [0.5–1]
    Spassive Water content in the passive storage [mm] [5,000–25,000]
    • Note. Notation t and x are removed from the equations provided below.
    • a Brackets show the range used in calibration, and nonbracketed numbers are either fixed values or noncalibration variables.
    • b T: time, x:space (i.e., HRU types)—when not present, the symbol represents a constant.
    • c Constant over space and time, estimated via calibration.
    The nitrate model is formulated based on the mass balances of three nitrate pools, (i) organic and (ii) mineral pool in the upper zone, and (iii) the mineral pool in the lower zone. Mass balance equations and mathematical representation of processes involved are provided in the rest of this appendix.
    1. Organic pool in the upper zone:
    where Worg is nitrogen mass in the organic pool in the upper zone, M is mineralization, calculated as M = kmWorg where km is mineralization rate constant. Fo, BF, and Rc are organic fertilizer, biological fixation, and crop residue terms as shown in Table B1. Initial value for Worg is estimated considering the surface density of 44.35 g/m2. To account for the impact of temperature, mineralization does not occur if temperature is below zero.
    1. Mineral pool in the upper zone:
    where Wmin is nitrogen mass in the mineral pool in the upper zone, Fi and Da are inorganic fertilizer and atmospheric deposition, respectively (see Table B1), M is mineralization, DNrz is denitrification (equation B3), Urz is plant uptake from the upper zone (equation B4), NLeach is leaching to the lower zone (equation B5), and NT is tiles nitrate export (equation B6):
    k1 and k2 are correction factors due to temperature and soil saturation, respectively. Temperature is an important factor controlling denitrification in soil (Dawson & Murphy, 1972; Hill et al., 2000; Knowles, 1981). We incorporated such impact by using the correction factor k1 as a function of temperature, similar to (Pohlert et al., 2007; Stockle & Campbell, 1989). Technically, k1 linearly increases from 0 (when temperature is 00C) to one (when temperature is 100C), then remains at 1 when temperature increases further. Moreover, as denitrification requires anaerobic conditions, its rate is highest when soil is saturated or close to saturation: k2 accounts for this control and varies linearly from 0 at soil water content φFC to 1 at full saturation (i.e., soil water storage φmax). This saturation control means that when soil water content is below soil moisture at field capacity, denitrification does not occur. Initial value for Wmin is estimated considering the surface density of 1.64 g/m2.
    Nitrogen processes formulated in equations (B4–B6) are coupled with underlying hydrologic fluxes and water storages. For example, plant uptake from the upper zone (Urz) iscalculated based on mineral pool of nitrogen in the upper zone (Wmin) multiplied by the ratio between soil evaporation from the upper soil compartment ( urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0022) and water storage in the upper compartment (RZ). This ratio determines the proportion of nitrogen mass that is transported via plant uptake. Such an approach to estimate the transport of mass is utilized in the formulation of other nitrogen processes. As such, leaching from the upper zone (NLeach) is determined based on the same nitrogen pool (Wmin), percolation (MPerc), and RZ. The amount of tile flow from the upper soil compartment (Mtile) is also utilized to estimate nitrate export from tiles (NT).
    1. Nitrate pool in the lower zone:
    From the nitrogen perspective, the lower zone is composed of two nitrogen pools, the active and passive pool. As detailed below, leaching from the upper zone is split between these two pools, but they have water and solute exchange with one another as well. Mass balance in the active pool in the lower zone is formulated as:
    where Wsat,a is the nitrogen mass in the active pool. Denitrification (DNsat,a), plants uptake (Usat,a), leaching from the upper zone to the active pool (NLeach,a), and base flow nitrate export (NB), all in the active pool, are calculated as follows:
    k1 and k2 are estimated as previously described in the upper zone. Usat,a is calculated based on the Wsat,a, evaporation from the lower compartment ( urn:x-wiley:00431397:media:wrcr23894:wrcr23894-math-0028), and water content in the lower compartment (Slower). NB is calculated based on the amount of base flow (Mbase), as well as the ratio between Wsat,a and Slower. Leaching that goes from the upper zone to the active pool (NLeach,a) is defined by the parameter pm that is the proportion of the amount of percolation that gets exchanged between active and passive pools. The last term in equation B7 is the amount of nitrate that comes from passive storage during water/solute exchange. This term is calculated based on pm, nitrogen mass in the passive pool (Wsat,p), percolation (Mperc), and water content in the passive pool (Spassive). Note that the water flux associated with this transport is (1 − pm)Mperc because the rest of percolation directly goes to the active pool. Initial value for Wsat,a and Wsat,p is estimated considering the surface density of 1.64 g/m2.
    Mass balance in the passive storage is written as

    The first right-hand side term is leaching from the upper zone to passive pool calculated as NLeach,p = (1 − pm)NLeach, and the second term is the amount of nitrate that leaves the passive pool toward the active pool. Note that same as equation B7 the water flux associated with this transport is (1 − pm)Mperc. All terms used in equation B12 were defined previously.

    Concentration at the basin outlet: Eventually, concentration at the outlet of the basin (Co) is calculated as follows (i.e., the ratio between mass of nitrate exported via tile and base flow to river discharge (Q):

    Note that Q is composed of overland runoff (Moverland), tile flow (Mtile), and base flow (Mbase) where the first two processes drain water from the upper compartment and the last process runs from the lower compartment.