# Storage selection functions: A coherent framework for quantifying how catchments store and release water and solutes

## Abstract

We discuss a recent theoretical approach combining catchment-scale flow and transport processes into a unified framework. The approach is designed to characterize the hydrochemistry of hydrologic systems and to meet the challenges posed by empirical evidence. StorAge Selection functions (SAS) are defined to represent the way catchment storage supplies the outflows with water of different ages, thus regulating the chemical composition of out-fluxes. Biogeochemical processes are also reflected in the evolving residence time distribution and thus in age-selection. Here we make the case for the routine use of SAS functions and look forward to areas where further research is needed.

## Key Points:

- Storage selection functions recapitulate age dynamics
- Formulation of transport by travel time distributions
- Flow and transport at catchment scales

## 1 Introduction

The time spent by a parcel of water within a watershed, from input to the present time, is a random variable commonly referred to as age or, equivalently, residence time. The age dynamics of the water in storage within a catchment system directly affects the chemical composition of hydrologic fluxes, including solute concentrations in the discharge through the catchment outlet that are routinely measured in the field. Residence time can be seen as the master variable of catchment hydrology because its distribution captures the integrated description of the ensemble of physical processes coexisting within a hydrologic system and the bulk effects of biogeochemical reactions undergone by solutes transported by water [e.g., *Rinaldo and Marani*, 1987; *Rinaldo et al*., 1989; *Maloszewski et al*., 1992; *Kirchner et al*., 2001; *Weiler et al*., 2003; *Kirchner*, 2003; *Botter et al*., 2005; *McGuire and McDonnell*, 2006; *Stumpp et al*., 2009; *Kirchner et al*., 2010; *Hrachowitz et al*., 2010a, 2010bb; *McDonnell et al*., 2010; *Harman et al*., 2011; *Cvetkovic et al*., 2012; *van der Velde et al*., 2012; *Beven*, 2012a; *Heidbuchel et al*., 2012; *Birkel et al*., 2012; *McMillan et al*., 2012; *Hrachowitz et al*., 2013; *Heidbuechel et al*., 2013; *Kirchner and Neal*, 2013; *Davies et al*., 2013; *Harman and Kim*, 2014; *McDonnell and Beven*, 2014; *Harman*, 2015].

As experimental evidence has mounted on the mutual links among the hydrochemistry of waters in storage and in fluxes, climatic regimes, soil and landscape attributes, and the age of streamflows [e.g., *Weiler and Flühler*, 2004; *McGuire et al*., 2005; *Hrachowitz et al*., 2009; *Soulsby et al*., 2011; *Tetzlaff et al*., 2014], a growing awareness has developed of the fundamental importance of a unifying theoretical framework for catchment-scale flow and transport phenomena [e.g., *Rinaldo and Marani*, 1987; *Kirchner*, 2003; *Sivapalan et al*., 2003]. Such a framework must formally characterize the age dynamics in hydrologic systems and, as a consequence, comprehensively recapitulate hydrograph and tracer information.

Evidence of complex age dynamics originates from the observation that much (if not most) of runoff had been stored in the catchment for much longer than event waters (the so-called old-water paradox) [e.g., *Stewart and McDonnell*, 1991; *Kirchner*, 2003; *Bishop et al*., 2004; *Divine and McDonnell*, 2005; *McGuire and McDonnell*, 2006; *McGuire et al*., 2007; *McDonnell et al*., 2010]. Different approaches exist to reconcile the dynamic character of runoff with the catchment's ability to store and mix large amounts of water and solutes without postulating, say, the stationarity of the hydrologic processes involved or the shapes of age distributions [see e.g., *McDonnell and Beven*, 2014]. A recently proposed approach, based on StorAge Selection (SAS) functions [*Botter et al*., 2011], provides in our view a sound basis to the quest for a watershed theory. SAS functions, formally defined in the next section, represent the way outflows are composed of stored water of different ages, and so recapitulate the hydrologic processes generating them. They thus encapsulate the integral effect of dispersion mechanisms driving solute transport within a hydrologic control volume, and explicitly regulate the composition of out-fluxes.

Here we make the case for the routine use of SAS functions and look at areas where further research is needed.

## 2 Storage Selection Functions: An Overview

The SAS framework relies on the representation of a hydrologic system as a dynamic population of water parcels growing older as they move within the system until they leave through any of the outflows. The process of water/solute transport from entry to exit can be described either through a time-forward or a time-backward perspective, which coincide only under steady transport [*Niemi*, 1977; *Rinaldo et al*., 2011]. In the forward case, one focuses on injections at a specific time and considers the distribution of future arrival times to an outlet. In the backward representation, the ages of particles in storage within (or leaving) the system at a given time are considered in terms of their distribution of entrance times. A sample of stream water contains a distribution of ages that entered the catchment at different times, hence, the backward representation is particularly suitable when at-a-station flux measurements of solutes are involved (say, contaminant release or mineral weathering). We shall focus here on the backward formulation and avoid technicalities, including the dual forward formulation [see e.g., *Benettin et al*., 2015a]. Suffice here to recall that the forward approach focuses on the future evolution of the system using the current state as initial condition, and requires the introduction of a water parcel's life expectancy which quantifies the time it will spend within the system before being sampled by one of the outflows. The sum of age and life expectancy is the parcel's travel time. Forward and backward formulations differ practically and conceptually as they yield analogous formal relationships only in the special case of stationary systems [*Niemi*, 1977]. Age and life expectancy distributions evolve distinctively in response to unsteady hydrologic fluxes, and both can be used to describe the fate of solutes measured at the exit control surface(s), with relative advantages that depend on the specific case at hand.

*T*of particles that are

*leaving*as discharge,

*Q*, at time

*t*. Likewise, any other outflow is characterized by a distinct age distribution (e.g., representing the distribution of the ages

*T*in storage removed by evapotranspiration at time

*t*). The residence time distribution (RTD), , in contrast, defines the probability distribution of the ages that characterize the water storage at time

*t*. Residence and travel time distributions are not independent from one another but tightly regulated by a Master Equation (ME), expressing mass and age continuity [

*Botter et al*., 2011]. The ME shows that residence and travel time distributions share the legacy of the sequence of inputs and their subsequent aging, and accounts for the transformation of age-tagged storage into fluxes out of the system. To characterize such transformation,

*Botter et al*. [2011] introduced SAS functions (originally termed mixing functions) as:

SAS functions identify, in a spatially integrated manner, the relationship between the set of ages available within a hydrologic control volume and the ages of the particles removed as outflows through the system boundaries (in our example, *Q* and *ET*). As such, SAS functions are volume-integrated analogues of the advection-dispersion equation (ADE) of water mass density along an age dimension [e.g., *Ginn et al*., 2009; *Delhez et al*., 1999; *Kirchner et al*., 2001; *Cornaton and Perrochet*, 2006; *Fiori and Russo*, 2008; *Ali et al*., 2014]. SAS functions can be derived explicitly for general dispersion models in finite 1-D domains by accounting for different boundary conditions, showing that larger dispersivities increase their uniformity (
) [*Benettin et al*., 2013a]. SAS functions are meant to describe which water parcels (i.e., those that recently entered the catchment or those that have been in storage for some time) contribute to streamflow production and plant uptake, as shown in the sketch of Figure 1. They embed the entirety of the physical processes determining the arrival of stored water at the boundary of a hydrologic system, to exit as discharge, evapotranspiration, or any other flux (say, through pumping or recharge to deep aquifers). Thus, at any time there exists a true SAS function which is impossible to measure directly unless every possible age in the transport volume is tagged. With current technology, this is clearly not an option. Viable approaches are thus based on the adoption of a functional form (fixed or time-varying) and on the calibration of its parameters by contrasting theoretical results with data. The specification of SAS functions, jointly with the knowledge of the relevant hydrologic fluxes, allows for a complete statistical characterization (say, of age and travel time distributions) through analytical or numerical solutions of the ME.

SAS theory is hypothesized to represent a general and flexible tool for characterizing the hydrochemistry of different types of flow systems potentially relevant to hydrology and biogeochemistry. A key issue for the practical use of SAS functions, however, is the requirement that
needs to be a probability distribution (i.e.,
and
). Because *p _{S}* evolves in time,

*ω*should also change to satisfy normalization. This renders a closed-form expression for the SAS functions impractical in most cases. An elegant procedure to tackle the general case has been proposed by

_{Q}*van der Velde et al*. [2012] who characterized the sampling behavior by expressing the SAS functions not as a function of age, but rather as a function of the cumulative distribution of ages stored in the control volume, i.e.,

*T*maps into . In the residence time age domain

*P*, the SAS function is a probability distribution function and thus it can be parameterized, e.g., by a beta distribution [

_{S}*van der Velde et al*., 2012, 2014] or through a power law defining the preference for old/young water [

*Queloz et al*., 2015]. If the storage that actively participates in transport is a small fraction of the total, the resulting travel time distributions should be relatively insensitive to the actual size of the overall storage prompting

*Harman*[2015] to suggest a formulation in which

*ω*is defined in terms of the product (rather than

_{Q}*P*), termed age-ranked storage. The ME can then be conveniently expressed by mass balance in

_{S}*S*where

_{T}*S*(

*t*) no longer appears in the equation explicitly.

*Harman*[2015] also demonstrated an important feature of SAS functions: they can be parameterized to vary in time with other system state variables (such as relative catchment wetness), so as to capture the temporal dynamics of transporting processes.

Interestingly, analytical solutions of the transport problem are available for
, i.e., where sampling occurs proportionally to the volumes of the various ages in storage (random sampling) [*Botter et al*., 2010]. In such a case, the age distributions in storage and in the outflows coincide [*Botter et al*., 2011; *Hrachowitz et al*., 2013]. Such solutions enable the derivation of more general analytical expressions for randomly sampled storages arranged in series or in parallel, collectively far from randomly sampled, which result in increasing/decreasing SAS functions corresponding to preference for older/younger ages, respectively [*Bertuzzo et al*., 2013; *Benettin et al*., 2013a, 2015b].

Model identification is central to estimates of hydrologic fluxes that cannot be directly measured, like large-scale ET or internal fluxes among different compartments. Ranking approaches by discounting for the number of parameters can be done formally by comparative analyses of measured and computed quantities [*Akaike*, 1974; *Beven*, 2012b]. Examples already exist of revealing comparative computational analyses based on hydrochemical field data, e.g., for chloride and nitrate at the Hupsel brook catchment in the Netherlands [*van der Velde et al*., 2010, 2012; *Benettin et al*., 2013b], the Monchaldorf catchment in Switzerland for pesticide circulation at catchment scales [*Bertuzzo et al*., 2013], upland catchments in the Scottish highlands for chloride and water stable isotopes [*Hrachowitz et al*., 2013; *Birkel et al*., 2015], and the upper and lower Hafren catchment in Wales for chloride transport [*Harman*, 2015; *Benettin et al*., 2015b]. General numerical solutions for the relevant statistics have also been provided [e.g. *Fiori and Russo*, 2008; *Fiori*, 2012; *Bertuzzo et al*., 2013; *Ali et al*., 2014; *Harman*, 2015].

## 3 Tracking Tracers and SAS Functions: An Outlook

- The availability of high-quality hydrochemical data sets [e.g.
*Kirchner and Neal*, 2013] in diverse regions of the world, including semiarid, cold, or tropical ones, remains central to our understanding of solute retention in catchments and interpreting hydrobiogeochemical processes in general—in brief, current approaches are geographically biased. High-frequency measurements should be undertaken for the relevant input and output fluxes, and maintained for timespans larger than mean travel times, typically ranging from months to years. This places constraints on the interpretation of field data. Hydrologists, in fact, would need to re-evaluate the traditional interpretation of field experiments in terms of concentration breakthrough curves to instantaneous pulses (the forward picture) in the light of long-term at-a-station gauging of exiting ages in streamflow (the backward picture). The theoretical apparatus underlying SAS functions challenges the uniqueness and usefulness of contrasting observed forward breakthrough curves with input concentrations time series in light of the now-realized need to predict and understand their backward dual. This holds whenever the system under investigation cannot be seen as the host of steady state transport phenomena, quite unlikely in the hydrologic response of natural watersheds; - Integrating information from different tracers still represents a challenge. Spiking rainfall with artificial tracers at catchment scales is manageable only at small spatial scales with current technologies, and thus the use of natural tracers represents a practical choice for catchment-scale hydrochemical transport studies. Depending on the tracer, however, one may introduce a bias to either young or old water: high tritium may suggest the presence of old water on the timescale of decades [see e.g.
*Morgenstern et al*., 2010;*Stewart*, 2012], while water stable isotopes mostly provide information on younger water (months to a few years) [*Seeger and Weiler*, 2004]. Moreover, the methodology itself used to analyze the tracers may be structurally biased [*Bethke and Johnson*, 2008;*Cornaton et al*., 2011]. The effects of evaporation and transpiration on isotope/solute transport are an ongoing matter of concern for many natural tracers, Hence, challenges to hydrologic transport that might be impacted by the use of SAS functions include tools that enable a spatially explicit characterization of inputs and improved understanding of the isotopic/solute composition of soil and vegetation water. In this context, model-guided field validation seems like a sensible step for interpreting laboratory evidence and scaled-up, catchment-scale transport experiments; - SAS functions propose a paradigm shift from a variety of approaches that require significant assumptions (e.g., to fit parameters of stationary transit time distributions to observed data from unsteady systems) to a coherent mathematical framework which explicitly takes into account flow and transport under variable hydrologic drivers. Applications to diverse hydrologic settings, already underway, are needed to build an archive of case studies. More work is thus needed on accurate large-scale numerical studies employing arbitrary SAS functions varying with distinctive state variables and embedding significant geomorphological complexity [but see e.g.,
*van der Velde et al*., 2014;*Harman*, 2015;*Benettin et al*., 2015b;*Queloz et al*., 2015]; - Approaches that consider an entire catchment as a unique control volume, without storage partitions of any kind, are of interest owing to our enhanced ability to measure and/or estimate directly in and out-fluxes without introducing internal subdivisions with their own state variables [e.g.,
*Kendall et al*., 2001]. In such a case, however, the behavior of whole catchment-scale SAS functions proves far more complex than a random sampling scheme, for which exact solutions are available [*Botter et al*., 2011;*Benettin et al*., 2013a;*Bertuzzo et al*., 2013]. To that end,*Harman*[2015] demonstrates how SAS theory can be used to parameterize a whole-catchment SAS function that is nonuniform and varies in time with a catchment state variable (relative storage); - The meaning (and possibly the existence) of ideal tracers that sample the same velocity field as water parcels needs be revisited in the light of SAS functions. In fact, even if a tracer is assumed to be passive to chemical reactions and degradation in soils and flows perfectly along with water without retardation of any nature (but see [
*Oberg and Sanden*, 2005]), it still may show limited (or enhanced) affinity to be selected by vegetation in settings where transpiration is significant. Under the above circumstances, the process of evapotranspiration would directly impact the chemical or isotopic composition of the storage and hence stream water quality. The residence times of solutes would thus be inevitably different from those of water parcels, regardless of the ideal nature of the tracer seen from a mere soil-water mass exchange perspective. Hence, deeper understanding of vegetation affinity for uptake of different ages is critical [*Brooks et al*., 2010;*McDonnell*, 2014]. Also, we need to develop suitable technologies for the direct measurement of solute uptake from large-scale assemblages of diverse vegetation cover. This is key to mass balance closures devoid of critical assumptions. Upscaling sap water sampling and biomass analyzed for residuals to levels producing statistical significance is one possible way, lacking to date a breakthrough technology for noninvasive bulk measurements; - Another challenge, and a call to action, pertains the study of catchment-scale reaction kinetics within the above framework. The residence time distribution inherently defines contact times between fixed and mobile phases driving biogeochemical cycling and mass exchange phenomena [
*Rinaldo and Marani*, 1987;*Rinaldo et al*., 1989] (but see also [see e.g.,*Brusseau et al*., 1989;*Botter et al*., 2006;*Maher*, 2010;*Basu et al*., 2010]). Thus, regardless of the physical, chemical, or biological reaction undergone by the solute mass within stored water parcels (or lack of it thereof), the mobile mass sampled by outflows would be inevitably controlled by the evolving residence time distribution and thus by age-selection. A critical issue, and an open challenge, will, therefore, reside in bridging complex geochemistry issues with the large-scale, integrative characterization required by catchment transport scales. Many biogeochemical processes occurring along hydrological flow paths (e.g., redox reactions, heterogeneous reaction kinetics, varying solubility equilibria to name a few) may be of importance for other solutes, and one would need to assess quantitatively what storage selection can achieve in this regard. Moreover, as in*Botter et al*. [2005], an assessment of spatial effects is needed so as to highlight at what nonpoint-source injection scales (relative to correlation scales of heterogeneous properties) one could assume the processes as chiefly driven only by contact times between fixed and mobile phases. A distinct advantage of SAS function approaches is that transport parameters pertaining to travel times may be decoupled from the ones characterizing any reaction kinetics, thereby allowing a direct use of multiple tracer field and experimental evidence and an effective parameter calibration; - Top-down and bottom-up approaches are both needed to develop a general, empirically supported theory [see e.g.
*Sivapalan et al*., 2003]. A top-down approach requires application to a large number of systems where appropriate data are available, and eventually a synthesis of the estimated SAS functions to infer patterns. A bottom-up approach involves analytical, numerical, and experimental analysis elucidating the relationship between the structure and dynamics of individual systems and their emergent SAS functions. Such efforts, both needed, should focus on site-specific dominant controls on storage selection, and how they can be parameterized in a systematic manner.

Regardless of the open challenges for which action is called, we conclude that the advancement postulated by SAS functions already provides the hydrologic community new and fundamental tools for coherently addressing catchment-scale flow and transport phenomena.

## Acknowledgments

A.R., P.B., and E.B. wish to thank the Swiss National Science Foundation (SFN) for funding the research project through grant SFN-135241. C.H. wishes to thank the National Science Foundation for support through grant EAR-1344664. The authors wish to thank two anonymous reviewers for their insightful comments on an earlier version of this Commentary.