Characteristic mega-basin water storage behavior using GRACE

[1] A long-standing challenge for hydrologists has been a lack of observational data on global-scale basin hydrological behavior. With observations from NASA’s Gravity Recovery and Climate Experiment (GRACE) mission, hydrologists are now able to study terrestrial water storage for large river basins (>200,000 km2), with monthly time resolution. Here we provide results of a time series model of basin-averaged GRACE terrestrial water storage anomaly and Global Precipitation Climatology Project precipitation for the world’s largest basins. We address the short (10 year) length of the GRACE record by adopting a parametric spectral method to calculate frequency-domain transfer functions of storage response to precipitation forcing and then generalize these transfer functions based on large-scale basin characteristics, such as percent forest cover and basin temperature. Among the parameters tested, results show that temperature, soil water-holding capacity, and percent forest cover are important controls on relative storage variability, while basin area and mean terrain slope are less important. The derived empirical relationships were accurate (0.54 ≤ Ef ≤ 0.84) in modeling global-scale water storage anomaly time series for the study basins using only precipitation, average basin temperature, and two land-surface variables, offering the potential for synthesis of basin storage time series beyond the GRACE observational period. Such an approach could be applied toward gap filling between current and future GRACE missions and for predicting basin storage given predictions of future precipitation.


Introduction and Background
[2] The terrestrial water balance describes the partitioning of precipitation (P) into evapotranspiration (ET) and runoff (R). It is commonly expressed as where the term dS/dt represents the change in water storage in an explicit region or control volume.
[3] Historically, various models and assumptions have been postulated to facilitate an operational relationship between precipitation forcing and runoff response within large hydrological basins. However, there is still little understanding of the primary state variable in models-terrestrial water storage-and the hydrology components of many global models are calibrated based on parameter optimization with discharge data alone [Bonan et al., 2002;Liang et al., 1994]. Quantitative observations of basin storage behavior are a key tool in dissecting the ''black box, '' in what has often been treated as an input-output relationship. However, field campaigns for storage observations are rare and never over a global domain, and the heterogeneity of land-surface properties and the complexity of landatmosphere coupling introduce a tremendous potential for error in moving from the observational scale to the resolution of current climate models.
[4] To fulfill the objective of operational prediction skill, empirical conceptualizations of basin behavior have been necessary. Many traditional analyses have assumed that at longer timescales and over large regions, dS/dt in equation (1) can be approximated as zero. For example, Budyko's [1974] characterization of the relationship between catchment water balance terms offers a simple framework for understanding basin behavior. On timescales longer than annual, Budyko assumes dS/dt ¼ 0, reducing equation (1) to P ¼ ET þ R. The resulting relationship, known as the Budyko curve, partitions precipitation between runoff and evapotranspiration based on the relative ''dryness'' of the basin. Considerable work has been done to explain deviations around this conceptual model, attributing error to variability and seasonality in climate, soil characteristics, vegetation type, and the scale of study [Donohue et al., 2007;Farmer et al., 2003;Atkinson et al., 2002;Milly, 1994].
[5] As our ability to observe large-scale Earth processes improves, we can show that approaches such as Budyko's [1974] formulation have important limitations. For example, terrestrial water storage S(t) has a significant interannual signal regionally, as shown by recent remote sensing campaigns [Chen et al., 2009;Leblanc et al., 2009;Ramillien et al., 2008;Syed et al., 2008;Rodell et al., 2007;Syed et al., 2007]. Interannual water storage signals can be caused by interannual temperature and precipitation variability, or by direct human activity [Famiglietti et al., 2011;Rodell et al., 2009]. This fact undoubtedly has implications for climate, ecology, and water resources availability at multiyear timescales, and natural storage processes need to be quantified based on available observations.
[6] Since terrestrial water storage in large basins can vary at interannual timescales, we might infer that these storage fluctuations are induced primarily by interannual precipitation variability. However, because of regional differences in land-surface properties, different hydrologic basins have a greater or lesser ability to buffer the effects of interannual signals in precipitation, as the land surface affects infiltration, runoff, and evaporation [Cherkauer and Lettenmaier, 1999;Milly and Dunne, 1994]. Also, there are likely to be upper limits on basin storage and regional storage capacities that, when exceeded, may be linked to regional flooding [Reager and Famiglietti, 2009;Crowley et al., 2006].
[7] For instance, in large basins, vegetation can influence precipitation patterns by affecting moisture and energy fluxes between the surface and atmosphere [Spracken et al., 2012;Bonan, 2008]. When forests decline, evapotranspiration of moisture from soil and vegetation can be diminished, leading to reduced atmospheric humidity and potentially suppressing precipitation [Eltahir, 1996;Eltahir and Bras, 1994;Shukla and Mintz, 1982]. In contrast, observational studies from within regional catchments (>1000 km 2 ) such as those in Peel et al. [2010] have concluded that vegetation has a negligible impact on water balance. The discrepancy in these studies hints at the difficulties in scaling observed hydrological behavior for a global-scale understanding of the land surface and climate.
[8] The world's largest river basins (those with an area >200,000 km 2 ) account for the majority of global land runoff to the oceans [Dai and Trenberth, 2002]. Because of the large influence of these basins on the Earth system, it is critical to work toward an accurate representation of their behavior in global climate models. For modern land-surface models operating at 2.5 or lower resolution, or to capture the major dynamics of the global water cycle, the results of traditional catchment-scale analyses may not be easily applied [Beven, 1995;Bloschl and Sivapalan, 1995;Wood, 1994, 1995;Gupta et al., 1986;Dooge, 1982]. In basins of such immense size, large-scale land-atmosphere interactions may play a critical role in the retention of water and changes in storage during the wet season [Trenberth et al., 2011;Makarieva and Gorshkov, 2007;Eltahir and Bras, 1996].
[9] NASA's Gravity Recovery and Climate Experiment (GRACE) mission [Tapley et al., 2004] now provides the opportunity to observe the dynamics of water storage for large river basins. With monthly global coverage, GRACE data are well suited to contribute to better understanding of hydrology at the larger temporal and spatial scales that are important for climate studies [Lettenmaier and Famiglietti, 2006]. For the current research, we focused on developing new methods to explore the information contained in the GRACE data at its intrinsic large-basin scale, in hopes of improving our conceptual understanding of the global water cycle.
[10] In this study, we begin with the hypothesis that precipitation alone does not drive large-scale storage variability, but that the influence of precipitation forcing is affected significantly by the land surface with which the water interacts. We test this with an analysis of the impact of land-surface variables on large-basin water balance, represented by the propagation of variability from Global Precipitation Climatology Project (GPCP) precipitation time series [Adler et al., 2003] to GRACE water storage time series. We then use land-cover, soil properties, and terrain information to draw broad conclusions about the large-scale land-surface impact on water balance.
[11] First, basin-averaged time series are estimated for 21 of the world's largest catchments ( Figure 1) and analyzed for the propagation of variability at two timescales, based on frequency-domain transfer functions. These transfer functions are then parameterized to understand the influence of basin-mean temperature (from the National Centers for Environmental Prediction (NCEP) [Kalnay et al., 1996], percent forest cover within a basin derived from Moderate Resolution Imaging Spectroradiometer (MODIS) satellite observations [DeFries et al., 2000], soil waterholding capacity (WHC) [Dunne and Willmott, 2000], and basin terrain characteristics. Finally, we validate generalized parameterizations by modeling storage time series and comparing with observations from GRACE.
[12] In addition to contributing to understanding of large-scale hydrology, this study enables the prediction of future water storage variations given climate model predictions of precipitation . There is also the potential for a multiyear gap between the failure of the current GRACE mission and the launch of its successor. The methods described here could be used to estimate water storage variations within this intermission time period. Although the current GRACE record is short and causes difficulty in the application of time series analysis methods, it is still critically important to offer a statistical model of the data with the best methodology available, given the critical need to understand water storage variations in the future.

Methods' Overview
[13] In order to investigate storage behavior across basins, we use an empirical approach to analyze and model the GRACE observations. Rather than solving equation (1) for dS/dt, we instead seek an estimate of storage response S(t) that is normalized for precipitation forcing. We achieve this objective through the following steps: First, a frequency-domain analysis is performed, including a dynamical systems model (single input/output cross-spectral analysis) for basinaveraged precipitation-storage transfer functions. Second, we select a small list of probable parameters to create a functional relationship between measured storage response and lumped basin variables. Third, we apply modeled transfer functions to observed precipitation spectra and return to the time domain with a prediction of storage anomaly.

GRACE
[14] Because of the inherent spatial limitations of the GRACE data , we use scaled basin-averaged time series [Swenson and Wahr, 2002] for basins with a drainage area greater than 200,000 km 2 and a drainage volume of greater than 40 km 3 /yr. Because of two significant and distinct causes of error in the GRACE data, GRACE basin time series must be corrected for loss of signal due to measurement error (based on the GRACE footprint) and for ''leakage'' error (the contamination of a signal by adjacent stronger signal). Using uncorrected (''unscaled'') estimates of basin storage will lead to an erroneous representation of GRACE-derived water balance. The error correction is performed by creating a linear scaling operator for the GRACE basin-averaged data Wahr et al., 2004]. In summary, synthetic basin variability is converted into a global spherical harmonic solution to degree and order 60 and smoothed spatially with a Gaussian filter at 300 km radius. The processed output is then compared to the raw input to create a scaling parameter that will be applied to similarly processed GRACE observations. Results have been validated with in situ observations in several studies [Famiglietti et al., 2011;Yeh et al., 2006].
[15] Following the processing for basin averages, the GRACE data are best suited for application in the following large basins (Table 1 and Figure 1): Amazon, Amur Congo, Danube, Dniepr, Don, Ganges and Brahmaputra, Lena, Mackenzie, Mississippi, Murray, Niger, Nile, Ob, Orinoco, Parana, Volga, Yangtze, Yenisei, Zambezi, and Mekong. The Murray is included even though it has a drainage volume of only 9 km 3 /yr. The effects of this are discussed in the analysis.
[16] The inherent temporal limitations of the GRACE data (one robust solution per month) effectively limit the usable resolution of the other data sets to monthly as well. Note that the GRACE data are not sampled once monthly in each location, but that several samples (three to four) approximate to the area of interest are taken within the month. These are combined to create a single global solution for GRACE water storage anomaly at a monthly interval, which we assume for this study to represent a monthly average.

Precipitation
[17] Global precipitation observations are taken from the GPCP [Adler et al., 2003]. Global grids of 2.5 monthly mean precipitation were downloaded from www.esrl.noaa.gov, and masks for each of the study basins were applied to estimate basin-averaged time series. The basin-averaged precipitation time series (mm/d) were converted to a cumulative precipitation anomaly by first removing the mean and then integrating with respect to time. This creates an estimate of the variability of the total precipitation input (cm) during the GRACE record and results in units of height anomaly that match those of storage ( Figure 2).

Temperature Data
[18] Global gridded 1 temperature data come from the NCEP reanalysis [Kalnay et al., 1996]. These data were used to calculate basin-averaged temperature to compare against precipitation-storage behavior. This comparison is shown as a map (Figure 3, top) and as a plot (Figure 3, bottom).

Land-Surface Variables
[19] We investigated several candidate land-surface variables in order to explore which had the greatest influence on controlling terrestrial water storage at the large scales observed by GRACE. In modern land-surface models, water balance losses are controlled by a small list of parameters: soil matric potential, soil porosity, and soil depth; land-cover type; and topography. We represent those parameters here in a gross fashion: one parameter for soil (soil w.h.c.), one parameter for vegetation (% forest cover), and one for topography (surface slope). We also add basin size, which is relevant in a realistic characterization of basin behavior that includes surface water storage. Each is described later.

Percent Forest Cover
[20] Percent forest-cover data were downloaded from University of Maryland Global Land Cover Facility [DeFries et al., 2000] at 1 resolution with global coverage and are shown in Figure 4a. Unlike land-cover type classification, forest cover is a continuous variable and can be averaged within a basin to achieve a basin-mean estimate. This approach assumes that a linear average of subbasin land cover will approximate the total basin land-cover effect on storage response.

Soil Water-Holding Capacity
[21] Soil WHC [Dunne and Willmott, 2000] data were downloaded from the NASA Global Change Master Directory at http://gcmd.nasa.gov/ and are shown in Figure 4b. These data are calculated empirically based on plant-extractable water capacity of soil-the maximum amount of water that can be extracted from the soil to fulfill evapotranspiration demands-and serve as a proxy for relative soil water storage capacity across the study basins.

Mean Terrain Slope
[22] Global terrain slope data at 1 resolution were downloaded from the ISLSCP II data set website (http:// daac.ornl.gov/). These data are derived from the Hydro1k digital elevation model [Verdin, 2011]. Hydro1k has a native spatial resolution of 1 km, the highest resolution database with global coverage of standard elevation-based derivatives. Figure 4c shows the global distribution of derived terrain slopes from which we constructed basin averages for comparison with GRACE data.

Basin Area and Drainage Volume
[23] Basin area and drainage volume [Dai and Trenberth, 2002] were also investigated as potential controls on relative storage response and are listed in Table 1. 2.6. Fourier Analysis 2.6.1. Basin-Averaged Water Storage and Precipitation Spectra [24] The goal of spectral analysis is to describe the distribution (in frequency) of the variability contained in a signal during a finite observation period. The basin-averaged GRACE time series were transformed into frequency domain to investigate the dominant frequencies in the GRACE signal. Because the GRACE record is short (late 2002-present), and because we are limited to a monthly sampling frequency, our resulting data series contain a low number of samples for standard frequency-domain analysis. Using a parametric spectral method instead of a nonparametric method, we can resolve undersampled frequencies with a single complete instance in the time series and sufficient energy, to reduce the impact of noise in a short time series that could cloud results in traditional spectral approaches. We apply the Yule-Walker autoregressive method of spectral estimation [Emery and Thomson, 2004] across the entire time series to detect significant variability down to 1 cycle/5 yr. This parametric method is better for short time series than traditional power spectra, but interpretation of results requires some a priori knowledge of the dominant spectral periods. Also, the GRACE observational error, and annual period and interannual period admittance with 95% confidence intervals from transfer function solutions. Finally, the squared correlation coefficient and Nash-Sutcliffe efficiency between predicted and observed storage.
[25] In order to estimate confidence in the results, we also calculate the Yule-Walker spectrum for a white-noise time series that possesses twice the standard deviation (95% confidence) of the basin-averaged GRACE storage. The scaled confidence spectra are shown in Figure 5 as a dashed line. Based on these results, the statistically significant spectral energy was grouped into two frequency ranges for the modeled storage response in section 2.6.4: annual (1 year period) and interannual (2.5-5 year period).

Basin-Averaged Transfer Functions
[26] Here we define a transfer function as a unit-response function from precipitation input to storage output, which, over a range of inputs, describes the response of the land surface to precipitation variability. It is limited in that it only quantifies periodic signals, not trends or biases. It is also limited by considerations of ensemble size and averaging window width, which vary based on available record length. Transfer functions for a precipitation input and a storage output, assuming wide-sense stationary and stochastic processes, were calculated using where G PP represents the two-sided autospectra of basinaveraged precipitation anomaly, G PS represents the twosided cross-spectra of basin-averaged precipitation anomaly and storage anomaly, and E is an error term for unexplained variance. Two-sided spectra are defined such that the frequency range of integration ranges from negative to positive infinity. The result occupies the whole Nyquist interval [to 1/dt] with a symmetrical spectrum in the second half, as opposed to the single-sided spectrum that goes to the Nyquist frequency [1/(2dt)]. Figure 6 offers a conceptual depiction of the transfer function operation. Transfer functions were calculated here using the Welch periodogram method [Emery and Thompson, 2004] for cross-spectra and autospectra. This allows the calculation of confidence intervals based on record length.
[27] A two-sided transfer function yields phase data in the imaginary component of the solution. To simplify transfer function interpretation, we assume that there is no time lag possible between precipitation forcing and storage response in snow-free basins. If rain falls in a basin, the basin storage changes simultaneously, like water in a bucket responding to filling by a hose. In order to apply this rule, we have allowed for only zero-time-lag propagation of precipitation variability to storage variability, by applying the imaginary portion of the transfer function correctively. Because the transfer function depends on the cross-spectra in its calculation, it contains the phase information for the two time series. We can discard the orthogonal, imaginary portion of the transfer function from the total magnitude, to remove the nonzero-time-lag portion of the signal. We are then left with only the instantaneous (monthly) basin storage response per unit precipitation forcing at each discrete frequency interval.
[28] For this analysis, we consider the higher-frequency signal in the transfer functions-beyond a frequency of 2 cycles/yr (6 month periodicity)-as noise. This is in order to simplify our results to the major periodic frequency ranges that appear in the GRACE data. Since the transfer functions represent an ''admittance'' of signal between input and output, even a small signal due to processing noise can appear in the result. Hence, we discount weaker frequencies as insignificant and concentrate on the significant discrete frequencies from the spectral results: the annual period and a combination of interannual (2.5-5 year) periods. We then regard the transfer function estimates with their corresponding confidence ranges.

Parameterized Transfer Functions
[29] The correlations between transfer function admittance (the propagated variability from precipitation to storage) and a range of values in certain basin-averaged parameters allow the formulation of a least squares solution. We applied a logarithmic fit for transfer function admittance as a function of the correlated land-surface variables. This allowed us to reconstruct a synthesized transfer function based on those parameters and apply this modeled transfer function to a precipitation time series as described in section 2.6.4.

Reconstructed Storage Time Series
[30] To empirically reconstruct a storage time series, we estimate the discretized transfer function at the annual and interannual (2-5 year) ranges based on the empirical model equations from equation (2). We then convert a precipitation forcing time series into the frequency domain using a fast-Fourier transform. The smoothed transfer function is multiplied by the precipitation spectrum to create an output storage spectrum. The inverse fast-Fourier transform is applied to the results to recreate a time domain series of basin-averaged storage. Transfer function confidence intervals are carried through the process.

Fourier Analysis
[31] Results of the spectral estimates of the GRACE and GPCP data are shown in Figure 5. The first noticeable congruency among all of the spectral series is a general lack of high-frequency variability with significant peaks at the annual and interannual periods. All of the study basins have significant energy at the interannual period (less than 1 cycle/yr) for both time series, except for the Amazon, Mackenzie, Niger, Orinoco, and Volga.
[32] The fact that the highest peaks occur at the annual period in almost all of GRACE storage time series (the blue lines in Figure 5)-even those that do not have significant spectral energy at the annual period in precipitationdeserves discussion. While the single greatest influence on shaping storage spectra is precipitation, we see in Figure 3 (bottom) that temperature has the ability to decorrelate the precipitation and storage time series and also their spectra. Several spectra with no significant energy at the annual period in precipitation have significant energy in the annual period for storage (Ob, Danube, Dnieper, Don, Volga). This is caused by the action of temperature to accumulate snowfall during the winter and discharge it during the spring, integrating the faster timescale precipitation variability into the 1/yr frequency. [33] Because of the dominant ability of freezing temperatures to modulate annual storage, in the rest of the analysis, we chose to separate the basins into two categories : ''warm'' basins (those with a mean temperature greater than 15 C), and ''cold'' basins (those with a mean temperature less than 15 C). The 10 warm basins are those clustered toward the right of Figure 3 (top), showing a good correlation with storage. The map in Figure 3 (bottom) gives spatial detail on the relationship between temperature and storage, which shows even negative correlations for precipitation and storage at higher latitudes.
[34] Figure 7 shows transfer function solutions for four warm basins (Amazon, Congo, Parana, and Ganges-Brahmaputra), as a function of frequency. The Amazon has the transfer function with the most admittance at the annual period, and the Parana with the least. However, there is a critical threshold moving into the low-frequency range beyond which the Parana becomes the most active transfer function and the Amazon the least. This highlights an important result: not only do basins have a different response to precipitation based on their land-cover and soil characteristics, but these responses also vary, or show some nonlinearity, across timescales.

Uncertainty and Assumptions
[35] The error estimates in GRACE data are constant in time for a region, as they exist primarily as a function of the  GRACE footprint and limit resolution defined by the orbital configuration of the satellites. They are therefore assumed not to affect the storage response in different frequency ranges differently (i.e., leakage and smoothing errors are assumed constant in time for a region). In other words, while significant error exists in the GRACE data, it is assumed to have negligible frequency content and not to propagate into the frequency domain. The time-constant observational errors included in Table 1 should be used as a proxy for observation error in the predicted time series of GRACE terrestrial water storage for each of the study basins.
[36] Frequencies with statistically significant spectral energy in the GRACE and GPCP time series were included in the transfer function results. The confidence estimates of the transfer function solutions were calculated assuming a 2 distribution of random error for the quality of the spectral solution relative to the length of the observational record and corrected for windowing. The 95% confidence intervals for the transfer functions are plotted as error bars in Figures 8-10.

Suggested Controls on Rainfall-Storage Relationship
[37] Among the variables tested, temperature appears as the single most important characteristic controlling storage across basins in our study, and the correlation between precipitation and storage decays quickly as basin-mean temperature decreases. In summation, zero-lag transfer function admittance from precipitation to storage varied widely across cold basins. Storage in these basins was highly anticorrelated with temperature time series at 120 day lag times (temperature leading storage), indicating a strong seasonal response of storage to temperature forcing, and as mentioned earlier, often uncorrelated or anticorrelated with precipitation seasonality.
[38] In Figure 8, we see the effects of percent forest cover on transfer function admittance for two different timescales, for each of the warm study basins. Land-cover density has the effect of preserving relatively more precipitation variability in storage at the annual period and less precipitation variability in storage at longer timescales. For the cold basins, there was little to no admittance at the annual period due to the uncorrelated time series, and the zero-time-lag rule applied here. However, for the cold basins at the interannual timescale, the effect of vegetation density is generally the same as for the warm basins.
[39] The plots of transfer function admittance versus available soil water are shown in Figure 9. The effect of soil WHC is very similar to that of forest cover. As soil WHC increases, the admittance of variance from precipitation to storage at the annual period also increases. For interannual timescales, this effect reverses, and basins with more WHC show a reduced storage response to precipitation forcing.
[40] Basin storage response relative to basin size and topography is shown in Figure 10 for annual period admittance only. Basin slope and basin drainage area had no significant correlation with transfer function admittance at either the annual or interannual period.

Modeling Storage Response to Precipitation Forcing
[41] We use our generalized transfer functions at two frequency ranges to create new time series of terrestrial water storage for warm and cold regions, based on land-cover type, forest cover, soil WHC, and given a precipitation input time series. For extended time periods, these predicted storage time series neglect the effects of any significant feedbacks on storage from ecosystem change and assume a stable vegetation range for an ecosystem during the GRACE record.
[42] We first fit empirical functions with an assumed logarithmic shape to the transfer function results across two correlated variables : basin-mean percent forest cover and basin-mean available soil depth. The model is applied to the warm basins only, since warm basins are more clearly correlated with land-surface parameters for their storage response across all frequency ranges, except for the Murray (due to the anomalously low correlation with precipitation at the annual frequency). Once the parameters were estimated, the model was applied to all basins. The best fit Figure 7. Basin-mean transfer function admittance for precipitation input to storage output for four basins, plotted as a function of frequency (cycles/yr).
functions are shown in Figures 8 and 9. We use a logarithmic function of the form: where H PS is the admittance of variability from precipitation to storage, FC and WHC are the values of the land-surface parameter (forest cover and WHC), and a, b, and c are model parameters. This model was applied to the annual and low-frequency storage response.
[43] The model predicted storage time series are shown in Figure 11 for warm basins and Figure 12 for cold basins. For cold basins, only the interannual storage variability was modeled, since only the low-period transfer functions were correlated with land-surface parameters. The model results (red) with 95% confidence interval (gray shaded) are compared to the GRACE observations (black). Validation metrics were calculated against a GRACE signal smoothed for coherence with the designed model (i.e., considering only the modeled frequencies in both the synthetic and observed GRACE data). The correlation coefficients (r 2 ) and model efficiency (E f ) [Nash and Sutcliffe, 1970] between modeled and observed storage anomaly are listed in Table 1. Figure 8. Basin-mean storage response to precipitation forcing (transfer function admittance) as a function of percent forest cover. Shown for the annual and low-frequency timescales. The best fit model function is also plotted.
[44] In warm basins, the generalized model was able to correctly reproduce the magnitude of the annual variability in GRACE observations and a portion of the interannual period variability. For warm basins, model results were generally well correlated (0.71 r 2 0.92) with observations and accurate (0. 54 E f 0.84). For cold basins, in comparison to the GRACE interannual signal, model results were moderately correlated (0.14 r 2 0.82) and had mixed accuracy (À0.08 E f 0.66).

Summary and Discussion
[45] When viewed across the spectrum of global river basins, the rainfall-storage relationship is quite complex. While two areas may receive similar amounts of precipitation, there can be differences in the resulting storage response due to temperature and the characteristics of the land surface. The basins used in this study are extremely large and give a general representation of land-cover types, across climates, globally.
[46] To summarize, we list the order of importance of controls on basin terrestrial water storage variability at mega-basin scales as follows: (1) temperature : in regions colder than 15 C, temperature-driven storage acts to generally decouple storage variability from precipitation forcing because of snow accumulation. For periods longer than annual, temperature becomes a less significant factor in storage response. (2) Land cover: Denser forest cover is correlated with less long-period storage variability and a more consistent annual dynamic storage range. (3) Soil: soil depth and porosity, combined here as soil WHC, are Figure 9. Storage response to precipitation (transfer function admittance) as a function of basin-mean soil WHC. Shown for the annual and low-period timescales. Best fit model is also plotted. correlated with less long-period storage and a more consistent annual dynamic storage range. (4) Topography : terrain slope and basin area were found to have an insignificant correlation with storage behavior at the large-basin scale.
[47] At the lowest order, temperature acts in a grossly binary fashion on basin storage : either to accelerate water loss through ET or to prevent water loss through snow accumulation. Our results show that in colder basins, such as the Mackenzie, temperature causes a decorrelation or even an anticorrelation between precipitation and storage time series. In the winter, higher-frequency precipitation variability is integrated and redistributed temporally into a cumulative seasonal signal, and gravity-based storage observations reflect this well-known effect. With the onset of springtime temperature increases, runoff and evaporative losses resume and drive decreasing storage, sometimes concurrent with (but often outpacing) rainy season precipitation increases. This effect acts at the annual period, while at interannual periods storage variability is better correlated with interannual precipitation variability. This effect makes it difficult to decompose GRACE storage spectra into its component pieces and difficult to model storage time series in cold regions from precipitation input. An interesting continuation of our study would involve representing temperature influence on storage empirically, or alternatively, using storage observations to tune ET losses in a land-surface model.
[48] Over a typical annual cycle, more water remains in storage per unit of precipitation with more forest cover and with more soil capacity. We find this effect more significant than that of basin area on storage response. These results suggest, for example, that a relatively small but well-forested basin like the Orinoco can maintain water in storage (per unit precipitation) during a typical rainy season that a larger, less forested basin such as the Zambezi or Parana cannot.
[49] The discrepancy between interannual behavior and annual behavior in the transfer functions suggests a difference in basin storage capacities across water-limited and Figure 10. Annual period transfer function admittance as a function of (top) basin drainage area and (bottom) basin-mean terrain slope.
nonwater-limited ecosystems. An interannual storage signal is not truly independent from an annual storage signal and in reality, is occuring with the annual storage signal for each monthly data point. For example, the lack of transfer of an interannual signal from precipitation to storage may indicate a full storage capacity in a given month. Despite an extreme in precipitation, some finite range of storage variability has been met, and now further storage gains are limited by the flooding of water out of the basin or by increased evaporation. In the tropics, interannual Figure 11. Basin-modeled storage anomaly time series for warm basins (red) and 95% confidence (gray shaded), based on cumulative precipitation anomaly (blue), compared with observed storage anomaly from GRACE (black dashed). Storage is in centimeters equivalent of water over the basin.
precipitation events (like those linked to El Niño-Southern Oscillation) may drive interannual increases in runoff or evapotranspiration, resulting in little interannual change in storage. A relatively larger interannual storage signal indicates the opposite mechanism : a basin that is often waterlimited.
[50] Certainly, it is well known that basin slope has an impact on river runoff and therefore on storage. For instance, in the Mekong basin [McGuire et al., 2005], as well as in voluminous catchment-scale studies, terrain slopes are important in runoff modeling. The absence of influence in these results could be due either to the very Figure 12. Basin-modeled storage anomaly time series for cold basins (red) and 95% confidence (gray shaded), based on cumulative precipitation anomaly (blue), compared with observed storage anomaly from GRACE (black dashed). Storage is in centimeters equivalent of water over the basin. large scale of the basins studied (i.e., that terrain variability in the mega-basins is simply not great enough to impact large-scale storage variations at the timescales investigated here); to the relatively low resolution of the data sets required to conduct a global study such as this one (so that some important terrain attributes were not well resolved) ; or to the fact that a broader sample of terrain attributes was not included in the research. It is also likely that basin slope has the most immediate effect on runoff generation, and in steep basins, precipitation extremes are matched by runoff extremes, resulting in little storage effect.
[51] There is a critical basin size and discharge volume range in which this model of storage variability applies. For example, the transfer function of the Murray basin is a particularly poor fit at the annual period resulting in a large confidence interval on admittance. This is likely due to the low flow volume of the Murray (Table 1) and the high degree of water management within the basin (Australia diverts river flow for agriculture and was in a severe drought from 1995 to 2009) [Ummenhofer et al., 2009]. As such, the storage signal does not possess much of the natural forcing variability from the precipitation input. It is interesting to note though that at the interannual period, the observed Murray basin response falls in line accurately with the parameterized transfer function fit. One has to consider that while precipitation seasonality can be expected and managed, the prediction of interannual precipitation variability is more difficult, and reservoir storage is limited by capacity in time. For these reasons, management may have its strongest impact on storage variability at the seasonal timescale.
[52] Our results showing the importance of land-atmosphere interaction in large basins are supported by previous work. Eltahir and Bras [1994] had shown precipitation recycling estimates of 25%-35% in the Amazon, and Makarieva and Gorshkov [2007] and Makarieva et al. [2009] have shown that for large basins, precipitation recycling is a necessary mechanism to bring water deeper than 600 km into a basin. It is likely that large-scale recycling is a mechanism which retains more water in storage in relatively wet regions and that this effect accounts for the larger annual period response in well-forested basins. This would signify that large-scale basin water dynamics are entirely distinct from field-scale or even regional-catchment-scale dynamics.
[53] While estimation of basin residence times based on these data sets was desired, it is not possible to do so quantitatively due to the fact that GRACE observes only anomalies of water storage, never an absolute amount, and the application of even a simple linear reservoir model requires more information. It is possible however to come to qualitative inferences about the relative residence tendencies for water storage across the study basins. The dynamics of the land-atmosphere link and the ability of large, well-forested basins to retain water in storage for long periods through recycling are important in large-scale hydrology, and suggest longer residence times in those basins. While a catchment-scale study may not consider atmospheric dynamics, land-atmosphere interactions, or precipitation recycling, for global-scale basins, these processes are probably of critical importance.
[54] Based on our results, we can now predict a monthly storage anomaly time series using GPCP precipitation time series and two land-surface variables (percent forest cover and soil water holding capacity) in the world's megabasins. Storage time series for large basins are uncommon and of use to water managers for understanding total water availability, and to large-scale hydrologic or land-surface modelers as a state variable for calibration and validation. While the simple model we have presented here does not account for nonlinear transitions in land-cover type or the evolution of terrestrial ecosystems, it offers the first means for predicting typical storage variability time series within the limits of a stable ecosystem.