Volume 59, Issue 11 e2022WR034310
Review Article
Open Access

A Decade of Data-Driven Water Budgets: Synthesis and Bibliometric Review

Kelley Moyers

Kelley Moyers

Department of Civil and Environmental Engineering, University of California, Merced, Merced, CA, USA

Contribution: Conceptualization, Methodology, Formal analysis, Writing - original draft, Writing - review & editing, Visualization

Search for more papers by this author
Robert Sabie

Robert Sabie

New Mexico Water Resources Research Institute, New Mexico State University, Las Cruces, NM, USA

Contribution: Conceptualization, Methodology, Formal analysis, Writing - review & editing, Visualization

Search for more papers by this author
Emily Waring

Emily Waring

Department of Civil and Environmental Engineering, University of California, Merced, Merced, CA, USA

Contribution: Conceptualization, Methodology, Formal analysis, Writing - review & editing, Visualization

Search for more papers by this author
Jorge Preciado

Jorge Preciado

New Mexico Water Resources Research Institute, New Mexico State University, Las Cruces, NM, USA

Contribution: Conceptualization, Methodology, Formal analysis, Writing - review & editing

Search for more papers by this author
Colleen C. Naughton

Colleen C. Naughton

Department of Civil and Environmental Engineering, University of California, Merced, Merced, CA, USA

Contribution: Conceptualization, Methodology, Formal analysis, Writing - review & editing

Search for more papers by this author
Thomas Harmon

Thomas Harmon

Department of Civil and Environmental Engineering, University of California, Merced, Merced, CA, USA

Contribution: Conceptualization, Methodology, Formal analysis, Writing - review & editing

Search for more papers by this author
Mohammad Safeeq

Mohammad Safeeq

Department of Civil and Environmental Engineering, University of California, Merced, Merced, CA, USA

Contribution: Conceptualization, Writing - review & editing, Visualization

Search for more papers by this author
Alfonso Torres-Rua

Alfonso Torres-Rua

Civil and Environmental Engineering Department, Utah State University, Logan, UT, USA

Contribution: Conceptualization, Writing - review & editing

Search for more papers by this author
Alexander Fernald

Alexander Fernald

New Mexico Water Resources Research Institute, New Mexico State University, Las Cruces, NM, USA

Contribution: Conceptualization, Writing - review & editing

Search for more papers by this author
Joshua H. Viers

Corresponding Author

Joshua H. Viers

Department of Civil and Environmental Engineering, University of California, Merced, Merced, CA, USA

Correspondence to:

J. H. Viers,

[email protected]

Contribution: Conceptualization, Methodology, Resources, Writing - review & editing, Visualization, Supervision, Project administration, Funding acquisition

Search for more papers by this author
First published: 16 November 2023

Abstract

Scarce water resources across the globe have prompted the development of data-driven water budgets to account for and distribute limited water more effectively across various land uses and purposes. Data-driven approaches for estimating individual water budget components have been extensively developed and subsequently reviewed (e.g., evapotranspiration, precipitation, groundwater, surface water, runoff), but the state of the art of data-driven approaches for estimating and integrating complete water budgets has not been the subject of a review paper to our knowledge. In this review, we fill this void by reviewing 81 systematically identified publications from the last decade (2012–2022) on data-driven water budget approaches. We describe the current state of measurements and data products for data-driven water budgets for various spatiotemporal scales. Our analysis suggests that spatiotemporal parameters drive the approach for data-driven water budgets, with larger spatiotemporal scales relying more on satellite remote sensing data products and smaller spatiotemporal scales relying more on ground-based monitoring. The incorporation of satellite remote sensing data products and ground-based monitoring was common across various spatiotemporal scales and enabled the estimation of complete water budgets in areas of limited data availability. We conclude that improved reporting of simplifying assumptions, uncertainty analysis methods, and data sources are required for the alignment of water budget estimations between resource managers at varied spatiotemporal scales. Our review calls for the standardization of data-driven water budget reporting protocols to improve the interpretability of data-driven water budgets across decision-makers working at various spatiotemporal scales.

Key Points

  • The frequency and spatial scale of data-driven water budgets differ in the integration of ground-based and satellite remote sensing data

  • The simplifying assumptions, uncertainty, and data sources should be reported in a data-driven water budget

  • The standardization of data-driven water budget reporting protocols is needed for decision-makers working at various spatiotemporal scales

Plain Language Summary

Scarce water resources across the globe have prompted the monitoring of water budgets using data, which we refer to as a data-driven water budget. We review the current state of measurements and data products for data-driven water budgets of various scales—from field to region and over minutes to decades. We conclude that satellite remote sensing data products are suitable for data-driven water budgets over large areas and long periods, but ground-based data are more suitable over small areas and shorter periods. We identify challenges for data-driven water budgets, which call for standardized reporting protocols for data-driven water budgets.

1 Introduction

Scarce water resources globally call for the development and adoption of water budget accounting tools by decision-makers—that is, individuals or organizations who use water budgets for decision-making—to achieve sustainable water management for agriculture, ecosystems, and communities. The emergence of new sensors for water component measurements and sources of publicly available water data sets has opened opportunities for developing data-driven water budgets (DDWB)—the monitoring of water budgets using empirical data. To our knowledge, there is limited literature that has synthesized or conducted a bibliometric review of methodological approaches for integrating multiple data sources into a DDWB.

Water budgets account for the storage and fluxes of water within a defined system of a control volume. A water budget is an estimation of inflows, outflows, and changes in water storage of a system over some time for a region of interest (Singh, 2016). The water balance, or “continuity” equation, states that the difference between the inflow to a system and the outflow from the system is equal to a change in storage of the system for a given time interval:
Inflow Outflow = d S d t ± ε $\text{Inflow}-\text{Outflow}=\frac{dS}{dt}\pm \boldsymbol{\upvarepsilon }$ (1)
where dS is the change in storage for a time interval (dt) and ε $\boldsymbol{\varepsilon }$ is the residual term that includes any water that was not accounted for elsewhere in the continuity equation. The continuity equation or conservation of mass is true for any temporal or spatial scale. Inflows include but are not limited to precipitation, irrigation, upward flow from the water table, subsurface lateral inflow, surface run-on, or surface streamflow. Outflows include but are not limited to evapotranspiration, subsurface lateral outflow, surface runoff, vertical drainage, or diversion. All components are typically expressed on a unit volume per time basis or depth per time if normalized by the system area.

The residual (or remainder) term, ε $\boldsymbol{\varepsilon }$ , quantifies the water balance error by combining measured water budget errors and uncertainties and model assumptions; it also indicates the non-closure of the measured water budget. Commonly, the continuity equation is simplified and reduced. The simplified version can be used for various spatiotemporal scales to calculate an unknown water budget component as the residual (or remainder) if the other components are measured. While it may be assumed that ε $\boldsymbol{\varepsilon }$ is negligible under certain circumstances, the water budget error is included within the calculated water budget component.

Climate change is impacting the hydroclimatic system. For example, climate change has resulted in changes to the soil moisture supply and atmospheric demand (Novick et al., 2016) and alterations in the spatiotemporal pattern and magnitude of droughts and flooding (Madsen et al., 2014). Furthermore, climate warming and its impact on the hydro-climatic system presents an emerging challenge to water resources planning and management objectives as non-stationarity behaviors are not well addressed in most water budgets (Milly et al., 2008). The climate impacts are complicating water resource planning and management with increased frequency of less predictable inter- and intra-annual hydroclimatic whiplash events (Swain et al., 2018). Shifts in the hydroclimatic system due to climate change call for the development of DDWB to aid water resources planning and management in maintaining water security.

New water budget approaches are needed to address critical changes in climate conditions, water management objectives, and land use decisions that are upending historic water use and water supply relationships. Innovation in measurements for improved accuracy and precision and the capability of long-term data records are needed for DDWB to work toward a secure water future for agriculture, ecosystems, and communities in the face of climate change and water scarcity.

Water is a shared resource, so it is important to align the management objectives of various resource managers who work at various spatiotemporal scales in any water resources planning and management. For example, water budgeting done at the field-scale by private property owners does not typically consider water management at regional scales that fall outside property lines. This lack of alignment across spatial scales has led to field-scale water budgets that ignore the impact of groundwater-dependent, daily irrigation on larger-scale water budgets, thus leading to negative consequences such as regional groundwater overdraft like in the United States High Plains and California's Central Valley (Scanlon et al., 2012). Aligning water management decisions by entities at the field-scale with regional-scale climate and water policies is necessary for the long-term equitable distribution of water across various decision-making domains (Figure 1). This alignment can be assisted by developing DDWB at all spatiotemporal scales that are relevant to water management. Sensors for measuring various water budget components in a DDWB are designed for use under specific protocols, such as for specific spatiotemporal scales. Therefore, the data sources (and underlying sensors collecting the data) limit the spatiotemporal scale of water management for which the data can be used and have implications for the uncertainty of the water budget, meriting a literature review on the state-of-the-art data used for DDWB.

Details are in the caption following the image

Conceptual diagram showing different water management objectives at field-scale and regional-scale and the relevant measurement scales.

The primary objective of this synthesis and bibliometric review is to evaluate the scientific literature on DDWB from the last decade and answer the following specific questions:
  1. What types of measurements and data products have been used for various spatiotemporal scales of DDWB (Section 3)?

  2. What are the types of DDWB and their suitable applications (Section 4)?

  3. What are the challenges of quantifying uncertainty in DDWB (Section 5)?

  4. What research opportunities exist for DDWB (Section 6)?

2 Methodology

We applied a systematic review methodology for identifying representative literature that examines the measurements and data products used for data-driven water budgets in the scientific literature. Our intent was not to present an exhaustive review of water budgets where the number of relevant publications is in the thousands, but rather to focus on publications that help answer our four objectives. We also did not intend to scope all potential data products, but rather assess the types of data products used for water budgets from a systematic sample of publications with the intent of gaining insight into the state-of-the-art of DDWB and recommendations for future research on DDWB. Our approach is an adaptation of existing scoping review methodologies (Arksey & O’Malley, 2005; Peters et al., 2015; Whipple & Viers, 2019) and incorporates suggestions for systematic reviews (Sugg et al., 2020; Xiao & Watson, 2019).

We performed a primary search query in Web of Science to identify 200 initial publications (Figure 2a) and then followed the systematic method shown in Figure 2b. A Web of Science query used terms selected from an initial screening of literature topics, titles, and keywords. Our aim was to focus on water budgets that employed measurements, so the query required “measurement” in the title, keywords, or abstract, helping to limit the number of publications to a reasonable amount to review. Water budgets are related to other names, such as water accounting, water balance, hydrologic budget, and water footprint, so these terms were searched in the title. We further limited the query to only publications that were in English, published between 2012 and 2022, and within the Web of Science Water Resources category. An initial scan of the abstracts of 200 publications from the search query indicated that the search query yielded mostly publications focused on DDWB. From there, we wanted to include only the publications focused on DDWB using the following method. Query results were evaluated for publications that focused on water accounting conducted from field to regional scales, excluding global studies (105 publications). One publication was eliminated for using outdated measurements and data products from 1940 to 1977, and one publication was eliminated for estimating a water budget not built on the physics-based continuity equation. Twenty-two publications were eliminated after reading them because they primarily used data for parametrizing or validating hydrological models, thus narrowing the publications included in our critical analysis to 81 peer-reviewed studies. A post-review search was conducted to find potentially missed literature with high citation counts in journals with high impact factors. These articles were included in our discussion section but were not included as part of the systematic review.

Details are in the caption following the image

(a) The Web of Science query gathered the publications with topic equal to “measurement” and language equal to “English” and Web of Science category equal to “Water Resources” and publication date in the period 2012 to 2022 (up until 27 July 2022) and titles including the words “water accounting,” “water budget,” “water balance,” “hydrologic budget,” or “water footprint.” (b) Flow diagram reporting the number of publications identified through the Web of Science query, the total publications screened, the exclusion criteria applied to the group of publications reviewed in this manuscript, the total publications assessed for eligibility, and the total publications included in the critical analysis.

Using the 81 studies we identified through our search as the foundation of this review, the remainder of this paper is structured as follows: Section 3 reviews the current state of measurements and data products for DDWB; Section 4 reviews data-driven water budget approaches and their suitable applications; Section 5 reviews the uncertainty challenges for DDWB; and Section 6 is a discussion of future research opportunities in DDWB. We provide conclusionary remarks and suggested improvements in Section 7.

3 Current State of Measurements and Data Products for Data-Driven Water Budgets

In this section, we briefly describe the commonly used measurements and/or data product types for the major water budget components that are used in DDWB. We observed that precipitation (Section 3.1) and evapotranspiration (Section 3.2) were heavily studied in the literature we reviewed, so we included separate sub-sections on the measurements and data products for those major water budget components. Then we describe the measurements and data products for any other inflows and outflows (Section 3.3) and the changes in water storage (Section 3.4) discussed in the literature. We explain the recommended spatial and temporal scales for each type of measurement and/or data product.

3.1 Precipitation Data

Precipitation for DDWB generally relies on either ground-based precipitation measurements or gridded precipitation data products. For regional scales, gridded data products are useful and are most often derived from remote sensing, which can include data from radar, microwave, infrared, and thermal imagers. Other gridded precipitation products can be derived from spatially interpolated weather station networks interfaced with precipitation gauges. At the field-scale, an individual precipitation gauge, often a tipping bucket rain gauge or weighing precipitation gauge, is deployed. Gridded data products are sufficient for water budget studies spanning months to decades, whereas individual precipitation gauges can be suitable for finer temporal scales down to hourly.

Several regional scale studies have used globally or provincially available gridded precipitation data products. For example, the Tropical Rainfall Measuring Mission (TRMM) uses active and passive microwave instruments, for regional-scale water budgets in areas with heavy to moderate precipitation over tropical and subtropical areas (Armanios & Fisher, 2014; Moreira et al., 2019; Rusli et al., 2021; Soltani et al., 2020; D. Zhang et al., 2016). Provincially available radar, such as Next-Generation Radar (NEXRAD) based on a network of high-resolution S-band Doppler weather radars operated by the National Weather Service, can be viewed as a mosaic map and has been used in a daily water budget at a wetland in Florida, United States (Polatel, 2015). Many gridded data products are available through government agencies, but some global regions do not have gridded data products available and rely on open-source data products from the United States, European Union, and others.

Where gridded data products are not readily available, a network of precipitation gauges can provide the data needed to develop a precipitation map. Various monitoring networks rely on multiple point-based weather stations interfaced with a precipitation gauges for regional-scale water budgets (Henn et al., 2018; Safeeq et al., 2021; Zheng et al., 2022). Some precipitation gauge networks use spatial interpolation methods, such as the Thiessen polygon approach (Rusli et al., 2021). However, in situations with low spatial data availability sometimes one gauge is uniformly extrapolated to the regional scale.

Field-scale water budgets sometimes use a single on-site precipitation gauge (Fouli et al., 2012), or precipitation gauge data from a nearby weather station from a government network of weather stations such as the Australian Bureau of Meteorology (Dean et al., 2016) or the California Irrigation Management Information System (Kisekka et al., 2019). One precipitation gauge assumes uniform precipitation throughout the entire study area, although the precipitation is only collected over the area of the orifice and is subject to precipitation undercatch in windy conditions (Liljedahl et al., 2017).

In areas with limited temporal availability of data, precipitation gauges can be installed for a short period and the data can be linearly regressed with a gridded precipitation data product that has a longer period of data available (such as the satellite-based Climate Hazards Group Infrared Precipitation v.2) extracted for the site area to extend data over the full time period of the DDWB (Alemu et al., 2020).

Precipitation as snow is usually only considered at regional scales at monthly to annual time scales, and several types of data products exist. Lidar-derived snow data such as the National Aeronautics and Space Administration (NASA) Airborne Snow Observatory (ASO) provides gridded estimates of snow depth from the time of peak snow water equivalent (SWE) through the snow ablation season (Henn et al., 2018). Snow depth can be estimated using spatially distributed ultrasonic snow-depth sensors and co-located soil moisture sensors for monthly and annual water budgets (Saksa et al., 2017). Gridded snow-covered area data are available for daily water budgets from Moderate Resolution Imaging Spectroradiometer (MODIS) snow products (MOD10A2) using a visible to thermal-infrared spectroradiometer and mapping algorithms (Savean et al., 2015). SWE determines the amount of water available in snow and could be used for water budgets at monthly and annual time scales (Zheng et al., 2022). The depth of snow accumulation on a board can be recorded at the field-scale at monthly intervals or however often is convenient for doing site visits to take manual recordings (Fouli et al., 2012). Snow is not usually considered in field-scale water budgets because it is labor-intensive to measure, and snow is sometimes nonexistent in irrigated agricultural areas where water budgets are monitored or alternately snow effects are overridden by the first irrigation event of the season.

3.2 Evapotranspiration Data

Evapotranspiration (ET) data generally relies on gridded data products, soil water balance, flux station measurements, isotope mass balance, or weather station data for empirically derived reference ET. In our review, water budgets that focus heavily on ET measurements were mostly in agricultural or rangeland settings where root water uptake and evaporative demand strongly increase ET, making it an important component to measure in the water budget.

Field-scale water budgets rely on ground-based evapotranspiration measurements. Lysimeters can measure the water budget of a small plant area for deriving actual ET (ETa) and are often assumed to represent ETa for a whole field (Abou Zakhem et al., 2019; Darzi-Naftchali et al., 2013). Eddy covariance has been widely used for determining ETa at the field-scale for data-driven water budgets at various temporal aggregations ranging from daily to annually (Campos et al., 2016; Dean et al., 2016; Denager et al., 2020; Kozii et al., 2020; Liljedahl et al., 2017; Pan et al., 2017; Schreiner-McGraw et al., 2016; Scott & Biederman, 2019; Webb et al., 2017; Xie et al., 2014). The crop coefficient approach multiplies a reference ET (ETo) by a crop coefficient which is defined as the ratio of observed crop evapotranspiration (ETc) to ETo (Allen et al., 1998). This approach can be used for estimating ETc under well-watered conditions and only requires simple meteorological data (Fouli et al., 2012; Garrido-Rubio et al., 2020).

Eddy covariance-based ET can be extrapolated to the basin-scale using remote sensing derived vegetation parameters. A gridded ET product in an undisturbed tropical woodland in Brazil was developed to spatially extrapolate from eddy covariance flux tower measurements over nearby regions using a best-fit equation between ET and MODIS-based enhanced vegetation index (EVI) and grass-based reference ET (ETo) derived from ground-based meteorological data (Oliveira et al., 2015). In the southern Sierra Nevada, annual catchment and river basin ET values were derived using a linear regression between ground-based ET from 10 eddy covariance towers and Normalized Difference Vegetation Index (NDVI) from Landsat 5, 7, and 8 (Safeeq et al., 2021).

Gridded ET products derived from satellite-based remote sensing are suitable for regional scale water budgets over long study periods of years to decades. In a comparison of seven gridded ET products for sub-basins in Thailand, a global 8-day interval data set at 0.5 km spatial resolution (MODIS MOD16A2) predicted the annual and monthly temporal variability of storage changes in a water balance framework with coefficients of variation of 0.97 and 0.70 respectively, although there was a bias that needed to be corrected (Sriwongsitanon et al., 2020). The Global Land Data Assimilation System (GLDAS) ET data product (https://ldas.gsfc.nasa.gov/gldas) has also been used in water budget studies (Lv et al., 2017; Zheng et al., 2022). A method for the basin-mean GLDAS ET data product reconstruction was tested to correct for precipitation underestimation, runoff under/overestimation, and the impact of irrigation on ET, which was not well considered in the raw ET product (Lv et al., 2017).

While gridded ET data products other than the products mentioned here exist, not all have been used for water budgets, and we limit our description to the ones identified in our systematic review. Gridded ET products used in water budgets from publications in this review involved pixel sizes ranging from 0.5 to 25 km and are too large for most field-scale water budgets. Careful attention should be given to whether a satellite remote sensing ET data product provides an ETa or an ETc estimate due to the potential effect of the local environmental and management conditions.

3.3 Other Inflow and Outflow Data

Field-scale DDWB approaches commonly include other inflows and outflows, such as soil flux measurements, runoff, capillary flow, and irrigation. Soil flux measurement methods include capillary wicks (Fouli et al., 2012), ceramic cups, and weighing lysimeters in drip irrigated agricultural areas (Abou Zakhem et al., 2019). Drainage was measured using tipping bucket rain gauges in potted plants (Jimenez-Buendia et al., 2015), a stopwatch and buckets from artificial subsurface drainage (Barnard et al., 2017), and without-end lysimeters (Darzi-Naftchali et al., 2013). Infiltration was measured using tracers and infiltration tests in an arctic coastal plain lake (Koch, 2016). Field-scale runoff was measured directly in research plots via collection systems such as gutters, flumes, or weirs (Fouli et al., 2012). Capillary flow was estimated using measurements of micrometeorological variables, soil moisture content profiles, water-table levels, sap flow velocities, and stem diameter variations to determine the groundwater supply for transpiration in a remnant urban reserve in Australia (Marchionni et al., 2019). Irrigation can be measured using flow meters. Field-scale soil flux measurements, runoff, capillary flow, and irrigation require ground-based measurements.

Regional-scale DDWB approaches commonly include other inflows or outflows, such as streamflow, irrigation, and interception. Streamflow was estimated using pressure transducers combined with flumes or weirs in conjunction with stage-discharge rating curves (Dean et al., 2016; Henn et al., 2018; Safeeq et al., 2021; Schreiner-McGraw et al., 2016), an acoustic doppler current profiler (Xing et al., 2012), V-notch weirs (Dean et al., 2016) and dam gate readings and pumping records (Xing et al., 2012). Governments and hydroelectric utilities sometimes make historical and active streamflow data publicly available. Regional-scale irrigation measurements are costly and labor-intensive using flow meters, thus a mapped remote sensing-based soil water balance using the FAO-56 dual crop coefficient method was used for irrigation water accounting over 100,000 ha (Garrido-Rubio et al., 2020). Irrigation records from local farmers are sometimes available. In groundwater-dependent irrigated areas, irrigation is sometimes assumed to be equal to the groundwater depletion, such as in the Ogallala aquifer (Ouapo et al., 2014). Interception can be measured using troughs, such as the eucalyptus forest study by Mitchell et al. (2012). At the regional-scale, satellite remote sensing data products can provide estimates of irrigation, but ground-based measurements are needed for streamflow and interception.

3.4 Changes in Water Storage Data

Changes in soil water storage are usually considered in field-scale water budgets. For field-scale water budgets, changes in soil water storage are derived from soil water content measurements using a neutron probe (Abou Zakhem et al., 2019; Mitchell et al., 2012), time-domain reflectometry (S. Han et al., 2021; Mitchell et al., 2012), capacitance sensors (Barnard et al., 2017), and cosmic ray neutron sensing (Schreiner-McGraw et al., 2016). Distributed soil water content sensor networks can be spatially interpolated to cover wider areas (Graf et al., 2014), such as using soil dielectric probes and Thiessen polygon interpolation (Schreiner-McGraw et al., 2016), but the area covered is usually not large enough for regional-scale DDWB. Groundwater levels are sometimes monitored to validate deep percolation and seepage estimates, but groundwater storage is not usually considered in field-scale water budgets even though some agricultural areas depend on groundwater for irrigation.

Regional-scale DDWB approaches often consider surface water storage, groundwater storage, or terrestrial water storage (which combines water storage in the ground, the soil, and the plant biomass). Changes in lake storage were estimated using pressure transducers, water level monitoring equipment combined with bathymetry to estimate lake volumes, or an isotope mass balance approach to determine contributions from floodwaters to evaporation and groundwater (Masse-Dufresne et al., 2021). Groundwater storage and its changes are often neglected in short-term water budgets because they can be challenging to measure precisely. Level loggers (pressure transducers) and groundwater extraction data are the most direct assessments of groundwater storage. Multi-year regional-scale DDWB have employed the Gravity Recovery and Climate Experiment (GRACE) global terrestrial water storage anomaly product, which combines all water on the land surface and subsurface including soil water content, groundwater, surface water, and water stored in plant biomass (Armanios & Fisher, 2014; Lv et al., 2017; Rusli et al., 2021; Soltani et al., 2020; Sriwongsitanon et al., 2020; S. Wang et al., 2014; D. Zhang et al., 2016; Zheng et al., 2022). At relatively small (catchment) scales and for shallow groundwater, ground-based time-lapse gravimetry has shown some promise for assessing groundwater storage changes in terms of water table surface fluctuations in small catchments (Arnoux et al., 2020). Groundwater discharge flux was measured via continuous radon measurements (Webb et al., 2017). For groundwater recharge, measurement methods include pressure transducers and a well for measuring water table fluctuations (Oliveira et al., 2015), and taking manual water table fluctuation readings (Barnard et al., 2017). In summary, changes in water storage are measured using various ground-based measurements at all spatiotemporal scales, but GRACE can estimate changes in terrestrial water storage over years and decades at regional scales.

4 Classification of Data-Driven Water Budget Approaches and Their Applications

4.1 Approaches to Data-Driven Water Budgets and Their Applications

A critical analysis of data products and measurements discussed in Section 3 led to the following main classes of DDWB: solely employing ground-based data products (Section 4.2), solely employing satellite remote sensing data products (Section 4.3), and employing a hybrid of satellite remote sensing and ground-based data products (Section 4.4).

The spatial and temporal scales are important for determining the suitable class of data product types (Figure 3). Figure 3 is a conceptual diagram showing the classifications of measurements and data products for DDWB. Entirely ground-based DDWB are well suited for plant-scale and field-scale studies but can be extended to regional scales with an adequate sensor network. Entirely ground-based DDWB are usually used for periods spanning days to years, with difficulty using ground-based DDWB for decades due to sensor maintenance issues and difficulty using ground-based DDWB for periods of hours due to sampling issues. Satellite remote sensing products are most commonly used for regional scale water budgets over study periods of years to decades. The incorporation of ground-based measurements with satellite remote sensing can make it possible to derive field-scale and regional-scale water budgets at temporal scales such as days to years, which are finer temporal scales than an entirely remote sensing DDWB has accomplished.

Details are in the caption following the image

Conceptual diagram of time and length scales that are suitable for various data-driven water budget approaches: (1) solely employing ground-based data products (Section 4.2), (2) solely employing remote sensing data products (Section 4.3), and (3) employing a hybrid of remote sensing and ground-based data products (Section 4.4). n indicates the number of studies in each of these categories from this systematic review.

The most common applications of DDWB include estimating or partitioning the water budget, decision-making, using the water budget for validating another data source, using the water budget to calculate something difficult or impossible to measure, and evaluating water budget closure and/or uncertainty (Table 1). Ground-based measurements are commonly used for estimating and partitioning the water budget (20 publications) and for decision-making (16 publications). A full list of the details of the classifications of DDWB approaches and their applications can be found in Table S1.

Table 1. Contingency Table Showing the Primary (but Not Only) Application of Data-Driven Water Budgeting and the Types of Data-Driven Water Budgets for the 81 Publications in This Review
Remote sensing Ground-based Hybrid
Estimating/Partitioning 4% (3) 25% (20) 11% (9)
Decision-Making 1% (1) 20% (16) 2% (2)
Validating 0% (0) 5% (4) 4% (3)
Difficult/Impossible 0% (0) 4% (3) 6% (5)
Uncertainty/Closure 1% (1) 10% (8) 7% (6)
  • Note. The types of applications are estimating or partitioning the water budget (“Estimating/Partitioning”), using the water budget for decision-making (“Decision-Making”), using the water budget for validating another data source (“Validating”), using the water budget to calculate something difficult or impossible to measure (“Difficult/Impossible”), and evaluating water budget closure and/or uncertainty (“Uncertainty/Closure”). The types of data-driven water budgets are entirely ground-based data, a hybrid of ground-based data and satellite remote sensing data, and entirely satellite remote sensing.

4.2 Entirely Ground-Based Approach DDWB and Its Suitable Applications

Several DDWB that relied on ground-based measurements focused primarily on estimating or partitioning the water budget into components (20 publications) (Table 1). Networks of ground-based sensor stations across heterogeneous landscapes can illuminate the spatiotemporal distribution of water budget components across a region, such as the critical zone of a semiarid savanna region in Arizona, United States over multiple years (Scott & Biederman, 2019), or over multiple diverse agricultural regions of the United States (Baffaut et al., 2020). Ground-based measurements can be suitable for understanding the contribution and partitioning of ET and its interception and transpiration components on the water budget, such as in a forested catchment in Sweden (Kozii et al., 2020) and in pine trees in a Mediterranean catchment (Mollema et al., 2013). Various ground-based measurements can be used to better understand groundwater-surface water interactions, such as in a groundwater-fed lake in Germany (Rudnick et al., 2015), between stream channels and groundwater recharge in an aquifer in Texas (Hauwert, 2016), and in a pond complex (Brannen et al., 2015). Surface water radon measurements were sensitive to rapid changes between surface water and groundwater in a drained agricultural floodplain in Australia (Webb et al., 2017). Isotope mass balance techniques can estimate groundwater-surface water interactions (Gaj et al., 2016; Gibson & Reid, 2014; Masse-Dufresne et al., 2021). Other applications of ground-based water budgets include estimating the rainfall/runoff process in tropical regions (Shiraki et al., 2017), evaluating the water budget in a tropical reservoir (Xing et al., 2012), determining the contribution of dewfall to the water budget in wet versus dry years in a coastal steppe ecosystem (Ucles et al., 2014), estimating the variation by topographic position and impacts of drought on water budget components in a mountainous catchment (Mitchell et al., 2012), and partitioning the water budget in urban areas (Marchionni et al., 2019).

DDWB can rely on ground-based measurements for the primary purpose of decision making (16 publications) (Table 1). Some agricultural decision-making examples include comparing the water budget components between different cropping systems to help decide which practice to follow (Fouli et al., 2012; Jahani et al., 2017; Li et al., 2019), comparing the effects of various irrigation systems on deep percolation (Darzi-Naftchali et al., 2013; LaHue & Linquist, 2021), determining the soil management effect on the water budget in dryland soil (D. Zhang et al., 2016; S. Zhang et al., 2016), and comparing water use efficiencies under different irrigation schemes (Barnard et al., 2017). Ecosystem decision-making examples include evaluating the projected water budget under various climate change scenarios using a Budyko framework (Csaki et al., 2020), assessing forest thinning impacts on the water balance in the Sierra Nevada mixed-conifer headwater basins to understand the extent to which biomass reductions increase runoff (Saksa et al., 2017), assessing the importance of regenerative forest management practices on the water budget (Munoz-Villers et al., 2012), and identifying three main phases of ecological development of a creek catchment (Schaaf et al., 2017). Ground-based measurements have also been used to quantify the water balance under various land use scenarios, such as converting from pasture to eucalyptus trees (Dean et al., 2016) or converting from grassland to forest plantations for production (Silveira et al., 2016), specifically focusing on the effects on groundwater resources. An urban development decision-making example included analyzing the water balance in the Tonle Sap Lake basin to aid in water resources planning in an area that is expected to expand with urban development (Kummu et al., 2014).

Ground-based measurements for DDWB can be used for uncertainty and water budget closure analysis (8 publications) (Table 1). Uncertainty analysis was done using ground-based measurements in river water balance estimates (Adams et al., 2013), micro-irrigated agricultural fields (Kisekka et al., 2019), and canal seepage estimates using acoustic Doppler devices (Martin & Gates, 2014). Water budget closure analyses were done using ground-based measurements in semi-arid watersheds using cosmic ray soil moisture sensing (Schreiner-McGraw et al., 2016) and in field-scale seasonal frozen conditions (Pan et al., 2017). Error analyses using ground-based measurements were done in a tundra water budget focusing on precipitation underestimation (Liljedahl et al., 2017) and in the Tahoe basin (Trask et al., 2017).

Other publications involved using a ground-based measurement to estimate a water budget component to validate against another data source (4 publications) (Table 1). Well data was evaluated against an agronomic mass balance approach for estimating water storage as the residual of the continuity equation in the Ogallala aquifer (Ouapo et al., 2014). ET from eddy covariance was validated against ET calculated as the residual of the continuity equation (Denager et al., 2020).

Ground-based measurements for DDWB can be used for estimating a water budget component that is difficult or impossible to measure (3 publications) (Table 1). ET was estimated using the soil water balance based on a lysimeter and neutron probe measurements in drip-irrigated orchards in arid and semi-arid areas and used for calculating the water productivity in a maize field and apple orchard (Abou Zakhem et al., 2019). Seasonal groundwater storage variation was quantified in a small mountain catchment using time-lapse gravimetry and stable isotope measurements (Arnoux et al., 2020). Soil water deficit was estimated using a combination of canopy temperature measurements and a soil water balance model (M. Han et al., 2018).

4.3 Entirely Remote Sensing Approach to DDWB and Its Suitable Applications

An entirely satellite remote sensing-based approach to DDWB is suitable for regional-scale water budgets (5 publications) (Table 1). Across our sampling of publications, entirely remote sensing-based regional-scale water budgets have better accuracy and closure at annual time scales than at smaller than annual time scales. The disadvantage of entirely remote sensing-based water budgets is the inability to accurately account for surface water inflows and outflows, such as runoff or streamflow. Nearly all reviewed publications in this category used GRACE for estimating the change in water storage (Armanios & Fisher, 2014; Moreira et al., 2019; Soltani et al., 2020; D. Zhang et al., 2016), except for one publication that calculated the change in water storage using the continuity equation (Karimi et al., 2015).

An entirely remote sensing approach to DDWB is suitable for estimating or partitioning the water budget (3 publications) (Table 1). A DDWB was developed for assessing water budgets and estimating runoff in Tanzania using TRMM for precipitation, GRACE for change in water storage, vegetation indices from MODIS, the surface radiation budget, and the Atmosphere Infrared Radiation Sounder (AIRS) for ET data (Armanios & Fisher, 2014). A water balance in the Yangtze River Basin employed TRMM, MODIS-ET product, and GRACE to estimate streamflow at monthly and annual time steps with reasonable accuracy achieved only at annual time steps (D. Zhang et al., 2016). At monthly time steps, the largest error in the water balance was the change in water storage from GRACE, but at annual time steps, TRMM precipitation had the largest error. A terrestrial water budget in various basins in South America used TRMM, MOD16, Global Land Evaporation Amsterdam Model (GLEAM), and GRACE (Moreira et al., 2019).

Remote sensing-based water budget uncertainty and non-closure were the subjects of 1 publication from the 81 publications we reviewed. A probabilistic framework was used for a water budget in low runoff regions of the groundwater-dependent, irrigated agricultural central Basin of Iran using TRMM, ET from Water Productivity Open Access Portal (WaPOR), and GRACE, and the First Order Reliability Method (FORM) for addressing the mismatch in spatiotemporal resolutions of the satellite data products (Soltani et al., 2020). The publication concludes that water budget estimation using remote sensing data has reasonable closure using seasonal and annual time scales and non-closure was most correlated with precipitation.

An entirely remote sensing approach for DDWB can be used for decision-making (1 publication). The Water Accounting Plus (WA+) framework in Ethiopia was used to predict famine using the two-layer ETLook surface energy balance model, input data from MODIS, daily rainfall maps from the US Agency for International Development (USAID) Famine Early Warning Systems Network (FEWS NET), and other remote sensing products (Karimi et al., 2015). The WA+ framework reports the hydrological processes and management issues in river basins by using four reporting sheets and has the flexibility of allowing the incorporation of satellite-based measurements of land and water processes in replacement of ground-based hydrological data sets when administrations will not share data (Karimi et al., 2015).

4.4 Hybrid Approach of Remote Sensing Data Products and Ground-Based Measurements and Its Suitable Applications

Some DDWB relied on a hybrid of remote sensing data and ground measurements (25 publications) (Table 1). A hybrid approach to DWWB was employed for a variety of suitable applications, including estimating or partitioning the water budget (9 publications), decision-making (2 publications), validating another data source (2 publications), calculating something difficult or impossible to measure (5 publications), and water budget uncertainty or non-closure analysis (6 publications).

When ground-based measurements of ET are not available, satellite remote sensing can be used to estimate ET from satellite data sources using algorithms such as the Surface Energy Balance Algorithm for Land (SEBAL) algorithm (van der Laan et al., 2019), the Mapping EvapoTranspiration at high Resolution with Internalized Calibration (METRIC) algorithm (Bjorneberg et al., 2020), GLDAS (Lv et al., 2017; Zheng et al., 2022), or the MODIS ET product (Falalakis & Gemitzi, 2020). Some publications relied on remote sensing for NDVI to derive vegetation parameters, such as leaf area index, fractional canopy cover, or vegetation height (Guo & Shen, 2015). Other publications relied on NDVI derived from remote sensing for determining the basal crop coefficient, Kcb, to estimate ETc using the crop coefficient approach (Garrido-Rubio et al., 2020; Hassan-Esfahani et al., 2015), or for determining basin-scale spatially distributed maps of ET from eddy covariance measurements (Avanzi et al., 2020; Safeeq et al., 2021). The EVI from MODIS has been used with an empirical model for estimating ETa using eddy covariance and reference ET (Oliveira et al., 2015). We conclude that the predominant reliance on remote sensing in hybrid studies is the result of ground-based data gaps in plant biophysical parameters and ET.

When ground-based measurements of terrestrial water storage change are not available, satellite remote sensing can estimate the terrestrial water storage change using GRACE in sufficiently large watersheds (Lv et al., 2017; Sriwongsitanon et al., 2020; Zheng et al., 2022). When point-based ground-based measurements of precipitation are not sufficient to cover large spatial scales, gridded products for precipitation can be used like the Parameter elevation Regression on Independent Slopes Model (PRISM) (Avanzi et al., 2020; Henn et al., 2018; Safeeq et al., 2021), the remote sensing based NEXRAD (Polatel, 2015), or the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) (Alemu et al., 2020; Rusli et al., 2021).

5 Uncertainty Challenges for Data-Driven Water Budgets

This section describes uncertainty challenges that are specific to DDWB observed in selected studies from the 81 studies we reviewed. The most important uncertainty challenges include (a) water budget closure simplifying assumptions, (b) calculating a water budget component using the continuity equation, and (c) spatiotemporal data scarcity. Justifiably negligible water budget components without available data are sometimes eliminated for convenience. If measurements or estimates of a particular water budget component cannot be made, it may be possible to make a simplifying assumption of eliminating the water budget component or calculating it as the residual of the continuity equation. Water budget analyses which assume water budget closure are highly sensitive to errors due to eliminated water budget components and propagated uncertainty of measured water budget components. For those reasons, an open water balance may be more informative (Kampf et al., 2020), but most publications in our systematic review assumed a closed water balance.

5.1 Water Budget Closure Simplifying Assumptions

Water budget closure simplifying assumptions that eliminate certain water budget components (perhaps due to convenience or lack of available data) can result in water budget imbalance. The assumptions of negligible net groundwater exchange and negligible deep-water storage possibly explained the water budget imbalance in a 36-year demonstration in the southern Sierra Nevada that closed the water balance within 10% of precipitation in river basins and within approximately a 25% of precipitation in smaller headwater catchments (Safeeq et al., 2021). Distributed soil moisture measurements in the top 1 m suggested the possibility of groundwater exchange (contrary to the assumptions), which could result in the overestimation of fluxes derived from the water balance (when calculated as the residual). Three Alaskan arctic lake water budgets that included only post-snowmelt surface water fluxes could not close the budget for two lakes even when a Monte Carlo approach integrated uncertainty in the input parameters, but measurements and simulations indicated that unaccounted-for lateral inflows and outflows contributed to the budget non-closure (Koch, 2016). The assumptions of negligible fluxes of blowing snow and negligible upward soil moisture redistribution (which were not measured) possibly explained the inability to close the water balance in a seasonally frozen prairie pasture field site in Canada (Pan et al., 2017). A well-timed snow survey to reveal peak SWE before melt is recommended to avoid needing to measure blowing snow and sublimation. A site-specific measurement survey at the appropriate timing can control and/or confirm the validity of simplifying assumptions (Pan et al., 2017; Safeeq et al., 2021).

A lack of available data can necessitate simplifying water budget closure assumptions that eliminate components. Snowmelt and ice melt are sometimes eliminated due to a lack of available data such as in a water budget of the Dudh Koshi River basin in Nepal (Savean et al., 2015) and in a water budget within Yosemite National Park in the United States (Henn et al., 2018). Eliminating snow and ice-related water budget components could lead to improper modeling of subsequent processes, such as runoff or groundwater percolation that may be important to adjacent land use areas. Lack of available data on surface water inflows and outflows, such as runoff or streamflow in entirely remote sensing-based water budgets could contribute to water budget non-closure (Armanios & Fisher, 2014; Soltani et al., 2020; D. Zhang et al., 2016; S. Zhang et al., 2016). Eliminating components without data can be justifiable at specific scales, such as the assumptions of negligible surface runoff at annual time scales at regional scales such as in a large watershed (Armanios & Fisher, 2014; Soltani et al., 2020; D. Zhang et al., 2016; S. Zhang et al., 2016). Unknown input of water through fog interception can also affect the water budget closure at field-scale in micro-irrigated almond orchards, but the lack of available data on fog interception led to the elimination of this water budget component (Kisekka et al., 2019). Possible sources of non-closure in an entirely satellite remote sensing DDWB included the inconsistency of the spatiotemporal resolutions of the remote sensing data products, and the assumption that runoff was negligible (Soltani et al., 2020). Publications should include detailed reporting of the simplifying water budget closure assumptions and justification for the elimination of specific water budget components that were not measured.

Climate change shifts the hydroclimate, which could result in certain water budget components becoming significant and necessary to include in the continuity equation. For example, the common assumption of net-zero changes in groundwater storage in a watershed presumes stationarity of groundwater recharge, which is tied to precipitation and likely nonstationary in the changing climate. Most studies started with prior knowledge about the site, enabling the authors to use a simplified water budget model based on the continuity equation shown by Equation 1. The partitioning of precipitation can shift with climate change during drought, altering the precipitation-runoff relationship and altering groundwater recharge. The predictive skill of precipitation-runoff modeling can decrease during drought and result in the underestimation of the response time of ET to inter-annual climate changes (defined as the climate elasticity of ET) (Avanzi et al., 2020). Careful attention should be given to the simplifications of the continuity equation used to develop water budget accounting frameworks, as contemporary climate change shifts can confound hydroclimate parameterization.

In the last decade, novel approaches have been developed for estimating difficult-to-measure components that are sometimes eliminated from water budgets. Dewfall is often ignored in water budgets but can have an important contribution to the water balance in arid ecosystems (Hill et al., 2020; Ucles et al., 2014). Increased dewfall contribution to precipitation calls for the adoption of dewfall estimation methods to determine total precipitation. The Combined Dewfall Estimation Method combined the single-source Penman-Monteith evaporation model with data including leaf wetness, precipitation, and soil surface and dew point temperatures in a Mediterranean steppe ecosystem in southeast Spain (Ucles et al., 2014). Dewfall was monitored continuously over 3 years using a modified Hiltner dew balance model and a digital data logger in the Negev Desert (Hill et al., 2020). Measurements of micrometeorological variables, soil moisture content profiles, water-table levels, sap flow velocities, and stem diameter variations were used to determine that groundwater supplied 30%–40% of total water for transpiration during the driest part of the year in addition to supplying water for transpiration at night in a water balance in remnant urban reserves in Australia (Marchionni et al., 2019). Estimating difficult-to-measure components may require specialized equipment that may be costly or time-consuming to implement in water budgets, as in the cases of dew and capillary flow, leading to simplifying water budget closure assumptions.

Several studies suggest that longer time scales result in better water budget closure (Graf et al., 2014; Polatel, 2015; Safeeq et al., 2021). The annual timescale was the finest time scale possible for a water budget in the Yellow River Basin and Changjiang River Basin in China because of the lack of available data on naturalized streamflow, irrigation water, and water diversion at finer time scales (Lv et al., 2017). These studies suggest that comparing implementations of DDWB under more than one timescale could provide helpful information for evaluating issues with water budget non-closure.

5.2 Calculating a Water Budget Component Using the Continuity Equation

A variety of water budget components have been calculated using the continuity equation when all other components are measured, usually when a component lacked available data at the study site. ET is commonly calculated as the residual of the continuity equation and may be compared to other methods of deriving ET such as satellite remote sensing algorithms (Denager et al., 2020; Sriwongsitanon et al., 2020). Sometimes ET and changes in water storage are lumped into the same calculated term due to the inability to separate them (Henn et al., 2018; Marchionni et al., 2019). The complexities of understanding groundwater flow may lead to calculating groundwater inflow (Trask et al., 2017), groundwater outflow (Alemu et al., 2020), or groundwater storage changes (Zheng et al., 2022) as the residual of the continuity equation. Surface water flows can be calculated as the residual of the continuity equation when there is limited ground data such as streamflow (D. Zhang et al., 2016; S. Zhang et al., 2016) or runoff (Armanios & Fisher, 2014). At regional scales, the most difficult-to-measure water budget component to be calculated is often due to spatial data scarcity and lack of gauging, such as the total basin discharge (Lv et al., 2017), total terrestrial storage change (Sriwongsitanon et al., 2020), basin-wide subsurface storage (Avanzi et al., 2020), or large spatial scale soil moisture changes (Guo & Shen, 2015). Because so many different water budget components can be calculated as the residual of the continuity equation, it is important to explicitly communicate which water budget components (if any) were calculated as a residual, because this information cannot be assumed and may have important implications on the accuracy assessment of the measured water budget. Also, it is important to communicate the associated uncertainties of the measured components to understand the possible error propagation into the component calculated as the residual of the continuity equation. Measurement uncertainty of individual components (e.g., soil water content, precipitation, and recharge) contributed differently to the propagated uncertainty of ET estimated as the residual of the water budget (Denager et al., 2020).

Components calculated using the continuity equation should be evaluated against ground-truthed data. The net stream-groundwater fluxes calculated using the continuity equation were compared to the temporal variability of stream-groundwater head difference data in a regulated river channel in Australia for evaluation (Adams et al., 2013). The total basin discharge calculated using the continuity equation was compared to the observed total basin discharge to evaluate the water budget closure and used to determine that reconstructed ET using the GLDAS-1 land surface models reduced the water budget non-closure in two river basins in China (Lv et al., 2017). Although calculating a water budget component using the continuity equation is convenient, a disadvantage of this method is the total water budget error ( ε $\boldsymbol{\varepsilon }$ ) is propagated into the calculated water budget component because 100% accuracy is assumed in all the measured components.

Sometimes the water budget error is calculated using the continuity equation (Pan et al., 2017; Safeeq et al., 2021; S. Wang et al., 2014; Xing et al., 2012). The remainder of the water balance can be calculated to evaluate where the remaining unmeasured water came from by looking at correlations with measured water budget components. For example, this approach was taken in a watershed water balance to evaluate the effect of the water budget partitioning due to changing from furrow irrigation to sprinkler irrigation in southern Idaho (Bjorneberg et al., 2020).

Some water budget components are estimated using the continuity equation and used to quantify an indicator such as water productivity or water footprint (van der Laan et al., 2019; D. Wang et al., 2021). The decreased confidence in the component calculated using the continuity equation can result in decreased confidence in the derived indicator and makes it difficult to compare indicators that are derived from data of varying levels of uncertainty. For example, ET can be estimated using the continuity equation to later calculate water productivity, which is a commonly calculated metric in agricultural settings (Abou Zakhem et al., 2019). However, any error in the measured water budget components will be propagated into the calculated ET, and consequently, also transfer into the calculated water productivity. This same problem can be present in water footprint calculations.

5.3 Spatiotemporal Data Scarcity

Several studies mentioned the role of spatiotemporal data scarcity of specific components propagating into the rest of the water budget. While spatiotemporal data scarcity contributing to the error of the overall estimated water budget components is one challenge, this sub-section focuses on spatiotemporal data scarcity contributing to the overall estimated water budget.

Spatiotemporal scarcity of precipitation data contributes to the overall water budget error. A water budget on the Dudh Koshi River (Nepal) used two observed data sets of spatialized precipitation data interpolated using a co-kriging method and indicated an unbalanced water budget, which was suspected to be the result of precipitation underestimation typical of high mountain areas (Savean et al., 2015). After a precipitation correction was applied, it was concluded that 40% of the precipitation underestimation came from uncertainties in the representation of precipitation spatial variation. Basin-wide precipitation correction might change with topographic and geographic parameters and seasons, but this variance cannot be investigated without knowledge of the precipitation spatial variability (Savean et al., 2015). At the individual basin scale, a study showed poor agreement (R2 ranging from 0.04 to 0.65) between two annual precipitation data sets (spatially interpolated precipitation gauge measurements and the global observation-based NCEP/NCAR Reanalysis data set developed by the National Centers for Environmental Prediction (NCEP) and the National Center for Atmospheric Research (NCAR)) (S. Wang et al., 2014). Water balance studies that are too short-term, have too few measurement sites, and unreliable precipitation measurements can contribute to uncertainty in the catchment-scale water balance, such as in the Sierra Nevada (Saksa et al., 2017). More measurement sites with strategic placements and precipitation corrections can alleviate the spatiotemporal data scarcity in precipitation measurements that contribute to the overall water budget error.

The lack of spatial representativeness of water storage estimations can contribute to errors in the rest of the water budget. Spatiotemporal data scarcity of soil moisture measurements affects the soil water storage estimates, transmitting into the rest of the water budget. A water balance that assessed leaching in a micro-irrigated almond orchard showed that changes in estimated field-scale soil water storage are highly impacted by the locations, depths, and the number of monitoring points (Kisekka et al., 2019). A 3-year water budget in a forested tributary catchment in Germany closed on an annual basis, but daily and weekly residuals were correlated with the soil water content that was measured using a distributed soil moisture network from 109 points with 3 measurement depths each (Graf et al., 2014). Clearly, even with several point-based soil moisture sensors placed at several locations, total water storage changes are difficult to capture. One workaround is to use an intermediate-scale soil moisture measurement technique such as cosmic ray neutron probe sensing (CRNS). One study validated CRNS as a soil moisture monitoring method (which has a similar spatial footprint as eddy covariance) against a soil moisture sensor network with a good agreement and assessed the water balance closure in an area where fluctuations in plant water uptake were important (Schreiner-McGraw et al., 2016).

Groundwater observation wells are another sparse data type that contributes to an error that propagates into the overall water budget. The primary problem with the change in water storage measurements is the limited spatial resolution, which is exaggerated at larger spatial scales with heterogeneous hydrogeology (S. Wang et al., 2014). Cones of depression and variability in water storage between well sites can increase errors in agronomic water balances where groundwater depletion is important (Ouapo et al., 2014). Water level measurements at midpoints between well sites are recommended, but that would be expensive (Ouapo et al., 2014). More measurement sites at closer time steps could address spatiotemporal data scarcity for water storage estimates, reducing the overall water budget error.

Lack of surface water inflow gauging can increase uncertainty in surface water contributions to the water budget, but modeling coupled with measurements in data-scarce regions can be appropriate for dealing with this type of data scarcity. Spatiotemporal data scarcity of streamflow measurements limits the ability to capture ephemerality, streamflow in ungauged tributaries, and portions of reaches (Adams et al., 2013; Alemu et al., 2020; Mitchell et al., 2012; Ward et al., 2013). The spatial representativeness of available gauged data for characterizing the flow regime influenced the estimation of ungauged inflow through rainfall-runoff modeling in a regulated river channel in Australia (Adams et al., 2013). An isotopic mass balance approach evaluated the relative importance of floodwater inputs and temporary subsurface storage of floodwater to lake water budgets in ungauged areas (Masse-Dufresne et al., 2021). Regions with little gauging from surface water flow employ empirical relationships between ET and groundwater using remote sensing data products and calculate surface flows such as runoff using the continuity equation (Falalakis & Gemitzi, 2020). Incorporating modeling with measurements can help alleviate the contribution of spatiotemporal data scarcity in surface water inflows and outflows on the overall water budget error.

Scarce availability of ground-based data can be a major limitation in large-scale basins. For example, an entirely satellite remote sensing based DDWB was developed in the Central Basin of Iran due to the lack of available ground-based data (Soltani et al., 2020). National agencies provide global satellite remote sensing products such as TRMM, GRACE, MODIS, and CHIRPS for basin-scale water budgets in areas with spatiotemporal data scarcity such as Africa, Middle East, and South America (Alemu et al., 2020; Armanios & Fisher, 2014; Moreira et al., 2019; Soltani et al., 2020). The United States Agency for International Development (USAID) Famine Early Warning Systems Network (FEWS NET) provides daily rainfall maps based on an interpolation method that combines Meteosat and global telecommunication system (GTS) data that has been used to predict famine in Africa in conjunction with the WA+ framework (Karimi et al., 2015). Networks of weather stations can estimate the spatiotemporal distribution across regions but require several stations across a wide region and are subject to spatial interpolation errors (Baffaut et al., 2020).

5.4 Uncertainty Analysis

Due to the challenges of data-driven water budgets outlined in this section that can lead to error, it is important to quantify uncertainty. Traditional performance metrics can quantify the uncertainty in data-driven water budgets, such as root mean square error, coefficient of determination, Pearson correlation coefficient, standard deviation, confidence intervals, bias, absolute bias, and Nash-Sutcliffe efficiencies. These performance metrics and others are reviewed in Moriasi et al. (2007). Alternatively, many non-traditional methods and metrics exist. For example, precipitation-decorrelation and residual-redistribution are two novel approaches that were introduced in the water balance of the Tahoe basin (Trask et al., 2017). The closed water budget uncertainty was evaluated by conducting an error propagation analysis using the Taylor series (Polatel, 2015). Monte-Carlo simulation was used for uncertainty quantification of water budgets in lake water budgets (Koch, 2016), in areas with canal seepage (Martin & Gates, 2014), and in a regulated river channel in Australia (Adams et al., 2013). In the latter case, short-term monitoring of the ungauged tributaries was used to test the confidence in the error and variance estimates for the major water budget components (Adams et al., 2013). The Extended Triple Collocation method (ETC) was used to conclude a high uncertainty in estimating subsurface hydrological processes in a water budget in the groundwater dependent Upper Citarum basin over 14 years (Rusli et al., 2021). As required by ETC, multiple data sources were compared, including various global satellite data products, ground-based measurements, and groundwater abstraction volume estimates based on population records. Stochastic computation provides the uncertainty bandwidth for every Water Accounting Plus output and was proposed as a standard procedure for uncertainty analysis to provide consistent information for various levels of decision-making (Karimi et al., 2015).

A water footprint study on maize compared various ET measurement approaches, remote sensing, eddy covariances and crop models (SWB, CROPWAT, SAPWAT), and found 15%–42% variation in seasonal ET estimates that results in maize consuming between 4.4% and 8.3% of the Orange River in South Africa (van der Laan et al., 2019). The variation in water budget component estimates across various data sources emphasizes the need to quantify water budget uncertainty for better managing water.

There were no studies in the 81 publications we reviewed at the regional or plant scales that focused on uncertainty or water budget closure analysis, suggesting room for future research.

6 Discussion and Future Research Opportunities

We reviewed 81 publications published between 2012 and 2022. Although this body of literature suggests that considerable advancement in data-driven water budgets has been made during that period, challenges remain that call for further research and development. In this section, we discuss the importance of considering the different management objectives of decision-makers working at various spatiotemporal scales and the need for standardized reporting protocols for data-driven water budgets.

Regional water management plans and field-scale individual decision objectives might differ in water budget accounting. However, conservation of mass (Equation 1) must always be true, so decision-makers working at various spatiotemporal scales must follow DDWB that obey conservation of mass when employed together. Simplifying assumptions that ignore specific components at specific scales and not at other scales, along with ignoring the uncertainty in water budget component measurements, can result in DDWB that do not align at various spatiotemporal scales. An exemplary approach for aligning management objectives of different spatiotemporal scales was discussed in Garrido-Rubio et al. (2020). Their approach was a shared mapping platform based on a satellite remote sensing FAO-56 soil water balance for irrigation water accounting to align the decision-making between field-scale and water user association management scale to better comply with agro-environmental laws in the groundwater-dependent Júcar River Basin in Spain (Garrido-Rubio et al., 2020). DDWB were used as a tool to encourage participation and the promotion of joint water use at the river basin level in Thailand (Supriyasilp & Pongput, 2021). Perhaps a solution to improve alignment is employing participatory approaches similar to the value judgment Analytical Hierarchical Process used in Turkey when developing water planning and decision-making solutions (Sanli et al., 2022). Shared mapping platforms and tools, along with participatory approaches, can make DDWB accessible to decision-makers at various scales to improve the compatibility of various practices involving limited water resources.

Standardized reporting protocols for DDWB are important for improving the comparability of DDWB calculated by decision-makers and streamlining water management decision-making. As discussed in Section 5, including the simplified continuity equation used in the DDWB is helpful for understanding the assumptions. Explicitly reporting that a water budget component is calculated as the residual of the continuity equation is essential for understanding how water budget closure was handled. Standardized reporting of the simplified continuity equation and how water budget closure was handled is important so water budget users can better understand possible issues with water budget non-closure and error propagation. The recommended reporting protocols for evapotranspiration measurements published in Allen et al. (2011a2011b) are exemplars of the type of standardized reporting protocols needed for DDWB.

Publications with the primary focus of quantifying uncertainty used different approaches as described in Section 5, making it difficult to compare water budget approaches across studies and water budget applications. Although some methods of uncertainty quantification are more suitable than others in specific contexts, the standardization of uncertainty quantification where possible might improve the comparability of DDWB conducted by water decision-makers focused on water budget applications of different land use types (e.g., agriculture, forestland, rangeland, water, urban, wetland, barren, mixed).

Standardization of DDWB methods will not completely alleviate uncertainties working across sites because water budget imbalance can highly depend on site characteristics, independent of the methods. Measurements of water budget components and uncertainties associated with each component can vary widely across adjacent sites even when the same water budget method is used. For example, the monthly water budget imbalance varied from 7.0 mm per month to 21 mm per month across 16 drainage basins in Canada (on average 30% of the corresponding monthly precipitation) (S. Wang et al., 2014). Water budget non-closure was a problem in two of the three studied lakes in a water budget study in Alaska, despite consistent methods (Koch, 2016). Due to the site-specific dependence of water budget non-closure, it is difficult to generalize the source of non-closure for any water budget, and detailed knowledge of the site is required to understand the origin of DDWB errors. In some cases, not enforcing the water balance closure may provide greater insight on watershed functioning and can eventually lead to better measurements (Kampf et al., 2020).

We do not recommend a specific set of data products or measurements for DDWB accuracy maximization because each site is different. However, we can say that the study site characteristics (spatial and temporal scale, climate, land use, water availability, etc.) should be carefully considered when selecting data products and measurements to maximize accuracy. Some data products are only recommended for specific conditions. For example, the gridded precipitation product TRMM is only recommended for subtropical and tropical precipitation (Armanios & Fisher, 2014; Rusli et al., 2021; Soltani et al., 2020; D. Zhang et al., 2016; S. Zhang et al., 2016). This limitation with TRMM is why NASA launched the Global Precipitation Mission (GPM) which can measure light rain (<0.5 mm hr−1) and snow (https://gpm.nasa.gov/). Furthermore, data products are only suitable for specific scales. For example, the MODIS global terrestrial ET product has a coarse temporal and spatial resolution, limiting its use to regional scale water budgets at the monthly or annual time scales (Armanios & Fisher, 2014; Falalakis & Gemitzi, 2020; D. Zhang et al., 2016; S. Zhang et al., 2016). A site assessment (which may or may not include a site visit) is required to consider the relevant spatial and temporal scales, climate, land use, and water availability that would point toward a specific data product or measurement for deriving a data-driven water budget.

6.1 Systematic Review Limitations

Systematic bibliographic reviews, by their structured and predefined nature, inherently possess limitations. While they aim to provide a comprehensive but repeatable overview of existing scientific literature on a specified topic, inevitably recent or adjacent but important contributions are missed. For example, Cook et al. (2020) delves deeply into drought projections using the CMIP6 forcing scenarios and quantify the importance of climate change on increasing the frequency or severity of drought events but also provide nuanced insights into various components of water budgets, including precipitation, soil moisture, and runoff. These components are crucial for a holistic understanding of data-driven water budgets, but this contribution was not included in our systematic screening. It is worth noting, however, that modeling future conditions, as demonstrated by Cook et al. (2020), has its own set of limitations. Predictive models are based on a set of assumptions and scenarios, and while they offer valuable foresight, they may not always capture the full complexity or unpredictability of real-world conditions. Such omissions and inherent uncertainties underscore the need for integration of emerging research in parallel to systematic reviews.

To this end, we acknowledge that the 81 publications in this review may have missed other types of measurements and data products that could be used in a DDWB. For example, other remote sensing hydrological data products might be used in a DDWB, including GPM precipitation data, the European Soil Moisture and Ocean Salinity (SMOS) data, and the Surface Water and Ocean Topography (SWOT) product which measures river discharge and surface water storage (Lettenmaier et al., 2015). Other examples of satellite remote sensing field-scale ET data products include the OpenET data set covering the Western United States (Melton et al., 2021)) and the global ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS) data set (Fisher et al., 2020). Other soil moisture data products include the United States Soil Moisture Active Passive (SMAP) data product (Lettenmaier et al., 2015), and the COsmic-ray Soil Moisture Observing System (or the COSMOS) project which is focused on deploying cosmic-ray soil moisture probes mainly in the United States for measuring area-average soil moisture at the hectometer horizontal scale (Zreda et al., 2012). Surely, other hydrological data products exist that could be used in a DDWB, so future research could focus on documenting a complete list of all data products for DDWB along with the relevant spatiotemporal parameters. We recommend that further research focus on searching databases other than Web of Science, such as Science Direct, and non-scholarly publications that employed a DDWB such as government water departments. We also recommend attempting other database search strategies, perhaps using different keywords, to capture additional publications that focus on a DDWB. Instead of searching the title for keywords, the search strategy could search the abstract to yield more publications.

We also acknowledge that our review had some gaps in specific topics of DDWB due to our strict search criteria that yielded only 81 publications. For example, our review had a gap in discussing DDWB that relied on satellite remote sensing, such as the topics of remote sensing-based water budget uncertainty and non-closure, and the use of satellite remote sensing based DDWB for validating another data source and calculating something difficult or impossible to measure. Our review also did not include publications that predominantly focused on the uncertainty or water budget closure analysis at the regional or plant scales, although there are likely publications dedicated to this topic.

Global, national, and state policies surrounding climate and water are increasing the importance of data-driven water budgets that incorporate climate change. Most of the studies we reviewed focused on a water budget from a past research site, perhaps mentioning at the importance of climate change (e.g., Hughes et al., 2022), but did not incorporate bias-corrected regional climate projections into the data-driven water budgets. One study that was screened by our search criteria evaluated the sensitivity of the Lake Babati system in East Africa to changes in key hydro-climatic variables using the Coupled Model Intercomparison Project, Phase 5 (CMIP5) output to understand future risks in water security and flooding (Mbanguka et al., 2016). Dynamic water budget accounting should integrate bias-corrected regional climate projects to account for the fundamental processes of climate impact on water balance conditions. Recent advances in artificial intelligence enabled water availability forecasts (Kalyanaraman et al., 2022) may improve the incorporation of climate change into DDWB. The disconnection between climate change and water budget research suggests the need for interdisciplinary research to advance data-driven water budgets under climate change.

A common limitation for DDWB can be described as the trade-off between spatial and temporal resolution—where satellite data with higher temporal resolution typically have lower spatial resolution (e.g., GLDAS data produced daily at 2.5° to 1 km) and higher spatial resolution data typically have lower temporal resolution (e.g., Landsat-derived products with 30 m spatial resolution at between 8- and 16-day temporal resolution). However, new tools are taking advantage of advances in data harmonization and computational capacity of cloud computing to provide both high spatial and temporal resolution data products. For example, Xue et al. (2021) produced daily evapotranspiration rates at the sub-field scale using Harmonized Landsat and Sentinel-2 (HLS) data to sharpen Visible Infrared Imaging Radiometer Suite (VIIRS) imagery with a MAE of 0.49 mm d−1. The recently released OpenET website uses Google Earth Engine to process six different satellite ET models and an ensemble of those models to provide monthly and daily ET data at the field-scale in the western United States (Melton et al., 2021). Deep learning is being applied to satellite and in situ data to produce daily soil moisture measurements at 9 km spatial resolution (Liu et al., 2022). Recent advances in unmanned aerial vehicles may also improve the scalability of DDWB methods (Kalua et al., 2020) such as soil moisture mapping by unpiloted aerial systems (Araya et al., 2021). Future research into how these new data sets can be incorporated into a DDWB, along with artificial intelligence and machine learning, are at the current frontier of tools for improving water management at all scales.

7 Conclusions and Opportunities Ahead

The development of data-driven water budget approaches is an ongoing relevant research area that addresses a path toward a secure water future. While research on data and monitoring of individual water budget components (e.g., evapotranspiration, streamflow, precipitation, etc.) has been extensively studied, the topic of data-driven water budget approaches that consider all water budget components of a system deserves further development. For this paper, we systematically reviewed 81 publications from 2012 to 2022 focused on data-driven water budgets.

From our systematic review, we conclude that data-driven water budget approaches have traditionally been distinguished by spatiotemporal parameters. Larger spatial scales tend to rely more on satellite remote sensing products, whereas smaller spatial scales tend to rely on ground-based monitoring. The combination of satellite remote sensing data products and ground-based monitoring has enabled complete data-driven water budget estimates across various land uses and applications across the globe. Furthermore, we conclude that aligning the management objectives of decision-makers working at various spatiotemporal scales and developing standardized reporting protocols are needed for DDWB.

A global pressing concern is that climate change alters the water budget, calling for the development of dynamic water budget accounting frameworks that can be used for forecasting. The United Nations Member States adopted the 2030 Agenda for Sustainable Development in 2015, which includes the Sustainable Development Goal (SDG) 6, including sustainable water management for all, and several other SDGs depend on sustainable water management (United Nations Department of Economic and Social Affairs Sustainable Development, 2015). Meeting ambitious global sustainable water development goals requires incorporating climate change forecasts into data-driven water budgets.

Based on our systematic review, we conclude that the following could be a general framework for a standardized approach to data-driven water budgets. First, the application of the water budget should be determined. Then a site assessment should be done to determine the relevant spatiotemporal scales. The continuity equation can be simplified by considering which water budget components should be included at the relevant spatiotemporal scales, considering the site assessment and application. Then, the available data products and measurements should be scoped that are relevant to the site and simplified continuity equation. If there are any water budget components with data scarcity or the inability to measure at the site, the component can be calculated as the residual of the continuity equation (if there is only one unknown) or modeling can be employed to estimate the component (through hydrological modeling, empirical modeling, or artificial intelligence). Lastly, an uncertainty and closure analysis should be done to assess the error of the data-driven water budget to guide the interpretations and risks associated with the estimated water budget. Refining a standardized approach into data-driven water budget protocols compatible with all scales of water resource management decision-making is an opportunity for further research and development.

Acknowledgments

This work was supported by Agriculture and Food Research Initiative Competitive Grant 2021-69012-35916 from the USDA National Institute of Food and Agriculture. JHV was partially supported by the AI Research Institutes program supported by NSF and USDA-NIFA under the AI Institute: Agricultural AI for Transforming Workforce and Decision Support (AgAID) award No. 2021-67021-35344. We thank the SecureWaterFuture.net program for discussions on this review topic, and Sarah Naumes for invaluable team science coordination.

    Data Availability Statement

    The Web of Science (https://www.webofscience.com/wos) search query that generated the list of publications used in this review paper can be found in Figure S1 of the supporting information. Registration is required to access Web of Science, but the publications are also available by the DOI links. The supporting information contains a full list of the references (including the DOI link) of the publications from this systematic review. The data supporting the results in Table 1 can be found in Table S1. No software or other research objects were generated from this review article.