On Parameterizing Soil Evaporation in a Direct Remote Sensing Model of ET: PT‐JPL

Remote sensing models that measure evapotranspiration directly from the Penman‐Monteith or Priestley‐Taylor equations typically estimate the soil evaporation component over large areas using coarse spatial resolution relative humidity (RH) from geospatial climate datasets. As a result, the models tend to underperform in dry areas at local scales where moisture status is not well represented by surrounding areas. Earth observation sensors that monitor large‐scale global dynamics (e.g., MODIS) afford comparable spatial coverage and temporal frequency, but at a higher spatial resolution than geospatial climate datasets. In this study, we compared soil evaporation parameterized with optical and thermal indices derived from MODIS to RH‐based soil evaporation as implemented in the Priestley Taylor‐Jet Propulsion Laboratory (PT‐JPL) model. We evaluated the parameterizations by subtracting PT‐JPL transpiration from observation‐based flux tower evapotranspiration in agricultural fields across the contiguous United States. We compared the apparent thermal inertia (ATI) index, land surface water index (LSWI), normalized difference water index (NDWI), and a new index derived from red and shortwave infrared bands (soil moisture divergence index [SMDI]). Relationships were significant at the 95% confidence band. LSWI and SMDI explained 18–33% of variance in 8‐day soil evaporation. This led to a 3–11% increase in explained ET variance. LSWI and SMDI tended to perform better at the irrigated sites than RH. LSWI and SMDI led to markedly better performance over other indices at a seasonal time step. L‐band microwave backscatter can penetrate clouds and can distinguish soil from canopy moisture content. We are presently fusing red‐SWIR‐RADAR to improve soil evaporation estimation.


Introduction
Earth system scientists increasingly use Earth observation (remote sensing) to estimate evapotranspiration (ET) for water resources management, food security analysis, drought monitoring, assessing the impact of land use/cover feedbacks on climate, and greenhouse gas emissions accounting (Fisher et al., 2017). How we partition ET is an emerging area of research-a recent review can be found in Stoy et al. (2019). We can partition evapotranspiration into three components: water movement through the canopy or transpiration, evaporation from the soil and waterbodies, and evaporation from the canopy or other surfaces following a wetting event. Transpiration accounts for approximately 57.2 ± 6.8% of global terrestrial ET, making it the largest and most researched component of ET (Wei et al., 2017). Earth system scientists should further investigate soil evaporation for three reasons. First, soil evaporation is considerable in agroecosystems and other moisture-limited ecosystems (Zhang et al., 2017). Second, studies estimate positive trends in ET globally that are partially offset by declining trends in soil evaporation in moisture-limited areas due to rainfall variability and change (Jung et al., 2010;Zhang et al., 2016). Finally, studies link much of the uncertainty in simulations used to estimate ET and ET trends to the partitioning of ET between transpiration and soil evaporation (McCabe et al., 2019;Vereecken et al., 2015;Vinukollu et al., 2012).
Scientists use Earth observation to decompose ET into transpiration and soil evaporation using two distinct approaches: indirect and direct (Biggs et al., 2015). Indirect methods estimate latent heat flux or the energy equivalent of ET as the difference between available energy (net radiation) and sensible heat flux. The sensible heat flux is determined as a function of land surface temperature (LST) (Norman et al., 1995). Examples of this approach include the ALEXI (Anderson et al., 1997) and SEBS (Su, 2002) models. The basic concept behind these models is that as LST and therefore the sensible heat flux between the surface and atmosphere increases, less energy is available at the surface for ET . Indirect methods therefore are sensitive to the moisture status of the entire evaporating surface regardless of whether it is vegetation or soil. Soil evaporation is assumed to be a fraction of total LE using a remote sensing vegetation index, such as the normalized difference vegetation index (NDVI: Tucker, 1977Tucker, , 1979. Direct methods on the other hand estimate latent heat flux as a fraction of potential evapotranspiration (PET), which is a function of net radiation, temperature, humidity, and wind speed (Cleugh et al., 2007). Examples of this approach include MOD16 (Mu et al., , 2011 and Priestley Taylor-Jet Propulsion Laboratory (PT-JPL) (Fisher et al., 2009(Fisher et al., , 2008. MOD16 uses the Penman-Monteith equation, and PT-JPL uses the Priestley-Taylor approximation for PET. Direct methods inform the capacity of the canopy for transpiration only, which is indirectly related to the moisture status of soil in which the plants are rooted (Polhamus et al., 2013). They are therefore sensitive to the vegetation component of the surface only. Soil evaporation is estimated as a function of the moisture content of the overlying atmosphere that regulates the canopy.
Direct methods are widely used for large area studies, because they require fewer inputs than indirect methods (Jiménez et al., 2011;Vinukollu et al., 2012;Vinukollu, Meynadier, et al., 2011;. In particular, PT-JPL performs well if not better than other methods across different ecosystem types and environments (Ershadi et al., 2014;McCabe et al., 2016;Miralles et al., 2016). Even so, the performance of these methods suffers when applied locally to irrigated agriculture and other moisture-limited areas where the soil moisture status is not well represented by the surrounding region and the overlying atmosphere Yilmaz et al., 2014). This is because researchers developed direct methods for regional-scale application where we can assume quasi surface-atmosphere equilibrium.
We can attribute the poorer performance of direct methods when applied locally to moisture-limited areas to the formulation and sensitivity of soil evaporation to atmospheric moisture status Zhang et al., 2019). In these models, soil evaporation is a function of the relative humidity (RH) of air immediately above the surface following the complementary hypothesis (Bouchet, 1963). Relative humidity is the ratio of actual to saturation water vapor pressure of the overlying air of the surface. Evaporative demand of the atmosphere drives evaporation from the surface and increases RH in the overlying air. As evaporation proceeds, the air becomes more humid, which decreases the evaporative demand. Eventually, the surface and atmosphere are in a state of equilibrium, and the evaporative demand of the atmosphere is the complement to the evaporative supply at the surface. The major limitation of this method is that equilibrium conditions are rarely ever met exactly. The closest they come occurs at synoptic spatial scales and long enough time scales sufficient for full accommodation between the surface and the atmosphere (Monteith, 1995). The drier the surface, the longer it takes to reach equilibrium, and the less likely for atmospheric moisture to be diagnostic of surface moisture status. Consistent with this hypothesis, PT-JPL performed well for predicting ET from an irrigated field in the desert of Saudi Arabia when the humidity was measured downwind of the irrigated field, but did not work well when humidity was measured in air coming off the surrounding desert (Aragon et al., 2018). While the canopy transpiration component is well constrained in direct methods particularly under unstressed conditions, we need alternative methods of localizing the soil evaporation estimates when such models are applied at scales or in areas that are not in equilibrium with the larger regional atmospheric environment. This physical bias is exacerbated when RH is derived from coarse spatial resolution climate reanalysis instead of ground-based weather data (Marshall et al., 2013;Badgley et al., 2015).
A number of studies aim to improve upon atmospheric-driven soil evaporation in PT-JPL. The alternatives more directly measure moisture supply, use comparatively higher spatial resolution remote sensing, and/or reduce model dependence on climate reanalysis. Famiglietti et al. (2018) proposed a methodology for deriving near-surface air temperature and dew point temperature from the MODIS sensor to improve the spatial resolution of RH estimation globally. Perhaps the most prominent alternative to the RH-based soil evaporation formulation due to the widespread use of indirect methods in agroecosystems involves the apparent thermal inertia index (ATI: García et al., 2013;Yao et al., 2013Yao et al., , 2017. MODIS or geostationary weather satellite thermal bands drive ATI. ATI showed better performance in rainfed areas when substituted in for the PT-JPL RH formulation for soil evaporation. In irrigated areas, however, soil evaporation derived from ground-based soil moisture performed better than RH and ATI. A recent study used the topsoil (5 cm) moisture content detected by SMAP L-band microwave emissivity to replace RH formulation for soil evaporation in PT-JPL . The results were promising in moisture-limited regions, leading to a 29.9% and 22.7% reduction in ET mean bias and RMSE, respectively. The shortwave infrared (SWIR) region of the electromagnetic spectrum is sensitive to changes in canopy and soil moisture content-water absorbs strongly in the SWIR (Gao, 1996). Yao et al. (2018) used the MODIS-derived land surface water index (LSWI) and a soil moisture index derived exclusively with MODIS SWIR bands to estimate transpiration and soil evaporation in PT-JPL, respectively (Yao et al., 2018). The combined indices resulted in 7% to 23.9% lower RMSE in LE compared to ground-based soil evaporation derived from soil moisture, but tended to underperform in ecosystems with weak seasonality. Lastly, Marshall et al. (2016) compared several hyperspectral narrowband and multispectral broadband indices derived from ground spectroscopy with soil evaporation as a residual of eddy covariance flux tower ET and PT-JPL transpiration. They identified important spectral features in soil evaporation centered on 428 nm (visible blue) and 1,518 nm (SWIR). Canopies and soil absorb strongly in the visible blue and SWIR, but around 1,450 nm (near 1,518 nm), soil water absorbs more strongly than canopy water (Penuelas et al., 1995).
We have not performed a comprehensive analysis of the alternatives to atmospheric-driven soil evaporation in direct remote sensing models of ET such as PT-JPL, particularly in irrigated croplands where atmospheric-driven soil evaporation tends to underperform. In this study, we compared the performance of atmospheric-driven soil evaporation as implemented in PT-JPL to alternatives derived from MODIS optical and thermal band combinations identified in the literate at 10 eddy covariance flux towers across important rainfed and irrigated agroecosystems of the contiguous United States. Although microwave-based formulations are a viable alternative to RH for soil evaporation (e.g., GLEAM; Miralles et al., 2011), our aim was to better understand the opportunities for leveraging currently available Earth observations that are already provided alongside the vegetation indices (e.g., NDVI) used by direct methods such as PT-JPL. Not to compare different satellite platforms. We therefore focus the comparison on MODIS optical and thermal bands, because they offer a number of presently used and proposed alternatives to the current implementation of soil evaporation in PT-JPL.

Data Acquisition and Processing
Observation-based and simulated soil evaporation consisted of 72 site years of micrometeorological and Earth observation data collected at 10 eddy covariance flux towers across major agroecosystems of the contiguous United States. The flux towers are part of the Ameriflux network (https://ameriflux.lbl.gov/) and represent various crop types, climates, and watering regimes (Table 1). Each flux tower had at least 3 years of data, which is sufficiently long enough to capture interannual variability in soil evaporation. We acquired Earth observation data using the United States Geological Survey's Application for Extracting and Exploring Analysis Ready Samples download data tool (AρρEEARS: https://lpdaacsvc.cr.usgs.gov/appeears/).

Micrometeorological Data
The micrometeorological data consisted of half-hourly or hourly energy terms (net radiation, sensible heat flux, soil ground heat flux, and latent heat) measured in W m −2 and weather terms (air temperature and vapor pressure deficit) measured in°C and kPa. We filled the data gaps in the energy and weather terms using the marginal distribution sampling method (Reichstein et al., 2005) and linear interpolation with a locally weighted regression. Following Fisher et al. (2008), we aggregated half-hourly and hourly data to daytime averages in units of MJ m −2 d −1 . Convection is high, and the correlation between energy and weather terms tends to be strongest during the day. Nighttime values were masked out using an incoming shortwave radiation threshold of ≤5 W m −2 , which is outside of instrument error. We applied an 8-day exponential moving average to daytime averages to approximate the compositing period of the Earth observation data. Lastly, we omitted values outside the growing seasons using planting and harvesting dates provided by the growers or farm managers via the flux tower investigators. Flux towers are considered the gold standard of field-level energy balance estimation, but can produce underestimates of 10-30% in surface energy fluxes (Wilson et al., 2002). Two approaches are generally used to close the gap: Bowen ratio (Twine et al., 2000) and energy balance residual (Amiro, 2009). The former distributes the imbalance between sensible and latent heat, while the latter distributes the imbalance to latent heat only. We closed the fluxes using both approaches, but in the end, we present the results of the uncorrected fluxes. We did this for three reasons. First, the performance of each approach can vary considerably due to instrumentation and climate, so the flux community generally accepts neither approach (Pan et al., 2017). Second, both approaches resulted in large systematic errors between simulated and observed latent heat. Lastly, neither correction changed the relative performance of the parameterization.

Earth Observation Data
The Earth observation data consisted of three MODIS Aqua Version-6 products: 8-day 500 m Surface Reflectance Bands 1-7 (MYD09A), 8-day 1 km Land Surface Temperature & Emissivity (MYD11A2), and daily 1 km Ocean Reflectance Bands 8-16 (MYDOCGA) ( Table 2). We used MYD09A and MYD11A2 to estimate optical and thermal indices, such as LSWI and ATI, respectively. We also used MYD09A to estimate NDVI and the enhanced vegetation index (EVI: Huete et al., 2002). PT-JPL does not use these indices to estimate soil evaporation, but to constrain transpiration. Until recently, MYDOCGA masked out the land, because the bands are mainly for ocean reflectance applications. The product is now available for the land. It includes blue bands (bands 8-10) that we used to develop additional optical indices for soil evaporation estimation.
We downloaded the time series of each product for the pixel overlapping half the footprint distance of each tower. The footprint distance is defined by the upwind area collected by the flux tower detectors. In this study, we define the average upwind area by the dominant wind direction during the primary growing season. We resampled the three products where necessary to an 8-day time step and 1 km spatial resolution for comparison purposes. An 8-day time step approximates the cumulative response of plants over several days to changing growth conditions (Pearcy & Sims, 1994). Further, the focus of the comparison was on the spectral resolution of the products and aggregating to the coarsest resolution resolves spatial discrepancies between the products.
The MYD09A product is generated by ranking daily reflectance data over an 8-day compositing period (Vermote et al., 2011). We use this ranking to align the dates of the MODIS and climate data over each 8-day composite. Pixels that are cloudy or impacted by other sources of noise receive a low rank, while pixels that are less impacted by noise (e.g., aerosols or haze) receive a high rank. Reflectance from the day with the highest rank and lowest view angle are selected for each 8-day composite. We used the selected dates in each composite to extract 8-day composites for MYDOCGA and the micrometeorological data. We retained pixels for each product ranked "highest quality" as defined in the MODIS quality assessment approach (Roy et al., 2002) and labeled in the AρρEEARS data files, while we dropped pixels tagged with a poorer quality flag. We filled remaining gaps by linear interpolation with locally weighted regression.

Methods
We estimated soil evaporation as a residual of the energy balance after partitioning evapotranspiration with the transpiration component of the PT-JPL model. Flux towers do not measure soil evaporation directly. The transpiration component of PT-JPL was selected to partition evapotranspiration for three reasons: (i) the overall performance of PT-JPL is superior to other vegetation-based remote sensing models and has been widely used to analyze atmospheric-based alternatives for soil evaporation (García et al., 2013;Yao et al., 2013Yao et al., , 2017; (ii) the evapotranspiration-CO 2 correlation approach, which can readily be employed to partition evapotranspiration with flux tower data, remains in the testing phase (Kool et al., 2014); and (iii) the performance of the transpiration component of PT-JPL shows much lower bias with reference data than the soil evaporation component . We compared the observed ET and PT-JPL transpiration residuals to the original atmospheric-based formulation and three Earth observation indices commonly used to evaluate plant-soil-water status (ATI, LSWI, and NDWI). We then expanded the comparison to include indices suggested in Marshall et al. (2016), which we derived from MODIS blue band combinations (bands 3 and 8-10). We evaluated each band combination on an 8-day basis using standard model performance metrics. Some applications, such as crop water productivity (Zwart & Bastiaanssen, 2004), assess cumulative ET effects, so we also evaluated the indices on a seasonal basis as well.

Observation-Based Soil Evaporation
We estimated soil evaporation on an 8-day basis for each flux tower and corresponding MODIS pixels using the following energy balance equation: where LE s,resp is the soil evaporation response defined by the residual of the energy terms (net radiation, R n , sensible heat flux, H, and soil ground heat flux, G) and transpiration as simulated by PT-JPL (LE c,ptjpl ).
The Priestley-Taylor Jet Propulsion Laboratory model (Fisher et al., 2008) estimates evapotranspiration in energy terms (latent heat). Latent heat is the sum of three components: LE c , LE s , and evaporation from the canopy itself following a wetting event. We set the surface wetness constraint that accounts for evaporation from precipitation intercepted by the canopy to zero according to García et al. (2013). Intercepted evaporation can be high in humid and densely vegetated ecosystems, but it is generally negligible in drier ecosystems such as those analyzed in this study (Jasechko et al., 2013). The model diagram in Figure 1 defines PT-JPL in terms of its inputs (EVI, NDVI, average daily air temperature, T av , net radiation for the canopy, R nc , net radiation for the soil, R ns , and soil ground heat flux, G), environmental constraints (green canopy fraction, F g , temperature, F t , moisture, F sm , and leaf age, F a ), and outputs (LE c and LE s ). Readers can find an exhaustive list of model equations and other pertinent details in Fisher et al. (2008) and García et al. (2013).
Potential evapotranspiration in PT-JPL is defined in terms of the Priestley-Taylor equilibrium equation: where α is the Priestley-Taylor coefficient (1.26°C kPa −1 ), Δ is the slope of the saturation vapor pressure curve, and γ is the psychometric constant (0.066 kPa −1°C−1 ). Net radiation is partitioned into R nc and (R ns -G) according to an exponential decay function of leaf area index (LAI) and the Beer-Lambert extinction coefficient (Ross, 1976). LAI is estimated as a function of the NDVI-driven vegetation fraction. The normalized difference vegetation index is sensitive to changes in vegetation abundance via the absorption and scattering characteristics of canopies in the visible red and NIR (Tucker, 1977(Tucker, , 1979. As canopies become more photoactive, the absorption of chlorophyll in the visible red increases. At the same time, NIR is scattered due to the internal structure of the canopy. We derived it from MYD09A red and NIR bands. Evapotranspiration is downregulated by the four environmental constraints to maintain equilibrium with PET. Plants adjust stomatal conductance according to changing light levels and therefore available energy for photosynthesis, which the model constrains by EVI (Huete et al., 2002) via F g . The Enhanced Vegetation Index is more sensitive to the chlorophyll content of the canopy then NDVI, because it makes corrections for canopy background and the atmosphere. We derived it from MYD09A blue, red, and near-infrared (NIR) bands. As with other chemical reactions, plants tend to photosynthesize at an optimal temperature. The reaction slows (speeds up) when air temperatures cool (warm) according to an asymptotic curve (F t : June et al., 2004). When air temperatures exceed this optimum and/or conditions become too dry, plants reduce stomatal conductance to prevent an excessive loss of moisture. The leaf age constraint (F a ) is defined in terms of relative changes in canopy light absorptance during senescence. When plants transition into the senescence phase after peak productivity, stomatal conductance declines as the plants no longer photosynthesize. The seasonal effect in PT-JPL is estimated as a function of peak productivity or maximum seasonal NDVI.
Soil evaporation is constrained by F sm , so that F sm is defined in terms of relative humidity and the vapor pressure deficit (VPD): RH (VPD/β) . Vapor pressure deficit is essentially the reciprocal of RH. It is defined as the difference between the vapor pressure of an air parcel above the surface that is saturated with moisture and the current vapor pressure. β is a constant equal to 1.0 kPa. Figure 1. Schematic representation of the Priestley-Taylor Jet Propulsion Laboratory assuming evaporation from precipitation intercepted by the canopy is zero (i. e., evapotranspiration is the sum of transpiration and soil evaporation). Transpiration is a function of canopy net radiation (R nc ) constrained by the green canopy (F g ), temperature (F t ), and leaf age (F a ) fractions. Soil evaporation is a function of soil net radiation (R ns -G) constrained by the soil moisture (F sm ) fraction. F g , F t , F a , and F sm are derived from the enhanced vegetation index (EVI), average daily temperature (T av ), the normalized difference vegetation index (NDVI), and the vapor pressure deficit (VPD).

Band Selection for Soil Evaporation Prediction
The apparent thermal index capitalizes on the response of the diurnal temperature range to changes in plant and soil moisture status. Water has a higher heat capacity than air, so the diurnal temperature range tends to be lower (higher) under moist (dry) canopy conditions all other factors (e.g., solar irradiance and declination) being equal (García et al., 2013).
where DT is the diurnal temperature range, defined as daytime-nighttime land surface temperature, and DT max is the maximum diurnal temperature range (60°C). We took daytime and nighttime thermal values from MYD11A2. Yao et al. (2013) first presented this version of ATI, which is a simplification of García et al. (2013) as it assumes a linear relationship between DT and F sm . We selected the Yao et al. (2013) version, because neither version has been compared to one another. In addition, the Yao et al. (2013) version is derived from MODIS thermal bands. ATI is a scalar (0-1) like the original F sm , so we substituted it directly into equation 4 for the analysis.
The normalized difference water index and LSWI are essentially the same index, but each capitalizes on two different regions of the shortwave infrared (SWIR). The SWIR region at approximately 1,240 nm (Gao, 1996) and 2,130 nm (Chandrasekar et al., 2010) is sensitive to liquid water molecules in the canopy, while NIR serves to normalize the relationship across different canopies, because this region of the electromagnetic spectrum is less sensitive to changes in water content (Chen et al., 2005). Previous studies featured another index with SWIR at 1,640 nm (MODIS band 6), but this region is adversely affected by water vapor and aerosols in the atmosphere, so we did not consider it for our analysis.
where SWIR in the case of NDWI and LSWI is taken from MYD09A bands 5 and 7, respectively. Unlike F sm and ATI, NDWI and LSWI are not scalars. We therefore converted them to scalars according to Xiao et al. (2004): where NDWI norm and LSWI norm are NDWI and LSWI scaled to the maximum seasonal values NDWI max and LSWI max .
We also scaled all possible band combinations from MYD09A and MYDOCGA to seasonal maxima analogous to NDWI and LSWI to evaluate the performance of potentially new indices. We only included the blue bands (8-10) from MYDOCGA, because bands 11-16 were consistently of poor quality.

Model Evaluation
We evaluated the indices differently for the 8-day and seasonal time steps. For the 8-day time step, we used the coefficient of determination (R 2 ), cross-validated root mean square error (RMSE), normalized RMSE, and two error metrics that decompose RMSE into systematic error or bias (RMSE s ) and unsystematic or random error (RMSE u ) (Willmott, 1981). All relationships were statistically significant at the 95% confidence band (p < 0.05). We used λ-λ plots, which are essentially cross correlation plots for R 2 , to pinpoint potentially new indices that perform better then indices identified in the literature. To isolate model performance of LE s, resp from overall performance of simulated LE, we also evaluated the performance of LE c,ptjpl with observed LE. We compared seasonal estimates with a two-sided t test for independent samples of unequal variance. We used the two-sample t test in this case to determine if the difference in seasonal means of observed and simulated LE was zero (null hypothesis) at the 95% confidence band. A rejection of the null hypothesis meant that significant differences in the means could affect interpretation of cumulative effects in ET over the growing season.

Indices From the Literature
Three of the indices (F sm , LSWI, and ATI) exhibited similar seasonality while NDWI showed little to no seasonality ( Figure 2). LSWI showed less variation from 8-day period to 8-day period, while ATI was much more sporadic. The original F sm , LSWI, and ATI tended to have two major oscillations throughout the year. The indices were high outside the analysis window (i.e., pre-planting and post-harvest) when little vegetation was on the surface and LE s → LE. Another minor peak occurred during the period of maximum crop productivity when the canopy is closed and LE s → 0. The troughs in between these peaks corresponded to the green-up and brown-down phases of crop development. Two general patterns in the seasonality of the original F sm , LSWI, and ATI with relation to LE s emerged across the sites. US-Ib1 (Figures 2a and 2c) is rainfed, because it has a continental climate and receives ample rainfall during the growing season. Conversely, US-Twt (Figures 2b and 2d) has a Mediterranean climate, receives little rainfall during the growing season, and consists of rice. The rice fields are flooded throughout the growing season. The gap between LE c and LE s during the growing season was therefore much wider (narrower) for US-Ib1 (US-Twt). LSWI tended to correspond better to LE s than other indices for irrigated sites, because it remained comparatively high during the growing season. The original F sm formulation on the other hand dropped during the growing season and therefore tended to correspond better to LE s than other indices for rainfed sites.
These patterns were confirmed by the performance statistics at irrigated/flooded sites (US-Ne1, US-Ne2, and US-Twt). Neither the RH-based F sm or LSWI were particularly strong performers, since they only explained 19-31% in LE s,resp variance (Table 3). Their integration with LE c did however lead to an overall improvement in LE estimation over LE c alone, explaining an additional 3-11% in observed LE variance. LSWI tended to explain more variance (i.e., higher R 2 ) and be less prone to over-fitting (i.e., lower RMSE) than the original F sm . As illustrated in Figures 3a and 3b, the original F sm produced a negative LE bias during the growing season around maximum productivity when LE is high. If either the original F sm or LSWI are used to estimate LE s , bias-correction over irrigated/flooded fields is feasible, since both had more than double the systematic error as unsystematic error. The performance of ATI was slightly worse than the original F sm and LSWI, followed by NDWI, which was clearly the worst performing index. ATI explained an additional 1% in LE s,obs variance at US-Ne1 than LSWI and the original F sm , but had much higher RMSE. The performance of LSWI at the flooded site (US-Twt) was markedly stronger than the other indices. LSWI explained an additional 5% and 15% of LE s,obs variance compared to the original F sm and ATI, respectively.
The performance of the original F sm and LSWI was more mixed at the rainfed sites (Table 3). The original F sm performed better than LSWI for four sites (US-Ne3, US-Ro1, US-Ib1, and US-Crt) explaining 20-34% in LE s,obs variance. LSWI performed better than the original F sm for the other two sites (US-Ro3 and US-Bo1) explaining 20-23% in LE s,obs variance. At these two sites, the original F sm performed particularly poorly (R 2 = 0.01, RMSE = 2.49 MJ m −2 d −1 and R 2 = 0.16, RMSE = 1.53 MJ m −2 d −1 ). Both indices performed poorly at US-Arm. As with the irrigated sites, the LSWI LE s parameterization combined with LE c,ptjpl led to an overall model improvement, explaining an additional 3-10% in observed LE variance. As illustrated in Figures 3c and 3d, the bias in the original F sm and LSWI tended to lead to underestimates of LE throughout the growing season. Although systematic errors were closer to unsystematic errors, some gains could be made with bias-correction. NDWI performed considerably better at US-Bo1 and other rainfed sites, but was prone to leveraging (over-fitting) as shown by the high RMSE. ATI was the worst performing of the indices at rainfed sites.

Other Two-Band Combinations
The additional bands afforded by MYDOCGA in one case (US-Ib1) led to higher correlations with LE s,obs than other band combinations (Figure 4a), but as shown at US-Twt (Figure 4b), MYD09A bands in general were comparable or superior. As with indices from the literature, the new band combinations performed very poorly at US-Arm. The MODIS red band (b1) was consistently the most important band and explained 18-35% in LE s,obs variance in combination with the MODIS NIR (b2), green (b4), and SWIR (b5) bands. MODIS bands 1 and 5 were the top-performing combination for only one site (US-Crt), but were selected for comparison with indices from the literature, because the combination was consistently among the top performers at each site. We refer to the band combination as the soil moisture divergence index (SMDI).
The soil moisture divergence index responded similarly to LSWI (Table 1). It tended to underpredict during the growing season at rainfed sites and was closer to observation-based values during the growing season at irrigated/flood sites than the RH-based F sm . It outperformed LSWI for four of the sites: US-Ro1, US-Ro3, US-Twt, and US-Crt. The most significant improvement was at US-Twt where SMDI increased explained variance by 2% with a 0.04 MJ m −2 d −1 lower RMSE ( Figure 5).
Much of the variability in the indices was averaged out on a seasonal basis. As a result, the distinction in performance between rainfed and irrigated crops was less apparent. The soil moisture divergence index clearly produced mean estimates closer to observed seasonal means than the other indices ( Figure 6). With the exception of US-Arm, SMDI had a mean t-statistic ranging from −1.09 to 0.23, meaning it was not statistically different from the observed mean at each site (Table 4). This was followed by the SWIR-based  Note. SMDI is a newly proposed soil moisture index derived MODIS bands one and five. Relationships are significant at the 95% confidence band.

10.1029/2019WR026290
Water Resources Research indices, LSWI and NDWI, with a mean t-statistic ranging from −1.42 to 0.06 and −1.36 to 0.14, respectively. These indices on average deviated from the observed mean one season at each site. ATI was again the worst performing index with a mean t-statistic ranging from −2.87 to −0.41 and an average deviation from the observed LE mean of two seasons at each site. The original F sm on average was closer to the observed means than ATI, with an average t-statistic ranging from −2.39 to −0.09. It was statistically different from the observed mean at each site on average approximately 1.5 seasons.

Discussion
Global ET is trending upward in response to global heating induced increases in atmospheric moisture demand (Jung et al., 2010). Scientists attribute these trends to increases in transpiration in humid regions of the world. Soil evaporation is much more important and partially offsets transpiration rates in dry regions of the world, leading to some dispute on the magnitude and direction of ET trends (Zhang et al., 2016). Accurate simulation of trends in these regions is important, because they are generally the most vulnerable to water shortages and food insecurity. Further, some regions (e.g., the Sahel) are major zones of transition between monsoon and dry interior climates that have undergone significant land use/cover shifts over the past several decades. Researchers have linked these shifts to regional moisture feedbacks that if sufficiently quantified could inform water pricing and other initiatives aimed at reducing the amount of water we consume for the amount of food we produce (Ellison et al., 2017). The substitution of RH-based  F sm with the new index presented here (SDMI) or LSWI could lead to modest improvements in soil evaporation estimates and therefore ET partitioning and trend analysis in dry regions of the world. Not only are they a more direct measure of moisture supply, but they are derived from the MOD09A product, which has a native spatial resolution of 500 m-much higher than climate reanalysis (typically ≥9 km).

Reformulated PT-JPL Soil Evaporation Performance
Shortwave infrared in combination with the visible red (SMDI) or NIR (LSWI) MODIS bands captured shortterm variability in and seasonal mean soil evaporation for irrigated/flooded agricultural fields more than MODIS thermal bands or ground-based estimates of RH as implemented in F sm of PT-JPL. The performance of these band combinations was lower at rainfed sites, but comparable to the RH-based F sm . Yao et al. (2018) produced a ΔR 2 of 0.07 to 0.09 with SWIR bands over the RH-based formulation for flux towers in croplands in China, which is higher than the improvements made in this study. It is important to note that initial PT-JPL performance was lower in the China study and they considered SWIR in both the formulation of transpiration and soil evaporation. Lastly, they did not make a seasonal comparison, which showed some very clear distinctions among soil evaporation formulations. The fields that SMDI and LSWI performed better than the original F sm represent moisture-limited agroecosystems for which direct remote sensing models, such as PT-JPL, tend to underperform compared to indirect methods. The shortwave infrared region of the electromagnetic spectrum (900-2,500 nm) contains several wavelengths that absorb strongly due to the moisture content of the Earth's surface. MODIS bands 5 and 7 proved particularly useful for estimating soil evaporation in irrigated/flooded fields during the growing season(s) where the divergence between transpiration and soil evaporation was low. Until now, the NIR region was used in SWIR-based indices to normalize the response of the surface to changes in moisture content, but this may result in contradictory index response to soil and canopy water (Gao, 1996). NDWI is particularly susceptible and tended to perform poorly in this study.
Scientists typically do not use the visible red region for normalization, but it clearly showed a clear advantage in this study via the new SMDI. SDMI outperformed NDWI and LSWI on an 8-day and seasonal basis. Furthermore, the visible red MODIS band was the most important for estimating soil evaporation in most band combinations. Photoactive canopies absorb strongly in the visible red, because the photosynthetic pigment chlorophyll preferentially absorbs in this region of the electromagnetic spectrum (Ollinger, 2011). As the photoactivity of canopies increase over the season, carbon assimilation and reciprocally transpiration increase. Red and NIR in PT-JPL therefore track phenological changes in transpiration. Soil reflectance in the red increases proportionally to its albedo (wet → dry) (Huete, 1988). This variation is smaller than canopy absorption, so perhaps the red region is more effective at normalizing for soil than NIR. Further, the results indicate that the visible red has additional explanatory power decoupled from phenology. Researchers should consider addressing the issue of data redundancy in PT-JPL to prove this hypothesis.
The poor performance of ATI was surprising, since it is gaining popularity in the Earth system science community to improve the soil evaporation component of direct remote sensing models of ET. The apparent thermal index was highly variable and this variability led to low performance at both 8-day and seasonal time steps. The index is sensitive to changing weather conditions (air temperature, wind velocity, and air humidity) particularly in vegetated areas (Van doninck et al., 2011). These conditions affect daytime and nighttime LSTs, which are in the denominator of and therefore greatly affect ATI. Weather satellite data, which was used to derive ATI in García et al. (2013), has a much higher temporal frequency than MODIS. High-frequency data may help to reduce noise from changing weather conditions. Researchers should compare ATI derived from weather satellites and MODIS to see if there are clear advantages of one derivation from the other.

Original PT-JPL Soil Evaporation Performance
The lower performance of RH-based F sm compared to the SWIR-based indices in irrigated/flooded fields can be attributed to two factors: (i) poor response of soil evaporation to overlying atmospheric moisture and (ii) sensitivity of the RH formulation to changing weather conditions. We would expect that irrigation/flooding would humidify the air. The RHbased F sm according to Bouchet's hypothesis should decrease and lead to higher soil evaporation rates. This was often not the case. PT-JPL does not include aerodynamic and surface resistance terms, as advection is subsumed by the α term. This reduces the propagation of error in the model and is well justified over large areas (Ershadi et al., 2014) but may also 10.1029/2019WR026290 present a limitation under certain local conditions. The canopy at US-Twt consisted of rice cultivated in flooded fields below sea level within a series of levees in the San Joaquin/Sacramento River Delta of California. The site is particularly windy and has very low surface roughness. In this case, the advection of moisture from the surface could be considerable, leading to drying of the atmosphere over the field. We did not consider windspeed in this study, because PT-JPL does not include it explicitly. The canopies at the other two irrigated sites (US Ne-1 and Ne-2) were taller/rougher and the performance of atmosphericdriven soil evaporation was better. At these sites, we suspect that model bias and uncertainties were due mainly to the structure of RH-based F sm than physical processes as with US-Twt. Like ATI, RH-based F sm is sensitive to weather conditions. It is a power relationship, so we would expect as with ATI that any changes in RH would be amplified. On an 8-day basis, the sensitivity did not greatly affect its performance at the two sites. On a seasonal basis however, we saw clear differences in performance compared to SWIRbased indices. The sensitivity may be particularly problematic when considering climate reanalysis data with its inherent bias and coarse spatial resolution.

Knowledge Gaps
We are currently expanding our analysis to include eddy covariance flux towers in croplands and grasslands across the globe to address some gaps in the analysis presented here. First, we are comparing red-SWIR-RADAR fusion to estimate F sm . Red and SWIR bands are sensitive to both the moisture content of the canopy and soil background. MODIS reflectance is at a higher temporal resolution than other spaceborne products, but its signal still suffers from cloud cover, especially in the tropics. Scientists typically use MODIS blue bands 8-16 to analyze ocean processes, so the land until recently with the release of MYDOCGA was masked out. Marshall et al. (2016) revealed via ground spectroscopy the value of the visible blue region of the electromagnetic spectrum for estimating soil evaporation, so we hoped MYDOCGA could improve soil evaporation estimation, but unfortunately, it did not add much value to the analysis. This region of the spectrum has a particularly narrow atmospheric window that greatly interferes with satellites, but has little impact on ground spectroscopy. Microwave surface (L-band) emissivity can better distinguish soil from canopy moisture content than red-SWIR, and cloud cover does not affect it. A red-SWIR-RADAR index for F sm could therefore be a powerful new tool to estimate soil evaporation. Since the fate of the L-band SMAP is still unknown, we are looking at other microwave L-band sensors, such as Argentina's space agency CONAE SAOCOM to perform the fusion. Second, we did not handle "mixed pixel" effects explicitly in our analysis. In some cases, the flux tower footprints were quite small (~100 m), so 1 km spatial resolution may not ably capture such localized processes. That said, the fields we analyzed were in heavily cultivated areas, and 1 km remote sensing red-SWIR in most cases did better than ground-based RH. Perhaps course resolution data in this case is able to pick up general patterns in soil moisture status across neighboring fields. We analyzed the MODIS data at 1 km spatial resolution, because the thermal and ocean MODIS products are only available at that resolution. Disaggregating them therefore would add no explanatory power and risk apples to oranges comparison. Since the 500 m MODIS bands were the most valuable, we will use these at their native resolution for our red-SWIR-RADAR fusion. We will also perform the analysis for flux towers in relatively more heterogeneous landscapes. Third, we did not analyze evaporation from precipitation intercepted by the canopy, because it is generally low in the agroecosystems we analyzed. In the future, we will consider modeling intercepted evaporation separately as in the original PT-JPL formulation to see if this component does in fact confirm our results. Lastly, we intend to see if the performance of RH-based F sm in indeed greatly affected by wind and moisture advection.
A couple of avenues worth exploring to improve upon optical-based F sm go beyond our current work. The U. S. National Academy of Sciences Decadal Survey's Surface Biology & Geology (SBG) proposes fusing 60 m thermal with 30 m hyperspectral remote sensing in next generation NASA missions to better understand plant processes on the Earth. Satellite-based hyperspectral image analysis is undergoing a renaissance with the recent and eminent launch of the Italian Space Agency's PRISMA and DLR Space Administration's ENMAP hyperspectral satellites, respectively. Similarly, the extended ECOSTRESS mission includes a thermal infrared radiometer onboard the international space station that measures canopy temperature at 70 m resolution. The data are used to estimate evapotranspiration and its components for monitoring drought (Fisher et al., 2020). Scientists can use these missions to perform a comprehensive analysis of hyperspectral narrowband and thermal band combinations sensitive to soil evaporation, which would be invaluable information for NASA's upcoming Surface Biology and Geology (formerly HyspIRI) hyperspectral mission. The mission promises to be the first long-term mission to provide consistent and continuous global hyperspectral-thermal coverage.

Conclusion
Evapotranspiration is an important land surface process and a decision-support tool. We can largely estimate it with biophysical parameters derived from Earth observation. Soil evaporation is an important component of evapotranspiration, particularly in dry regions of the world, but has proved to be more difficult to estimate than transpiration with Earth observation data. This could be due to the comparatively loose coupling of soil evaporation to biophysical parameters. Visible blue, shortwave infrared, thermal, and microwave Earth observation have been used to improve the simulation of soil evaporation, because they estimate soil moisture supply more directly and are at a higher spatial resolution than more traditional climate-based parameterizations. We have not made a broad comparison of Earth observation alternatives until now. Our study evaluated the performance of visible blue, shortwave, thermal, and other regions of the electromagnetic spectrum as detected by MODIS for estimating soil evaporation. It showed that shortwave infrared bands normalized by the visible red band are particularly strong at estimating soil evaporation in irrigated/flooded agricultural fields where the difference between transpiration and soil evaporation over the growing season is low. Performance of these band combinations in rainfed croplands was lower, but comparable to climatebased parameterizations. We hypothesize the relatively low short-term variation in the red-shortwave infrared led to more representative seasonal estimates of evapotranspiration across the rainfed and irrigated/ flooded agricultural fields analyzed. Given these findings, we recommend red and shortwave infrared band combinations be used to estimate soil evaporation in direct remote sensing models, such as PT-JPL. We did not include the microwave region in our analysis, but we are currently fusing microwave (L-band) emissivity with red and shortwave infrared reflectance. An index that combines red, shortwave infrared, and microwave could finally close the gap between transpiration and soil evaporation performance in direct remote sensing models, such as PT-JPL.