A Bayesian analysis of sensible heat flux estimation: Quantifying uncertainty in meteorological forcing to improve model prediction
Abstract
[1] The influence of uncertainty in land surface temperature, air temperature, and wind speed on the estimation of sensible heat flux is analyzed using a Bayesian inference technique applied to the Surface Energy Balance System (SEBS) model. The Bayesian approach allows for an explicit quantification of the uncertainties in input variables: a source of error generally ignored in surface heat flux estimation. An application using field measurements from the Soil Moisture Experiment 2002 is presented. The spatial variability of selected input meteorological variables in a multitower site is used to formulate the prior estimates for the sampling uncertainties, and the likelihood function is formulated assuming Gaussian errors in the SEBS model. Land surface temperature, air temperature, and wind speed were estimated by sampling their posterior distribution using a Markov chain Monte Carlo algorithm. Results verify that Bayesian-inferred air temperature and wind speed were generally consistent with those observed at the towers, suggesting that local observations of these variables were spatially representative. Uncertainties in the land surface temperature appear to have the strongest effect on the estimated sensible heat flux, with Bayesian-inferred values differing by up to ±5°C from the observed data. These differences suggest that the footprint of the in situ measured land surface temperature is not representative of the larger-scale variability. As such, these measurements should be used with caution in the calculation of surface heat fluxes and highlight the importance of capturing the spatial variability in the land surface temperature: particularly, for remote sensing retrieval algorithms that use this variable for flux estimation.
Key Points
- Bayesian inference with prior info is well suited for input uncertainty analysis
- The land surface temperature has large uncertainties in flux estimation
- Scaling of surface temperature is required to capture spatial variability
1. Introduction
[2] Evapotranspiration (ET) is a major component of the hydrological cycle [Brutsaert, 2005] and can account for more than 90% of the precipitation in semiarid and arid regions [Wang et al., 2012]. Accurate estimation of ET is required to better constrain and understand hydrometeorological behavior across a range of systems and scales: locally, regionally, and globally. ET is usually represented as the latent heat flux ( ) from the land surface to some level in the overlaying atmosphere. Although there are a number of techniques available to estimate the land surface fluxes of heat and water [Kalma et al., 2008; Wang and Dickinson, 2012], a common approach is via evaluation of the energy balance at the surface. In models using this approach, ET (or latent heat flux, ) is usually derived as the residual term of the energy budget, i.e., , where is the net radiation, is the ground heat flux, and is the sensible heat flux. In such instances, it is the calculation of that is of key importance in the estimation of ET. One example from this family of models is the Surface Energy Balance System (SEBS) model [Su, 2002], an energy balance method that is widely used to estimate actual ET via a combination of remote sensing and in situ meteorological observations [McCabe and Wood, 2006; Su et al., 2005].
[3] Model simplifications, natural variability in system response, and issues of measurement or sampling errors in the input forcing cause mismatches between the modeled and observed responses, in both physically based (e.g., SEBS) and empirical models [Kalma et al., 2008; Samanta et al., 2007]. Probabilistic (stochastic) modeling methodologies are hence of particular interest because they allow an explicit examination of data and modeling uncertainties using probability distributions [Kavetski et al., 2006a; Luo et al., 2007] or empirical ensembles [Pan et al., 2008; Peters-Lidard et al., 2011]. Probabilistic approaches have been used previously in groundwater models [Dagan, 1985], conceptual rainfall-runoff models [Kuczera et al., 2006], and integrated water resources systems [Castelletti and Soncini-Sessa, 2007]. However, there are limited cases detailing the use of probabilistic frameworks in heat flux modeling. In a recent contribution, van der Tol et al. [2009] developed a Bayesian approach for the estimation of heat fluxes over vegetated land surfaces and showed that the integration of different prior information within a land surface modeling scheme improved the estimation of model parameters. Samanta et al. [2007] and Li et al. [2010] used a Bayesian approach to fit the Penman-Monteith model to half-hourly transpiration rates for a sugar maple stand in different regions, finding considerable uncertainties in predicted transpiration. In general, the nonlinearity of the model equations, process complexity and the difficulties in specifying realistic uncertainty models represent challenging research problems for developing and applying probabilistic techniques in heat flux models.
[4] In energy balance methods (including SEBS), the estimation of the sensible heat flux presents greater difficulties than the estimation of the available energy flux (i.e., ). The sensible heat flux is the transfer of heat from the land surface to the atmosphere, represented conceptually as a temperature gradient, , where is the land surface temperature, is the air temperature, and is the aerodynamic resistance to heat transfer. Note that is itself a function of the wind speed and of the aerodynamic roughness of the land surface. Given this expression, the main uncertainties in the estimation of the sensible heat flux in SEBS arise due to uncertainties in the input meteorological variables ( , , ) and in the aerodynamic roughness parameterization.
[5] Timmermans et al. [2011] found that uncertainties in the estimation of via SEBS were likely due to the incorrect parameterization of the roughness height for heat. On the other hand, van der Kwast et al. [2009] found that SEBS is more sensitive to the surface temperature errors than to the surface aerodynamic parameters. In another study, Gibson et al. [2011] found that SEBS is sensitive to and , depending on the land cover and wet limit criteria. As can be seen, identifying the true nature of the uncertainties resulting from these input variables remains a challenging and unresolved task.
[6] The aim of this paper is to provide insights into the spatial representativeness of the key input meteorological variables by quantifying the uncertainties in their actual measurements. More specifically, our research questions are as follows: (a) which meteorological forcing set (land surface temperature, air temperature, or wind speed) has the greatest influence on the uncertainty in sensible heat flux values estimated using the SEBS model? and (b) what are the likely reasons for such uncertainties?
[7] These questions are investigated using a Bayesian inference technique (BIT), where uncertainties in the observed input and response data are represented using probability distribution functions (PDFs), and the SEBS model is used to describe the physics of the sensible heat flux process. Inferred values of the input variables are then used to quantitatively estimate the errors in their measurements. The likely causes of these uncertainties are then discussed. One of the major differences between the current study and previous investigations is the use of Bayesian uncertainty analysis instead of a sensitivity analysis to quantify the errors in sensible heat flux estimation. Moreover, instead of associating all uncertainties to the parameterization of the models [Samanta et al., 2007, 2008; van der Tol et al., 2009], this paper examines the uncertainties inherent within the input variables used in heat flux estimation.
2. Field Measurements and Site Description
[8] This investigation is based on data from the Walnut Creek (WC) watershed, centered at 41.96°N, 93.6°W and located near Ames, Iowa, in the USA. Meteorological and flux data for the study area were measured across 12 eddy covariance towers, collected as part of the Soil Moisture-Atmospheric Coupling Experiment and the Soil Moisture Experiment 2002 (SMEX02) campaigns [Kustas et al., 2005; Prueger et al., 2009] during June and July 2002. The locations of the towers within and around the study area are shown in Figure 1.
[9] The land cover of the region is comprised primarily of either corn (Zea mays L.) or soybean (Glycine max L. Merr.). Nearly 95% of the region and watershed is used for row crop agriculture, with 80% of that being corn and soybean in equal proportions. The climate is humid, with an average annual rainfall of 835 mm/yr. The topography is characterized by low relief and poor surface drainage. Dominant soil types of the study area are clay and silty clay loams, with generally low permeability [Hatfield et al., 1999].
[10] Meteorological data along with surface heat flux and vegetation measurements are available for 20 days from 20 June to 9 July 2002 (day-of-year 171–190). During this time period, the vegetation grew rapidly and surface soil moisture changed from dry to wet due to rainfall events in early July. The eddy covariance flux towers provided measurements of the friction velocity ( ), sensible heat flux ( ), and latent heat flux ( ). Air temperature and humidity were measured using Vaisala HMP-45C instruments and sonic temperature, and wind speed fluctuations were measured using Campbell Scientific CSAT3 sonic anemometers. Radiometric temperatures were measured using Apogee thermal-infrared radiometers (model IRTS-P) with a nominal 60° field of view. The Apogee sensor height is kept at 2.5 m above soybean and 5 m above corn canopies in all corresponding towers. The effective canopy level footprint area for the land surface temperature sensor was approximately 7 m^{2} for soybean towers and 26 m^{2} for corn towers. All data for rain periods are removed from the analysis, as the CSAT sonic instrument does not provide reliable results during such conditions. In addition, sporadic spikes and values with invalid range are removed. During the field campaign, the vegetation height ( ), leaf area index ( ), and fractional vegetation cover ( ) varied with crop growth stage [Anderson et al., 2004], with ranges shown in Table 1.
Crop | (m) | (m^{2}/m^{2}) | |
---|---|---|---|
Soybean | 0.2–0.6 | 0.4–3.7 | 0.2–0.9 |
Corn | 0.7–2.2 | 1.1–5.6 | 0.5–1.0 |
[11] Meteorological and heat flux data are averaged to 30 min. The measured sensible heat flux data are used without any closure correction. All records are filtered for rain events and limited to the daytime period from 7:30 A.M. to 6:00 P.M. local time. More detailed site information and a description of the experiments can be found in Kustas et al. [2005] and Prueger et al. [2005].
3. Modeling Approach
3.1. Surface Energy Balance System
[14] The functions proposed by Beljaars and Holtslag [1991] and evaluated by van den Hurk and Holtslag [1997] for stable conditions and the functions proposed by Brutsaert [2005] for unstable conditions are used for atmospheric stability corrections in the atmospheric surface layer. The roughness heights for momentum and heat transfer ( and ) are important parameters used in the MOST and BAST equations and are functions of the biometeorological conditions of the land surface. These two key parameters are estimated in SEBS using the methodology developed by Su et al. [2001], which is based on vegetation phenology, air temperature, and wind speed.
[15] After the estimation of H, SEBS uses a scaling method to scale the derived between hypothetical dry and wet limits based on the relative evaporation concept. Finally, this scaled can be used to calculate the latent heat flux as a residual term in the general energy balance equation, i.e., as . Figure 2 provides a schematic representation of the model as employed in this application (see Su [2002] for further details on the model description and formulation).
3.2. Bayesian Inference Technique
[16] In standard deterministic applications of the SEBS model, all input variables are fixed and constant at each simulation time step. In contrast, in a stochastic application, inputs and response variables can be considered as probability distributions or empirical ensembles of values, the envelope of which represents the range of plausible values. This allows for an accounting of uncertainties such as input variations across a heterogeneous site.
[17] For stochastic application of the SEBS model in this study, a Bayesian inference technique (BIT) is developed and linked with the SEBS model. The approach is partially analogous to the Bayesian total error analysis (BATEA) model [Kavetski et al., 2003, 2006a] and focuses on the uncertainty in the SEBS input forcings. In the terminology and notation adopted here, observed variables are indicated with a tilde ( ), while their posterior estimates are indicated with a hat ( ).
[21] In the hierarchical Bayesian framework detailed earlier, is the “latent variable” and corresponds to estimates of the true inputs; they are not directly observed but are rather inferred as part of the BIT-SEBS procedure. The error model parameters and describe the statistical properties (e.g., mean and variance) of input and response variables, respectively [Renard et al., 2011]. In this application, the values of and are estimated and fixed prior to the BIT-SEBS inference using a separate data analysis procedure detailed later in this section.
[25] In equation 9, we set the prior mean of to , which is equivalent to ignoring systematic errors in the observations. The prior standard deviation is specified by analyzing the spatial variability of the observed forcing field, thus corresponding to sampling uncertainty. This variability can be expressed as an absolute quantity, or as a fraction of , or as a range based on expert knowledge of the input uncertainty.
[26] In the context of the inference equation 7, which is conditioned on the observed response data , the error model in equation 9 plays the role of a prior on x before is analyzed. Note that formulating the input error model as , rather than , corresponds to using Bayes' identity in combination with a noninformative prior . It is also possible to use additional information, such as the average climatology, to define an informative prior p(x) [Huard and Mailhot, 2006].
3.3. Markov Chain Monte Carlo Sampling of the BIT-SEBS Posterior Distribution
[28] The posterior distribution can be approximated using a Monte Carlo or Markov chain Monte Carlo (MCMC) sampling scheme. Due to the high dimensionality of the posterior PDF in this work, the slice sampling MCMC method of Neal [2003] is used. This method uses the prior as a proposal distribution and avoids requiring the user to specify a high-dimensional proposal distribution [Noh et al., 2010].
[29] A flowchart of the computational algorithm is shown in Figure 3. At each step of the MCMC simulation, the slice sampling algorithm draws a candidate value from the prior distribution (equation 9), runs the SEBS model with the candidate inputs , and evaluates the likelihood function (equation 10). This procedure is then repeated until the MCMC iterations converge. Other Monte Carlo methods for sampling from the posterior include standard Metropolis methods [Kavetski et al., 2006a, 2006b], which in some cases can be adapted to exploit the time dependence of the model [Kuczera et al., 2010]. To ensure that the MCMC algorithm explored all parts of the prior distributions, convergence diagnostics are applied as detailed in section 4.1.
3.4. BIT-SEBS Methodology for Analyzing SMEX02 Tower Data
3.4.1. Prior Uncertainty Analysis of Input Variables
[30] The “uncertain” input meteorological variables of the SEBS model used in this study include the air temperature ( ), land surface temperature ( ), and wind speed ( ). For each of the uncertain input meteorological variables, a Gaussian prior PDF is specified, with a mean equal to the measured value and a standard deviation proportional to the spatial variability of the observed values. Hence, for each time step, the standard deviations of , , and are calculated as the standard deviation of observations across all 12 towers within the study area. In the case of wind speed, the Gaussian prior distribution was truncated at zero to avoid negative wind speeds being sampled when the observed values are small relative to their potential variability.
[31] Other input variables (e.g., humidity) are assumed constant and equal to the observed value in the tower. SEBS model parameters ( , , ) are also calculated deterministically for each time step at each tower using the corresponding measured vegetation height and density and meteorological variables. Due to careful in situ observations of the vegetation parameters at each tower [Anderson, 2003; Kustas et al., 2005], the dynamics in aerodynamic roughness of the surface are preserved, and uncertainties in parameterization of the roughness height are expected to be reduced.
[32] Figure 4 presents measured values of precipitation, land surface temperature, air temperature, and wind speed during the study period across all towers. A rain event on day-of-year 172 was followed by a 12 day dry period, causing the soil moisture to decrease from field capacity to relatively dry conditions. Subsequently, some rain events during day-of-year 185–188 increased the soil moisture. Figure 4 shows that relative to the corn towers, soybean towers measure higher land surface temperature, air temperature, and wind speeds.
[33] As described in section 3.2, the Bayesian inference for each of the meteorological variables ( , , ) requires the construction of a prior for each variable, at each time step, and for each tower. Here each meteorological variable at each simulation time step at each of the 12 towers is given its own Gaussian prior PDF, with mean given by the observed value at tower , at time , and a standard deviation estimated from the range of observed values within each of the 12 towers at time . As the eddy covariance towers within the SMEX02 domain provide a reasonable coverage of the study area (see Figure 1), the range of the observed meteorological values across these towers is assumed to be indicative of the spatial variability.
[34] Based on the values of all towers, the standard deviations for each time step are calculated for , , and and shown in Figure 5. As have larger values than and , its priors are wider. The width of the prior controls the uncertainty bound of each input variable and hence directly affects the inference (see section 4.2).
[35] To appraise the assumption of Gaussian priors, Figure 6 shows quantile-quantile (QQ) plots of the tower data used to construct the priors (results for two representative time steps are shown). Land surface and air temperatures appear reasonably Gaussian, while the wind speed distributions exhibit heavier tails, representing a limitation of the Gaussian assumption.
3.4.2. Prior Uncertainty Analysis of Response Variable
[36] The response variable in this Bayesian investigation is the sensible heat flux observed at each of the eddy covariance towers. A number of recent studies [e.g., Foken, 2008; Foken et al., 2012; Hollinger and Richardson, 2005; Mauder et al., 2008; Meyers and Baldocchi, 2005; Richardson et al., 2012] have highlighted the uncertainties in eddy covariance estimations of turbulent heat fluxes. In addition to standard data quality controls (e.g., coordinate rotation and density correction) that need to be performed on the high-frequency eddy covariance measurements, there are issues related to inadequacy of fetch, heterogeneity of the footprint, improper averaging times, and noncapture of large eddies that add to the uncertainties in the eddy covariance estimates [Allen et al., 2011].
[37] To include the uncertainties of sensible heat flux observations in the Bayesian inference of the input variables, prior PDFs of are developed, with the observed sensible heat flux considered as the mean of the prior PDF. The standard deviation of the PDF, , is expressed as a fraction r of the observed sensible heat flux, . The choice of has a direct influence on the inference of the input variables. Smaller values of (e.g., with ) correspond to a lower uncertainty in the observations of the sensible heat flux, which causes larger deviations of the inferred values of input forcing from their observed values. In contrast, larger values of (e.g., with ) correspond to higher uncertainty in the observations of the sensible heat flux and cause smaller deviations of the inferred input values.
[38] Determination of the best (or optimum) value of is not possible, as the uncertainty in sensible heat flux observations is poorly described. Also, the spatial variability of cannot be used to develop PDFs of due to the difference in the extent and heterogeneity of the footprints among towers. Allen et al. [2011] identified that the errors in the estimation of the latent heat flux using eddy covariance systems for a well-maintained site, in terms of standard deviation from the mean, are in the range of 10%–15%. Based on these measures, we estimate that the standard deviation for sensible heat flux is around 10% of the measured value (i.e., ), as sensible heat flux estimations are often more reliable than latent heat flux estimations in eddy covariance towers [Foken, 2008; Mauder et al., 2008; Richardson et al., 2012].
[39] To evaluate the sensitivity of the inference to the value of r, we examined three cases: r = 0.05, r = 0.1, and r = 0.15. The sensitivity analysis was based on the residuals Δ of the inferred and observed values, computed as , where can be , , or and subscripts and refer to inferred and observed values, respectively. Results showed that in all three cases of , the relative variation in the range and magnitude of , , and were similar (i.e., is an order of magnitude higher than and , see supporting information). Consequently, variation of among selected values has no significant influence in identifying the most uncertain variable. Therefore, = 0.1 is adopted in the computation of results in the following sections.
3.4.3. Posterior Estimation and Inference Using BIT-SEBS
[40] Figure 7 shows the overall procedure in estimation of the posterior values of the input variables. For each time step and at each tower, prior analysis of data uncertainty was carried out as described earlier. MCMC simulations were then performed using the slice sampling method (section 3.2). The results of the Bayesian simulations can then be represented as time series of the posterior values for , , and for each tower record. Following an MCMC convergence assessment, the time series of posterior estimates of input variables were then used as estimates of the meteorological input variables (section 4.2) and also to provide insights into their uncertainties (section 4.3).
4. Results
4.1. Convergence Analysis of the MCMC Iterations
[41] A convergence study of the MCMC samples was undertaken as follows. The number of iterations necessary for MCMC chain convergence was estimated visually by plotting traces of the MCMC samples against the number of iterations for all chains [Kass et al., 1998]. Figure 8 shows the MCMC chain traces and their cumulative mean for 3000 samples (iterations), with a thinning factor of 10 and a burn-in period of 1000 samples for the 12:00 P.M. time stamp of tower WC162 (soybean) for day-of-year 173. Here a thinning factor of 10 means that a total of 30,000 samples were generated, but only every 10th sample was retained (this reduces the effects of serial correlation of the MCMC samples). A burn-in period of 1000 samples means that the first 1000 samples were discarded. From Figure 8 it can be seen that the cumulative means of the posterior traces are stationary after approximately 1000 iterations, suggesting adequate convergence of the MCMC samples.
[42] For quantitative evaluation of the MCMC convergence and assessment of the adequacy of the chain numbers, the potential scale reduction factor of Gelman and Rubin [1992] is used. As recommended by Brooks and Gelman [1998], the criterion for acceptance of the Bayesian modeling is that . Any MCMC chain that did not meet this criterion was rejected and was not considered in the inference.
[43] Histograms of the MCMC samples from the posterior are shown in Figure 9. The histograms have symmetric shapes and are well approximated by Gaussian distributions. In addition, Figure 9 shows that BIT-SEBS has refined the estimates of compared to their prior estimates, whereas for and the data were noninformative, and BIT-SEBS did not result in any refinement of the prior estimates.
[44] As the posterior distributions of each input variable are approximately Gaussian, their mean values (which also correspond to the most likely values) are taken as the point estimates of that variable. These inferred values are then used in evaluation of the performance of the Bayesian inference (section 4.2) and quantification of the uncertainties (section 4.3).
4.2. Bayesian Uncertainty Analysis of SEBS Inputs
[45] The SEBS model is used to estimate the sensible heat flux in both “deterministic” and Bayesian “stochastic” estimation schemes, with Figure 10 presenting a schematic of the overall procedure. In deterministic estimation, the observed values of the meteorological variables were used for direct estimation of the sensible heat flux (the traditional flux estimation approach). However, in stochastic estimation, the inferred values of , , and are used.
[46] Figure 11 presents a scatterplot of both the deterministic and stochastic estimates of sensible heat flux values against measured eddy covariance data for daytime half-hourly records for all soybean (top) and all corn towers (bottom). Linear regression statistics for each scatterplot are also shown in Figure 11. The term refers to a relative error measure defined as , where RMSE is the root-mean-squared error between observed and simulated sensible heat flux, and is the observed sensible heat flux. As is apparent from Figure 11, stochastic simulation of sensible heat flux using Bayesian-inferred values of , , and improves the correlations for both corn and soybean towers, with an increase from 0.68 to 0.99 for soybean and from 0.62 to 0.98 for corn. In addition, the relative error decreases from around 10% to 1% for both soybean and corn towers.
[47] Time series of the observed, deterministic simulated, and stochastic simulated sensible heat flux for six selected towers (with fewest data gaps) are presented in Figure 12, with and values shown for both deterministic and stochastic simulations. The deterministic simulated sensible heat flux is in agreement with the observed values for the majority of towers, with values between 0.4 (tower WC162) and 0.8 (tower WC13). However, a clear underestimation of sensible heat flux in deterministic results is evident for WC13 and WC161. Also, deterministic simulated sensible heat fluxes of WC162 have clear forward diurnal shifts. In contrast, the stochastic simulated values are in better agreement with the observed values, showing improved values of 0.96–0.99.
[48] It is apparent that by using the inferred values of , , and , the performance of linear regressions of half-hourly results improves significantly, with and slope values close to 1 and a considerable decrease in relative errors. The improved model performance in the stochastic simulations is due to the inference of the input variables from the observed responses (section 3.2) and should not be viewed as indicative of the performance in predictive applications. Instead, our aim here is to use the inferred values of the input variables to gain further insights into the errors and uncertainties associated with them, and to gain insights into which input variables are likely to be contributing to the predictive uncertainty. In particular, the following sections examine and discuss which inferred inputs differ most from their observed values.
[49] It should be emphasized that the specification of the priors (in particular, their standard deviations) has a significant influence on the performance of the inference in BIT-SEBS. The importance of the choice of priors is illustrated in the example presented in the supporting information. In this case, BIT-SEBS simulations are performed for tower WC13 (soybean) assuming that each variable shares the same larger standard deviation of . Results show that when using such a set of noninformative priors, the efficiency in the stochastic estimation of the sensible heat flux is greatly reduced, and the time series of the differences between inferred and observed values for all three variables are in the same approximate range. Assigning a realistic and representative range of uncertainty to the input variables is known to be of key importance to the fidelity of such hierarchical Bayesian approaches (e.g., as discussed by Renard et al. [2010, 2011]). In this study, this is pursued by considering the spatial variability of the inputs as a proxy for the sampling errors in these quantities.
[50] In summary, although the true values of the selected input meteorological variables are unknown, the inferred values using BIT-SEBS can be considered as an accurate estimate of such true values due to the following reasons: (1) The prior distribution of input variables are based on the spatial variability of the measurements within a relatively dense network of towers; (2) The likelihood function contains a physically based model with established relations between input data and estimated sensible heat flux; (3) Errors in the parameterization of the SEBS model are likely to be relatively small (due to the quality of the field observations of the vegetation characteristics); (4) The MCMC analysis of the posterior distributions appears to have converged, according to the diagnostics employed; (5) The posterior distributions are well behaved and approximately Gaussian, and there is no evidence of incompatibility with the corresponding prior distributions; (6) Stochastic simulations of the SEBS model using the inferred input variables resulted in consistent estimates of the response variable (sensible heat flux).
[51] Therefore, differences between the inferred and observed values of the input variables are likely to be primarily comprised of observational errors. Further examination of the inferred input observations is undertaken in the following section.
4.3. Inferred Values of Meteorological Variables
[52] To evaluate the performance of the Bayesian inference, results are first examined for a sample day for both a soybean and a corn tower. To evaluate the approach more closely, the differences between inferred and observed meteorological values for all towers are also presented. Figure 13 plots the inferred values of , , and for day-of-year 173 from a representative soybean (WC162) and corn tower (WC152). Gray lines in Figures 13a–13h indicate the observed values from among the additional 11 soybean and corn towers, which can be used to establish whether the range and trend in observed and inferred values are in accord with the other measurements across the study domain.
[53] As can be seen from Figure 13a, the observed has a different diurnal cycle than is present in the other towers, due perhaps to sensor time delay, alignment, or geometric configuration. If the observed values of , , and from this tower were to be used in SEBS in deterministic simulations, the resulting sensible heat flux would be very different from the observed ( of 0.22 and RMSE of 52 W/m^{2}; see Figure 13g). On the other hand, the Bayesian estimated values of match well with the observed sensible heat flux and improve to 0.99 and RMSE to 0.86 W/m^{2}. To achieve this, the Bayesian inference approach identifies alternative values of that provide a better match to the diurnal variations represented across the other towers. Given that the inferred values of and are close to the observed values (see Figures 13c and 13e), it seems that for this tower at least, the main uncertainty in flux estimation results from the observations, with absolute differences between observed and inferred values of up to 3°C. This difference is well within the expected spatial variability observed within the in situ surface temperature measurements over agricultural fields [McCabe et al., 2008].
[54] For the corn tower, the inferred values of the land surface temperature are up to 2°C lower than the observed values (see Figure 13b). Similar to the soybean tower example earlier, the Bayesian-inferred values of air temperature and wind speed remain quite close to the observed values, indicating that these seem to be spatially representative. Figure 13h shows that the deterministic estimate of via standard application of SEBS is considerably higher than the observed flux estimate. Through use of the inferred land surface temperature values, a significantly improved simulation of the observed sensible heat flux is achieved, with increased from 0.86 to 0.99 and RMSE reduced from 41 to 1.3 W/m^{2}.
[55] To evaluate the performance of the Bayesian inference for all towers, the difference between observed and inferred values of , , and are calculated as , where is the variable of interest ( , , or ), and subscripts and i represent the observed and inferred values of the variable, respectively. Time series of , , and are shown in Figure 14. A bar plot of the all-tower-averaged precipitation is also shown to support interpretation of the results. For all plots (except precipitation), gray lines represent the time series of soybean, and black lines represent the corn towers.
[56] For all towers, differences between Bayesian-inferred and observed values are larger for the land surface temperature ( values of up to ±5°C) than for either the air temperature or wind speed. One possible reason for this difference is the disparity between the footprint of the in situ Apogee land surface temperature sensors and the CSAT sonic anemometer that is used to derive the sensible heat flux at the eddy covariance tower. The effective footprint of the Apogee sensors used in this study is on the order of a few square meters (approximated as circles with areas of 26.2 m^{2} over corn and 6.5 m^{2} over soybean), while the sonic anemometer measures eddies that originate from a nonlocal (relative to the in situ sensor) distance upwind of the tower [Schmid, 2002], representing a source area of several hundreds of square meters.
[57] Figure 14 indicates that the differences between the Bayesian inferred and observed at the soybean towers are more significant (and frequent) than those at the corn towers, possibly due to the lower fractional vegetation cover and the effect of bare soil on the locally observed [McCabe et al., 2008]. For the corn towers, the fractional vegetation cover is higher than for soybean towers, and hence, the footprint of surface temperature is more likely to be spatially stable and spatially representative. In contrast to , values of have lower variability and their magnitude is within the range of the sensor accuracy (±0.3°C). The lower values of suggest that due to atmospheric mixing and turbulence in the air, the footprint of the in situ air temperature sensor (HMP-45C) is more representative of the footprint of the sonic instrument. However, this reasoning cannot be extended to , as the wind speed in this study derives directly from the CSAT sonic anemometer rather than from independent measurements. Nevertheless, for the majority of cases, the range of is less than 0.5 m/s, which indicates that observations of the wind speed in each individual tower are likely to be representative of the domain average (apart from a number of clearly identifiable periods). The few days with higher values (e.g., day-of-year 178) are days with lower values of wind speed in the area (see Figure 4), which indicates that when wind speed is lower, the spatial variability in its value is larger.
5. Discussion
[58] Sources of uncertainty in Earth system models are varied and can include errors due to simplifications in the model structure, errors in the observations of the input forcing, uncertainties in parameterization of the model, or errors in the observations of the response variables (sensible heat flux in this study). In general, understanding and quantifying the uncertainty in such modeling schemes is nontrivial, due to the complexity of the interactions between the land surface and the atmosphere and the combined effects of all sources of error [Kalma et al., 2008].
[59] In the present study, errors associated with the measurements of meteorological input forcing were estimated based on their application in determining sensible heat flux using the SEBS model over a number of eddy covariance towers. Input forcing included air temperature, land surface temperature, and wind speed. Results indicate that the main uncertainty contributing to flux prediction arises due to uncertainties in the local observations of the land surface temperature, with differences between inferred and observed values of up to ±5°C. A number of previous studies have identified that errors in the land surface temperature can have a direct and significant effect on the estimation of the sensible heat flux. For example, van der Kwast et al. [2009] found that in well-irrigated fields, estimated by SEBS can deviate up to 70% with only a 0.5°C difference in surface temperature. It is worth noting that this behavior is not distinct to SEBS alone: similar sensitivities appear in other energy balance models and represent a considerable problem for energy balance based approaches that require the use of an infrared surface temperature [Kalma et al., 2008]. For instance, Timmermans et al. [2007] identified that a 3°C deviation in surface temperature can cause errors in the sensible heat flux estimation of up to 75% in the two-source model (TSM) [Norman et al., 2000] and 45% in the Surface Energy Balance Algorithm for Land (SEBAL) model [Bastiaanssen et al., 1998a].
[60] Looking beyond sensitivity analysis of modeling schemes to surface temperature, we suggest that the main reason for the differences between observed and Bayesian-inferred values of the land surface temperature is the disparity in the spatial representativeness (or footprint) of the sensors. In particular, the local-scale footprint of in situ measurements of the land surface temperature (using Apogee sensors) is unlikely to correspond with the footprint scale of flux observations made with eddy covariance systems [Kljun et al., 2004; Kustas et al., 2006; Su et al., 2005; Vickers et al., 2010]. Due to atmospheric turbulence and mixing, air temperatures and wind speeds will have lower spatial variability than the land surface temperature. Likewise, the footprint of the locally measured and will more closely match the footprint of the observed eddy covariance based fluxes.
[61] Use of the locally observed land surface temperature without spatial scaling and footprint correction has significant implications for the validation of heat flux models. For example, Su et al. [2005] showed that errors in the land surface temperature are the main reason for discrepancies between modeled and simulated heat fluxes in the SMEX02 towers. However, they partially corrected such errors by modification and adjustment of the emissivity. In image-scale applications, footprint models [Leclerc and Thurtell, 1990; Schmid, 2002; Schuepp et al., 1990] have been used for correction of the in situ observed land surface temperature using remote sensing images [Kustas et al., 2006; Li et al., 2008; Timmermans et al., 2009]. In a footprint model, the observed sensible heat flux is related to the orientation and length of the footprint of a source area located in the upwind direction of the eddy covariance tower. Footprint models can characterize this source area (as a distance or region) based on the measurement height, aerodynamic surface roughness ( and ), and atmospheric stability [Bastiaanssen et al., 1998b]. However, length and orientation of the source area cannot be quantitatively used in adjustment of the local land surface temperature observations, unless a remote sensing image is available. Hence, with suitable refinement, the methodology developed in this paper could serve as a practical tool for quality control and evaluation of the tower-based land surface temperature observations and their spatial scaling.
[62] Although this study focused on the uncertainties of the meteorological variables, other uncertainties in the model structure and parameterization may exist. As such, the Bayesian-inferred values of the land surface temperature might be partly contaminated by the effect of such uncertainties. The importance of model structure and parameterization uncertainties is highlighted in a number of recent studies. Zhang et al. [2010] observed that the choice of formula and the MOST function for temperature ( ) significantly influenced the agreement between sensible heat flux calculated for a scintillometer and an eddy covariance system. Also, van der Kwast et al. [2009] observed that the roughness parameters ( , , ) can cause large deviations in the modeling of sensible heat flux. In addition, Verstraeten et al. [2008] found that the estimation error due to the uncertainty of roughness length for heat transfer is important: even more so than the uncertainty on temperature, wind speed, and stability correction. The Bayesian model of this study is sensitive to the number of priors and their interdependencies, and as such it is not practical to include uncertainty of the model roughness parameters. In particular, the Su et al. [2001] method employed in SEBS for estimation of the roughness parameters ( and ) uses wind speed and air temperature as input variables. Hence, introducing roughness parameters as priors is likely to be problematic. However, careful measurement of the vegetation height and density during the SMEX02 field campaign suggests that uncertainties in the roughness parameterization are not likely to be significant in this study.
[63] Preliminary evaluations indicate that BIT-SEBS is sensitive to the parameters of the prior PDFs. In the case of the Gaussian prior PDFs, the definition of the standard deviation of the input variables has a direct influence on the inference performance and convergence of the MCMC simulations. Likewise, the performance of the Bayesian technique in the estimation of the input variables depends upon the accuracy and validity of the prior information. This is especially important in hierarchical Bayesian inference, where the use of nonrepresentative priors can result in poor or meaningless posterior estimates. Therefore, in order to provide a quantitative measure of the spatial variability within these variables, we recommend that new installations of field-based eddy covariance measurements provide a few additional spatially distributed instruments that measure the key meteorological variables such as land surface temperature, air temperature, and wind speed. Such instrumentation might include traditional point-based infrared and air temperature sensors located within the footprint of the eddy covariance tower, or more spatially representative devices such as the recently developed fiber-optic distributed temperature sensing networks [Selker et al., 2006]. For existing data sets with single tower observations, it is important to quantify the bound of uncertainty (i.e., standard deviation of the prior PDF) for each time stamp of observation. A first approximation might be to assume that the footprints of air temperature and wind speed are similar to the sonic instrument, while the footprint of land surface temperature is different (i.e., smaller) to the sonic instrument footprint (i.e., low values for and and high values for ).
6. Conclusions
[64] In this study, the uncertainties associated with input meteorological variables over a multitower site were quantitatively evaluated using a Bayesian inference scheme coupled with the SEBS model. Results confirm that the performance of physically based energy balance methods in heat flux estimation strongly depends upon the representativeness of the input meteorological variables. In particular, uncertainties in local observations of the land surface temperature have considerable effect on the mismatch between the observed and modeled sensible heat fluxes over both soybean and corn fields. As such, the land surface temperature cannot be assumed to provide spatially representative values in the computation of the sensible heat flux observed at the tower scale: at least not without some prior spatial scaling. Characterizing this spatial variability of surface temperature using high-resolution remote sensing retrievals or exploiting stand-alone tower data to inform the prior distributions of forcing uncertainty provide a number of directions for further investigation, development, and application of the approach developed here.
Acknowledgments
[65] Funding for this research is provided jointly by an Australian Research Council (ARC) Linkage Project (LP0989441), Discovery Project (DP120104718) and top-up scholarship support for Ali Ershadi from the National Centre for Groundwater Research and Training (NCGRT) in Australia. This work was also supported by the NCI National Facility at the ANU via the provision of computing resources to the ARC Centre of Excellence for Climate System Science.