Volume 49, Issue 5 p. 2343-2358
Regular Article
Free Access

A Bayesian analysis of sensible heat flux estimation: Quantifying uncertainty in meteorological forcing to improve model prediction

Ali Ershadi

Corresponding Author

Ali Ershadi

School of Civil & Environmental Engineering, University of New South Wales, Sydney, New South Wales, Australia

Corresponding author: A. Ershadi, School of Civil & Environmental Engineering, University of New South Wales, Sydney, New South Wales 2032, Australia. ([email protected]; [email protected])Search for more papers by this author
Matthew F. McCabe

Matthew F. McCabe

Water Desalination and Reuse Centre, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

Search for more papers by this author
Jason P. Evans

Jason P. Evans

Climate Change Research Centre, University of New South Wales, Sydney, New South Wales, Australia

Search for more papers by this author
Gregoire Mariethoz

Gregoire Mariethoz

School of Civil & Environmental Engineering, University of New South Wales, Sydney, New South Wales, Australia

Search for more papers by this author
Dmitri Kavetski

Dmitri Kavetski

School of Civil, Environmental and Mining Engineering, University of Adelaide, South Australia, Australia

Search for more papers by this author
First published: 04 April 2013
Citations: 16


[1] The influence of uncertainty in land surface temperature, air temperature, and wind speed on the estimation of sensible heat flux is analyzed using a Bayesian inference technique applied to the Surface Energy Balance System (SEBS) model. The Bayesian approach allows for an explicit quantification of the uncertainties in input variables: a source of error generally ignored in surface heat flux estimation. An application using field measurements from the Soil Moisture Experiment 2002 is presented. The spatial variability of selected input meteorological variables in a multitower site is used to formulate the prior estimates for the sampling uncertainties, and the likelihood function is formulated assuming Gaussian errors in the SEBS model. Land surface temperature, air temperature, and wind speed were estimated by sampling their posterior distribution using a Markov chain Monte Carlo algorithm. Results verify that Bayesian-inferred air temperature and wind speed were generally consistent with those observed at the towers, suggesting that local observations of these variables were spatially representative. Uncertainties in the land surface temperature appear to have the strongest effect on the estimated sensible heat flux, with Bayesian-inferred values differing by up to ±5°C from the observed data. These differences suggest that the footprint of the in situ measured land surface temperature is not representative of the larger-scale variability. As such, these measurements should be used with caution in the calculation of surface heat fluxes and highlight the importance of capturing the spatial variability in the land surface temperature: particularly, for remote sensing retrieval algorithms that use this variable for flux estimation.

Key Points

  • Bayesian inference with prior info is well suited for input uncertainty analysis
  • The land surface temperature has large uncertainties in flux estimation
  • Scaling of surface temperature is required to capture spatial variability

1. Introduction

[2] Evapotranspiration (ET) is a major component of the hydrological cycle [Brutsaert, 2005] and can account for more than 90% of the precipitation in semiarid and arid regions [Wang et al., 2012]. Accurate estimation of ET is required to better constrain and understand hydrometeorological behavior across a range of systems and scales: locally, regionally, and globally. ET is usually represented as the latent heat flux ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0008) from the land surface to some level in the overlaying atmosphere. Although there are a number of techniques available to estimate the land surface fluxes of heat and water [Kalma et al., 2008; Wang and Dickinson, 2012], a common approach is via evaluation of the energy balance at the surface. In models using this approach, ET (or latent heat flux, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0009) is usually derived as the residual term of the energy budget, i.e., urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0010, where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0011 is the net radiation, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0012 is the ground heat flux, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0013 is the sensible heat flux. In such instances, it is the calculation of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0014 that is of key importance in the estimation of ET. One example from this family of models is the Surface Energy Balance System (SEBS) model [Su, 2002], an energy balance method that is widely used to estimate actual ET via a combination of remote sensing and in situ meteorological observations [McCabe and Wood, 2006; Su et al., 2005].

[3] Model simplifications, natural variability in system response, and issues of measurement or sampling errors in the input forcing cause mismatches between the modeled and observed responses, in both physically based (e.g., SEBS) and empirical models [Kalma et al., 2008; Samanta et al., 2007]. Probabilistic (stochastic) modeling methodologies are hence of particular interest because they allow an explicit examination of data and modeling uncertainties using probability distributions [Kavetski et al., 2006a; Luo et al., 2007] or empirical ensembles [Pan et al., 2008; Peters-Lidard et al., 2011]. Probabilistic approaches have been used previously in groundwater models [Dagan, 1985], conceptual rainfall-runoff models [Kuczera et al., 2006], and integrated water resources systems [Castelletti and Soncini-Sessa, 2007]. However, there are limited cases detailing the use of probabilistic frameworks in heat flux modeling. In a recent contribution, van der Tol et al. [2009] developed a Bayesian approach for the estimation of heat fluxes over vegetated land surfaces and showed that the integration of different prior information within a land surface modeling scheme improved the estimation of model parameters. Samanta et al. [2007] and Li et al. [2010] used a Bayesian approach to fit the Penman-Monteith model to half-hourly transpiration rates for a sugar maple stand in different regions, finding considerable uncertainties in predicted transpiration. In general, the nonlinearity of the model equations, process complexity and the difficulties in specifying realistic uncertainty models represent challenging research problems for developing and applying probabilistic techniques in heat flux models.

[4] In energy balance methods (including SEBS), the estimation of the sensible heat flux presents greater difficulties than the estimation of the available energy flux (i.e., urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0015). The sensible heat flux urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0016 is the transfer of heat from the land surface to the atmosphere, represented conceptually as a temperature gradient, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0017, where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0018 is the land surface temperature, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0019 is the air temperature, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0020 is the aerodynamic resistance to heat transfer. Note that urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0021 is itself a function of the wind speed urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0022 and of the aerodynamic roughness of the land surface. Given this expression, the main uncertainties in the estimation of the sensible heat flux in SEBS arise due to uncertainties in the input meteorological variables ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0023, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0024, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0025) and in the aerodynamic roughness parameterization.

[5] Timmermans et al. [2011] found that uncertainties in the estimation of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0026 via SEBS were likely due to the incorrect parameterization of the roughness height for heat. On the other hand, van der Kwast et al. [2009] found that SEBS is more sensitive to the surface temperature errors than to the surface aerodynamic parameters. In another study, Gibson et al. [2011] found that SEBS is sensitive to urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0027 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0028, depending on the land cover and wet limit criteria. As can be seen, identifying the true nature of the uncertainties resulting from these input variables remains a challenging and unresolved task.

[6] The aim of this paper is to provide insights into the spatial representativeness of the key input meteorological variables by quantifying the uncertainties in their actual measurements. More specifically, our research questions are as follows: (a) which meteorological forcing set (land surface temperature, air temperature, or wind speed) has the greatest influence on the uncertainty in sensible heat flux values estimated using the SEBS model? and (b) what are the likely reasons for such uncertainties?

[7] These questions are investigated using a Bayesian inference technique (BIT), where uncertainties in the observed input and response data are represented using probability distribution functions (PDFs), and the SEBS model is used to describe the physics of the sensible heat flux process. Inferred values of the input variables are then used to quantitatively estimate the errors in their measurements. The likely causes of these uncertainties are then discussed. One of the major differences between the current study and previous investigations is the use of Bayesian uncertainty analysis instead of a sensitivity analysis to quantify the errors in sensible heat flux estimation. Moreover, instead of associating all uncertainties to the parameterization of the models [Samanta et al., 2007, 2008; van der Tol et al., 2009], this paper examines the uncertainties inherent within the input variables used in heat flux estimation.

2. Field Measurements and Site Description

[8] This investigation is based on data from the Walnut Creek (WC) watershed, centered at 41.96°N, 93.6°W and located near Ames, Iowa, in the USA. Meteorological and flux data for the study area were measured across 12 eddy covariance towers, collected as part of the Soil Moisture-Atmospheric Coupling Experiment and the Soil Moisture Experiment 2002 (SMEX02) campaigns [Kustas et al., 2005; Prueger et al., 2009] during June and July 2002. The locations of the towers within and around the study area are shown in Figure 1.

Details are in the caption following the image
WC basin (thick black line) and location of soybean and corn towers. The land use map of the region is shown in the background.

[9] The land cover of the region is comprised primarily of either corn (Zea mays L.) or soybean (Glycine max L. Merr.). Nearly 95% of the region and watershed is used for row crop agriculture, with 80% of that being corn and soybean in equal proportions. The climate is humid, with an average annual rainfall of 835 mm/yr. The topography is characterized by low relief and poor surface drainage. Dominant soil types of the study area are clay and silty clay loams, with generally low permeability [Hatfield et al., 1999].

[10] Meteorological data along with surface heat flux and vegetation measurements are available for 20 days from 20 June to 9 July 2002 (day-of-year 171–190). During this time period, the vegetation grew rapidly and surface soil moisture changed from dry to wet due to rainfall events in early July. The eddy covariance flux towers provided measurements of the friction velocity ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0029), sensible heat flux ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0030), and latent heat flux ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0031). Air temperature and humidity were measured using Vaisala HMP-45C instruments and sonic temperature, and wind speed fluctuations were measured using Campbell Scientific CSAT3 sonic anemometers. Radiometric temperatures were measured using Apogee thermal-infrared radiometers (model IRTS-P) with a nominal 60° field of view. The Apogee sensor height is kept at 2.5 m above soybean and 5 m above corn canopies in all corresponding towers. The effective canopy level footprint area for the land surface temperature sensor was approximately 7 m2 for soybean towers and 26 m2 for corn towers. All data for rain periods are removed from the analysis, as the CSAT sonic instrument does not provide reliable results during such conditions. In addition, sporadic spikes and values with invalid range are removed. During the field campaign, the vegetation height ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0032), leaf area index ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0033), and fractional vegetation cover ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0034) varied with crop growth stage [Anderson et al., 2004], with ranges shown in Table 1.

Table 1. Range of Vegetation Height ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0035), Leaf Area Index ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0036), and Fractional Vegetation Cover ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0037) During the Study Period
Crop urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0038 (m) urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0039 (m2/m2) urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0040
Soybean 0.2–0.6 0.4–3.7 0.2–0.9
Corn 0.7–2.2 1.1–5.6 0.5–1.0

[11] Meteorological and heat flux data are averaged to 30 min. The measured sensible heat flux data are used without any closure correction. All records are filtered for rain events and limited to the daytime period from 7:30 A.M. to 6:00 P.M. local time. More detailed site information and a description of the experiments can be found in Kustas et al. [2005] and Prueger et al. [2005].

3. Modeling Approach

3.1. Surface Energy Balance System

[12] SEBS [Su, 2002] is a physically based modeling approach that uses a combination of remote sensing and in situ observations to derive the land surface variables, radiative heat fluxes, and roughness parameters required for the calculation of turbulent heat fluxes at the land surface. The main inputs to the SEBS model include land surface temperature, vegetation height and density, air temperature, humidity, and wind speed, along with surface radiation components. The key aspect of SEBS is its robust formulation for the estimation of the sensible heat flux using either the Monin-Obukhov similarity theory (MOST) equations [Monin and Obukhov, 1945] for the atmospheric surface layer or the bulk atmospheric similarity theory (BAST) [Brutsaert, 1999] for the mixed layer of the atmospheric boundary layer. In the majority of cases, the MOST equations are used unless the roughness of the surface is high or the atmospheric surface layer height is low. The MOST equations used in SEBS include stability-dependent flux-gradient functions for momentum and heat transfer, as described below:
where z is the reference height (m) above the land surface for measurement of the meteorological variables, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0043 is the friction velocity (m/s), urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0044 is the density of the air (kg/m3), urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0045 is the von Karman constant, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0046 is the specific heat capacity of air (J/(kg K)), urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0047 is the potential land surface temperature (K), urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0048 is the potential air temperature (K) at height urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0049, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0050 is the zero-plane displacement height (m), urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0051 is the roughness height for momentum transfer (m), urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0052 is the roughness height for heat transfer (m), and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0053 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0054 are the stability correction functions for momentum and heat transfer, respectively.
[13] The quantity urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0055 is the Obukhov length (m), defined as
where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0057 is the acceleration due to gravity (m/s2), and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0058 is the atmospheric virtual potential temperature (K).

[14] The functions proposed by Beljaars and Holtslag [1991] and evaluated by van den Hurk and Holtslag [1997] for stable conditions and the functions proposed by Brutsaert [2005] for unstable conditions are used for atmospheric stability corrections in the atmospheric surface layer. The roughness heights for momentum and heat transfer ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0059 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0060) are important parameters used in the MOST and BAST equations and are functions of the biometeorological conditions of the land surface. These two key parameters are estimated in SEBS using the methodology developed by Su et al. [2001], which is based on vegetation phenology, air temperature, and wind speed.

[15] After the estimation of H, SEBS uses a scaling method to scale the derived urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0061 between hypothetical dry and wet limits based on the relative evaporation concept. Finally, this scaled urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0062 can be used to calculate the latent heat flux urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0063 as a residual term in the general energy balance equation, i.e., as urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0064. Figure 2 provides a schematic representation of the model as employed in this application (see Su [2002] for further details on the model description and formulation).

Details are in the caption following the image
Schematic of data flow for estimating the sensible heat flux in the SEBS model.

3.2. Bayesian Inference Technique

[16] In standard deterministic applications of the SEBS model, all input variables are fixed and constant at each simulation time step. In contrast, in a stochastic application, inputs and response variables can be considered as probability distributions or empirical ensembles of values, the envelope of which represents the range of plausible values. This allows for an accounting of uncertainties such as input variations across a heterogeneous site.

[17] For stochastic application of the SEBS model in this study, a Bayesian inference technique (BIT) is developed and linked with the SEBS model. The approach is partially analogous to the Bayesian total error analysis (BATEA) model [Kavetski et al., 2003, 2006a] and focuses on the uncertainty in the SEBS input forcings. In the terminology and notation adopted here, observed variables are indicated with a tilde ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0065), while their posterior estimates are indicated with a hat ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0066).

[18] Let us assume a deterministic model urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0067 that maps the forcing urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0068 into the response urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0069,
where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0071 is the vector of model parameters which in this study, is kept fixed at pre-estimated values, including the roughness height parameters ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0072, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0073, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0074). In this study, these parameters are pre-estimated deterministically using the Su et al. [2001] model for each half-hourly time step at each tower.
[19] Following Kavetski et al. [2003], the observed input data urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0075 is assumed to be corrupted by errors (e.g., due to measurement and sampling). A prior distribution of the true inputs, denoted by urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0076, is constructed as follows:
where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0078 are parameters of the input error model urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0079.
[20] The observed response data urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0080 is also assumed to be corrupted by errors:
where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0082 is the observed response (e.g., sensible heat flux), and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0083 describes the response errors given the true response urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0084 and response error parameters urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0085.

[21] In the hierarchical Bayesian framework detailed earlier, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0086 is the “latent variable” and corresponds to estimates of the true inputs; they are not directly observed but are rather inferred as part of the BIT-SEBS procedure. The error model parameters urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0087 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0088 describe the statistical properties (e.g., mean and variance) of input and response variables, respectively [Renard et al., 2011]. In this application, the values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0089 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0090 are estimated and fixed prior to the BIT-SEBS inference using a separate data analysis procedure detailed later in this section.

[22] In this study, the key objective of the BIT-SEBS scheme is to estimate urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0091 given the observed meteorological forcing urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0092 and the observed response urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0093 using prior information on the magnitude and distribution of the data errors (specified using urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0094 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0095). The Bayesian posterior for this quantity, conditioned on the observed data, is as follows:
where the likelihood function urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0097 represents the probability of observing the data urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0098 given the “estimated” true input urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0099, the model parameter urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0100, the response error parameter urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0101, and the deterministic model hypothesis (SEBS).
[23] Since the denominator urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0102 is a normalization factor independent of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0103, the following expression of proportionality can be used:
[24] The input error model urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0105 reflects any independent estimates of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0106, e.g., based on observed input data urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0107, available prior to the analysis of the observed response data urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0108 (hence, it is also independent from the model parameters urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0109). In physically based models such as SEBS, input variables are often measurable and have physical meaning and valid ranges that can be used to formulate informative priors based on the independent data analysis. In this study, we represent our prior knowledge of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0110 as follows:
where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0112 denotes the Gaussian PDF of a random variable z with mean µ and standard deviation σ.

[25] In equation 9, we set the prior mean of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0113 to urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0114, which is equivalent to ignoring systematic errors in the observations. The prior standard deviation urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0115 is specified by analyzing the spatial variability of the observed forcing field, thus corresponding to sampling uncertainty. This variability can be expressed as an absolute quantity, or as a fraction of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0116, or as a range based on expert knowledge of the input uncertainty.

[26] In the context of the inference equation 7, which is conditioned on the observed response data urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0117, the error model in equation 9 plays the role of a prior on x before urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0118 is analyzed. Note that formulating the input error model as urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0119, rather than urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0120, corresponds to using Bayes' identity urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0121 in combination with a noninformative prior urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0122. It is also possible to use additional information, such as the average climatology, to define an informative prior p(x) [Huard and Mailhot, 2006].

[27] The likelihood function is formulated by assuming that the differences between the observed responses, and the SEBS predictions (i.e., the residual errors) are approximately Gaussian:
where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0124 is the SEBS response produced using the input urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0125 and the SEBS parameters urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0126. urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0127 is the observed response variable and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0128 is the standard deviation of the errors in the response variable (which may include errors in the response data and in the model structure).

3.3. Markov Chain Monte Carlo Sampling of the BIT-SEBS Posterior Distribution

[28] The posterior distribution urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0129 can be approximated using a Monte Carlo or Markov chain Monte Carlo (MCMC) sampling scheme. Due to the high dimensionality of the posterior PDF in this work, the slice sampling MCMC method of Neal [2003] is used. This method uses the prior as a proposal distribution and avoids requiring the user to specify a high-dimensional proposal distribution [Noh et al., 2010].

[29] A flowchart of the computational algorithm is shown in Figure 3. At each step of the MCMC simulation, the slice sampling algorithm draws a candidate value urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0130 from the prior distribution (equation 9), runs the SEBS model with the candidate inputs urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0131, and evaluates the likelihood function (equation 10). This procedure is then repeated until the MCMC iterations converge. Other Monte Carlo methods for sampling from the posterior include standard Metropolis methods [Kavetski et al., 2006a, 2006b], which in some cases can be adapted to exploit the time dependence of the model [Kuczera et al., 2010]. To ensure that the MCMC algorithm explored all parts of the prior distributions, convergence diagnostics are applied as detailed in section 4.1.

Details are in the caption following the image
Computational flowchart of the Bayesian inference procedure in BIT-SEBS using the slice sampling MCMC. Input data and parameters are highlighted in gray squares.

3.4. BIT-SEBS Methodology for Analyzing SMEX02 Tower Data

3.4.1. Prior Uncertainty Analysis of Input Variables

[30] The “uncertain” input meteorological variables of the SEBS model used in this study include the air temperature ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0132), land surface temperature ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0133), and wind speed ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0134). For each of the uncertain input meteorological variables, a Gaussian prior PDF is specified, with a mean equal to the measured value and a standard deviation urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0135 proportional to the spatial variability of the observed values. Hence, for each time step, the standard deviations of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0136, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0137, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0138 are calculated as the standard deviation of observations across all 12 towers within the study area. In the case of wind speed, the Gaussian prior distribution was truncated at zero to avoid negative wind speeds being sampled when the observed values are small relative to their potential variability.

[31] Other input variables (e.g., humidity) are assumed constant and equal to the observed value in the tower. SEBS model parameters ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0139, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0140, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0141) are also calculated deterministically for each time step at each tower using the corresponding measured vegetation height and density and meteorological variables. Due to careful in situ observations of the vegetation parameters at each tower [Anderson, 2003; Kustas et al., 2005], the dynamics in aerodynamic roughness of the surface are preserved, and uncertainties in parameterization of the roughness height are expected to be reduced.

[32] Figure 4 presents measured values of precipitation, land surface temperature, air temperature, and wind speed during the study period across all towers. A rain event on day-of-year 172 was followed by a 12 day dry period, causing the soil moisture to decrease from field capacity to relatively dry conditions. Subsequently, some rain events during day-of-year 185–188 increased the soil moisture. Figure 4 shows that relative to the corn towers, soybean towers measure higher land surface temperature, air temperature, and wind speeds.

Details are in the caption following the image
Time series of the land surface temperature urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0142, air temperature urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0143, and wind speed urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0144 for all 12 towers (6 over soybean and 6 over corn) of the SMEX02 campaign during the daytime. Gray lines represent soybean towers, and black lines represent the corn towers. The tower-averaged precipitation is shown in the upper plot. Gaps in the data record reflect the removal of rain periods from the analysis, given the influence that these have on flux observations.

[33] As described in section 3.2, the Bayesian inference for each of the meteorological variables ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0145, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0146, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0147) requires the construction of a prior for each variable, at each time step, and for each tower. Here each meteorological variable at each simulation time step at each of the 12 towers is given its own Gaussian prior PDF, with mean given by the observed value at tower urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0148, at time urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0149, and a standard deviation estimated from the range of observed values within each of the 12 towers at time urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0150. As the eddy covariance towers within the SMEX02 domain provide a reasonable coverage of the study area (see Figure 1), the range of the observed meteorological values across these towers is assumed to be indicative of the spatial variability.

[34] Based on the values of all towers, the standard deviations urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0151 for each time step are calculated for urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0152, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0153, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0154 and shown in Figure 5. As urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0155 have larger values than urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0156 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0157, its priors are wider. The width of the prior controls the uncertainty bound of each input variable and hence directly affects the inference (see section 4.2).

Details are in the caption following the image
Time series of the standard deviation ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0158) for urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0159 (°C), urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0160 (°C), and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0161 (m/s) derived from all of the SMEX02 towers at each time step during daytime.

[35] To appraise the assumption of Gaussian priors, Figure 6 shows quantile-quantile (QQ) plots of the tower data used to construct the priors (results for two representative time steps are shown). Land surface and air temperatures appear reasonably Gaussian, while the wind speed distributions exhibit heavier tails, representing a limitation of the Gaussian assumption.

Details are in the caption following the image
QQ plots of the land surface temperature urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0162, air temperature urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0163, and wind speed urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0164 for all towers at 12:00 P.M. (local time) on day-of-year 173 and 174. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

3.4.2. Prior Uncertainty Analysis of Response Variable

[36] The response variable in this Bayesian investigation is the sensible heat flux observed at each of the eddy covariance towers. A number of recent studies [e.g., Foken, 2008; Foken et al., 2012; Hollinger and Richardson, 2005; Mauder et al., 2008; Meyers and Baldocchi, 2005; Richardson et al., 2012] have highlighted the uncertainties in eddy covariance estimations of turbulent heat fluxes. In addition to standard data quality controls (e.g., coordinate rotation and density correction) that need to be performed on the high-frequency eddy covariance measurements, there are issues related to inadequacy of fetch, heterogeneity of the footprint, improper averaging times, and noncapture of large eddies that add to the uncertainties in the eddy covariance estimates [Allen et al., 2011].

[37] To include the uncertainties of sensible heat flux observations in the Bayesian inference of the input variables, prior PDFs of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0165 are developed, with the observed sensible heat flux urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0166 considered as the mean of the prior PDF. The standard deviation of the PDF, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0167, is expressed as a fraction r of the observed sensible heat flux, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0168. The choice of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0169 has a direct influence on the inference of the input variables. Smaller values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0170 (e.g., with urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0171) correspond to a lower uncertainty in the observations of the sensible heat flux, which causes larger deviations of the inferred values of input forcing from their observed values. In contrast, larger values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0172 (e.g., with urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0173) correspond to higher uncertainty in the observations of the sensible heat flux and cause smaller deviations of the inferred input values.

[38] Determination of the best (or optimum) value of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0174 is not possible, as the uncertainty in sensible heat flux observations is poorly described. Also, the spatial variability of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0175 cannot be used to develop PDFs of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0176 due to the difference in the extent and heterogeneity of the footprints among towers. Allen et al. [2011] identified that the errors in the estimation of the latent heat flux using eddy covariance systems for a well-maintained site, in terms of standard deviation from the mean, are in the range of 10%–15%. Based on these measures, we estimate that the standard deviation for sensible heat flux is around 10% of the measured value (i.e., urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0177), as sensible heat flux estimations are often more reliable than latent heat flux estimations in eddy covariance towers [Foken, 2008; Mauder et al., 2008; Richardson et al., 2012].

[39] To evaluate the sensitivity of the inference to the value of r, we examined three cases: r = 0.05, r = 0.1, and r = 0.15. The sensitivity analysis was based on the residuals Δ of the inferred and observed values, computed as urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0178, where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0179 can be urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0180, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0181, or urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0182 and subscripts urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0183 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0184 refer to inferred and observed values, respectively. Results showed that in all three cases of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0185, the relative variation in the range and magnitude of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0186, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0187, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0188 were similar (i.e., urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0189 is an order of magnitude higher than urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0190 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0191, see supporting information). Consequently, variation of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0192 among selected values has no significant influence in identifying the most uncertain variable. Therefore, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0193 = 0.1 is adopted in the computation of results in the following sections.

3.4.3. Posterior Estimation and Inference Using BIT-SEBS

[40] Figure 7 shows the overall procedure in estimation of the posterior values of the input variables. For each time step and at each tower, prior analysis of data uncertainty was carried out as described earlier. MCMC simulations were then performed using the slice sampling method (section 3.2). The results of the Bayesian simulations can then be represented as time series of the posterior values for urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0194, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0195, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0196 for each tower record. Following an MCMC convergence assessment, the time series of posterior estimates of input variables were then used as estimates of the meteorological input variables (section 4.2) and also to provide insights into their uncertainties (section 4.3).

Details are in the caption following the image
Overall procedure used in BIT-SEBS to estimate input meteorological variables. Input data and parameters are shown in gray squares. The procedure is applied at each time step of each tower.

4. Results

4.1. Convergence Analysis of the MCMC Iterations

[41] A convergence study of the MCMC samples was undertaken as follows. The number of iterations necessary for MCMC chain convergence was estimated visually by plotting traces of the MCMC samples against the number of iterations for all chains [Kass et al., 1998]. Figure 8 shows the MCMC chain traces and their cumulative mean for 3000 samples (iterations), with a thinning factor of 10 and a burn-in period of 1000 samples for the 12:00 P.M. time stamp of tower WC162 (soybean) for day-of-year 173. Here a thinning factor of 10 means that a total of 30,000 samples were generated, but only every 10th sample was retained (this reduces the effects of serial correlation of the MCMC samples). A burn-in period of 1000 samples means that the first 1000 samples were discarded. From Figure 8 it can be seen that the cumulative means of the posterior traces are stationary after approximately 1000 iterations, suggesting adequate convergence of the MCMC samples.

Details are in the caption following the image
(left) Traces of the posterior input meteorological variables in the Markov chain traces and (right) their corresponding cumulative mean (with x axis in logarithmic scale). Results represent a single 12:00 P.M. time stamp for day-of-year 173 at tower WC162 (soybean). The means of all variables appear stationary after about 1000 iterations.

[42] For quantitative evaluation of the MCMC convergence and assessment of the adequacy of the chain numbers, the potential scale reduction factor urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0197 of Gelman and Rubin [1992] is used. As recommended by Brooks and Gelman [1998], the criterion for acceptance of the Bayesian modeling is that urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0198. Any MCMC chain that did not meet this criterion was rejected and was not considered in the inference.

[43] Histograms of the MCMC samples from the posterior are shown in Figure 9. The histograms have symmetric shapes and are well approximated by Gaussian distributions. In addition, Figure 9 shows that BIT-SEBS has refined the estimates of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0199 compared to their prior estimates, whereas for urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0200 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0201 the data were noninformative, and BIT-SEBS did not result in any refinement of the prior estimates.

Details are in the caption following the image
Prior and posterior distributions of input meteorological variables for the 12:00 P.M. time stamp of tower WC162 (soybean) for day-of-year 173. The thin line is the prior distribution (Gaussian PDF), the histogram represents the MCMC samples from the posterior distribution, and the thick line is a Gaussian PDF fitted to the histogram of posteriors.

[44] As the posterior distributions of each input variable are approximately Gaussian, their mean values (which also correspond to the most likely values) are taken as the point estimates of that variable. These inferred values are then used in evaluation of the performance of the Bayesian inference (section 4.2) and quantification of the uncertainties (section 4.3).

4.2. Bayesian Uncertainty Analysis of SEBS Inputs

[45] The SEBS model is used to estimate the sensible heat flux in both “deterministic” and Bayesian “stochastic” estimation schemes, with Figure 10 presenting a schematic of the overall procedure. In deterministic estimation, the observed values of the meteorological variables were used for direct estimation of the sensible heat flux (the traditional flux estimation approach). However, in stochastic estimation, the inferred values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0202, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0203, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0204 are used.

Details are in the caption following the image
Overall procedure to simulate sensible heat fluxes in SEBS using both stochastic and deterministic forms. The parameter urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0205 denotes the input forcing ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0206, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0207, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0208). The procedure is applied at each time step of each tower.

[46] Figure 11 presents a scatterplot of both the deterministic and stochastic estimates of sensible heat flux values against measured eddy covariance data for daytime half-hourly records for all soybean (top) and all corn towers (bottom). Linear regression statistics for each scatterplot are also shown in Figure 11. The urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0209 term refers to a relative error measure defined as urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0210, where RMSE is the root-mean-squared error between observed and simulated sensible heat flux, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0211 is the observed sensible heat flux. As is apparent from Figure 11, stochastic simulation of sensible heat flux using Bayesian-inferred values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0212, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0213, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0214 improves the correlations for both corn and soybean towers, with an urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0215 increase from 0.68 to 0.99 for soybean and from 0.62 to 0.98 for corn. In addition, the relative error decreases from around 10% to 1% for both soybean and corn towers.

Details are in the caption following the image
Scatterplots of observed sensible heat flux ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0216; x axis) versus deterministic simulated ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0217) and stochastic simulated ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0218) for all daytime (top) soybean and (bottom) corn half-hourly tower values. Linear regression statistics of both urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0219 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0220 are also shown. The quantity urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0221 represents the relative error, defined as the RMSE divided by the range of observations. The 1:1 line is also shown.

[47] Time series of the observed, deterministic simulated, and stochastic simulated sensible heat flux for six selected towers (with fewest data gaps) are presented in Figure 12, with urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0222 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0223 values shown for both deterministic and stochastic simulations. The deterministic simulated sensible heat flux is in agreement with the observed values for the majority of towers, with urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0224 values between 0.4 (tower WC162) and 0.8 (tower WC13). However, a clear underestimation of sensible heat flux in deterministic results is evident for WC13 and WC161. Also, deterministic simulated sensible heat fluxes of WC162 have clear forward diurnal shifts. In contrast, the stochastic simulated values are in better agreement with the observed values, showing improved urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0225 values of 0.96–0.99.

Details are in the caption following the image
Time series of observed sensible heat flux urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0226, stochastic simulated urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0227, and deterministic simulated sensible heat flux urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0228 for six selected towers. For each plot, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0229 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0230 are given for deterministic (Det.) and stochastic (Sto.) linear regressions.

[48] It is apparent that by using the inferred values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0231, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0232, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0233, the performance of linear regressions of half-hourly results improves significantly, with urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0234 and slope values close to 1 and a considerable decrease in relative errors. The improved model performance in the stochastic simulations is due to the inference of the input variables from the observed responses (section 3.2) and should not be viewed as indicative of the performance in predictive applications. Instead, our aim here is to use the inferred values of the input variables to gain further insights into the errors and uncertainties associated with them, and to gain insights into which input variables are likely to be contributing to the predictive uncertainty. In particular, the following sections examine and discuss which inferred inputs differ most from their observed values.

[49] It should be emphasized that the specification of the priors (in particular, their standard deviations) has a significant influence on the performance of the inference in BIT-SEBS. The importance of the choice of priors is illustrated in the example presented in the supporting information. In this case, BIT-SEBS simulations are performed for tower WC13 (soybean) assuming that each variable shares the same larger standard deviation of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0235. Results show that when using such a set of noninformative priors, the efficiency in the stochastic estimation of the sensible heat flux is greatly reduced, and the time series of the differences between inferred and observed values for all three variables are in the same approximate range. Assigning a realistic and representative range of uncertainty to the input variables is known to be of key importance to the fidelity of such hierarchical Bayesian approaches (e.g., as discussed by Renard et al. [2010, 2011]). In this study, this is pursued by considering the spatial variability of the inputs as a proxy for the sampling errors in these quantities.

[50] In summary, although the true values of the selected input meteorological variables are unknown, the inferred values using BIT-SEBS can be considered as an accurate estimate of such true values due to the following reasons: (1) The prior distribution of input variables are based on the spatial variability of the measurements within a relatively dense network of towers; (2) The likelihood function contains a physically based model with established relations between input data and estimated sensible heat flux; (3) Errors in the parameterization of the SEBS model are likely to be relatively small (due to the quality of the field observations of the vegetation characteristics); (4) The MCMC analysis of the posterior distributions appears to have converged, according to the diagnostics employed; (5) The posterior distributions are well behaved and approximately Gaussian, and there is no evidence of incompatibility with the corresponding prior distributions; (6) Stochastic simulations of the SEBS model using the inferred input variables resulted in consistent estimates of the response variable (sensible heat flux).

[51] Therefore, differences between the inferred and observed values of the input variables are likely to be primarily comprised of observational errors. Further examination of the inferred input observations is undertaken in the following section.

4.3. Inferred Values of Meteorological Variables

[52] To evaluate the performance of the Bayesian inference, results are first examined for a sample day for both a soybean and a corn tower. To evaluate the approach more closely, the differences between inferred and observed meteorological values for all towers are also presented. Figure 13 plots the inferred values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0236, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0237, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0238 for day-of-year 173 from a representative soybean (WC162) and corn tower (WC152). Gray lines in Figures 13a–13h indicate the observed values from among the additional 11 soybean and corn towers, which can be used to establish whether the range and trend in observed and inferred values are in accord with the other measurements across the study domain.

Details are in the caption following the image
Observed and Bayesian-inferred values of meteorological variables for a (a, c, e, and g) soybean tower (WC162) and (b, d, f, and h) corn tower (WC152). The gray lines in the top three rows represent the observed values for the other 11 flux towers. (bottom) The observed, deterministic calculated, and stochastic generated sensible heat fluxes are shown. The x axis for all plots indicate hour of the day (local time) for day-of-year 173.

[53] As can be seen from Figure 13a, the observed urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0239 has a different diurnal cycle than is present in the other towers, due perhaps to sensor time delay, alignment, or geometric configuration. If the observed values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0240, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0241, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0242 from this tower were to be used in SEBS in deterministic simulations, the resulting sensible heat flux would be very different from the observed urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0243 ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0244 of 0.22 and RMSE of 52 W/m2; see Figure 13g). On the other hand, the Bayesian estimated values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0245 match well with the observed sensible heat flux and improve urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0246 to 0.99 and RMSE to 0.86 W/m2. To achieve this, the Bayesian inference approach identifies alternative values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0247 that provide a better match to the diurnal variations represented across the other towers. Given that the inferred values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0248 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0249 are close to the observed values (see Figures 13c and 13e), it seems that for this tower at least, the main uncertainty in flux estimation results from the urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0250 observations, with absolute differences between observed and inferred values of up to 3°C. This difference is well within the expected spatial variability observed within the in situ surface temperature measurements over agricultural fields [McCabe et al., 2008].

[54] For the corn tower, the inferred values of the land surface temperature are up to 2°C lower than the observed values (see Figure 13b). Similar to the soybean tower example earlier, the Bayesian-inferred values of air temperature and wind speed remain quite close to the observed values, indicating that these seem to be spatially representative. Figure 13h shows that the deterministic estimate of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0251 via standard application of SEBS is considerably higher than the observed flux estimate. Through use of the inferred land surface temperature values, a significantly improved simulation of the observed sensible heat flux is achieved, with urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0252 increased from 0.86 to 0.99 and RMSE reduced from 41 to 1.3 W/m2.

[55] To evaluate the performance of the Bayesian inference for all towers, the difference between observed and inferred values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0253, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0254, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0255 are calculated as urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0256, where urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0257 is the variable of interest ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0258, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0259, or urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0260), and subscripts urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0261 and i represent the observed and inferred values of the variable, respectively. Time series of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0262, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0263, and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0264 are shown in Figure 14. A bar plot of the all-tower-averaged precipitation is also shown to support interpretation of the results. For all plots (except precipitation), gray lines represent the time series of soybean, and black lines represent the corn towers.

Details are in the caption following the image
Differences between Bayesian-inferred and observed values for land surface temperature, air temperature, and wind speed. Soybean towers are shown in gray, and corn towers are shown in black. (top) Average precipitation occurring during the field campaign.

[56] For all towers, differences between Bayesian-inferred and observed values are larger for the land surface temperature ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0265 values of up to ±5°C) than for either the air temperature or wind speed. One possible reason for this difference is the disparity between the footprint of the in situ Apogee land surface temperature sensors and the CSAT sonic anemometer that is used to derive the sensible heat flux at the eddy covariance tower. The effective footprint of the Apogee sensors used in this study is on the order of a few square meters (approximated as circles with areas of 26.2 m2 over corn and 6.5 m2 over soybean), while the sonic anemometer measures eddies that originate from a nonlocal (relative to the in situ sensor) distance upwind of the tower [Schmid, 2002], representing a source area of several hundreds of square meters.

[57] Figure 14 indicates that the differences between the Bayesian inferred and observed urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0266 at the soybean towers are more significant (and frequent) than those at the corn towers, possibly due to the lower fractional vegetation cover and the effect of bare soil on the locally observed urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0267 [McCabe et al., 2008]. For the corn towers, the fractional vegetation cover is higher than for soybean towers, and hence, the footprint of surface temperature is more likely to be spatially stable and spatially representative. In contrast to urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0268, values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0269 have lower variability and their magnitude is within the range of the sensor accuracy (±0.3°C). The lower values of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0270 suggest that due to atmospheric mixing and turbulence in the air, the footprint of the in situ air temperature sensor (HMP-45C) is more representative of the footprint of the sonic instrument. However, this reasoning cannot be extended to urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0271, as the wind speed in this study derives directly from the CSAT sonic anemometer rather than from independent measurements. Nevertheless, for the majority of cases, the range of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0272 is less than 0.5 m/s, which indicates that observations of the wind speed in each individual tower are likely to be representative of the domain average (apart from a number of clearly identifiable periods). The few days with higher urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0273 values (e.g., day-of-year 178) are days with lower values of wind speed in the area (see Figure 4), which indicates that when wind speed is lower, the spatial variability in its value is larger.

5. Discussion

[58] Sources of uncertainty in Earth system models are varied and can include errors due to simplifications in the model structure, errors in the observations of the input forcing, uncertainties in parameterization of the model, or errors in the observations of the response variables (sensible heat flux in this study). In general, understanding and quantifying the uncertainty in such modeling schemes is nontrivial, due to the complexity of the interactions between the land surface and the atmosphere and the combined effects of all sources of error [Kalma et al., 2008].

[59] In the present study, errors associated with the measurements of meteorological input forcing were estimated based on their application in determining sensible heat flux using the SEBS model over a number of eddy covariance towers. Input forcing included air temperature, land surface temperature, and wind speed. Results indicate that the main uncertainty contributing to flux prediction arises due to uncertainties in the local observations of the land surface temperature, with differences between inferred and observed values of up to ±5°C. A number of previous studies have identified that errors in the land surface temperature can have a direct and significant effect on the estimation of the sensible heat flux. For example, van der Kwast et al. [2009] found that in well-irrigated fields, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0274 estimated by SEBS can deviate up to 70% with only a 0.5°C difference in surface temperature. It is worth noting that this behavior is not distinct to SEBS alone: similar sensitivities appear in other energy balance models and represent a considerable problem for energy balance based approaches that require the use of an infrared surface temperature [Kalma et al., 2008]. For instance, Timmermans et al. [2007] identified that a 3°C deviation in surface temperature can cause errors in the sensible heat flux estimation of up to 75% in the two-source model (TSM) [Norman et al., 2000] and 45% in the Surface Energy Balance Algorithm for Land (SEBAL) model [Bastiaanssen et al., 1998a].

[60] Looking beyond sensitivity analysis of modeling schemes to surface temperature, we suggest that the main reason for the differences between observed and Bayesian-inferred values of the land surface temperature is the disparity in the spatial representativeness (or footprint) of the sensors. In particular, the local-scale footprint of in situ measurements of the land surface temperature (using Apogee sensors) is unlikely to correspond with the footprint scale of flux observations made with eddy covariance systems [Kljun et al., 2004; Kustas et al., 2006; Su et al., 2005; Vickers et al., 2010]. Due to atmospheric turbulence and mixing, air temperatures and wind speeds will have lower spatial variability than the land surface temperature. Likewise, the footprint of the locally measured urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0275 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0276 will more closely match the footprint of the observed eddy covariance based fluxes.

[61] Use of the locally observed land surface temperature without spatial scaling and footprint correction has significant implications for the validation of heat flux models. For example, Su et al. [2005] showed that errors in the land surface temperature are the main reason for discrepancies between modeled and simulated heat fluxes in the SMEX02 towers. However, they partially corrected such errors by modification and adjustment of the emissivity. In image-scale applications, footprint models [Leclerc and Thurtell, 1990; Schmid, 2002; Schuepp et al., 1990] have been used for correction of the in situ observed land surface temperature using remote sensing images [Kustas et al., 2006; Li et al., 2008; Timmermans et al., 2009]. In a footprint model, the observed sensible heat flux is related to the orientation and length of the footprint of a source area located in the upwind direction of the eddy covariance tower. Footprint models can characterize this source area (as a distance or region) based on the measurement height, aerodynamic surface roughness ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0277 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0278), and atmospheric stability [Bastiaanssen et al., 1998b]. However, length and orientation of the source area cannot be quantitatively used in adjustment of the local land surface temperature observations, unless a remote sensing image is available. Hence, with suitable refinement, the methodology developed in this paper could serve as a practical tool for quality control and evaluation of the tower-based land surface temperature observations and their spatial scaling.

[62] Although this study focused on the uncertainties of the meteorological variables, other uncertainties in the model structure and parameterization may exist. As such, the Bayesian-inferred values of the land surface temperature might be partly contaminated by the effect of such uncertainties. The importance of model structure and parameterization uncertainties is highlighted in a number of recent studies. Zhang et al. [2010] observed that the choice of urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0279 formula and the MOST function for temperature ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0280) significantly influenced the agreement between sensible heat flux calculated for a scintillometer and an eddy covariance system. Also, van der Kwast et al. [2009] observed that the roughness parameters ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0281, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0282, urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0283) can cause large deviations in the modeling of sensible heat flux. In addition, Verstraeten et al. [2008] found that the estimation error due to the uncertainty of roughness length for heat transfer is important: even more so than the uncertainty on temperature, wind speed, and stability correction. The Bayesian model of this study is sensitive to the number of priors and their interdependencies, and as such it is not practical to include uncertainty of the model roughness parameters. In particular, the Su et al. [2001] method employed in SEBS for estimation of the roughness parameters ( urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0284 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0285) uses wind speed and air temperature as input variables. Hence, introducing roughness parameters as priors is likely to be problematic. However, careful measurement of the vegetation height and density during the SMEX02 field campaign suggests that uncertainties in the roughness parameterization are not likely to be significant in this study.

[63] Preliminary evaluations indicate that BIT-SEBS is sensitive to the parameters of the prior PDFs. In the case of the Gaussian prior PDFs, the definition of the standard deviation of the input variables has a direct influence on the inference performance and convergence of the MCMC simulations. Likewise, the performance of the Bayesian technique in the estimation of the input variables depends upon the accuracy and validity of the prior information. This is especially important in hierarchical Bayesian inference, where the use of nonrepresentative priors can result in poor or meaningless posterior estimates. Therefore, in order to provide a quantitative measure of the spatial variability within these variables, we recommend that new installations of field-based eddy covariance measurements provide a few additional spatially distributed instruments that measure the key meteorological variables such as land surface temperature, air temperature, and wind speed. Such instrumentation might include traditional point-based infrared and air temperature sensors located within the footprint of the eddy covariance tower, or more spatially representative devices such as the recently developed fiber-optic distributed temperature sensing networks [Selker et al., 2006]. For existing data sets with single tower observations, it is important to quantify the bound of uncertainty (i.e., standard deviation of the prior PDF) for each time stamp of observation. A first approximation might be to assume that the footprints of air temperature and wind speed are similar to the sonic instrument, while the footprint of land surface temperature is different (i.e., smaller) to the sonic instrument footprint (i.e., low values for urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0286 and urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0287 and high values for urn:x-wiley:00431397:media:wrcr20231:wrcr20231-math-0288).

6. Conclusions

[64] In this study, the uncertainties associated with input meteorological variables over a multitower site were quantitatively evaluated using a Bayesian inference scheme coupled with the SEBS model. Results confirm that the performance of physically based energy balance methods in heat flux estimation strongly depends upon the representativeness of the input meteorological variables. In particular, uncertainties in local observations of the land surface temperature have considerable effect on the mismatch between the observed and modeled sensible heat fluxes over both soybean and corn fields. As such, the land surface temperature cannot be assumed to provide spatially representative values in the computation of the sensible heat flux observed at the tower scale: at least not without some prior spatial scaling. Characterizing this spatial variability of surface temperature using high-resolution remote sensing retrievals or exploiting stand-alone tower data to inform the prior distributions of forcing uncertainty provide a number of directions for further investigation, development, and application of the approach developed here.


[65] Funding for this research is provided jointly by an Australian Research Council (ARC) Linkage Project (LP0989441), Discovery Project (DP120104718) and top-up scholarship support for Ali Ershadi from the National Centre for Groundwater Research and Training (NCGRT) in Australia. This work was also supported by the NCI National Facility at the ANU via the provision of computing resources to the ARC Centre of Excellence for Climate System Science.