Quantifying Contributions of Uncertainties in Physical Parameterization Schemes and Model Parameters to Overall Errors in Noah‐MP Dynamic Vegetation Modeling

Quantifying contributions of errors in model structure and parameters to biases in a land surface model (LSM) is critical for model improvement. This paper investigated the uncertainties in parameterizations and parameters in the Noah with multiparameterization (Noah‐MP) LSM with dynamic vegetation using eddy flux data. First, we conducted full factorial experiments of eight land subprocesses, followed by sensitivity analysis (SA) to identify the subprocesses for which possible parameterizations made significant difference. Then, based on the full factorial experiments and SA results, we selected the statistically optimal parameterizations combination and the most biased parameterizations combination. Lastly, we calibrated the parameters in two selected parameterizations combinations. The results showed that five subprocesses—surface exchange coefficient, soil moisture β threshold, radiation transfer, runoff and groundwater, and surface resistance to evaporation—had significant influence on the model performances, and the interactions were generally low but contributed up to 80% of the variation in the performance at some sites. In the optimization period, following the criterion by Moriasi et al. (2007, https://doi.org/10.13031/2013.23153 ), parameter optimization improved the performance of both parameterizations combinations at most sites to be satisfactory, and the superiority between two parameterizations combinations was preserved; in the validation period, adjusting the parameterizations was more robust than parameter optimization in improving LSMs. Finally, we found that uncertainty in soil parameters was much higher than that in vegetation parameters because the optimal soil parameters were significantly different among sites with the same soil types and recommended that spatially calibrating the soil parameters was a major issue for Noah‐MP dynamic vegetation modeling.


Introduction
Reliable simulation of the biological, physical, and hydrological processes in a land surface model (LSM) is critical for improving the accuracy of both numerical weather forecasts and climate predictions Gao et al., 2016). Most published studies suggest that transpiration accounts for the majority of water vapor flux from terrestrial land to the atmosphere despite significant uncertainties (Good et al., 2015;Jasechko et al., 2013;Miralles et al., 2011;Wei et al., 2017). Therefore, accurately representing the seasonality of vegetation is crucial for modeling the seasonal and annual exchanges of energy, water, and carbon between the land surface and the atmosphere (Arora, 2002;Prentice et al., 2007Prentice et al., , 2015. Implementing a dynamic vegetation model can potentially be beneficial to a more physically realistic and bioclimatically consistent model framework (Alo & Wang, 2010) and predicting the responses of vegetation to climate change at various temporal and spatial scales (Tang & Bartlein, 2008;Woodward & Lomas, 2004). In the 1990s, some LSMs predicted canopy amount and leaf area by adopting a submodel of vegetation dynamics, which consisted of a semiprocess-based photosynthesis model and an empirical respiration model (e.g., Dickinson et al., 1998;Foley et al., 1996;Thornton et al., 2002). Jiang et al. (2009) suggested that incorporating vegetation dynamics into the Noah LSM via a regional climate model is especially beneficial for seasonal precipitation forecasting in transition zones. Van den Hoof et al. (2011) developed a version of the joint UK land environment simulator (JULES) with a dynamic crop growth submodel, which had been shown to significantly improve the simulated land fluxes over cropland and well capture the spatial and temporal variabilities in growth conditions in Europe. However, because the dynamic vegetation models were developed with limited information, their predictions now are highly uncertain. In fact, vegetation dynamics has already been one of the largest sources of uncertainty in Earth system models (Purves & Pacala, 2008;Sitch et al., 2008). Several studies on LSMs reported that vegetation dynamics still need to be improved, especially regarding vegetation seasonality and long-term changes at regional and local scales (e.g., Anav et al., 2013;Bonan & Levis, 2006;Cadule et al., 2010;Murray-Tortarolo et al., 2013;Randerson et al., 2009;Richardson et al., 2012;Sitch et al., 2015).
The Noah with multiparameterization (Noah-MP) LSM has a significant advantage of taking into account model uncertainties because it incorporates different parameterization schemes (PSs) for various physical subprocesses in the same model dynamic structure Yang et al., 2011), and therefore, with a specific combination of PSs, it can be regarded as a surrogate of another LSM. Noah-MP has been widely used as a stand-alone LSM, and it was also employed as one of the available LSM options in the Weather Research and Forecasting (WRF) model (Skamarock et al., 2008), which has a large worldwide community of registered users. A number of researches have showed the advantage of the Noah-MP LSM in reproducing the water and energy fluxes between the land surface and atmosphere (Barlage et al., 2015;Cai et al., 2014;Chen et al., 2014;Gao et al., 2015;Salamanca et al., 2018;Zhuo et al., 2019). However, among these studies, the Noah-MP LSM used default parameterizations in which the dynamic vegetation was turned off due to its significant error. For example, Ma et al. (2017) found that the Noah-MP LSM with dynamic vegetation (Dickinson et al., 1998;Yang & Niu, 2003) overestimated gross primary productivity (GPP) by 40% and evapotranspiration by 22% over the whole CONUS domain. Therefore, the main goal of this study is to quantify and attribute the errors in the Noah-MP LSM with dynamic vegetation for its future improvement and development.
There are three primary causes of errors in LSMs (Clark et al., 2011;Duan et al., 2006;Wang et al., 2009): (1) an imperfect model structure; (2) inappropriate model parameter values (PVs); and (3) incorrect model inputs, including meteorological forcing data and initial surface conditions. Obviously, parts of the errors in the Noah-MP LSM with dynamic vegetation are model structural errors caused by missing physical features. For example, Wang et al. (2018) recently implemented a vegetation optimality model of root dynamics in the Noah-MP LSM to model the extraction of groundwater through phreatophytic roots, which substantially improved the simulation of surface energy and water fluxes at a hyperarid site. Gim et al. (2017) developed a new PS to characterize the allocation of assimilated carbon to plant parts, including leaves and fine roots. In particular, interactive nutrient cycling has not been included in most dynamic vegetation models, although it was found that the long-term plant response to elevated CO 2 is likely affected by nutrients and its impact on plant carbon allocation (Zaehle et al., 2014).
In addition, the land model errors caused by the first two sources can also be reduced by tuning the PSs  and PVs (Wang et al., 2001, respectively. For example, Hong et al. (2014) and Zhang et al. (2016) used a natural selection approach to obtain the optimal combination of PSs in the Noah-MP LSM for the Han River basin in South Korea and a cropland site on the Tibetan Plateau in China, respectively.  found that up to 65% of model errors can be reduced by parameter optimization for most of the plant functional types (PFTs) in two LSMs. Before optimizing the PSs or PVs, sensitivity analysis (SA), that is, the study of how a given model depends on the factors fed into it (Saltelli et al., 2008), is typically used to screen out the most sensitive subprocesses Zheng et al., 2019) or parameters (Gan et al., 2015;Li et al., 2013Li et al., , 2016Lu et al., 2013) in LSMs.
However, the uncertainties in the PSs and PVs always interacting has largely been ignored, which causes the optimal values of parameters, especially key parameters, to vary with the model structures. Although the Noah-MP LSM was frequently used to study the sensitivity of the simulated surface fluxes to particular subprocesses Li et al., 2019;Zhang et al., 2016;Zheng et al., 2019), few studies focused on parameter optimization for the Noah-MP LSM in which all the PSs for each subprocess share the same PVs. Therefore, the main goal of this study is to assess and reduce the errors in the Noah-MP LSM with dynamic vegetation relative to the background structural and parametric uncertainties. The objectives are to (1) quantify the impacts of the PS uncertainties on model performance; (2) investigate the optimal combination of PSs for the Noah-MP LSM with dynamic vegetation; and (3) calibrate the parameters in different combinations of PSs.
Here, as a follow-up study by Li et al. (2019), we conducted physics-ensemble simulations of eight subprocesses, including soil moisture β threshold (BTR), runoff and groundwater (RUN), surface exchange coefficient (SFC), supercooled liquid water in frozen soil (SUP), frozen soil permeability (FRO), radiation transfer (RAD), snow albedo (SNO), and surface resistance to evaporation (SRE) at multiple FLUXNET sites. We computed and evaluated the variance-based sensitivity indices (Homma & Saltelli, 1996;Sobol', 2001), including the total effects and interactions, of these eight subprocesses for simulated sensible heat fluxes (SH), latent heat fluxes (LH), net absorbed radiation (Rnet), and GPP. Furthermore, we intercompared the PSs to determine not only the optimal combination of model parameterizations but also the combination of PSs that has a significant difference from the optimal combination. The parameters in these two combinations of PSs were then calibrated by a popular automatic optimization method, the shuffled complex evolution (SCE-UA) algorithm (Duan et al., 1993;Naeini et al., 2019). The model errors caused by different sources and the robustness of model improvement were examined and compared.
The remainder of this paper is organized as follows. Section 2 briefly summarizes the Noah-MP LSM with dynamic vegetation, selected PSs, and eddy covariance flux data; then, section 3 describes the various methods used in this study. Section 4 presents the results, and section 5 shows some associated discussions and conclusions.

Model and Data
As a follow-up on our previous work, this study used almost identical LSM and data sets as Li et al. (2019). Brief descriptions were given as follows.

Eddy Covariance Flux Data
This study selected the Tier-1 sites from the FLUXNET2015 data release (Baldocchi et al., 2001). The eddy covariance flux data at these sites contained hourly meteorological variables, including precipitation, surface atmospheric pressure, air temperature, air specific humidity, wind speed, downward shortwave radiation, and downward longwave radiation, which were used to drive the Noah-MP LSM, along with hourly SH, LH, Rnet, and GPP, which were used to evaluate model performance.
Two criteria were used to select the flux sites: (1) The measurements should cover at least two consecutive years. For each site, we chose the first year as the optimization period in which the PSs or PVs were tuned to reduce the model errors, and the remaining years were defined as the validation period in which the Noah-MP LSM was run with the optimal PSs or PVs to examine the robustness of the improvement. (2) In any given year, the proportion of missing data should be less than 15%. The meteorological forcing variables were gap filled using ERA-Interim (Dee et al., 2011;Vuichard & Papale, 2015) estimates provided as part of the FLUXNET2015 data set.

The Noah-MP LSM With Dynamic Vegetation
The Noah-MP LSM (version 4.0.1) was developed using multiple parameterization options for key land-atmosphere interaction processes, such as vegetation phenology, canopy stomatal resistance, runoff, and groundwater (Chen et al., , 1997Chen & Dudhia, 2001;Ek et al., 2003;Koren et al., 1999;Schaake et al., 1996). Details concerning the PSs for each subprocess can be found in Niu et al. (2011). The PSs selected to conduct physical ensemble simulations in this study are shown in Table 1.
The dynamic vegetation parameterization in the Noah-MP LSM required the use of the Ball-Berry scheme (Ball et al., 1987). It incorporated the leaf area index (LAI) growth model of Dickinson et al. (1998). This LAI growth model has two major parts: one is a stomatal conductance-photosynthesis part, which is based on Collatz et al. (1991), with photosynthesis being calculated for both sunlit and shaded leaves, and a dynamic leaf part. Another is concerned with the changes in carbon allocation, respiration, and vegetation phenology. The Noah-MP LSM added a stem-carbon-balance equation for simulating stem-rich plants (Yang & Niu, 2003). Thus, the Noah-MP LSM accounts for biochemical processes, including carbon assimilation through photosynthesis, allocation of assimilated carbon to various carbon pools (i.e., leaf, stem, wood, root, and soil), and respiration from each of the carbon pools. The leaf carbon mass,C leaf (g m −2 ), is computed from the following: where A is the total carbon assimilation rate of the sunlit and shaded leaves (g m −2 s −1 ); F leaf is the fraction of the assimilated carbon allocated to the leaf; S cd is the death rate due to cold and drought stresses; T leaf is the rate of leaf turnover due to senescence, herbivory, or mechanical loss; and R leaf is the leaf respiration rate, including maintenance and growth respiration. F leaf is parameterized as a function of the LAI, and the LAI is converted from C leaf using a specific leaf area (m 2 g −1 ), which is a vegetationtype-dependent parameter. The vegetation greenness fraction is then simply converted from the LAI:  (Niu et al., 2007) 3: The Noah type  4: The biosphere atmosphere transfer scheme (BATS) type (Dickinson & Kennedy, 1993;Yang & Dickinson, 1996) SFC 1: The Monin-Obukhov scheme (Brutsaert, 1982) 2: The Noah type (Chen et al., 1997) SUP 1: General form of the freezing-point depression equation (Niu & Yang, 2006) 2: Koren's iteration (Koren et al., 1999) FRO 1: Hydraulic properties from total soil water and ice (Niu & Yang, 2006) 2: Nonlinear effects, less permeable (Koren et al., 1999) RAD 1: Canopy gaps from 3-D structure and solar zenith angle (Niu & Yang, 2004) 2: No canopy gap (Niu & Yang, 2004) 3: Gaps from vegetated fraction (Niu & Yang, 2004) SNO 1: The BATS type (Dickinson & Kennedy, 1993) 2: The Canadian land surface scheme (CLASS) type (Verseghy, 1991) SRE 1: Sakaguchi and Zeng's scheme (Sakaguchi & Zeng, 2009) 2: Sellers's scheme (Sellers et al., 1992) 4: Different surface resistance for non-snow and snow This study designed a full factorial experiment taking on all possible combinations of these schemes to conduct a physical ensemble simulation, so that the total ensemble size for each site was 1,296. The ensembles allowed for the calculations of the effect of each subprocess and their interactions.
The parameters to be optimized in this study are shown in Table 2. It has been reported that for LSMs, the key parameters of simulated carbon and water fluxes are consistent (Li et al., 2016). Therefore, this study did not carry out the SA for the model parameters but selected the key parameters and defined their ranges based on previous studies. For example, Arsenault et al. (2018) found that the SH, LH, and net ecosystem exchange were sensitive to the parameters BB (Clapp-Hornberger b parameter), MAXSMC (saturated value of soil moisture), VCMX25 (maximum rate of carboxylation at 25°C), LTOVRC (leaf turnover), RMF25 (leaf maintenance respiration at 25°C), SLA (single-side leaf area per kg) and QE25 (quantum efficiency at 25°C) for the Noah-MP LSM with dynamic vegetation, which were generally consistent with Lu et al. (2013), Li et al. (2016), and Cuntz et al. (2016).

RMSE-Observations Standard Deviation Ratio
The root-mean-square error (RMSE)-observations standard deviation ratio (RSR) has been widely used to evaluate the performance of hydrological models (Ang & Oeurng, 2018;Golmohammadi et al., 2014) and LSMs . It was developed by Moriasi et al. (2007) to combine both the RMSE error index and the additional observational information recommended by Legates and McCabe (1999).
For a land flux Y at each site, the RSR is defined as the ratio of the RMSE to standard deviation of observations: where Y sim i and Y obs i are the simulated and observed Y at the ith time step, respectively; Y obs is the mean of the observed Y; and n is the number of time steps.
To evaluate multiple variables simultaneously, the weighted mean values of the RSRs were calculated, which were RSR K for K model outputs Y 1 ,…,Y K , and RSR M for K model outputs at N sites:

Journal of Advances in Modeling Earth Systems
LI ET AL. and where the weight ω j was set to be 1/4 for RSR K and ω i was set to be 1/92 for RSR M , as there were four variables (i.e., SH, LH, Rnet, and GPP) and 92 flux sites.
The RSR index is very similar to the well-known Nash-Sutcliffe efficiency coefficient (Nash & Sutcliffe, 1970), and therefore, many literatures can be referred to reflect model performance (e.g., Gupta & Kling, 2011). This study followed the criteria given by Moriasi et al. (2007) to evaluate the model performance, which were as follows: The model prediction was very good if 0 ≤ RSR ≤ 0.5, good if 0.5 < RSR ≤ 0.6, satisfactory if 0.6 < RSR ≤ 0.7, and unsatisfactory if RSR > 0.7.

Variance-Based SA and Subprocess Screening
The field of SA contains a broad family of methods. Overall, they can be categorized into local SA that explores the changes of model response by varying one factor while keeping other factors constant and global SA that seeks to characterize the properties of a response surface across the entire factor space (Gan et al., 2014;Razavi & Gupta, 2015). Obviously, the global SA methods are more appropriate for this study than the local ones, because for a land subprocess, each of the PSs and all its possible combinations with PSs for other subprocesses should be comprehensively examined to confirm whether it is significantly different from the other PSs. This is also why we designed a full factorial experiment taking on all possible combinations of these PSs to conduct a physical ensemble simulation. Razavi and Gupta (2015) reviewed that the global SA approaches were evolved from model-based methods that assumed particular model forms (typically linear or polynomial) for the underlying response surfaces, to model-free methods that make minimal assumptions regarding the functional form of the response surface. Due to the characteristics of high dimension and nonlinearity for land surface modeling, this study adopted a model-free SA method to screen out the key land surface subprocesses from the insensitive ones.
The derivative-based Morris (Campolongo et al., 2007;Morris, 1991), the variance-based Sobol' (Sobol', 1990(Sobol', , 2001, and the variogram-based analysis of response surfaces (VARS) (Razavi & Gupta, 2016a, 2016b are model-free global SA methods and have been widely used for complex dynamical system models. Among them, the Sobol' method quantifies the contributions of each factor and each possible two or higher-order interaction to the total variance as the sensitivity indices. The only shortcoming of the Sobol' method is possibly that it suffers from poor computational efficiency, particularly as the problem dimension grows. For example, Rosolem et al. (2012) used 45,000 model runs to assess the Sobol' sensitivities of 42 parameters in the Simple Biosphere 3 (SiB3) model.
As each subprocess in the Noah-MP LSM contained three PS options at most, it was affordable for us to carry out the Sobol' SA based on a full factorial experiment to identify the key subprocesses and investigate the interactions of these subprocesses on LSM behavior.
Here, the use of the SA aims to analyze the contributions of the uncertainties in the PSs of eight physical subprocesses, that is, to the RSRs of model output Y. For any set of subprocesses X, the variance in Y, V Y , can be written as where V X [E(Y| X)] is the variance in the conditional expectation and E X [V(Y| X)] is the residual part (Saltelli et al., 2008).
First, we calculated the sensitivity indices, including (1) the first-order sensitivity, which is the main effect of X i on the output (the fractional contribution of X i to the variance in Y); (2) the second-order sensitivity,

Journal of Advances in Modeling Earth Systems
which is the interactive effect (the proportion of the variation in Y due to X i and X j but cannot be explained by the sum of the first-order effects of X i and X j ); and (3) the total effect, which is the sum of all sensitivity indices (first and higher orders) involving X i , where X −i is all subprocesses other than X i (Homma & Saltelli, 1996).
Having obtained the total effects, we then used the Pareto rank-based approach introduced by Rosolem et al. (2012) to differentiate the most sensitive subprocesses from the insensitive ones for the RSR K defined by Equation 4. This approach assigns a Pareto rank r (Gupta et al., 1998) to each subprocess on the basis of a simultaneous maximization of the total effect indices for all of the objective functions. Further details can be found in Rosolem et al. (2012) and Li et al. (2019).

The SCE-UA Method
The SCE-UA method is a globally based automatic search algorithm proposed by Duan et al. (1993) that was designed to combine the strengths of global and local search methods, such as genetic algorithms and the simplex method, with the concepts of complex partitioning and complex shuffling. Duan et al. (1994) conducted a detailed investigation on the selection of the SCE-UA algorithmic parameters and found that the most important parameter in SCE-UA was the number of complexes, which was dependent on the complexity of the problem. As suggested by Duan et al. (1994), this study set a value of two for the complex number to calibrate the seven parameters in the Noah-MP LSM. The SCE-UA algorithm has three independent convergence criteria. The first one is the maximum number of model evaluations during optimization (set to 3,000). The second one is the percentage of objective function improvement during the last 10 evolution loops (set to be 0.1). The third is the percentage of the normalized geometric range of the population (set to be 0.001). The searching process will stop if the population distribution range is within a small percentage such that all of the population members do not have significant differences.

Journal of Advances in Modeling Earth Systems
Numerous researchers have investigated the use of the SCE-UA for watershed model calibration purposes (Hogue et al., 2000;Luce & Cundy, 1994). The SCE-UA has been shown to be consistently more efficient and robust when it was compared against a variety of search methods (Franchini et al., 1998;Gan & Biftu, 1996;Kuczera, 1997;Thyer et al., 1999). Naeini et al. (2019) reviewed the history of the development and application of the SCE-UA method and showed that it had great potential to become more prominent in the field of optimization. Figure 1 shows the total effects of the PS uncertainty in each subprocess on the RSRs of SH, LH, Rnet, and GPP at the sites, as sorted by PFTs. It can be found that, for each site, the performance of each simulated flux was dominated by one or two subprocesses, with values of total effects higher than 0.5. However, the total effects and the dominant subprocesses may be different across the sites, even for those within the same PFTs. Generally, the results showed that (1) the RSRs of SH and Rnet were the most sensitive to the SFC and RAD subprocesses;

Impacts of PS Uncertainty
(2) the RSRs of LH and GPP were the most sensitive to the BTR, RUN, SFC, and RAD subprocesses; (3) the SNO and FRO subprocesses appeared to be sensitive only at those sites with high elevations (e.g., US-GLE, CN-Dan, IT-MBo, and CN-Ha2), while the SUP subprocess was generally insensitive at all sites; and (4) for LH and GPP, the SFC subprocess was more sensitive at the forest-type sites, and for SH and Rnet, the RAD subprocess was more sensitive at the forest-type sites, especially the ENF sites.
As shown in Figure 2, for multiple surface fluxes at the flux sites, the percentages of the subprocesses in each Pareto rank group for different seasons were calculated. The results of the rank assignment were different among seasons, especially between the warm and cold seasons. For example, the SUP, FRO, and SNO subprocesses were assigned to rank 1 at much more sites in spring and winter than in summer and autumn. However, at most sites, SFC, BTR, and RAD were assigned to the rank 1 for all seasons, while SUP, FRO, and SNO were rarely in rank 1, indicating that the selection of these PSs has less impact on the performance of the Noah-MP LSM with respect to the global SH, LH, Rnet, and GPP. Figure 3 presents the interactive effects among BTR, RUN, SFC, RAD, and SRE on the performances of the surface fluxes. Generally, the interactions of the key subprocesses were low for the flux sites, with values less than 10%. Relatively, the interactions between SFC and RAD (SFC_RAD) had higher impacts on the RSRs of SH, LH, and Rnet, and the interaction between BTR and RAD (BTR_RAD) had a higher impact on the RSR of the simulated GPP. However, for some sites, these interactions can be quite significant and much higher than the main effects of each subprocess. For example, the interactive effect of SFC_RAD at the IT-Lav ENFsite on the RSR of SH was higher than 80%, and the interactive effect of BTR_SFC at the IT-SRo ENF-site on the RSR of GPP was higher than 70%.

Selection of the PSs
Li et al. (2019) found that two interactive subprocesses should be jointly analyzed to identify the optimal PS. Therefore, we selected the optimal PS for the sensitive subprocesses by comparing the conditional box plots of the mean RSR at all sites with respect to the two given schemes for

BTR1 performed better than the other options of BTR (see
From the results, the performances of the schemes for each subprocess were closely interacted with the choices of the schemes for some other subprocesses. For example, SRE2 performed best if option 1 was chosen for SFC but performed worst if SFC2 was selected. Since BTR1 was shown to be the best choice and the BTR subprocess had higher sensitivity for the mean RSR at all sites, with the shortest lengths of the box plots, the optimal PSs for both the RAD and SRE subprocesses were selected as option 1 from Figures 4c and 4d.
Therefore, the recommended PSs-namely, global optimal schemes-for the Noah-MP LSM with dynamic vegetation were BTR1, RUN3, SFC1, SUP1, FRO2, RAD1, SNO2, and SRE1, which is known as "combination A." To calibrate the parameters in different model structures, we also selected the PS combination with significant differences from the optimal combination, that is, BTR3, RUN1, SFC2, SUP2, FRO1, RAD2, SNO1, and SRE2, which is known as "combination B."

Parameter Optimization for Different PS Combinations
For combinations A and B, the model parameters were calibrated by the SCE-UA method to minimize the mean RSRs of the four surface fluxes at the selected sites. Figure 5 presents a kernel density (estimation of the probability density function) plot of the RSRs for the physics ensemble simulations at each site, as associated with the RSRs calculated from combinations A and B with the default and optimized PVs.
Following the criterion by Moriasi et al. (2007) that model prediction is unsatisfactory if the RSR is greater than 0.7, it was found from the kernel density plots that for 66.3% of the selected sites, tuning the PSs can make the model performance satisfactory or better than satisfactory. For 85.9% of the sites, tuning the PVs for both combinations A and B can make the model performance satisfactory or better than satisfactory. For most sites, the superiority of combination A over combination B is preserved after parameter optimization. Even for combination B, parameter optimization is an effective way to improve the model performance from satisfactory or good to very good. Then, we examined whether the optimal PVs significantly varied with different model structures. Figure 6 shows the differences between the optimal and default values of the vegetation (a) and soil (b) parameters for combinations A and B. In Figure 6, the difference was calculated as where v opt , v def , v max , and v min were the optimized, default, maximum, and minimum values of one

Journal of Advances in Modeling Earth Systems
parameter, respectively; the sites in Figure 6a was sorted by PFTs, and those in Figure 6b was sorted by STTs; Each subplot contains two bars: The top bar represents the difference between the optimal and default PVs for combination A, and the bottom bar represents that for combination B. It can be found that (1) For the vegetation parameters, the optimal values for combination A were close to those for combination B; moreover, for each PFT, the optimal vegetation PVs were generally close (e.g., the parameter, QE25, was underestimated by the default values for the PFTs except for ENF), indicating that it was practical to use the same PFT-specified vegetation PVs for different model structures.
(2) For the soil parameters, the difference between the optimal values for combinations A and B was significant, and for almost every STT, the obtained soil PVs from optimization were significantly different, indicating that the uncertainty in soil parameters caused a large portion of the parametric uncertainty and that the use of the same values of soil parameters among diverse model structures may increase the parameter errors.

Comparison of Improvements From Tuning PSs and PVs
In this section, we focus on comparing the improved model performance from tuning the PSs and PVs. For each site, the reference simulation is that using the combination of PSs with low performances (e.g., combination B for the ES-LgS site and combination A for the US-Myb site) without parameter optimization, which was then compared to the simulations with adjusted PSs and PVs at the seasonal scale to examine the detailed seasonal model improvement from these two simulations (see Figure 7 and Table 3) and during the optimization and validation periods to verify robustness (see Figure 8).
First, the sites for which tuning the PSs and PVs can make equivalent reductions in the annual RSR of multiple fluxes were selected. Then, for these sites, Figure 7 shows the reduced mean RSRs in different seasons by tuning the PSs, PVs, and both the PSs and PVs. It was clear that tuning both the PSs and PVs (i.e., calibrating the model parameters along with choosing better physics parameterizations) was the most effective in improving the seasonal land simulations. We also noticed that tuning only the PSs or PVs could hardly improve the seasonal simulations for some sites. For example, tuning the PSs cannot improve the winter simulations for the US-Me6 site, and tuning the PVs cannot improve the summer simulation for the AU-Das site.
Therefore, we counted the frequencies of the events where little improvement was made when only the PSs or PVs were tuned (Table 3). Here, we calculated the percentages of the reduced RSR, RSR 0 − RSR 1 RSR 0 × 100%, where RSR 0 is the RSR calculated from the reference experiment, and RSR 1 is the corresponding RSR after the PSs or PVs were tuned. It was assumed that the percentages of the reduced RSRs with a value of less than or equal to 10% meant that the model improvement made by tuning PSs or PVs was not significant. Table 3 shows that (1) for most sites, either approach can improve the seasonal performance significantly; (2) generally, by tuning the PVs, the performance at more sites can be improved than by tuning the PSs between combinations A and B; (3) it can be inferred that a nonnegligible part of the error in the winter simulations was attributed to missing physical features in the Noah-MP LSM, as both tuning the PSs and PVs lost efficacy at 34.78% of the sites. Figure 8 can be divided into three areas: The first area is where the reduced RSRs during the validation period are greater than both the value of 0 and the corresponding values during the optimization period; the second area is where the reduced RSRs during the

Journal of Advances in Modeling Earth Systems
validation period are greater than 0 but smaller than the corresponding values during the optimization period; and the third area is where the reduced RSRs during the validation period are smaller than 0. Thus, we found that (1) most of the sites were in the second area, indicating that by tuning both the PSs and PVs, the model improvement tended to degrade during the validation period; (2) generally, tuning the PSs made a more robust model improvement. By tuning the PVs, the degradations in the improvements were more significant than those by tuning the PSs, and there was one site at which tuning the parameter reduced the RSR by up to 1.1 in the optimization period, but use of the optimized PVs increased the RSR during the validation period.

Discussion and Conclusions
To understand and reduce the errors in the Noah-MP LSM with dynamic vegetation, we first assessed the impacts of uncertainties in the PSs of the land surface subprocesses on model performance and then selected the statistically optimal PS combination and the combination with significant differences from the optimal combination, that is, combinations A and B, respectively. Finally, we calibrated the key parameters in combinations A and B with the SCE-UA method and compared the model improvement made by tuning the PSs and PVs. The main findings can be summarized as follows: 1. Uncertainties in five subprocesses, SFC, BTR, RAD, RUN, and SRE, had the most significant impacts on the performances of the simulated SH, LH, Rnet, and GPP by the Noah-MP LSM with dynamic vegetation; 2. Generally, the interactions of the key subprocesses were low, with values less than 10%. However, for some sites, the interactions can be quite significant with a value of greater than 80%; 3. Combining the adjustments of both PSs and PVs provides the most effective way to reduce modeling errors in the Noah-MP LSM with dynamic vegetation, particularly at the seasonal scale. Compared to tuning the PVs, tuning the PSs made more robust model improvements. More detailed findings obtained from jointly exploring the uncertainties in the PSs and PVs can be found in the discussions.
In contrast to Li et al. (2019) focusing on the reasons for model differences, this study stressed the errors in the Noah-MP LSM with dynamic vegetation, but generally, the results of the parameterization sensitivities were consistent with previous studies. For example, both this study and Li et al. (2019) agreed that the uncertainty in SFC greatly impacted the energy and water balance. The same conclusion can be found in Niu et al. (2011), Zhang et al. (2016), and Zheng et al. (2019, although the quantitative sensitivity of SFC varied with region due to diverse climatic conditions. Hong et al. (2014) and Gan et al. (2019) noted that evapotranspiration simulated by the Noah-MP LSM with dynamic vegetation is sensitive to RAD, SFC, RUN, and BTR, which is consistent with our results. Similar to the fact that the key parameters are generally the same for different LSMs (Li et al., 2016), the results of the parameterization sensitivities are not model dependent because, to a certain extent, the Noah-MP LSM with a specific combination of PSs can be regarded as a surrogate of another LSM. Therefore, finding the sensitive subprocesses of the Noah-MP LSM can provide scientific references for other LSMs with dynamic vegetation.
However, caution must be taken for the results of PS sensitivities. Gupta and Razavi (2018) argued that performance metric-based methods for global SA should be recognized as actually being identifiability analyses but not SA. In this sense, this study told more about the PSs for which subprocesses can be perturbed to improve model fit to the observed data rather than finding out the most important land subprocesses. Gupta and Razavi (2018) also showed that, due to temporal variations in the strengths of the climatic factors controlling the system drivers, use of a strictly ranked order of importance approach to characterizing sensitivity of model response to parameters is suspect, which was consistent with Prihodko et al. (2008) and . Although it was also shown that the rank assignments of land subprocesses were different among seasons, it should be admitted that the RSR used in this study integrated the daily response in a whole year onto a single metric, which unavoidably distorted important information of the Noah-MP LSM about PS sensitivities. Razavi and Gupta (2019) developed a multimethod generalized global sensitivity matrix approach that enabled a comprehensive multivariate investigation of the time-varying influence of parameters and forcings on different modeled state variables and responses, which can be an ideal choice for further SA of PSs and PVs for the Noah-MP LSM.

10.1029/2019MS001914
The results of the optimal set of PSs may differ from previous studies. For example, Zhang et al. (2016) found that for the simulated SH and LH at a crop site in Tibet, RUN3 performed worse compared to RUN1 and RUN2, and SFC2 performed better than SFC1, which are completely contrary to the conclusions of this study. Two possible explanations are that the optimal sets of PSs vary with region  and that the optimal sets of PSs are different for different land fluxes. Hong et al. (2014) found that over the Han River basin, SFC2, RUN3, and RAD1 comprised the optimal scheme combination for evapotranspiration, but SFC1, RUN4, and RAD2 comprised the best scheme combination for runoff. As this study focused on multiple land fluxes from FLUXNET, the obtained conclusion was an average result depending on which variables and observed sites were chosen. It should be admitted that, given that the approaches we used depended on availability of observational data, the provided analysis was necessarily incomplete .
To date, few studies have explored or quantified the interaction effects of land surface subprocesses (Jiang et al., 2009;Koster & Milly, 1997;Li et al., 2019). Although only 92 sites were selected, it was found that the impacts of these interactions on the performances of SH and GPP can be greater than 80% at some sites, which should be taken seriously because for a region where two land surface subprocesses are highly interacted, only modifying one of the two subprocesses is insufficient for model improvement.
One of the characteristics of this study was that it jointly took the uncertainties of the PSs and PVs into consideration. Several interesting findings were obtained here. First, the superiority of either scheme combinations A or B over another was generally preserved after the process of parameter optimization. This finding indicates that to improve the land model performance, it is reasonable to tune the PS first and then calibrate the parameters in the corresponding PS. Parameter optimization should be regarded as a tool to compensate not only the model structural differences (Bonan et al., 2011;Chen et al., 2011) but also the structural error caused by missing physical features.
Second, modifying the PSs associated with calibrating the PVs can generally reduce the errors in the Noah-MP LSM with dynamic vegetation to a satisfactory level. However, for some sites (e.g., US-Los and US-Myb), tuning PSs or PVs was insufficient to improve the model due to missing physical features; inappropriate descriptions of surface conditions, such as land cover and soil types (Li, Duan, et al., 2018); or errors in the forcing data sets (Xia et al., 2005), and observations (Wilson et al., 2002;Xin et al., 2018). Attributing the errors of global LSMs to various sources is still a challenge. This study found a significant difference in the attribution of land model errors across the selected sites so that there are no proportions to approximately estimate the attributions of land model errors. This is the reason why some previous studies obtained opposite conclusions. For example, Abramowitz et al. (2007) suggested that efforts to improve the representation of fundamental processes in LSMs, rather than parameter optimization, were key to the development of LSM capability, while Li et al. (2016) showed that 60% of the model errors can be reduced by tuning the most sensitive parameters.
Third, although parameter optimization is an effective approach to reduce the errors in the Noah-MP LSM with dynamic vegetation, the optimal values of the parameters, especially the soil parameters, are still quite uncertain. It was found that for almost each STT, the obtained optimal soil PVs for the studied sites were significantly different, which is consistent with Kishné et al. (2017), who examined the Noah default soil hydraulic parameters and compared them to soil measurements across Texas. They noted that 95% of the default soil parameters were significantly different from the region-specific measured values and that it was vital to account for the spatial variability in soil parameters rather than combine these parameters into texture types alone. Moreover, it is a signal of overfitting by optimization that the optimal soil PVs differed significantly for different combinations of PSs. This can also be inferred from the finding that tuning PVs resulted in less robust model improvements than tuning PSs in this study. Brynjarsdóttir and Hagan (2014)) demonstrated that if the model structural errors were not sufficiently considered, a science-based model may be overfitted by optimization, but they also noted that only realistic priors on the model structural errors revealed the true PVs. This study inferred that the structural error in combination A was generally smaller than that in combination B, but because missing physical features made the Noah-MP LSM with combination A not a "truth" model, we did not adopt the methods by Zhang et al. (2015) to calibrate the LSMs with biased physical features. Some previous studies assumed that the structural error was normally distributed and then ruled out input parameter settings (McNeall et al., 2016;Williamson et al., 2013) or calibrated the parameters (Hutton & Kapelan, 2015;Xu et al., 2017) based on the Bayesian theory. These novel methods provide a solution to real-world modeling applications for which the model structural error exists but cannot be quantified or corrected, but to the best of our knowledge, their superiority to traditional optimization methods has not been fully demonstrated yet, and acceptable prior information of model structural errors is still necessary.

Journal of Advances in Modeling Earth Systems Data Availability Statement
The FLUXNET2015 data set is available at this site (http://fluxnet.fluxdata.org/data/fluxnet2015-Dataset). The global soil particle-size data set is available at this site (http://globalchange.bnu.edu.cn/research/soilw).

Appendix A: Supporting Table
In this appendix, we provide information on the FLUXNET sites used in this study (Table A1).