Volume 59, Issue 12 e2023WR035513
Research Article
Open Access

Synergistic Effects of High-Resolution Factors for Improving Soil Moisture Simulations Over China

Peng Ji

Peng Ji

Key Laboratory of Hydrometeorological Disaster Mechanism and Warning of Ministry of Water Resources, Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science and Technology, Nanjing, China

State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing, China

School of Hydrology and Water Resources, Nanjing University of Information Science and Technology, Nanjing, China

Contribution: Conceptualization, Methodology, Software, Formal analysis, Writing - original draft

Search for more papers by this author
Xing Yuan

Corresponding Author

Xing Yuan

Key Laboratory of Hydrometeorological Disaster Mechanism and Warning of Ministry of Water Resources, Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science and Technology, Nanjing, China

School of Hydrology and Water Resources, Nanjing University of Information Science and Technology, Nanjing, China

Correspondence to:

X. Yuan,

[email protected]

Contribution: Conceptualization, Resources, Writing - review & editing, Supervision, Funding acquisition

Search for more papers by this author
Yang Jiao

Yang Jiao

Key Laboratory of Hydrometeorological Disaster Mechanism and Warning of Ministry of Water Resources, Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science and Technology, Nanjing, China

School of Hydrology and Water Resources, Nanjing University of Information Science and Technology, Nanjing, China

Contribution: Software, Writing - review & editing

Search for more papers by this author
First published: 15 December 2023

Abstract

Understanding contributions of advanced land surface models and high-resolution model inputs (e.g., meteorological forcings and soil parameters) to root-zone soil moisture (RSM) simulations provides critical implications for both model and data development. Previous works have investigated influences of these factors separately, without considering the interdependence between multiple factors (e.g., positive impacts of high-resolution forcings may be reduced by coarse-resolution parameters). To date, how to quantify this interdependence and its relative importance remain to be investigated. Here, we propose a framework to quantify independent and interdependent/synergistic effects of forcings, parameters and models on high-resolution RSM modeling using ensemble simulations. Forty-eight RSM simulations with different high-resolution factors superior in both spatial resolution and data accuracy are performed over China during 2013–2017, and observations from 1,553 stations across different climate zones are used to conduct evaluation. Results show that, the increase in Kling-Gupta efficiencies (KGEs) after combining different high-resolution factors are larger than the sum of that using individual factors. Such synergistic effects dominate the improvement of high-resolution modeling at national and regional scales, and contribute to consistent improvements of simulations of RSM's mean state and variability. At station scale, although independent effect increases over western China, synergistic effect contributes 42%–60% to the improved KGEs over eastern China. The positive effects of an individual high-resolution factor on RSM modeling could be reduced by 25%–80% without synergistic effects, indicating that the synergistic developments of models, meteorological forcings and soil parameters can facilitate high-resolution RSM modeling more efficiently than only focusing on a single factor.

Key Points

  • Independent and synergistic effects of model, forcing and parameter on high-resolution soil moisture (SM) simulations are estimated

  • Using only one high-resolution factor has limited improvement in SM simulations due to uncertainties from other coarse-resolution factors

  • Synergistic effect of multiple high-resolution factors interprets 25%–80% of the improvement in high-resolution SM modeling over China

Plain Language Summary

Facilitated by high-resolution meteorological forcings, soil parameters and numerical models, land surface modeling is an efficient way to provide locally relevant root-zone soil moisture (RSM) for the agriculture management and drought monitoring. Previous works used to quantify the influences of different high-resolution factors by using sensitivity experiments with an independent assumption (e.g., the added value of high-resolution forcing keeps constant no matter high-resolution parameter is used or not), so as to find the factor that can improve RSM modeling efficiently. However, whether the independent assumption is appropriate remains to be investigated. This study develops a new framework to separate the interdependent influences among multiple high-resolution factors and the independent effects of individual factors through ensemble simulations. We show that the impacts of different high-resolution factors are strongly interdependent. Large part of the positive effect of individual high-resolution factors on RSM modeling cannot be achieved when other factors have coarse-resolution and large uncertainties. Such an interdependence is identified as the synergistic effect, and dominates the improvement of high-resolution RSM modeling over East China. Therefore, efforts are needed to the collaborative development of high-resolution models, forcings and parameters for high-resolution RSM simulations.

1 Introduction

Root-zone soil moisture (RSM) is a crucial and highly heterogeneous variable in the earth system (Seneviratne et al., 2010). It influences vegetation transpiration and photosynthesis, modulates the interactions between the ecology-hydrology-atmosphere systems, and provides an important source of weather and climate predictability (Humphrey et al., 2021; Koster et al., 2004; Merryfield et al., 2020; C. Wang et al., 2018). Due to sparse in-situ observations and measuring gaps and uncertainties of satellite sensors (McCabe et al., 2017), high-resolution RSM modeling with land surface models (LSMs) has been widely used to provide spatiotemporally continuous and locally relevant information (Beven et al., 2014; Bierkens et al., 2015; Ji et al., 2017; Rouf et al., 2021; Wood et al., 2011).

RSM simulations are influenced by the LSMs' structure, accuracy of meteorological forcings and land surface parameters (Clark et al., 2011; Gupta & Nearing, 2014; Montanari & Koutsoyiannis, 2012; Nearing et al., 2016; Nearing & Gupta, 2015). Model structures are descriptions of physical processes and solvers for the soil water system (e.g., parameterizations of soil water transport, evapotranspiration and runoff). Land surface parameters (e.g., soil parameters for hydraulic properties) are required in LSMs to solve the water transport equation, whereas meteorological forcings (e.g., precipitation and temperature) are time-dependent boundary conditions of land surface processes. Instead of simply running the LSMs at high spatial resolution, high-resolution RSM modeling necessitates ongoing efforts to develop/optimize LSMs, meteorological forcings and soil parameters (Beven et al., 2014; Wood et al., 2011).

Numerous works have assessed the added value of newly-developed high-resolution meteorological forcings and soil parameters to RSM modeling. High-resolution meteorological forcings have crucial impacts on simulating the mean and variability of RSM (Liu et al., 2023; Meng et al., 2019; Rouf et al., 2021; Zeng et al., 2021), while high-resolution soil parameters are noted to reduce RSM biases by providing much more precise soil hydraulic properties than that at coarse resolution (De Lannoy et al., 2014; Livneh et al., 2015; Maertens et al., 2021; Singh et al., 2015). Because high-resolution forcings/parameters typically utilize a larger number of observations and employ more sophisticated data fusion techniques during their production, their added value is not solely derived from their finer spatial resolution but also from the enhanced accuracy of the data (Beven et al., 2014; Singh et al., 2015). Therefore, the “high-resolution product” stands for the data set with both higher spatial resolution and higher data accuracy in the following manuscript. Moreover, RSM modeling improvements are often observed after optimizing the model structure, developing new parameterizations, or assimilating high-resolution observations (Chaney et al., 2016; Ji et al., 2017; Niu et al., 2014; Shellito et al., 2020; Vergopolan et al., 2020). Although a series of works have demonstrated the positive effect of high-resolution forcings, parameters and advanced LSMs on RSM modeling, their relative importance are still under debate (Fu et al., 2022; Liu et al., 2023; Zeng et al., 2021). For example, Liu et al. (2023) found that high-resolution precipitation is critical, while Zeng et al. (2021) suggested that advanced LSMs with improved physical parameterizations have larger impacts on RSM than high-resolution meteorological forcings.

Previous works usually quantify the effect of a specific high-resolution factor individually without considering the influences of other coarse-resolution factors. However, recent works have revealed that high-resolution meteorological forcings do not necessarily improve the RSM simulation accuracy when the soil hydraulic properties are misrepresented (J. Sun et al., 2021; Xie et al., 2016). This indicates that, using the default coarse-resolution or the updated high-resolution soil parameters will lead to different results when we quantify the influence of high-resolution forcings. In addition, there are evidences showing that positive influences of a high-resolution forcings/parameters on RSM modeling are affected by the LSMs (Alavi et al., 2016; De Lannoy et al., 2014; Maertens et al., 2021). For example, high-resolution soil texture data sets cannot cause significant reductions in the SM simulation bias when applied to the Catchment LSM unless the soil property parameterizations are optimized (De Lannoy et al., 2014). Therefore, the added value of high-resolution factors in improving RSM simulation accuracy will be reduced and even offset by the other coarse-resolution factors. Such an interdependence between different factors necessitates analyzing their roles in increasing RSM simulation accuracy from an integrated perspective.

To date, few works investigate the independence/interdependence between different factors in improving RSM modeling quantitatively, and relative importance of the independent and interdependent effects between different factors still remains unknown. How much uncertainties it will cause when neglecting the interdependent effects also needs further investigations. To resolve the above issues, this study proposed a method for distinguishing independent and synergistic (or interdependent) contributions of multiple factors to RSM modeling based on ensemble simulations. The method was applied in the mainland of China where 1553 RSM in-situ stations are available. The independent and synergistic effects of a state-of-the-art LSM (a model that has advanced physical parameterizations and a suitable structure for RSM modeling), high-resolution meteorological forcings and soil parameters were analyzed at regional to station scales, and regional differences were investigated.

The remainder of this paper is organized as follows: Section 2 introduces the study domain, observational data, LSMs, and experimental design. The results are presented in Section 3, and the discussion and conclusions are outlined in Sections 4 and 5 respectively.

2 Material and Methods

2.1 Study Area and Observational Data

China has a large variety of climate and landscape. The complex climate, vegetation, and soil properties have resulted in highly heterogeneous RSM. The country was divided into nine subregions to perform regional analysis according to climatic characteristics and conventions in the geographical division (Figure 1) (Wu et al., 20112021). The RSM is relatively lower in northwestern China, with a semi-arid and arid climate, and higher in humid regions, including eastern and southern China (A. Wang & Shi, 2019).

Details are in the caption following the image

Study domain and soil moisture station observation information. Different colors represent the percentages of record length during the growing season (April–September) in 2013–2017 at the stations.

Daily SM observations from more than 2,500 automatic soil moisture observation (ASMO) stations during 2013–2017 were provided by the China Meteorological Administration (CMA). The ASMO sensors detect the SM using frequency domain reflection (FDR). The raw data are collected in 10 layers from surface (0.1 m) to 1 m depth with an interval of 0.1 m, and the 10 layers were averaged to derive the RSM data. If a station had missing data for less than three layers on a given day (i.e., ≤20% missing data), the missing values were filled by linear interpolation. Otherwise, the RSM values were marked as missing. Because the ASMO sensors have large uncertainties during the freeze-thaw period (L. Li et al., 2021), we focused on the growing season (April–September) in this study. Quality control was applied to the daily RSM data by following previous research (Chen & Yuan, 2020; Zeng et al., 2021). First, values larger than 0.5 m3 m−3 were removed, according to the working range of the FDR sensors and the observation specification by the CMA (Chen & Yuan, 2020). Second, stations with valid records less than 80% during the growing season of 2013–2017 were removed. Finally, 1,553 stations were selected and their locations are shown in Figure 1. Generally, observation stations are denser in Northeast China (160), North China (495), East China (278), South China (97), and Southwest China (419) and sparse in XinJiang (89), Inner Mongolia (15), and Tibet (0).

2.2 Methodology

The methodological framework to quantify independent and synergistic contributions of multiple high-resolution factors to RSM modeling based on ensemble simulations is shown in Figure 2, and is described below.

Details are in the caption following the image

The framework to quantify independent and synergistic (or interdependent) effects of meteorological forcings, soil parameters and models on high-resolution root-zone soil moisture modeling. Detailed information of the 48 experiments is given in Figure 3.

Details are in the caption following the image

Flowchart of the ensemble simulations. Simulations with different combinations of meteorological forcings, soil parameters, and land surface models (LSMs) were performed (step ①–⑧). Each step contains several simulation members (e.g., step ① includes 18 members). The Kling-Gupta efficiency (KGE) was used as the evaluation metric. “Ref” represents the reference simulation, and “M,” “F,” and “S” refer to the Conjunctive Surface-Subsurface Process model (CSSPv2) model, CMA Land Data Assimilation System (CLDASv2) high-resolution forcings, and China Data set of Soil Properties (CDSP) high-resolution soil parameters, respectively. The contribution of the factors was determined by calculating the difference in the KGE (Equations 2-4). All meteorological forcing and soil parameter data were interpolated to 6-km spatial resolution before driving the LSMs.

2.2.1 Model Descriptions

Four LSMs were chosen to model RSM, including the Common Land Model (CoLM) (Dai et al., 20032004), two different settings of the Noah LSM with Multi-Parameterization options (Noah-MP1, Noah-MP2) (Niu et al., 2011), and second version of the Conjunctive Surface-Subsurface Process model (CSSPv2) (Yuan et al., 2018).

The CoLM contains detailed representations of biogeophysical and biogeochemical processes. The soil column has a depth of 3.43 m with 10 soil layers, each with its own thermal and hydraulic properties (the layer structure is shown in Figure S1a in Supporting Information S1). Soil moisture dynamics are solved using the continuity equation and Darcy's law, and the source (e.g., surface infiltration) and sink (e.g., subsurface runoff, evapotranspiration) terms are considered. Soil hydrological properties are determined according to the soil texture (e.g., clay fraction and sand fraction) using pedotransfer functions. There are eight soil layers in CoLM above a depth of 1 m, and the thickness-weighted mean SM in these layers is considered the RSM in the CoLM following previous studies (Mu et al., 2021; Yuan & Liang, 2011).

The Noah-MP model is developed from the community Noah LSM by introducing multiple parameterization options (Niu et al., 2011). Noah-MP has four soil layers (0–0.1 m, 0.1–0.4 m, 0.4–1 m, and 1–2 m, Figure S1b in Supporting Information S1). All soil layers share the same thermal and hydraulic properties determined by 16 soil types. The numerical solution to obtain SM is similar to that in the CoLM, but the detailed parameterizations of runoff, evapotranspiration, and other related processes are different. Here, we used two different settings of the Noah-MP model to perform experiments. Noah-MP1 uses the Jarvis stomatal resistance scheme, free drainage surface and subsurface runoff, and Koren's iteration for supercooled liquid water, the same as the Noah LSM. In contrast, the Noah-MP2 uses the Ball-Berry stomatal resistance model, the TOPMODEL based surface and subsurface runoff parameterization with a simple groundwater model, and a more general form of the freezing-point depression equation without iteration. Other parameterizations of the Noah-MP1 and Noah-MP2 are the options suggested in the User's Guide (https://www.jsg.utexas.edu/noah-mp/). The thickness-weighted mean SM in the top three layers is considered the RSM in the Noah-MP LSM.

The CSSPv2 model is rooted in the CoLM with improvements in hydrological processes including a quasi-three-dimensional volume average soil water transport (VAST) module (Choi et al., 2007), a one-dimensional groundwater module (Yuan & Liang, 2011), a variable infiltration capacity runoff scheme (Yuan et al., 2018), and an urban-module (Ji et al., 2021). In contrast to the CoLM and Noah-MP, the VAST module in CSSPv2 decomposes the SM into averaged and perturbative terms to account for subgrid SM variations. The soil column is extended to 5.67 m to improve the representation of subsurface hydrology (Figure S1c in Supporting Information S1). The parameterization of soil hydrological properties is similar to the CoLM, but the hydraulic influences of soil organic matter are considered (Yuan et al., 2018). The CSSPv2 model has been applied to 30 km∼100 m land surface simulations over the southwestern United States (Ji et al., 2017), the Tibetan Plateau (Ji et al., 2020; Yuan et al., 2018), Beijing metropolitan area (Ji et al., 2021), and continental China (Ji et al., 2023; Zeng et al., 2021). The CSSPv2 accurately reproduces RSM dynamics and outperforms the state-of-art LSMs (Zeng et al., 2021). The thickness-weighted mean SM in the top eight layers is considered the RSM in the CSSPv2 LSM.

Compared with Noah-MP and CoLM, CSSPv2 model has following advantages that may facilitate RSM modeling. First, compared to Noah-MP's soil classification scheme with a limited number of broad classes, the pedotransfer functions in the CSSPv2 can better utilize soil texture and soil properties in parameterizing soil hydraulic properties (De Lannoy et al., 2014). Second, Kollet and Maxwell (2008) suggested a critical depth range (1–5 m) to consider the substantial impacts of groundwater dynamics on surface and RSM. The CSSPv2's deep soil column and the groundwater module enable it to model groundwater and soil water interactions at the critical depth. Third, solving the soil water equation with a higher vertical resolution in the CSSPv2 model ensures a better vertical water diffusion in the soil and provides a more reasonable response to precipitation (Husain et al., 2016; Mu et al., 2021). In addition, CSSPv2 shows higher Kling-Gupta Efficiency (KGE) (Gupta et al., 2009) than CoLM, Noah-MP1, and Noah-MP2 in modeling RSM when forced by the same high-resolution forcings and parameters (according to the evaluation result in Section 3). We also compared the LSMs based on the Bayesian inference index (Schwarz, 1978), but the result is quite sensitive to the number of concerned parameters (please see Text S1 in Supporting Information S1). Considering that the LSMs usually have hundreds of parameters and are often overparameterized (Cuntz et al., 2016; Gan et al., 2015; J. Li et al., 2013), it is difficult to get the precise number of model parameters that have critical influences on the RSM modeling for each LSM. Therefore, the models are simply ranked according to the KGE following previous model comparison plans (Wood et al., 1998), and the CSSPv2 model is regarded as an advanced LSM in this study.

2.2.2 Meteorological Forcings and Soil Parameter Data Sets

Hourly or 3-hourly near surface meteorological forcings (e.g., precipitation, temperature, wind speed, shortwave radiation, longwave radiation, specific humidity and surface pressure) from the CMA Land Data Assimilation System (CLDAS) (Shi et al., 2011), the fifth generation of European ReAnalysis (ERA5) (Hersbach et al., 2020), the Global Land Data Assimilation System version 2.1 (GLDASv2.1) (Rodell et al., 2004), and the second Modern-Era Retrospective analysis for Research and Applications (MERRA2) (Gelaro et al., 2017) were used to drive LSMs (upper right in Figure 2). These products are state-of-art land-atmosphere coupled reanalysis or land data assimilation systems that provide continuous and consistent near-surface meteorological forcings (Gualtieri, 2022; Xia et al., 2019). The spatial resolutions of CLDAS, ERA5, GLDASv2.1 and MERRA2 are 0.0625°, 0.25°, 0.25°, and 0.5°, respectively. The CLDAS precipitation fuses observations from 30,000 to 60,000 regional meteorological stations in China and outperforms other reanalysis products and satellite-based precipitation products (e.g., Global Precipitation Measurement precipitation fusion data set) (S. Sun et al., 2020; Yang et al., 2017). Note, the specific humidity of ERA5 was derived from dew point temperature, temperature and surface pressure using the method introduced in Community Land Model (Oleson et al., 2013).

The soil parameters (e.g., the “b” parameter in hydraulic functions, soil porosity, saturated hydraulic conductivity, matric potential, etc.) are derived from three different data sets, including the China Data set of Soil Properties (CDSP) (Shangguan et al., 2013), ERA5, and GLDASv2.1 (upper left in Figure 2). The CDSP provides multi-layer (eight layers to the depth of 2.3 m) soil texture at 1-km spatial resolution based on 8,979 soil profiles and the soil map of China. The ERA5 derives 6 soil types from the 5-min FAO/UNESCO Digital Soil Map of the World (DSMW) (FAO, 2003), and each soil type is assigned a set of hydraulic properties (Balsamo et al., 2009). Similarly, the GLDASv2.1 derives 16-category soil texture class (default soil types in the Noah-MP model) from the DSMW for regions outside the contiguous United States (https://ldas.gsfc.nasa.gov/gldas/soils). However, both the soil type and its corresponding properties in GLDASv2.1 are different from ERA5. Because the soil parameters in the Noah-MP LSM do not have vertical variation, the multi-layer soil texture in the CDSP was averaged based on the layer thickness and converted to the 16 soil types used in the Noah-MP model's soil texture classification. The ERA5 soil types were applied to the Noah-MP model by changing the configuration of soil type and properties. For the CSSPv2 and CoLM, the soil properties were derived from each data set directly.

2.2.3 Experimental Design

Forty-eight simulations were performed at 6-km resolution using different combinations of LSMs, meteorological forcings, and soil parameters (lower left in Figure 2), and the simulations are depicted in Figure 3. The bilinear interpolation method, which is a straightforward remapping algorithm widely used in generating meteorological forcings (Ji et al., 2023; Rouf et al., 2021), was used to remap all forcings and soil parameters to 6-km spatial resolution. The discrepancies in the raw data sets before interpolation caused the difference in the evaluation metrics between simulations using high- and coarse-resolution products. The CLDAS and CDSP are considered high-resolution forcings (HR_For) and high-resolution soil parameters (HR_Par), respectively, because of their high spatial resolutions and high data accuracy. Considering that this study focuses on the interdependence between different high-resolution factors, both high spatial resolution and data accuracy are considered as high-resolution factors' advantages that could potentially cause a better modeling result. The other parameters in the LSMs remain default, except for the runoff parameters in the CSSPv2 model. The runoff parameters in CSSPv2 have regional variations over different river basins, which were calibrated against monthly streamflow observations (Ji et al., 2023). The CSSPv2 simulation results were compared to the CoLM and Noah-MP (Noah-MP1 and Noah-MP2) simulation results to quantify the added values of the CSSPv2 model.

Similar to previous research (Chen & Yuan, 2020; Zeng et al., 2021), we used the KGE as an evaluation metric. The KGE is an integrated measure of the correlation, bias and variability, and is calculated as:
KGE = 1 ( CC 1 ) 2 + ( β 1 ) 2 + ( α 1 ) 2 ; where β = μ m / μ 0 , α = σ m / σ 0 $\text{KGE}=1-\sqrt{{(\text{CC}-1)}^{2}+{(\beta -1)}^{2}+{(\alpha -1)}^{2}};\,\text{where}\,\beta ={\mu }_{m}/{\mu }_{0},\alpha ={\sigma }_{m}/{\sigma }_{0}$ (1)
where CC is the correlation coefficient, β is the ratio between the simulated (μm) and observed mean values of RSM (μ0), and α is the ratio between the simulated (σm) and observed (σ0) standard deviation of RSM.
The simulation procedures in Figure 3 were as follows:
  1. Reference simulations (“Ref”). Coarse-resolution forcings and soil parameters were interpolated to 6-km resolution to drive the CoLM, Noah-MP1 and Noah-MP2 LSMs. In this scenario, RSM modeling was performed at high spatial resolution without any improvements in forcings, parameters, or LSMs. The evaluation result is represented by KGERef. The “Ref” experiment contains 18 members due to different combinations of forcings (e.g., ERA5, GLDASv2.1, and MERRA2), parameters (e.g., ERA5 and GLDASv2.1) and models (e.g., CoLM, Noah-MP1 and Noah-MP2) (Table S1 in Supporting Information S1), so as to consider the sampling uncertainties.

  2. Using advanced LSMs (“M”). Step 2 was the same as step 1 but used the CSSPv2 LSM. The difference between KGEM and KGERef (e.g., ∆KGEM = KGEM − KGERef) was attributed to the difference between CSSPv2 and the other three models, and the ∆KGEM represented the independent influence of the CSSPv2 model because no additional information from high-resolution forcings and parameters were introduced in the simulation.

  3. Using high-resolution meteorological forcings (“F”). The CLDAS forcings and coarse-resolution parameters were used to drive the CoLM, Noah-MP1, and Noah-MP2 LSMs. The difference between the KGEF and the KGERef (e.g., ∆KGEF = KGEF − KGERef) represented the independent influence of the high-resolution forcings.

  4. Using high-resolution meteorological forcings and advanced LSMs (“M + F”). The RSM simulation was performed using the high-resolution CLDAS forcings and the CSSPv2 LSM. In most cases, the KGE improvement (e.g., KGEM,F − KGERef) was not equal to the sum of ∆KGEM and ∆KGEF, indicating that the added values of the high-resolution forcings and the CSSPv2 LSM to RSM modeling are not independent. The additional improvement (e.g., ∆KGEM,F = KGEM,F − KGERef − ∆KGEF − ∆KGEM) was taken as the synergistic effect of the advanced LSMs and high-resolution meteorological forcings, revealing the interdependence between LSM and meteorological forcings to produce a reasonably good RSM simulation.

Figure 4 shows an example to clarify the independent and synergistic influences. The CSSPv2 produces a better RSM simulation than the NoahMP1 model with the same coarse-resolution forcings and parameters, by showing a higher CC, a smaller wet bias (revealed by the β factor), and a better standard deviation (cyan and blue bars in Figure 4). Therefore, the KGE increases and the independent impact of CSSPv2 model (∆KGEM) is quantified to be 0.19 (e.g., 0.34–0.15). Using the CLDAS forcing to drive Noah-MP1 model does not cause a better standard deviation, but results in a lower bias and a much higher CC than the reference simulation (pink bars in Figure 4). Therefore, the KGE also increases and the independent impact of CLDAS forcing (∆KGEF) is 0.16. When the CLDAS forcings and CSSPv2 model are combined, the increase in KGE is 0.12 higher than the sum of that only using individual factor (red bar in Figure 4b). This additional increase in the KGE can be interpreted from the interdependence between CSSPv2 model and CLDASv2 forcings on modeling the variability of RSM. The GLDASv2.1 coarse-resolution precipitation shows some inconsistencies with the CLDASv2 precipitation during late April, August, and early September (green and red bars in Figure 4a), inducing unrealistic RSM perturbations (blue lines in Figure 4a) and thus limiting the CSSPv2's positive effect on modeling RSM dynamics (e.g., the CC increases little from cyan to blue bars in Figure 4c). The Noah-MP1 model responds quickly to rainless periods in August, causing a significant decreasing trend of RSM which does not occur in the observation (pink line in Figure 4a). This will then reduce the positive effect of the CLDAS forcing on the increase of CC (e.g., CC only increases by 0.17 from cyan to pink bars in Figure 4c). Combining CLDAS forcing and CSSPv2 model, however, reduces uncertainties related to model and forcings and results in a similar RSM variability with the observation (red bars in Figure 4c).

Details are in the caption following the image

An example of the independent and synergistic influences of high-resolution meteorological forcings and the Conjunctive Surface-Subsurface Process model (CSSPv2) land surface model (LSM). (a) Observed and simulated soil moisture at a station in East China from April to September 2013. The colored lines represent the simulation results using different meteorological forcings and LSMs (e.g., “F: GLDASv2” and “M: Noah-MP1” refer to the GLDASv2 meteorological forcings and Noah-MP1 model, respectively). The bars represent daily precipitation from different data sets. All simulations use the soil parameters from the GLDASv2 data set. (b) The Kling-Gupta efficiency (KGE) between simulations and observations. (c–e) The three metrics used to calculate the KGE, including the correlation coefficient (CC), the ratio of variation (α), and the ratio of mean state (β).

⑤–⑧ are similar to ①–④, but with different combinations of meteorological forcings and soil parameters. The independent and synergistic impacts of high-resolution forcings, high-resolution parameters and the CSSPv2 LSM were calculated in ⑨ and ⑩ by using the method introduced in Section 2.2.4.

All the simulations were performed for two cycles from 2011 to 2017 at half-hourly time steps, with the land surface conditions at the end of the first cycle being used for the initial conditions for the second cycle. The first cycle and simulations during 2011–2012 of the second cycle were regarded as the spin-up period, which is needed for the water and energy cycles in the LSM to reach equilibrium given the same meteorological forcings. This is a widely used procedure to obtain reasonable initial conditions of the land surface, because the actual initial conditions are usually unknown due to data scarcity and process nonlinearity (Kabir et al., 2022). The 3-hourly/hourly forcings were decomposed into half-hourly data linearly, except for the precipitation and shortwave radiation. The 3-hourly/hourly precipitation was separated uniformly into half-hourly record, while the shortwave radiation was decomposed according to the solar zenith angle (Ji et al., 2023). To ensure computational efficiency, only the grids containing observational RSM stations are simulated. This strategy is reasonable because the CoLM, Noah-MP1 and Noah-MP2 LSMs do not consider water or energy flows between grids. Lateral subsurface flows in the CSSPv2 LSM were disabled, because they have a negligible influence on the results when the spatial resolution is coarser than 1 km (Ji et al., 2017).

2.2.4 Distinguishing Between Independent and Synergistic Contributions of Different Factors

The independent effects of the CSSPv2 LSM (∆KGEM), CLDASv2 forcing (∆KGEF), and CDSP soil parameters (∆KGES) on RSM modeling can be derived as:
Δ KG E M = KG E M KG E Ref Δ KG E S = KG E S KG E Ref Δ KG E F = KG E F KG E Ref \begin{align*}{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M}}=\text{KG}{\mathrm{E}}_{\mathrm{M}}-\text{KG}{\mathrm{E}}_{\text{Ref}},\\ {\Delta }\text{KG}{\mathrm{E}}_{\mathrm{S}}=\text{KG}{\mathrm{E}}_{\mathrm{S}}-\text{KG}{\mathrm{E}}_{\text{Ref}},\\ {\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F}}=\text{KG}{\mathrm{E}}_{\mathrm{F}}-\text{KG}{\mathrm{E}}_{\text{Ref}},\end{align*} (2)
where KGERef represents the mean KGE value for the reference experiment, while KGEM, KGES, and KGEF are the mean KGE values for “M,” “S,” and “F” experiments, respectively. Detailed settings of different experiments are summarized in Table S1 in Supporting Information S1. The synergistic influences of different combinations of the factors are calculated as:
Δ KG E M , S = KG E M , S KG E Ref Δ KG E M Δ KG E S Δ KG E M , F = KG E M , F KG E Ref Δ KG E M Δ KG E F Δ KG E F , S = KG E F , S KG E Ref Δ KG E F Δ KG E S Δ KG E M , S , F = KG E M , F , S KG E Ref Δ KG E M Δ KG E S Δ KG E F Δ KG E M , S Δ KG E M , F Δ KG E F , S \begin{align*}{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{S}}=\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{S}}-\text{KG}{\mathrm{E}}_{\text{Ref}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{S}},\\ {\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{F}}=\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{F}}-\text{KG}{\mathrm{E}}_{\text{Ref}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F}},\\ {\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F},\mathrm{S}}=\text{KG}{\mathrm{E}}_{\mathrm{F},\mathrm{S}}-\text{KG}{\mathrm{E}}_{\text{Ref}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{S}},\\ {\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{S},\mathrm{F}}=\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{F},\mathrm{S}}-\text{KG}{\mathrm{E}}_{\text{Ref}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{S}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F}}\\ \qquad \qquad\qquad -{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{S}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{F}}-{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F},\mathrm{S}},\end{align*} (3)
where ΔKGEM,S, ΔKGEM,F, and ΔKGEF,S indicate the synergistic influences between CSSPv2 model and high-resolution soil parameters, CSSPv2 model and high-resolution forcings, and high-resolution soil parameters and forcings, respectively. The ΔKGEM,S,F means the synergistic influences which can only be achieved when using CSSPv2 model, high-resolution soil parameters, and high-resolution forcings. The independent and synergistic contributions are calculated using Equation 4. Note that only the positive contribution is considered in Equation 4. This approach is reasonable because it can be concluded that the improved performance of the CSSPv2 high-resolution simulation is caused by the CSSPv2 when only the ΔKGEM is positive.
CT R M = max 0 , Δ KG E M Δ KG E tot ; CT R S = max 0 , Δ KG E S Δ KG E tot ; CT R F = max 0 , Δ KG E F Δ KG E tot ; CT R M , S = max 0 , Δ KG E M , S Δ KG E tot ; CT R M , F = max 0 , Δ KG E M , F Δ KG E tot ; CT R S , F = max 0 , Δ KG E F , S Δ KG E tot ; CT R M , S , F = max 0 , Δ KG E M , S , F Δ KG E tot ; Δ KG E tot = max 0 , Δ KG E M + max 0 , Δ KG E S + max 0 , Δ KG E F + max 0 , Δ KG E M , S + max 0 , Δ KG E M , F + max 0 , Δ KG E F , S + max 0 , Δ KG E M , S , F \begin{align*}\text{CT}{\mathrm{R}}_{\mathrm{M}}=\frac{\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M}}\right)}{{\Delta }\text{KG}{\mathrm{E}}_{\text{tot}}^{\prime }};\,\text{CT}{\mathrm{R}}_{\mathrm{S}}=\frac{\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{S}}\right)}{{\Delta }\text{KG}{\mathrm{E}}_{\text{tot}}^{\prime }};\text{CT}{\mathrm{R}}_{\mathrm{F}}=\frac{\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F}}\right)}{{\Delta }\text{KG}{\mathrm{E}}_{\text{tot}}^{\prime }};\\ \text{CT}{\mathrm{R}}_{\mathrm{M},\mathrm{S}}=\frac{\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{S}}\right)}{{\Delta }\text{KG}{\mathrm{E}}_{\text{tot}}^{\prime }};\,\text{CT}{\mathrm{R}}_{\mathrm{M},\mathrm{F}}=\frac{\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{F}}\right)}{{\Delta }\text{KG}{\mathrm{E}}_{\text{tot}}^{\prime }};\\ \text{CT}{\mathrm{R}}_{\mathrm{S},\mathrm{F}}=\frac{\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F},\mathrm{S}}\right)}{{\Delta }\text{KG}{\mathrm{E}}_{\text{tot}}^{\prime }};\,\text{CT}{\mathrm{R}}_{\mathrm{M},\mathrm{S},\mathrm{F}}=\frac{\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{S},\mathrm{F}}\right)}{{\Delta }\text{KG}{\mathrm{E}}_{\text{tot}}^{\prime }};\\ {\Delta }\text{KG}{\mathrm{E}}_{\text{tot}}^{\prime }=\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M}}\right)+\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{S}}\right)+\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F}}\right)+\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{S}}\right)\\ \qquad \qquad\quad +\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{F}}\right)+\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{F},\mathrm{S}}\right)+\max \left(0,{\Delta }\text{KG}{\mathrm{E}}_{\mathrm{M},\mathrm{S},\mathrm{F}}\right)\end{align*} (4)

There are 48 ensemble simulations in total with different combinations of four LSMs, four sets of forcings and three sets of parameters (Table S1 in Supporting Information S1). The 48 ensemble members were then divided into 18 groups according to different reference experiments. Equations 2-4 were applied to each set of comparative simulations, and the ensemble mean was calculated. One standard deviation of the ensemble members was regarded as uncertainty.

3 Results

3.1 Comparisons Between High- and Coarse-Resolution Meteorological and Surface Data Sets

Figure 5 shows the comparison of different forcing data sets at 1,553 stations during the growing seasons of 2013–2017. The CLDAS forcing data set was used as the reference. MERRA2 has a large positive bias when the precipitation exceeds 1,000 mm (Figure 5a), due to the overestimation of precipitation in southern and eastern China (C. X. Li et al., 2020). ERA5 and GLDASv2.1 show systematic underestimations of daily surface solar radiation flux compared with CLDAS, whereas MERRA2 has a positive bias (Figure 5c). All three coarse-resolution data sets have higher surface wind speeds than CLDAS (Figure 5e), which is consistent with previous findings that coarse-resolution reanalysis data sets tend to overestimate surface wind speed (Sharp et al., 2015). Figures 5b, 5d, and 5f show the CCs between coarse- and high-resolution forcings. In general, ERA5 shows higher consistence with CLDAS precipitation than GLDASv2.1 and MERRA2 with a median CCs being 0.7 (Figure 5b). The CCs between CLDAS and the coarse-resolution forcing data sets are much larger (0.8∼0.9) for shortwave radiation and surface wind speed than for precipitation (Figures 5d and 5f). Small differences are observed between the four data sets for the other meteorological forcing variables including near-surface temperature, surface downward longwave radiation and specific humidity (Figure S2 in Supporting Information S1).

Details are in the caption following the image

Comparisons of meteorological forcings between CMA Land Data Assimilation System (CLDAS) and coarse resolution (CR) products during the growing seasons of 2013–2017. (a) Scatter plot for mean precipitation during the growing seasons of 2013–2017 at soil moisture stations compiled from different products. The solid lines represent the fitted linear regressions. (b) Box plot of the correlation coefficient (CC) of daily precipitation between CLDAS and other products. (c, d) And (e, f) are the same as (a, b), but for solar radiation and wind speed.

Figure 6 shows the spatial distributions of soil porosity (maximum water holding capacity of soil), b factor (the parameter for the calculation of soil matric potentials), and soil saturated hydraulic conductivity from different data sets. All parameters are averaged over a depth of 0–1 m. The soil parameters from the high-resolution CDSP data set are the most heterogeneous, whereas the parameters from ERA5 and GLDASv2.1 only capture the difference between the southern and northern parts of China. The relative absolute differences between the CDSP and the two coarse-resolution data sets over the grids containing RSM stations are 1%–14%, 3%–63%, and 4%–123% (5%–95% percentiles of 1,553 stations) for porosity, b factor, and hydraulic conductivity, respectively (the CDSP was used as a reference). These large disparities will result in differences in the RSM simulation. For example, the higher porosity and b factor in southern and eastern China in ERA5 will result in higher soil water holding capacity and stronger soil water retaining ability than the CDSP, which tends to increase the SM when other conditions are the same.

Details are in the caption following the image

Spatial distributions of soil porosity, b factor and soil saturated hydraulic conductivity for different products at a soil depth of 0–1 m. Soil porosity indicates the maximum water-holding capacity of soil, the b factor is used to calculate the soil matric potentials (a higher b indicates higher soil water retaining ability), and soil saturated hydraulic conductivity reflects the soil water movement speed. Note that the results are based on the thickness-weighted mean of the values in the top 1 m soil layers.

3.2 Independent and Synergistic Effects of Models, Forcings and Soil Parameters

3.2.1 National and Regional Scales

Figure 7 shows the observed and simulated national mean daily RSM during the growing season of 2013–2017, together with the evaluation metrics and their uncertainties. The “Ref” ensemble simulation, where coarse-resolution meteorological forcings and soil parameters were used to drive CoLM and Noah-MP LSMs, shows a wet bias (β = 1.1), an overestimated variability (α = 1.4), a low CC (CC = 0.51), and a small KGE (0.32) (blue lines and bars in Figure 7). When high-resolution soil parameters are used (the “S” experiment), the wet bias is significantly reduced with the β factor close to 1 (brown bars in Figure 7e). This finding is consistent with previous studies (De Lannoy et al., 2014; Livneh et al., 2015; Maertens et al., 2021; Singh et al., 2015). However, the “S” experiment does not exhibit a large improvement in the CC and α, thus the KGE is only 0.03 (insignificant) higher than the “Ref” experiment (brown bars in Figures 7b–7d). Similarly, only using high-resolution meteorological forcings (the “F” experiment in Figure 7) or the CSSPv2 model (the “M” experiment in Figure 7) does not significantly improve the KGE (light brown and orange bars in Figure 7b), due to inconsistent improvements in the CC, β and α. For example, the “F” experiment provides a higher CC but worse β and α factors than “Ref,” resulting in negligible improvements in KGE.

Details are in the caption following the image

Evaluations of national mean root-zone soil moisture (RSM) for different simulations. (a) Observed and simulated national mean daily RSM during the growing season of 2013–2017. The national mean was calculated by averaging the mean RSM of the sub-regions (Figure 1) to prevent disproportionate weighting caused by the heterogeneous distribution of the stations. (b) The Kling-Gupta efficiency (KGE) between simulations and observations. (c–e) The three metrics used to calculate the KGE, including the correlation coefficient (CC), the ratio of variation (α), and the ratio of mean state (β). The colored bars are ensemble means, and the error bars represent ±1 standard deviation of the individual members. “Ref” is the simulation without using the Conjunctive Surface-Subsurface Process model (CSSPv2) model, CMA Land Data Assimilation System (CLDAS) high-resolution meteorological forcings, and China Data set of Soil Properties (CDSP) high-resolution soil parameters, while “M,” “S,” and “F” denote the use of the CSSPv2 model, CLDAS forcings, and CDSP soil parameters, respectively.

However, when the CSSPv2 model and high-resolution soil parameters are combined (the “M + S” experiment), the KGE increases to 0.62 and exceeds the error bars of the “Ref,” “M,” and “S” experiments (Figure 7b), indicating a significant improvement. Figures 7c–7e show that the “M + S” experiment has a lower wet bias (represented by β factor) than the “M” experiment, a higher CC than the “S” experiment, and a lower variation bias (represented by α factor) than both the “M” and “S” experiments, suggesting the synergistic effects of the CSSPv2 model and high-resolution soil parameters on the RSM modeling. Although the physical process responsible for such a synergistic effect is very complicated and highly nonlinear, the effect is evident from the model's structure and the CDSP high-resolution soil parameter data set. The Noah-MP model employs homogeneous soil profiles with the hydraulic properties specified by limited soil categories, and hence cannot fully use the additional information contained in the CDSP data set (e.g., vertical and spatial variations, and data accuracy) to represent the spatial heterogeneity. Therefore, the Noah-MP model inhibits parts of the favorable effects of CDSP soil parameters on RSM modeling. The CSSPv2 model estimates hydraulic properties using continuous pedotransfer functions and considers the vertical variation, but these advantages are not incorporated in RSM modeling without high-resolution soil parameters. Combining the CSSPv2 model and the CDSP product reduces the uncertainties related to both model and soil parameter, resulting in a considerable improvement in RSM modeling.

When the high-resolution meteorological forcing is further incorporated, the “M + S + F” simulation exhibits similar dynamics (CC = 0.79), variations (α = 1.07), and mean values (β = 0.99) as the observations. The KGE also increases from 0.62 in “M + S” to 0.78 in “M + S + F” experiment (red lines and bars in Figure 7). Such an obvious increase in KGE does not exist in “M + F” and “S + F” experiments, suggesting the needs of advanced LSMs and high-resolution parameters to achieve the added value of CLDAS high-resolution forcings in RSM modeling.

Figure 8 depicts the independent and synergistic contributions of multiple factors. The synergistic effect is dominant in improving the simulation of the national mean RSM (total areas of the (M, F), (M, S), (S, F) and (M, S, F) for the pie chart in the center of Figure 8), which is consistent with the result in Figure 7. Different from the national mean RSM, the improvement in regional mean RSM exhibits some independent positive effects of the advanced LSM and high-resolution model inputs. For example, a large independent contribution of high-resolution meteorological forcings is observed in western China (orange pie charts in Figure 8). This finding is not surprising because the coarse-resolution meteorological forcings (e.g., ERA5, GLDASv2.1 and MERRA2) have large uncertainty in western China (C. X. Li et al., 2020; Yang et al., 2017). In contrast, the CLDAS high-resolution forcings are based on numerous station-based observations, reducing the uncertainty. Nevertheless, the independent contribution of the CSSPv2 model is large (53%–66%) in North and South China (light yellow pie charts in Figure 8), suggesting that the CSSPv2 model provides substantial added value in simulating RSM in these regions, even forced by coarse-resolution meteorological forcings and soil parameters (the detailed reasons will be discussed in Section 4.2). Meanwhile, the synergistic effect is still dominant in improving the high-resolution RSM simulation accuracy in East China and the Xinjiang area, contributing 68%–90% of the improved KGE.

Details are in the caption following the image

Independent contributions of the Conjunctive Surface-Subsurface Process model (CSSPv2) model, CMA Land Data Assimilation System (CLDAS) high-resolution forcings, China Data set of Soil Properties (CDSP) high-resolution soil parameters, and their synergistic effect on improving the Kling-Gupta efficiency (KGE) at the national and regional scales. The results are based on the regional or national mean root-zone soil moisture, and the latter was calculated from the regionally averaged one. The mean KGE improvements from the “Ref” to “M + F + S” experiment are shown in the brackets below the region names. The results for SouthWest and NorthEast China are not shown due to insignificant improvement. Only factors with positive effects on the KGE improvements are shown in the pie charts. The “F,” “M,” and “S” represent independent contributions, while the “(M, F),” “(M, S),” “(S, F),” and “(M, S, F)” are synergistic contributions from two and three factors.

3.2.2 Station Scale

The evaluation result (Figures S3a–S3c in Supporting Information S1) depicts that the “M + S + F” experiment still provides significantly better performance than the “Ref” simulations at the station scale. Figure 9 shows the regional mean ∆KGE between the “Ref” and the other experiments, and error bars are the uncertainties. Significant improvements are observed in the “M + S + F” simulation in all regions and most regional mean ∆KGEs are larger than 0.1 (about 70% of the mean KGE of the “Ref” simulation), except for Xinjiang and NorthEast China (red bars in Figure 9). Similar to the regional results, the CSSPv2 model shows large independent positive effect on RSM modeling in North and South China (the ∆KGE of “M” experiment in Figures 9d and 9g, Figure S3d in Supporting Information S1). The high-resolution soil parameters provide an independent added value to the RSM simulation at most stations (the ∆KGE of “S” experiment in Figure 9 and Figure S3e in Supporting Information S1), with significant KGE increase occurring over NorthWest and SouthWest China. The high-resolution meteorological forcings can also cause significant KGE improvements independently (the ∆KGE of “F” experiment in Figure S3f in Supporting Information S1), especially over the XinJiang, NorthWest, SouthWest and NorthEast China (Figures 9b, 9e, and 9f). In addition to the independent impacts, synergistic effects between different factors are still observed at the station scale. For example, the ∆KGE in “M + S” experiment is larger than the sum of the ∆KGE in “M” and “S” experiments over North China (Figure 9d), and the ∆KGE in “M + F + S” is greater than the sum of the ∆KGE in “M,” “S,” and “F” experiments over North and East China (Figures 9d and 9h).

Details are in the caption following the image

Regional means of ∆KGE between the reference simulation and the other simulations using advanced land surface model or/and high-resolution forcings and parameters. The regional mean was derived from the ΔKGE for each station in Figure S3 in Supporting Information S1. The mean values (bar plots) of 18 paired experiments and the ±1 standard deviations (error bars) are shown.

Figure 10 shows the independent and synergistic contributions of different factors quantitatively. In contrast to the regional results (Figure 8), independent effects of high-resolution meteorological forcings and soil parameters occur at all regions at the station scale, and contribute to 6%–61% and 8%–33% of the KGE improvement, respectively. This result reveals that the independent contributions of high-resolution forcings and soil parameter are larger and more robust at small scales than those at regional scales. Figure S4 in Supporting Information S1 depicts the lagged CC between precipitation and SM, and the high-resolution precipitation data have significantly higher CCs with RSM than coarse-resolution products at the station scale. Therefore, the heterogeneity of high-resolution forcings and their correlations with RSM are not smoothed at the station scale, leading to a higher independent contribution of high-resolution meteorological forcings than that at regional and national scales. The CSSPv2 model provides critical, independent contributions in North and South China.

Details are in the caption following the image

Independent contributions of the Conjunctive Surface-Subsurface Process model (CSSPv2) model, CMA Land Data Assimilation System (CLDAS) high-resolution forcings, China Data set of Soil Properties (CDSP) high-resolution soil parameters, and their synergistic effects on improving the Kling-Gupta efficiency (KGE) at the station scale. All results were calculated at each station and then averaged over different regions. The mean KGE improvements are shown in the brackets below the region names. Only factors with positive effects on the KGE improvements are shown in the pie charts. The “F,” “M,” and “S” represent independent contributions, while the “(M, F),” “(M, S),” “(S, F),” and “(M, S, F)” are synergistic contributions from two and three factors.

Although the synergistic contributions are smaller at the station scale than at the regional scale, they still have a large influence in East, North, and Northeast China (Figure 10) where the synergistic effect is responsible for 42%–60% higher KGEs. In western China, synergistic effects still cannot be neglected, as they interpret about 7%–25% of the improvement. As described in Section 3.2.1, the synergistic effect is attributed to the interdependence between multiple factors to improve RSM simulation. This interdependence is more pronounced in southern China because the synergistic effect can only be achieved when combining the advanced LSM, high-resolution forcings and parameters (e.g., red pie charts over SouthWest, South and East China in Figure 10). Nevertheless, the advanced model and high-resolution forcings (or the high-resolution surface parameters and high-resolution forcings) can collaborate to improve the RSM simulation in central and northern China (e.g., light green and dark green pies in NorthWest, North and NorthEast China).

4 Discussion

4.1 Sensitivity of the Results to the Use of RSM Anomaly Data

This study focuses on the absolute RSM time series, because both temporal dynamic and mean state of RSM are concerned by the modeling group (Nicolai-Shaw et al., 2015; Niu et al., 2011; Wood et al., 1998). However, some studies suggest that using RSM anomaly data is more appropriate for analyzing the spatial-temporal variability of RSM (Brocca et al., 2014; Mittelbach & Seneviratne, 2012). Therefore, we also used the RSM anomaly to replicate our analysis in order to determine whether the results were induced solely by the improved modeling of RSM mean states, which are significantly affected by soil parameters such as porosity. Figures S5–S8 in Supporting Information S1 are the same as Figures 7-10, but use the RSM anomaly time series. The mean KGE improvements (∆KGE shown in the brackets in Figures 7 and 10, Figures S5 and S8 in Supporting Information S1) decrease after applying the RSM anomaly data (e.g., ∆KGE decreases from 0.46 to 0.45 for national scale evaluation from Figure 7 to Figure S5 in Supporting Information S1), because the positive effects of high-resolution factors on correcting the RSM modeling biases are not considered. However, the decreases of ∆KGE are minor (0.01∼0.02) over most regions. Obvious improvement of KGE after using high-resolution factors and CSSPv2 model still exist (Figures S5 and S7 in Supporting Information S1), which is consistent with that in Figures 7 and 9. The synergistic effect is still important and contributes to 19∼96% to the KGE improvement at regional and station scales (total area of “M + S,” “M + F,” “F + S,” and “M + F + S” in Figures S6 and S8 in Supporting Information S1).

Table S2 in Supporting Information S1 shows the KGE metric and its three components when we evaluating national mean RSM (Figure 7 and Figure S5 in Supporting Information S1). The bias term (β) is around 1, and the |β − 1| ranges from 0 to 0.15 across different experiments. On the other hand, the CC and α terms exhibit larger spread, with the |CC − 1| and |α − 1| ranging from 0.21 to 0.49 and 0.1 to 0.53, respectively. The same pattern holds true when evaluating the regional and station scale RSM (not shown). Therefore, the insensitivity of the results to the use of absolute or anomalous RSM data is due to much larger improvements in RSM dynamics and variability than that in the mean state.

4.2 Possible Reasons for Independent Contributions of CSSPv2 Model

The independent contribution of the CSSPv2 model, which is significant and critical over North and South China, is attributed to a better representation of the drying/wetting processes of RSM. Figure 11 presents the RSM simulations in North and South China derived from four LSMs, while the evaluation metrics are given in Table S3 in Supporting Information S1. The Noah-MP1 and Noah-MP2 models show more rapid drying during dry periods in 2014, 2016, and 2017 in North China and respond slowly to precipitation events after long term drying, resulting in a low CC with the observations. Although the CoLM performs much better, its RSM still keeps drying during the rainy season in 2014 and increases slowly during the rainy season in 2017. In South China, however, the Noah-MP1, Noah-MP2 and CoLM simulation results exhibit faster drying and wetting speed than the observations, causing high variability (60%–100% higher than the observations) and a small KGE. In contrast, the CSSPv2 model captures the drying and wetting process in North China and shows a reasonable RSM variability in South China. Note that although CSSPv2 shows a better simulation of the mean state of RSM, its improvements in modeling RSM dynamics (the CC term) or variability (the α term) are much larger and dominate the increase in KGE (Table S3 in Supporting Information S1).

Details are in the caption following the image

Time series of regional mean root-zone soil moisture (solid lines) and precipitation (bars) in North and South China during the growing season of 2013–2017. All simulations used the CMA Land Data Assimilation System high-resolution meteorological forcing and China Data set of Soil Properties high-resolution soil parameters.

We conducted similar analysis using the CSSPv2 model with default runoff parameters (CSSPv2_Default) to examine whether the improved performance of CSSPv2 is attributed to the calibrated runoff parameters. Figure S9 in Supporting Information S1 shows that the primary difference between CSSPv2 and CSSPv2_Default is the regional mean RSM state. The mean difference between the KGE metric based on CSSPv2 and CSSPv2_Default simulations at all RSM stations was 0.02 ± 0.2 (not shown), indicating an insignificant difference between CSSPv2 and CSSPv2_Default at the station scale. The contribution result based on CSSPv2_Default simulation is also similar with that based on CSSPv2 simulation (Figure 10 and Figure S10 in Supporting Information S1), and CSSPv2 model still exhibits large independent contributions to RSM modeling over North and South China. Therefore, the advantages of CSSPv2 for SM modeling are mainly caused by model structure or physical parameterizations instead of the calibrated runoff parameters.

Niu et al. (2011) found that the shallow soil column used by the Noah model does not capture the critical zone (down to 5 m), which is vital to the SM dynamics. In addition, the free drainage runoff scheme in the Noah model (which is used in Noah-MP1) will underestimate the memory of RSM, causing rapid wetting or drying speed of the RSM. Shellito et al. (2020) also found that reducing the layer thickness in the Noah-MP model caused more rapid and thorough drying of the soil during dry down periods. In contrast, the CSSPv2 and CoLM consider much deeper soil columns (5.67 m for CSSPv2 and 3.43 m for CoLM), resulting in a slower drying speed during dry periods in North China. Moreover, the parameterizations of vegetation evapotranspiration in the CSSPv2 and CoLM differ from those of Noah-MP1/Noah-MP2, leading to different responses in vegetation transpiration to drying soil and influencing the drying speed. As described in Section 2.2.1, the CSSPv2 has improved SM transport module and runoff and groundwater schemes than the CoLM, causing a better performance of the CSSPv2 than the CoLM in North and South China. However, it is difficult to determine the dominant process causing differences between the CSSPv2 and other models since the model structure, parameterizations, and parameters are all different.

4.3 Contributions of Spatial Resolution and Data Accuracy to the Effect of High-Resolution Forcings and Parameters

Because the high-resolution meteorological forcings and soil parameter products use more observations than the coarse-resolution data sets, their contributions to RSM modeling are related to both spatial resolution and data accuracy. We repeated the 48 sets of simulations at 30 km (∼0.25°) resolution to distinguish the influences of the spatial resolution and data accuracy. The high-resolution forcings and parameters were upscaled to 30 km using the conservative interpolation method, so as to eliminate their advantages in spatial resolution. It is worth noting that the high-resolution CDSP soil texture data were first upscaled to 30 km and then used to calculate the soil hydraulic properties, following the method described in Section 2.2.2.

Figures 12a and 12b show that the independent effects of the CLDAS meteorological forcings and CDSP soil parameters (indicated by ∆KGE) are similar for the 6 and 30 km resolution experiments. Similar results can be found for the synergistic effects (Figures 12c–12f), suggesting that high data accuracy is primarily responsible for the positive contribution of high-resolution forcings and parameters. On the other hand, there are still some regions (e.g., Northeast, Southwest, and East China in Figure 12a) with relatively dense RSM observation network exhibiting higher KGEs at 6 km resolution than at 30 km resolution, indicating that high spatial resolution also has contributions. However, the main objective of this study is not to reiterate the well-known notion that incorporating high-resolution forcings/parameters can improve the accuracy of RSM modeling due to increased spatial resolution and data accuracy (Beven et al., 2014; Ji et al., 2017; Singh et al., 2015). Instead, this study suggests that a substantial portion of the positive effects of a high-resolution factor (e.g., forcings) on RSM modeling, resulting from higher spatial resolution and data accuracy, are influenced by other factors (e.g., parameters) with coarse spatial resolution and low data accuracy (see Section 4.4 for detailed discussion).

Details are in the caption following the image

The influence of spatial resolution on the changes of Kling-Gupta efficiency (KGE) in different experiments. All the statistics are calculated at each station and averaged over the national and regional scales. The “NW,” “XJ,” “N,” “NE,” “SW,” “S,” and “E” mean the Northwest China, Xinjiang, North China, Northeast China, Southwest China, South China and East China.

4.4 Comparison of Results With and Without Synergistic Contribution

The synergistic contribution is the focus of this research. A large synergistic contribution represents a strong interdependence between the LSMs, forcings and parameters in improving RSM modeling. Ignoring the synergistic contribution causes uncertainty in quantifying the effect of a single factor. Figure 13 shows the effects of the CSSPv2 LSM, high-resolution forcings, and soil parameters on ∆KGE obtained from model comparisons with and without synergistic effects. The CLDAS high-resolution meteorological forcings result in a 0.065 higher mean KGE at the station scale (“F” − “Ref” in Figure 13) when the model and soil parameters achieve no improvements. However, when the CSSPv2 LSM and CDSP soil parameters are used, the ∆KGE increases to 0.09 (“M + F + S” − “M + S” in Figure 13). The higher ∆KGE in the latter case is caused by the reduced uncertainties in the LSM and soil parameters. Similar results are observed for the CSSPv2 model and CDSP soil parameters. In general, the beneficial effect of a single factor on RSM modeling results in a 25%–80% higher KGE after considering the synergistic effect (comparing the purple bars with green bars in Figure 13). In other words, ignoring the synergistic effect underestimates the added value of the improved forcings/models/parameters by 25%–80%.

Details are in the caption following the image

Attributed effects of the Conjunctive Surface-Subsurface Process model (CSSPv2) model (“M”), CMA Land Data Assimilation System (CLDAS) high-resolution meteorological forcings (“F”), and China Data set of Soil Properties (CDSP) high-resolution soil parameters (“S”) on the Kling-Gupta efficiency (KGE) of root-zone soil moisture simulation under different scenarios. The purple and dark-green bars represent the ensemble results with and without the synergistic effect, respectively. The error bars indicate ±1 standard deviation of the individual members. All statistics were calculated at the station scale and averaged nationally. The names of comparative experiments are shown in the figure, where, for example, the (“M + S + F” − “M + F”) refers to the difference between the “M + S + F” and “M + F” experiments.

In addition, ignoring the synergistic contribution may provide misleading information on the dominant factor. For example, when we compare the effects of the CSSPv2 model, high-resolution meteorological forcings, and high-resolution soil parameters in East China without considering the synergistic effects (dark-green bars in Figure S11 in Supporting Information S1), the soil parameters will be regarded as the dominant factor. However, when the synergistic effects are considered (purple bars in Figure S11 in Supporting Information S1), the effect of CSSPv2 model is largest. These contrary results are because that the positive influence of the CSSPv2 on RSM modeling in East China is highly dependent on the accuracy of the forcings and parameters (red and dark-green in the pie charts in Figure 10).

4.5 Implications of the Independent and Synergistic Contributions

The independent contributions can shed light on other hydrological modeling works. For example, future research could be conducted to improve the Noah-MP and CoLM regarding the sensitivity of RSM to precipitation and the RSM drying speed during dry periods, especially in North and South China. The high-resolution meteorological forcings have significant independent contributions to improving RSM accuracy at station scale in Northwest and Southwest China. Therefore, pursuing the high-resolution meteorological forcing that merges numerous observations and has high data accuracy ranks the first when performing high-resolution RSM simulations in western China. In addition, high-resolution soil parameters have clear independent contribution (8%–33%) at the station scale, suggesting the necessity of using high-resolution soil parameters.

The synergistic contribution represents the interdependence between multiple factors in improving RSM modeling, because the high-resolution forcings (or advanced LSM or high-resolution parameters) do not necessarily improve the RSM simulation due to the constraints from other coarse-resolution factors with large uncertainties. Since the synergistic contributions are significantly large in eastern and southern China, where the frequency of flash drought is increasing rapidly (Yuan et al., 2019), more attentions should be paid to the collaborative development of models, forcings and parameters to provide precise RSM simulations in these places. In addition, as discussed in Section 4.4 the large synergistic contribution also indicates the necessity to investigate the collaborative influence of different high-resolution factors on RSM modeling instead of only considering individual factors from an independent perspective.

4.6 Uncertainties and Limitations

Some uncertainties could have affected this study. First, although the CLDAS meteorological forcings and CDSP soil parameters are based on numerous observations, they are far from perfect and contain uncertainties. As a result, there are uncertainties in the independent and synergistic effects of high-resolution forcings and soil parameters. Second, although an unprecedented number of RSM stations were used to evaluate the RSM simulations, the station density remains sparse in western China, including Tibet, Xinjiang, and Inner Mongolia. Therefore, current results may not well represent the actual conditions in western China. Continuous efforts are needed to establish more stations in these regions. Third, interpolating different forcings and soil parameter products to the same resolution may cause uncertainties in the RSM simulation (Chen & Yuan, 2020). However, this does not influence the significance of the result because we repeated the simulation and analysis at 30 km resolution, and the independent and synergistic influences did not change significantly (Figures S12 and S7 in Supporting Information S1).

This study conducted simulations at the highest spatial resolution of the meteorological forcings (6-km). Figure S13 in Supporting Information S1 illustrates that 80% of the RSM stations are situated in the grids with relatively low subgrid variations (<30%) for variables that are important for SM heterogeneity, such as topography, soil texture, and land cover type (Chaney et al., 2016). Therefore, the majority of RSM stations have a good spatial representativeness at a 6-km resolution. However, there are still 20% of stations that may suffer from uncertainties due to the mismatch in spatial representation between observations and simulations. In addition, although four LSMs, four meteorological forcing products, and three soil parameter data sets were used in this study to minimize sampling uncertainties, there are still many other LSMs (e.g., the Community Land Model and the Variable Infiltration Capacity model) and products (e.g., satellite-based precipitation and machine-learning-based soil parameters). Moreover, this study focused on the soil hydraulic parameters that can be observed (or at least parameterized according to the observation) and have critical influence on RSM modeling (Arsenault et al., 2018; Gan et al., 2015; J. Li et al., 2013), the other hydrological parameters that cannot observed and need calibrations are set default. Further work is required to extend the framework to higher spatial resolution and larger regions with more data sets/models/parameters to obtain a full picture of the independent/interdependent relationships between models, forcings, and parameters in the RSM simulation.

5 Conclusions

The advanced LSMs, newly developed high-resolution meteorological forcings, and soil parameters are expected to facilitate high-resolution RSM modeling. However, a single factor may not improve the RSM simulation accuracy due to the limitations of other factors with large uncertainties. To date, the independent contribution of one factor to RSM modeling accuracy and the interdependent relationships between multiple factors have been unknown. In this study, we proposed a framework to distinguish the independent and synergistic contributions based on ensemble simulations. The framework was applied to the mainland of China, and the ensemble simulations were performed using four LSMs, four meteorological forcing products, and three soil parameter data sets. The following results were observed.

Simulations using all the high-resolution factors (including forcings, parameters, and the CSSPv2 LSM with improved physical parameterizations) significantly improve the RSM modeling accuracy at national to station scales, resulting in a 0.1–0.42 higher KGE than the reference product (simulations from other LSMs forced with coarse-resolution forcings and parameters). The improved KGE is much larger than the sum of the improvement due to use of individual high-resolution factor alone, suggesting a strong interdependence between these high-resolution factors in facilitating the RSM modeling. At national and regional scales, the interdependent (or synergistic) effect dominates the improvement in RSM simulation by contributing to 68%–95% of the increase in KGE score. At station scale, the independent contribution increases, but the synergistic contribution is still dominant in eastern China. The independent effect of forcings/parameters is due to the higher accuracy of high-resolution data sets, while the independent effect of CSSPv2 LSM is attributed to its superiority in capturing a reasonable RSM drying/wetting speed. The synergistic effect is caused by consistent improvements in the magnitude and variability of the simulated RSM, which can hardly be achieved when only using individual high-resolution factor. Ignoring the synergistic contributions will underestimate the influence of advanced models, and high-resolution meteorological forcings, and parameters by 25%–80% and provides misleading information on the dominant factor affecting high-resolution RSM modeling accuracy. Thus, we highlight the necessity to investigate the effects of advanced LSMs, high-resolution forcings, and high-resolution parameters on RSM simulation from an interdependent perspective.

Acknowledgments

This work was supported by National Natural Science Foundation of China (U22A20556), Natural Science Foundation of Jiangsu Province (BK20220460), Natural Science Foundation of Jiangsu Province for Distinguished Young Scholars (BK20211540), Natural Science Foundation of the Jiangsu Higher Education Institutions of China (22KJB170004), and the Major Science and Technology Program of the Ministry of Water Resources of China (SKS-2022019).

    Conflict of Interest

    The authors declare no conflicts of interest relevant to this study.

    Data Availability Statement

    The GLDASv2.1 and ERA5 data sets are available from NASA Land Data Assimilation System (https://ldas.gsfc.nasa.gov/gldas) and the Climate Data Store (https://cds.climate.copernicus.eu/cdsapp#!/home), respectively. The CLDASv2.1 meteorological forcing and soil moisture data sets are provided by the China meteorological data service center (http://data.cma.cn/en). The China Dataset of Soil Properties and CoLM source code is available from the repository of land-atmosphere interaction research group at Sun Yat-sen University (http://globalchange.bnu.edu.cn/research/soil2). Noah-MP model is available from the National Center for Atmospheric Research (https://ral.ucar.edu/model/noah-multiparameterization-land-surface-model-noah-mp-lsm).