A Framework to Quantify the Uncertainty Contribution of GCMs Over Multiple Sources in Hydrological Impacts of Climate Change

The quantification of climate change impacts on hydrology is subjected to multiple uncertainty sources. Large ensembles of hydrological simulations based on multimodel ensembles (MMEs) have been commonly applied to represent overall uncertainty of hydrological impacts. However, as increasing numbers of global climate models (GCMs) are being developed, how many GCMs in MMEs are sufficient to characterize overall uncertainty is not clear. Therefore, this study investigates the influences of GCM quantity on quantifying overall uncertainty and uncertainty contributions of multiple sources in hydrological impacts. Large ensembles of hydrological simulations are obtained through the permutation of 3 greenhouse gas emission scenarios, 22 GCMs, 6 downscaling techniques, 5 hydrological models (HMs), and 5 sets of HM parameters, which enables to decompose uncertainty components using analysis of variance. The influences of GCM quantity are investigated by repeatedly conducting uncertainty decomposition for hydrological simulations from subsets with different numbers of GCMs. The results show that GCMs are the leading uncertainty sources in evaluating changes in annual and peak streamflows, while for changes in low flow, other uncertainty sources except HM parameters also have large contributions to overall uncertainty. Furthermore, on the condition of using no more than five GCMs, there are large possibilities that the overall uncertainty and GCMs' uncertainty contribution are underestimated. Using around 10 GCMs can ensure that the median of different combinations generates similar uncertainty components as the whole ensemble. Therefore, it is recommended to use at least 10 GCMs in studies of climate change impacts on hydrology to thoroughly quantify uncertainty.


Introduction
Designing mitigation and adaptation strategies in response to climate change relies on robust simulations of climate change scenarios (Eyring et al., 2019). Although global climate models (GCMs) are state-of-art tools for global climate change analyses, they are still subject to uncertainty from multiple sources. For example, due to unknown future emissions of anthropogenic forcing, predictions in future climate rely on the projections of greenhouse gases and aerosols, which introduce scenario uncertainty (van Vuuren et al., 2011;Yip et al., 2011). Scenario uncertainty is often accounted for through predefined scenarios of societal development (e.g., representative concentration pathways, RCPs). Besides, since different GCMs have their own simplification, parameterization, and numeric approximations about climate systems, there is model uncertainty in long-term climate simulations (Knutti et al., 2010(Knutti et al., , 2019Murphy et al., 2004).
In studies of climate change impacts, additional sources of uncertainty are also introduced when climate simulations are used to obtain projections in the impact fields. For example, there are generally two steps to obtain projections about future hydrological impacts in watersheds: (1) the outputs of GCMs are downscaled to the scale of watersheds and (2) the downscaled data are then input into hydrological models to simulate hydrological responses under climate change scenarios (Chen et al., 2013;Wang et al., 2019;Wilby & Harris, 2006). The choices of these steps influence the simulations of hydrological responses under climate change (Bastola et al., 2011;Jung et al., 2012;Poulin et al., 2011;Rana & Moradkhani, 2015;Shen et al., 2018;Velázquez et al., 2013). For example,  quantified the uncertainty of downscaling techniques in studies of hydrological impacts and highlighted the necessity of using multiple downscaling methods. Najafi et al. (2011) evaluated the uncertainty resulting from the selection of hydrological models in the quantification of hydrological responses under climate change and found that hydrological models contribute more uncertainty during the dry season. Her et al. (2019) considered the uncertainty of hydrological model parameterization and addressed its significant influences on simulating slow components of streamflow.
In general, the choices on each process of climate modeling and impact modeling affect the hydrological projections under climate change. Many studies have named this series of modeling steps as the impact modeling chain, in which the uncertainty propagates as different choices are made for each process (Addor et al., 2014;Bosshard et al., 2013;Chegwidden et al., 2019;Chen, Brissette, Poulin, & Leconte, 2011;Dobler et al., 2012;Vidal et al., 2016;Wilby & Harris, 2006). Figure 1 shows an example of the impact modeling chain for studies of hydrological climate change impacts, which is used in this paper. Since a failure to consider uncertainty from multiple sources may lead to a biased assessment of climate change impacts, it is important to cover the overall uncertainty in the impact modeling chain.
One common way to investigate the overall uncertainty of hydrological projections is to design a factorial experiment where multiple options are implemented for each step in the impact modeling chain. For example, climate model ensembles formed by multiple GCMs are often used to assess the model uncertainty from GCM structures (Chen, Brissette, Poulin, & Leconte, 2011;Knutti et al., 2010;Wang et al., 2018;Wilby & Harris, 2006). Velázquez et al. (2013) investigated the uncertainty of hydrological modeling in the evaluation of climate change impacts on water resources by considering multiple types of hydrological models. Besides, the large ensembles based on the factorial experiment make it possible to analyze the relative contributions of each source to the overall uncertainty. In addition to the identification of major uncertainty components, quantifying uncertainty contributions of different sources may also help to highlight the potential ways ahead to reduce uncertainty in impact assessment. Many studies have employed different methods to analyze the uncertainty contributions from different sources. For example, Chen, Brissette, Poulin, and Leconte (2011) grouped the results of hydrological simulations according to modeling options and compared the range of the means of each group among different modeling steps. Dobler et al. (2012) compared the uncertainty contributions among global or regional climate models, bias correction methods, and parameters of a hydrological model by only varying the modeling step under focus and keeping others fixed. Bosshard et al. (2013) used the analysis of variance method (ANOVA) to obtain the uncertainty contributions of GCMs, downscaling techniques, hydrological modeling, and their interactions.
However, in the context of factorial experiments, only finite numbers of options can be considered in each step of the impact modeling chain. For example, Tian et al. (2016) allocated sources of uncertainty for high flow projections by employing 4 greenhouse gas emission scenarios, 3 GCMs, 10 parameter sets for downscaling, and 3 hydrological models. They found that the hydrological models contribute the largest share of uncertainty. Mandal and Simonovic (2017) used 3 emission scenarios, 4 GCMs, and 6 downscaling methods to model future streamflow predictions for a Canadian watershed, and downscaling methods are found to be the largest source to contribute uncertainty. Chegwidden et al. (2019) used 10 GCMs with better performances in reproducing observed metrics and found that emission scenarios and GCMs are the most important contributor for the variance of annual streamflow and timing while hydrological models are the largest contributor for low flows. The limited size of options may affect the uncertainty quantification of each source. Using small numbers of modeling options may lead to insufficient uncertainty quantification, especially for GCMs, which are usually estimated as the major uncertainty source in many studies (Bosshard et al., 2013;Chen, Brissette, Poulin, & Leconte, 2011;Dobler et al., 2012).
On the other hand, with the substantial development of climate modeling, an increased number of GCMs has been developed (Eyring et al., 2019). To be specific, 61 different GCMs are archived in the Coupled Model Inter-comparison Project Phase 5 (CMIP5), and the ongoing CMIP6 encompasses more than 80 GCMs (Eyring et al., 2016;Taylor et al., 2012). Although it is recommended to use as many GCMs as possible in order to fully evaluate GCMs' contribution to uncertainty (Knutti et al., 2010;Santer et al., 2009), using all available climate simulations will result in large computational costs in the quantification of hydrological impacts from climate change. Thus, analyzing the necessary number of climate models can shed light on the way to employ the ensemble of GCMs in the studies of hydrological impacts for researchers and practitioners.
Many studies have investigated the selection methods of GCMs for the quantification of hydrological impacts of climate change, which can be generally separated into two types. The first type of approaches involves the ranking of GCMs based on their performances with respect to reproducing the observed characteristics (Evans et al., 2013;Raju & Nagesh Kumar, 2014). For example, Ahmadalipour et al., (2017) ranked 20 GCMs based on their skills in simulating multiple aspects of daily observations in Columbia River Basin, which could provide an objective reference for selecting more reliable GCMs for regional climate change impact assessment. However, the ranks of GCMs may be altered due to their nonstationary performances in simulating climates (Hui et al., 2019). Another type of approaches seeks to preserve the uncertainty of climate change signals related to the choice of GCMs (Cannon, 2015;Warszawski et al., 2014). For example, Wang et al. (2018) used two envelope-based selection methods of GCMs in quantifying the hydrological impacts and their results showed that the selection of about 10 climate simulations can cover the majority of uncertainty range. This type of approaches matches the motivation of using multimodel ensembles (MMEs), which is to consider the uncertainty resulting from different sources.
Despite underlying methods, most of selection approaches only focused on the uncertainty related to the choice of GCMs. However, the assessment of hydrological impacts involves multiple uncertainty sources in the impact modeling chain, so the selection based on GCM uncertainty may not sufficiently reflect overall uncertainty of hydrological impacts. In addition, although some studies have applied the factorial experiments to evaluate uncertainty contributions from multiple sources, the influences of the numbers of GCMs employed on the uncertainty quantification in the impact modeling chain are not investigated. Investigating the necessary number of GCMs can help to reduce the computational burden of applying multiple GCM outputs in the quantification of hydrological impacts of climate change.
Accordingly, the objectives of this study are to (1) analyze the uncertainty contributions from five sources in the modeling chain of hydrological impacts of climate change and (2) investigate the influences of the number of GCMs on the quantification of uncertainty contribution from GCMs. In this study, we conduct factorial experiments in two watersheds with contrast hydroclimatic characteristics. The multivariate ANOVA method is then used to partition the total uncertainty into respective sources. In order to investigate the influences of the number of GCMs on quantifying uncertainty of hydrological impacts, the uncertainty

10.1029/2020EF001602
Earth's Future decomposition is repeatedly conducted for GCM combinations that comprise different numbers of GCMs and is compared with the uncertainty quantification of using all GCMs.

Study Area
This study was conducted over the snowfall-dominated Manicouagan-5 watershed  in Canada and the rainfall-dominated Xiangjiang watershed in China ( Figure 2). The Manic-5 watershed with a drainage area about 24,610 km 2 locates at the center of Quebec, Canada (Figure 2a). It is characterized by a continental subarctic climate with half of the year below 0°C. The average air temperature over the watershed is about −3°C and the average monthly temperature ranges from −22°C (January) to 13°C (July). The average annual precipitation is about 950 mm, which is relatively evenly distributed throughout the year. The average streamflow at the outlet of the Manic-5 watershed (the Daniel-Johnson Dam) is about 530 m 3 s −1 , and the streamflow process is mainly dominated by the spring snowmelt runoff.
The Xiangjiang watershed with a drainage area of about 94,660 km 2 locates at central-south China and has a humid subtropical climate ( Figure 2b). The subcatchment above the Hengyang gauge in the middle stream was used (drainage area is about 52,150 km 2 ). The mean air temperature over the subcatchment is about 17°C with the monthly mean ranging from 7°C (January) to 29°C (July). The mean annual precipitation is about 1,483 mm, most of which results from the monsoon in summer. The mean streamflow at the outlet, Hengyang gauge, is around 1,400 m 3 s −1 , and the peak streamflow is mainly generated by summer extreme rainfalls.

Historical Observation
Observed daily meteorological (maximum and minimum temperatures and precipitation) and hydrological (streamflow) data were collected for both watersheds and cover the historical reference period . For the Manic-5 watershed, the meteorological data were collected from a gridded data set, which is interpolated from station observation (Hutchinson et al., 2009). The streamflow data were reconstructed from the Daniel-Johnson Dam through mass balance calculations. For the Xiangjiang watershed, the meteorological data were collected from 97 precipitation gauges and 8 temperature gauges. The streamflow data were collected from the outlet of the subcatchment, Hengyang gauge.

Factorial Experiment and Uncertainty Decomposition
Five uncertainty sources in the impact modeling chain were considered in this study: greenhouse gas emission scenarios, GCMs, downscaling methods, hydrological models, and parameters of hydrological models ( Figure 1). Three emission scenarios, 22 GCMs, 6 downscaling methods, 5 hydrological models, and 5 sets of hydrological model parameters were implemented in the factorial experiment, which enables the quantification of uncertainty for each of five sources. In sum, a large ensemble of 9,900 hydrological projections was obtained over each watershed for both the reference  and future (2070-2099) periods. Then, the multivariate ANOVA was used to partition the overall uncertainty of changes in hydrological indicators. ANOVA is repeatedly conducted for all possible combinations of GCMs to investigate the impacts of the numbers of GCMs considered on the quantification of GCMs' contribution to overall uncertainty.

Greenhouse Gas Emission Scenarios
The climate simulations in our study are extracted from the CMIP5 archive, which employs RCPs as the climate forcing. In this study, three emission scenarios (RCP2.6, RCP4.5, and RCP8.5) were used to cover the range of greenhouse gas emissions, resulting from different climate policies in the 21st century. They respectively represent a very low forcing scenario, a medium stabilization scenario and a very high forcing scenario.

GCMs
For the climate simulations, daily maximum and minimum temperatures and precipitation simulated by 22 GCMs were extracted from the CMIP5 archive (Taylor et al., 2012) for both watersheds (Table 1). These 22 GCMs provide the outputs of climate variables under the historical scenario for the reference period  and the three RCP scenarios used in this study for the future period (2070-2099). The outputs of GCM 10.1029/2020EF001602 were extracted from the ensemble run of r1i1p1, except EC-EARTH, whose outputs were extracted from the ensemble run of r8i1p1.

Downscaling Methods
Six statistical downscaling techniques were used to generate climate change scenarios at the watershed scale. The six downscaling techniques include one change factor method (daily scaling method, DS), two  Earth's Future univariate bias correction methods (daily bias correction method, DBC; and quantile delta mapping method, QDM), two multivariate bias correction methods (two-stage quantile mapping, TSQM; and multivariate bias correction based on N-dimensional probability density function transform, MBCn) and one stochastic weather generator-based method (generator for point climate change, GPCC). All six methods consider different biases or climate change signals for different levels of precipitation by using the cumulative distribution function of precipitation amounts. The characteristics of these downscaling methods are briefly introduced here, and more detailed information can be found in corresponding references.
The DS method (Mpelasoka & Chiew, 2009) assumes that climate change signals simulated by GCMs are reliable and calculates the climate change signals based on the differences (or ratios) in the simulated distributions of temperature (or precipitation amounts) between the future and reference periods. Future climate scenarios are generated by superimposing the climate change signals to the observed series.
Two univariate bias correction methods assume that the biases of GCMs in the future period are consistent with those in the reference period. The DBC method (Chen et al., 2013) involves two steps: (1) the biases in precipitation occurrence are corrected by determining the precipitation threshold for simulated precipitation amounts and (2) the biases in temperature (or wet-day precipitation amounts) are calculated based on the differences (or ratios) in the distributions of temperature (or wet-day precipitation amounts) between the reference simulation and observation, and the same biases are removed for the future simulation. The QDM method  can explicitly preserve simulated climate change signals by quantile when correcting the biases of climate simulations. The QDM method firstly detrends the future simulation by quantile and corrects its biases by using the quantile mapping method; then, the climate change signals in quantiles are added upon the bias-corrected series to generate future climate change scenarios in the watershed scale.
Two multivariate bias correction methods do not only correct biases in the distributions of temperature and precipitation but also the intervariable correlations. The TSQM method  firstly uses the DBC method to correct biases for each climate variable, separately. Then, a distribution-free shuffle approach is used to construct the intervariable correlations by rearranging the sequences of bias-corrected series. The MBCn method (Cannon, 2017) uses the QDM method in the correction of biases for each climate variable and constructs the intervariable correlations by repeatedly corrects the random linear combinations of climate variables.
The GPCC method (Chen et al., 2014) generates climate scenarios based on the combination of the quantile mapping method and a stochastic weather generator. The climate generator (CLIGEN) incorporated into GPCC uses the normal distribution to simulate the temperature series, the first-order two-state Markov chain to simulate precipitation occurrence, and the Pearson Type-III distribution to simulate precipitation amounts. In the first step, the quantile mapping method is used to correct the biases of the GCM-simulated monthly series. In the second step, the parameters in CLIGEN are estimated based on observations during the reference period. For the simulation during the future period, the parameters in CLIGEN are scaled based on the bias-corrected series, which are obtained from the first step.

Hydrological Models
Five lumped hydrological models were employed to consider the influences of hydrological model structures on hydrological simulations: GR4J-6, HBV, HMETS, SIMHYD, and XAJ. All the hydrological models used in this study require minimum and maximum temperatures and precipitation as the inputs, and the potential evaporation is estimated by the Oudin equation (Oudin et al., 2005).
The GR4J-6 hydrological model is a conceptual lumped hydrological model, which consists of CemaNeige snow accumulation and snowmelt module and GR4J rainfall-runoff module (Arsenault et al., 2015). GR4J module is an empirical four-parameter rainfall-runoff model and simulates runoff generation and routing processes through two nonlinear reservoirs and two linear unit hydrographs (Perrin et al., 2003). Since no snow component in GR4J limits its applicability in snow-dominated watersheds such as Manic-5, the CemaNeige module is introduced to simulate the snow accumulation and snowmelt processes. The CemaNeige module separates precipitation into rainfall and snowfall and calculates the snowmelt based on a degree-day method, which has two parameters to be calibrated (Valéry et al., 2014). In GR4J-6 model,

10.1029/2020EF001602
Earth's Future the rainfall and snowmelt simulated by CemaNeige module are used as the inputs to the GR4J module.
The HBV model is a conceptual lumped hydrological model that includes 14 parameters to be calibrated (Seibert & Vis, 2012). Its snow accumulation and snowmelt module considers the processes of snowmelt, the retention of meltwater and the refreezing of meltwater and has five free parameters. Effective rainfall that generates runoff is simulated through a soil moisture reservoir and is separated into three sources of runoff: peak flow, intermediate flow, and baseflow. The runoff routing is simulated by a triangular weighting function.
The HMETS model is a conceptual 21-parameter hydrological model, which includes four components: snow accumulation and snowmelt module, evapotranspiration module, vertical water balance, and horizontal transport (Martel et al., 2017). Compared to previous hydrological models, the snowmelt module in HMETS also simulates snowmelt and refreezing processes but considers finer processes with 10 parameters. The actual evapotranspiration is calculated as a fraction (a parameter to be calibrated) of potential evapotranspiration. The vertical water balance separates the effective rainfall into four sources of runoff (surface runoff, delayed runoff, hypodermic flow, and groundwater flow) based on two linear reservoirs and contains six free parameters. The first two runoff components are routed through two gamma distributions in the horizontal transport, which has four free parameters.
The SIMHYD model is a conceptual rainfall-runoff model, which has nine free parameters and considers three mechanisms for the runoff generation: infiltration excess runoff, saturation excess runoff and baseflow (Chiew et al., 2002). Infiltration excess runoff is simulated through an interception store. Infiltration is separated into saturation excess runoff and baseflow through two reservoirs. Streamflow routing is simulated with the use of a unit hydrograph.
The XAJ model is a conceptual 15-parameter rainfall-runoff model (Zhao et al., 1980). Different from other hydrological models, it separates the watershed area into pervious and impervious areas and considers a three-layer soil moisture store to simulate actual evapotranspiration. In the impervious area, all of the effective rainfall transfers to surface runoff. In the previous area, effective rainfall is separated into three sources of runoff: surface runoff, interflow, and groundwater runoff. The surface runoff routes through a unit hydrograph to the outlet of watersheds. Similar to GR4J, CemaNeige is added to SIMHYD and XAJ models to simulate the snow accumulation and snowmelt processes.

Parameters of Hydrological Models
The observed minimum and maximum temperatures, precipitation, and streamflow series were used to calibrate free parameters for five hydrological models. The observed data used for calibrating parameters cover a length of 25 years as shown in Table 2, and the odd years are used as the calibration period whereas the even years are used as the validation period. The optimal values for all free parameters in hydrological models were selected based on the algorithm of evolution strategy with covariance matrix adaptation (CMA-ES) (Hansen & Kern, 2004). CMA-ES was used since it outperforms many other algorithms in calibrating hydrological models and it is especially efficient in handling higher-dimension parameter space (Arsenault et al., 2014). In terms of the objective function, Pushpalatha et al. (2012) revealed that Nash-Sutcliffe efficiency (NSE) only focuses on larger flows but shows no sensitivity to low flow while NSE of natural logarithm streamflow (LNSE) tends to focus on lower flows. Therefore, in order to obtain good performances in simulating both peak streamflow and low flow, the objective function used in this study is the mean of NSE and LNSE. In the CMA-ES, the recommended default configuration of Hansen and Kern (2004) was used for the population size, N, and the max iteration, I. The population size N is 4+⌊3ln(p)⌋, where p is the number of free parameters. The max iteration I is N þ 5 ð Þ 2 × 10 3 = ffiffiffiffi N p . The stopping rule is set as the difference in optimal functions between two iterations smaller than 10 −13 .
The uncertainty of hydrological model parameters is bracketed by repeatedly performing the calibration 5 times with different initial random seeds in CMA-ES (i.e., for each hydrological model, five sets of parameters are obtained). The ranges of objective functions corresponding to optimally selected parameter sets are shown in Table 2, and the hydrographs of streamflows simulated by calibrated hydrological models are presented in Figure 3. The results of calibration show that all five hydrological models yield efficiencies greater than 0.80, which means they are applicable in simulating streamflows for both watersheds. Figure 3 10.1029/2020EF001602 Earth's Future presents that calibrated models have good performances in simulating both peak streamflow and low flow. In addition, the HMETS and XAJ models show the best performances among five hydrological models, followed by the HBV model. The performances of GR4J-6 and SIMHYD models are slightly inferior to Table 2 The

10.1029/2020EF001602
Earth's Future other models. The hydrological simulations under climate change scenarios were simulated by inputting the downscaled GCM outputs into the calibrated hydrological models.

Uncertainty Decomposition
By implementing multiple options for each step in the impact modeling chain in Figure 1, we can obtain a large ensemble of 9,900 hydrological simulations for the estimation of overall uncertainty and uncertainty compositions. In order to evaluate the uncertainty of hydrological impacts in terms of different aspects, this study compared the simulated streamflow in the future period to that in the reference period in terms of three different hydrological indicators: annual mean streamflow, annual peak streamflow, and annual 7-day low flow.
In this study, overall uncertainty is estimated by the variance of changes in hydrological indicators, and the overall uncertainty is decomposed to the uncertainty contribution of different sources and the interaction of sources by the multivariate ANOVA method. This method has been applied by some researchers to partition the uncertainty components in studies of hydrological impacts (Addor et al., 2014;Bosshard et al., 2013;Chegwidden et al., 2019). Change in a specific hydrological indicator of one hydrological simulation in the ensemble, Δx ijkmn , is assumed to follow the model as Equation 1.
where μ represents the mean change of the whole ensemble in the hydrological indicator; E i , G j , D k , H m , and P n respectively represent the effects on the hydrological indicator of ith emission scenario, jth GCM, kth downscaling method, mth hydrological model, and nth set of parameters. I ijkmn represents the sum of effects of interaction between different sources. Based on the ANOVA method, the total variance (overall uncertainty), VT, can be separated into contributions from different sources as Equation 2.
where VE, VG, VD, VH, VP, and VI respectively represent the variance contributed by effects of emission scenarios, GCMs, downscaling methods, hydrological models, parameter sets of hydrological models, and the interactions between different sources. By dividing the variance from different sources by the total variance, the fractional contributions of different sources to the overall uncertainty are obtained.
The major focus of this study is on the impacts of the quantity of GCMs on the uncertainty quantification. This was evaluated by comparing the overall uncertainty and uncertainty components of using subsets of GCMs to those of using all GCMs. To be specific, hydrological simulations from subsets of GCMs were used to quantify overall uncertainty and uncertainty components in the ANOVA method and were compared with the results of using all 22 GCMs. For example, when using 2 GCMs as the subset, there were 22 2 ¼ 231 combinations for the subset, and for each combination, the 900 hydrological simulations from the two GCMs in the combination were used to quantify uncertainty. The overall uncertainty and uncertainty contributions based on the subsets of GCMs were compared with the results of using all 22 GCMs. The uncertainty quantification of using GCM subsets is better when it is more similar to the results of using all GCMs.

Overall Uncertainty and Uncertainty Contributions
The overall uncertainty of changes in three hydrological indicators based on the streamflows simulated by all 22 GCMs is shown in Table 3 in the form of standard deviation. The results show that for both watersheds, the uncertainty of changes in mean streamflow is the smallest, whereas the uncertainty of changes in peak streamflow and low flow is greater. It highlights that there are larger variances for changes in extreme streamflow than mean streamflow, and thus, it is important to cover the overall uncertainty when evaluating climate change impacts on extreme hydrological events.
Based on the ANOVA method, the overall uncertainty was decomposed to uncertainty components of different sources, and the relative contributions of uncertainty sources to the overall uncertainty are also shown in Table 3. For changes in mean streamflow and peak streamflow, choices of GCMs explain most of

10.1029/2020EF001602
Earth's Future uncertainty. This highlights the necessity to use multiple GCMs to fully reflect the influences of the choices of GCMs on the simulation of hydrological changes under climate change. Besides GCMs, the interaction of different sources is also an important source of uncertainty. It should be noted that the emission scenario contributes parts of uncertainty of changes in annual streamflow over Manic-5 watershed. This may be related to the fact that the temperature rises resulting from different emission scenarios have great influences on the snow accumulation and snowmelt processes. However, for changes in low flow, the uncertainty contributions of GCMs are reduced. For example, in Manic-5 watershed, the interaction of different sources, emission scenarios, and hydrological models are the main sources of uncertainty. In the Xiangjiang watershed, GCMs, interaction, and downscaling methods are the major sources of uncertainty. Hence, when evaluating climate change impacts on low flow, it is necessary to consider the influences of other uncertainty sources in addition to GCMs. It should be noted that for both watersheds and all hydrological indicators, the parameters of hydrological models always contribute the least uncertainty.

Impacts of GCM Quantity on the Overall Uncertainty
One of the main purposes of this study is to investigate the influences of the numbers of GCMs on the uncertainty quantification of climate change impacts on hydrology. Accordingly, the combinations encompassing different numbers of GCMs were chosen from the whole ensemble of 22 GCMs and, then, the overall uncertainty was recalculated for all combinations of GCMs to be compared with the overall uncertainty estimated by using 22 GCMs. Figure 4 presents the median and the range of overall uncertainty of changes in three hydrological indicators when different numbers of GCMs are used. If the difference between the median estimated by GCM subsets and that estimated by all 22 GCMs is smaller and the range of overall uncertainty estimated by different combinations is smaller, it represents the overall uncertainty estimated by the GCM subsets is more similar to the case when all GCMs are used. In terms of the median overall uncertainty estimated from different combinations (Figures 4a and 4b), when the GCM number is smaller than 5, the overall uncertainty is largely underestimated compared with that of using all GCMs for all hydrological indicators and both watersheds. When the number of GCMs reaches 10, the median value of overall uncertainty is similar to that of using all GCMs. As to the range of overall uncertainty when different GCM combinations are used (Figures 4c and 4d), it decreases as more GCMs are used, which means the overall uncertainty estimated by GCM subsets converges to the result by using all GCMs. Specifically, when the GCM number reaches 10, the ranges reduce to almost half of the range of using only two GCMs for all three hydrological indicators and both watersheds.

Impacts of GCM Quantity on Uncertainty Components
Since GCMs are the major source of uncertainty and the numbers of GCMs directly affect the quantification of uncertainty contribution of GCMs, this study analyzed the uncertainty contribution of GCMs when different numbers of GCMs are used, as shown in Figure 5. The dark and light orange panels in Figure 5 respectively represent the thresholds of ±2.5% and ±5% when using the contribution estimated from simulations of 22 GCMs as the standard. Similar to the results of overall uncertainty, when the numbers of GCMs are small, there are large possibilities that the contribution of GCMs to overall uncertainty is underestimated. With the increases in GCM quantity, the uncertainty contributions estimated by different combinations of GCMs Earth's Future converge to the result of using all GCMs. To be specific, for changes in annual mean streamflow over two watersheds, when the numbers of GCMs are not more than five, the estimations of GCMs' contribution to uncertainty from more than half of GCM combinations are below the threshold of −5%. When the numbers of GCMs are not more than eight, more than half of GCM combinations bring about contribution estimations smaller than the threshold of −2.5%. For changes in annual peak streamflow, using the simulations from no more than five GCMs results in the estimation of uncertainty contribution below the threshold of −2.5% for more than half of GCM combinations. In terms of changes in low flow, since the uncertainty contribution of GCMs is relatively small in the Manic-5 watershed, the estimations of uncertainty contribution by GCM combinations are evidently underestimated only when using less than four GCMs. In the Xiangjiang watershed, uncertainty contributions of GCMs are evidently underestimated when GCMs are less five. In addition, using 10 GCMs can make the upper quartile of uncertainty contributions falls within the threshold of +5% and the lower quartile approaches the threshold of −5% for all hydrological indicators. When the numbers of GCMs reach 14 and 15 for the Manic-5 and Xiangjiang watersheds, respectively, the upper and lower quartiles of estimated uncertainty contribution fall within the threshold of ±5% for all three hydrological indicators.
The abovementioned analysis only considers the impacts of GCM quantity on the quantification of uncertainty contribution of GCMs. In order to further investigate its impacts on the quantification of other uncertainty components, some examples (10th, 25th, 50th, 75th, and 90th percentiles) are selected from GCM combinations based on their distributions in estimating the uncertainty components of GCMs as in Figure 5. The uncertainty contributions of all sources for the selected samples are displayed in Figure 6. For example, when two GCMs are used, among the 231 combinations, the 23rd, 58th, 116th, 173rd, and 208th in the distribution of GCMs' contributions are used to analyze uncertainty contributions of all sources. Figure 6 shows the results when 2, 5, 10, 15, 20, and all GCMs are used. The results manifest that using only

10.1029/2020EF001602
Earth's Future two or five GCMs results in large distinctions among the uncertainty contributions estimated by different GCM combinations. For example, in terms of changes in low flow over the Xiangjiang watershed when 5 GCMs are used, the combination in the 25th percentile manifests that the downscaling method is the largest source of uncertainty while it is GCMs for the 75th combination. Over both watersheds, when 10 GCMs are used, the results of uncertainty decomposition based on 25th, 50th, and 75th percentiles of GCM combinations are similar to each other for three hydrological indicators and the maximum differences among the uncertainty contributions in these combinations are smaller than 13%. When the hydrological simulations from 15 GCMs are used, combinations of all quantiles generate similar results of uncertainty contributions, whose maximum differences are smaller than 15%.

Influences of Uncertainty Decomposition Method
In terms of the uncertainty decomposition method, this study only uses the ANOVA method in the previous part, which is common in relevant research, but other methods are also proposed by some studies (Fatichi et al., 2016;Kim et al., 2019;Vidal et al., 2016;Yip et al., 2011). Using different uncertainty decomposition methods may influence the analysis of uncertainty contributions. Therefore, this study also uses the uncertainty decomposition method proposed by Kim et al. (2019) to examine the robustness of results. This method follows the concept of uncertainty cascade and assumes that the uncertainty components are stepwise cumulative (this method is named as the cumulative uncertainty decomposition method (CUD) in this

10.1029/2020EF001602
Earth's Future study). Figure 7 shows the GCMs' contribution to uncertainty when CUD method is used as the uncertainty decomposition method. The characteristics of results, in this case, are similar to the results based on the ANOVA method. On the condition of no more than five GCMs, more than half of GCM combinations evidently underestimate the uncertainty contribution of GCMs (i.e. smaller than the −5% threshold). When the GCM quantity reaches 10, the median estimated uncertainty contributions are close to the results of using all 22 GCMs.

Discussion
In order to consider the overall uncertainty resulting from the choice of options in the impact modeling chain, it is a common practice to use the large ensemble of simulations from the permutations of multiple options for each step in studies of hydrological impacts (Chen, Brissette, Poulin, & Leconte, 2011;Mandal & Simonovic, 2017;Wilby & Harris, 2006). Some studies have proposed to decompose the overall uncertainty into the contributions of sources to investigate the main origination of uncertainty (Addor et al., 2014;Bosshard et al., 2013;Chegwidden et al., 2019;Dobler et al., 2012;Mandal & Simonovic, 2017). However, more and more climate simulations are available for the studies of hydrological impacts. Only limited GCMs considered in the impact simulations may cause unstable analysis of uncertainty decomposition.

Earth's Future
Thus, inadequate numbers of GCMs may challenge the robustness of overall uncertainty and uncertainty components, and rare research investigates the influences of GCM quantity. Accordingly, this study proposes to evaluate the impacts of the number of GCMs on the quantification of overall uncertainty and uncertainty components. This study reveals that when only small numbers of GCMs are considered, there are large possibilities that the overall uncertainty and the uncertainty components differ to the results of using all GCMs. For example, when only five GCMs are used, the standard deviation of changes in annual mean streamflow over the Manic-5 watershed covers a wide range (14-86%) and the uncertainty contribution of GCMs ranges from 1% to 75% for the ANOVA method. When the GCM number reaches 10, the differences are relatively small in terms of the uncertainty compositions. In addition, the robustness of results was examined by using a different uncertainty decomposition method (CUD). In terms of the influences of GCM numbers on the quantification of GCMs' uncertainty contribution, similar characteristics are observed in this case. Therefore, in studies of climate change impacts on hydrology, in order to reasonably evaluate the uncertainty contributions, it is recommended to use the outputs of at least 10 GCMs on the condition of inadequate computational resources.
In this study, all GCMs are assigned equal weights to simulate the hydrological impacts of climate change. However, there may exist some problems with this assumption. The performances of GCMs vary with locations and variables. Besides, the equal weights assume that GCMs are independent of each other, which may 10.1029/2020EF001602 not be true due to the common modules, codes or parameterizations among some GCMs (Jun et al., 2012). Thus, the equal weighting may not be the optimal weighting method in dealing with the MMEs. Chen et al. (2017) and Wang et al. (2019) evaluated the performances of unequal weighting methods and concluded that when downscaling/bias correction is performed, the unequal weighting method does not bring obvious influences on the quantification of hydrological impacts. However, recent work suggests that further wise weighting or selection of GCMs may accrue from evaluating processes relevant to the impact field in the watersheds of interest or from the analysis of emergent constraints (Eyring et al., 2019;Klein & Hall, 2015). These methods may help to reduce the uncertainty and improve the robustness for the quantification of hydrological climate change impacts. Thus, how the unequal weighting affects the uncertainty contributions from different sources deserves further study.
It is a limitation that this study only employed lumped hydrological models with different structures and parameters. The spatial differences among climate simulations and spatial heterogeneity within a watershed are not directly reflected in the processes of hydrological modeling (such as runoff generation and streamflow routing). Some studies have evaluated how the choice of the types of hydrological models (including lumped and distributed models) contributes to the uncertainty of hydrological impacts (e.g., Mandal & Simonovic, 2017;Velázquez et al., 2013). They showed that the hydrological model uncertainty is considerably smaller than the GCM uncertainty except for the low flow, which is similar to the results of uncertainty components in this study, so the introduction of distributed hydrological models may not change the main conclusions of the study. In addition, the factorial experiment in this study involves large ensembles of climate simulations to be inputted into hydrological models, so using distributed models may pose a large computational burden. However, the specific influences of distributed hydrological models still deserve further examination.

Conclusions
This study evaluates the overall uncertainty and uncertainty contributions from five sources in the quantification of hydrological impacts of climate change based on the ANOVA method over two watersheds. In order to further investigate the impacts of GCM quantity on the evaluation of uncertainty, this study reconducts the uncertainty decomposition for the subsets of hydrological simulations from different numbers of GCMs. The main conclusions are as follows.
1. Among the three hydrological indicators, the overall uncertainty of changes in extreme indicators (peak streamflow and low flow) is larger than that of changes in the mean indicator. Changes in hydrological extremes are more vulnerable to the methodological choices in the impact modeling chain. 2. For the changes in mean annual streamflow and peak streamflow, GCMs are the leading source of uncertainty. Thus, in evaluating climate change impacts on annual or peak streamflows, it is a priority to consider the influences of GCM. In contrast, for the changes in mean low flow, other sources besides GCMs also have large contributions to the overall uncertainty, which means that multiple uncertainty sources deserve consideration in this case. 3. On the condition of no more than five GCMs, there exist large possibilities that the overall uncertainty and GCMs' contribution to uncertainty in hydrological impacts are underestimated. Using at least 10 GCMs can make the estimation of the overall uncertainty and uncertainty components similar to the results of using all 22 GCMs. Therefore, if the computation resources are limited in studies of climate change impacts on hydrology, at least 10 GCMs should be used to adequately reflect the overall uncertainty and uncertainty contribution of GCMs.