Quantifying sources of inter‐model diversity in the cloud albedo effect

There is a large diversity in simulated aerosol forcing among models that participated in the fifth Coupled Model Intercomparison Project, particularly related to aerosol interactions with clouds. Here we use the reported model data and fitted aerosol‐cloud relations to separate the main sources of inter‐model diversity in the magnitude of the cloud albedo effect. There is a large diversity in the global load and spatial distribution of sulfate aerosol, as well as in global mean cloud top effective radius. The use of different parameterizations of aerosol‐cloud interactions makes the largest contribution to diversity in modeled radiative forcing (−39%, +48% about the mean estimate). Uncertainty in preindustrial sulfate load also makes a substantial contribution (−15%, +61% about the mean estimate), with smaller contributions from inter‐model differences in the historical change in sulfate load and in mean cloud fraction.


Introduction
The interaction of aerosols with clouds causes a significant part of the total aerosol radiative forcing over the industrial era [Zelinka et al., 2014;Shindell et al., 2013;Lohmann et al., 2010;Quaas et al., 2009] and was attributed a low level of confidence in the Intergovernmental Panel on Climate Change assessment [Boucher et al., 2013]. The best estimate of the net effective radiative forcing due to aerosol-cloud interactions is −0.45 W m −2 (2010 versus 1750), with a 90% confidence interval of −1.2 to 0 W m −2 .
Although there is a large uncertainty in the magnitude of aerosol-cloud forcing [Boucher et al., 2013;Zelinka et al., 2014], its inclusion in climate models improves simulations of historical temperature [Wilcox et al., 2013]. Forster et al. [2013] showed that this improvement is not because aerosol-cloud interactions are tuned in models to agree with past temperature variations. Indeed, Shindell et al. [2013] showed that there is little relationship between the magnitude of the aerosol indirect effect and the climate sensitivity, which is in contrast with the third Coupled Model Intercomparison Project (CMIP3) generation of models [Kiehl, 2007].
The majority of CMIP5 models use emissions of anthropogenic aerosols and their precursors from the Lamarque et al. [2010] inventory, and many include at least the cloud albedo effect [Twomey et al., 1984]. Although the models use the same emissions, there is considerable spread in their estimates of the magnitude of the global mean radiative forcing caused by historical changes in aerosols [Zelinka et al., 2014].
Here we aim to understand the causes of the inter-model diversity in aerosol-cloud radiative forcing and identify the model processes that need to be improved to reduce uncertainty.

Diversity in CMIP5 Aerosol and Cloud Properties
CMIP5 models generally have improved representation of aerosol-cloud interactions compared to CMIP3, with 29 out of 45 models having a representation of at least the cloud albedo effect. We use historical experiments from 12 CMIP5 models (see supporting information Table S1) that include at least a representation of the cloud albedo effect and which also archived aerosol data. Detailed analysis of the mechanisms focuses on four models for which cloud top effective radius (r e ) was archived and the underlying parameterizations have been published: HadGEM2-ES [Bellouin et al., 2007;Collins et al., 2011], CSIRO-Mk3.6.0 [Rotstayn et al., 2012], IPSL-CM5A-LR [Dufresne et al., 2013], and NorESM1-M [Iversen et al., 2012]. Change over the historical period of (a) sulfate load for 12 CMIP5 models and (b) r e for nine CMIP5 models, compared to preindustrial values. Models in Figure 1a are numbered in order of increasing preindustrial load. Focus models for the remainder of the study are highlighted.
We relate the diversity in radiative forcing due to changes in cloud albedo to diversity in modeled aerosol, differences in the assumed relationship between r e and cloud droplet number concentration (N d ), and differences in the modeled cloud fraction. Because very few model diagnostics have been archived, we consider the relationships between aerosols and clouds in terms of the reported vertically integrated sulfate load. In reality, aerosol properties at cloud base control N d s, and these properties vary among models. However, we show in section 3 that vertically integrated sulfate load and N d are strongly correlated. Furthermore, changes in sulfate load account for 80% of the change in r e in NorESM1-M [Kirkevåg et al., 2013]. Given the strong correlation of sulfate load and N d and the lack of available other model diagnostics, we use this relationship to relate changes in aerosols to forcing among the models. Figure 1a shows the change in vertically integrated sulfate load over the historical period  versus the 1860 global mean load for 12 models. There is a factor of 15 spread in the global mean sulfate in 1860, reflecting differences in meteorology, aerosol transport and deposition, and chemical processes. However, the correlation between 1860 sulfate and the historical change in sulfate is r 2 = 0.36, suggesting that the causes of model diversity in 1860 are likely to be different to those that account for diversity in anthropogenic sulfate changes. For example, additional diversity in 1860 sulfate comes from the inclusion of dimethyl sulfide in some models and different representations of continuously degassing volcanoes. It is not possible to quantify the contribution of each model-specific process to this diversity based on the reported diagnostics. There are also large differences in the spatial distribution of sulfate (supporting information Figures S1 and S2), which will affect the colocation of aerosols and clouds. Furthermore, although all the models agree on the sign of the sulfate trends, there are large differences in the absolute sulfate load and its rate of change (supporting information Figure S3).
Effective radius (r e ) is a metric of the cloud albedo effect and shows considerable inter-model diversity among the few models for which it is archived (Figure 1b). Global mean r e spans a factor 16 in 1860 (nine models), although most models predict values between 10 and 12 μm. The decrease in global mean annual mean r e between 1860 and 2004 ranges from ≈ −0.15 μm in the IPSL family of models to almost −0.7 μm in the HadGEM2 family (Figures 1b and S3 in the supporting information).
Here we investigate how differences in preindustrial sulfate, changes in sulfate load over the industrial era, and differences in the parameterization of the r e -N d relationship contribute to inter-model diversity in modeled r e and hence the magnitude of the cloud albedo effect. Meteorological Research Institute-CGCM3 is excluded because sulfur from explosive volcanic eruptions is modeled prognostically, which precludes a like-for-like comparison of the time evolution of sulfate load related to N d in the troposphere.

Simple Functional Forms of Cloud Top Effective Radius
Four simple functional forms are presented, which enable the quantification of contributions to diversity in the time evolution of r e and cloud albedo in the absence of all necessary diagnostics from the CMIP5 archive. The functional forms capture the dependence of r e on the vertically integrated sulfate load and are based on the underlying equations of the four models that provided sufficient aerosol diagnostics to the CMIP5 archive.
Differences in the way that N d and r e are calculated in models are two potential sources of inter-model diversity in albedo forcing. We focus on the calculation of r e from N d . The parameterization of N d from aerosol mass or number concentration has previously been shown to be an important source of inter-model diversity [e.g., Kiehl et al., 2000;Penner et al., 2006;Storelvmo et al., 2009]. However, two of the four models considered here share a parameterization scheme, so there is not enough diversity of approach to address this issue. Hence, the total inter-model diversity from the choice of different parameterization schemes is likely to be larger than the value we report here.
Each model contains a prognostic equation for r e in terms of N d , e.g., (from HadGEM2-ES): where q c is the cloud liquid water content, 0 and w are the densities of air and water, respectively, and k is a constant that depends on whether the clouds are over land or sea [Jones et al., 2001]. We find a linear correlation between global annual mean sulfate load and vertically integrated N d with r ≥ 0.98 for HadGEM2-ES, CSIRO-Mk3.6.0, and IPSL-CM5A-LR (data were not available for NorESM1-M). Using this linear relationship and the functional form of the equivalent equations for r e from each model, we derive equations for r e in terms of sulfate load of the form where the values of the constants a and b are found by linear least squares regression of global mean time series of load c onto global mean time series of r e . In the model parameterization scheme, c is the exponent of N d , e.g., for HadGEM2-ES this would be 1 3 following equation (1). This approach gives a very good approximation for the multidecadal time evolution of r e in the full models (supporting information Figure S4). Further information on the parameterization schemes and the empirical constants are given in the supporting information. Eleven year running means of global mean annual mean sulfate load from HadGEM2-ES and temperature anomalies from HadCRUT4. The period in the mid-twentieth century where temperature has previously been shown to be strongly influenced by the coincident increase in anthropogenic aerosols is highlighted. Figure 2a shows the relationship between r e and sulfate load. The thin lines span the whole CMIP5 range of historical sulfate loads, and each model's own sulfate range is shown as thick lines. Differences between the functional forms reflect differences in both the underlying sensitivity of the parameterization of the relationship between r e and N d and in temporal trends in cloud liquid water content. Global mean r e is most sensitive to sulfate changes in the HadGEM2-ES parameterization, and least sensitive in the IPSL-CM5A-LR parameterization ( Figure 2a). However, different models also have different sulfate loads: the four models with their native loads predict r e changes over the historical period that are more similar than if a standardized sulfate load was used, i.e., the inter-model diversity in historical effective radius changes is less than would be expected from the diversity in parameterization alone. Figure 2a shows that in all models r e is more sensitive to changes in sulfate load when the sulfate load is low [e.g., Carslaw et al., 2013], which means that for a given change in anthropogenic sulfate load, the magnitude of the indirect effect will be larger for models with low preindustrial load. To illustrate this sensitivity, Figure 3a shows how the historical change in r e in the HadGEM2-ES functional form changes when its own preindustrial sulfate load is replaced by those of 11 CMIP5 models. This produces a broad range of r e changes, particularly in the early part of the historical period. However, by 2004 the impact of preindustrial sulfate loads on further changes in r e is small. The higher loads in the present day result in greater buffering and therefore a reduced sensitivity of r e to further load changes, as suggested by Figure 2a.
Although the uncertainty in the preindustrial aerosol state has a large influence on the change in r e over the industrial era, changes in anthropogenic aerosol emissions are still important for multidecadal variability in modeled twentieth century cloud properties. In particular, r e changes rapidly in the mid-twentieth century when there is a pronounced increase in global mean sulfate loads (Figure 3b), which has a strong influence on global mean temperature [Wilcox et al., 2013].

Causes of Inter-Model Diversity
To estimate the contributions to inter-model diversity in the cloud albedo effect, we vary, in turn, (i) preindustrial sulfate load, (ii) the absolute change in sulfate load over the historical period, (iii) the modeled relationship between N d and r e , and (iv) modeled cloud fraction. We use shortwave radiative forcing (sRF) as a metric for comparing the relative contributions to inter-model diversity. An approximate sRF is calculated from r e assuming = 1.5L based on Platnick and Twomey [1994], where is optical depth, L is historical mean liquid water path, w is the density of water, A is albedo and g = 0.85 is the asymmetry parameter, and following Carslaw et al. [2013], where F 0 is the top of atmosphere radiative flux, t a = 0.75 is the transmission of the atmosphere, and C is the historical mean total cloud fraction.
We assume that cloud cover does not change over the historical period. Zelinka et al. [2014] show that aerosol-cloud interactions have little impact on cloud amount in a similar set of models. We also assume that the total cloud fraction of all clouds is representative of low cloud, which means that our estimate of sRF is an upper bound. The multimodel mean functional form estimate of global climate model sRF is −1 W m −2 , which is comparable to Zelinka et al.'s [2014] estimate of −1.04 ± 0.67 W m −2 , based on nine CMIP5 models. However, we do not aim to quantify the sRF due to the cloud albedo effect, only its relative sensitivity to different sources of uncertainty. Figure 4a shows the functional form estimates as a percentage of the multimodel mean. As can also be seen in Figure 2d, three of the models produce very similar estimates, while the NorESM1-M functional form produces a smaller forcing.
Radiative forcing depends on the modeled mean liquid water paths, top of atmosphere shortwave radiation, and mean cloud fractions. The inter-model differences in these fields are sufficient that the relative sensitivities of modeled sRF to sulfate differ from the relative sensitivities of r e shown in Figure 2a. This change in the relative positions of the models when moving from r e change to sRF can be seen in Figures  2a-2c, which show equivalent schematics for global mean r e , cloud albedo, and sRF. The change in the relative positions of the models in this figure demonstrates that the sensitivity of the model response to sulfate changes cannot be predicted by considering the underlying equations in isolation: the magnitude of the indirect effect is determined by a combination of the underlying sensitivity of the parameterization and the model climatology, e.g., cloud fraction, colocation of aerosol, and cloud. The additional consideration of liquid water path in albedo ( Figure 2b) and cloud fraction in sRF (Figure 2c) reduces the inter-model spread in the maximum differences relative to the preindustrial in the functional forms (Figure 2d).
The sensitivity of sRF to the parameterization of r e versus N d is evaluated by driving the four functional forms with the same sulfate time series so that the only difference between them is the parameterization. Each of the four equations is driven with each of the four model sulfate time series, corresponding the four points shown for each functional form in Figure 4b. The results are qualitatively similar irrespective of the sulfate time series used, although there are small quantitative differences because of the nonlinear relationship between sulfate load and cloud albedo change. The different parameterizations produce r e changes that are between 39% less and 48% more than the mean estimate (Figure 4b). The inter-model spread in sRF estimates is larger when sulfate loads are standardized than in the unperturbed functional forms (Figure 4b versus Figure 4a). This suggests that the actual differences in sRF are less than would be expected from differences in the parameterization of the relationship between N d and r e alone. This compensation between model sensitivity and model state can be seen in Figure 2, where CSIRO-Mk3.6.0 r e is inherently more sensitive to sulfate changes than NorESM1-M (Figure 2a), but the higher preindustrial load in CSIRO-Mk3.6.0 suppresses the r e change over the historical period ( Figure 2d).
If preindustrial sulfate load and the change in load over the historical period were independent, and therefore governed by different physical processes, their contributions to inter-model diversity in the cloud albedo effect could be quantified by varying each of them within the 95% confidence interval of CMIP5 values. However, the weak correlation between preindustrial and present-day sulfate loads (r 2 = 0.36, Figure 1) suggests that there are likely to be some shared sources of diversity. Therefore, to quantify the effect of uncertainty in preindustrial load, we scale the variance of the load values by 0.64 and use this reduced variance to find a reduced 95% confidence interval. Since our small sample limits our knowledge of the distribution of preindustrial loads, we vary loads within the bounds of the actual model values that lie within the 95% confidence interval, rather than our calculated limits. This is accounted for by the central nine of the CMIP5 values. The same methodology is used to quantify the sensitivity of the indirect effect to the absolute change in historical load. Supporting information Figure S5 shows a graphical representation of load perturbations made in this section.
Perturbing preindustrial sulfate load within the range of the central nine CMIP5 values results in large changes to the multimodel mean sRF. Using the lower bound of preindustrial sulfate load in each model results in a 61% increase in the multimodel mean sRF (20 to 129% for individual models), while the upper bound results in a 15% decrease in the mean sRF (maximum 40% for the individual models) (Figure 4c). The nonsymmetric effect of the upper and lower sulfate loads is due to the buffering effect of r e at higher sulfate loads ( Figure 2a).
The influence of differences in the absolute load change over the industrial era on sRF is smaller than that of differences in the preindustrial sulfate load (Figure 4c). Imposing the minimum load change from the central nine CMIP5 values on the functional forms results in a mean reduction in sRF of 24% (10 to 39% in individual models). When the equations are driven by the load time series with the greatest change from the central nine CMIP5 models over the industrial era, the average sRF increases by 5% (up to 23% in HadGEM2-ES) (Figure 4c).
Perturbing the cloud fraction used to calculate sRF results in a linear scaling. Using the minimum cloud fraction from 11 CMIP5 models reduces the mean estimate by 15% (7 to 28% in individual models) (Figure 4c). Using the maximum cloud fraction increases the mean estimate by 23% (5 to 35% in individual models).

Conclusions
In order to reduce the uncertainty in modeled aerosol-cloud interactions, it is important to understand the sources of that uncertainty. Although the available diagnostic data from CMIP5 models are very limited, we have shown that functional forms of the response of cloud top effective radius to changes in vertically integrated sulfate load can be used to test the sensitivity of the magnitude of the cloud albedo effect to (i) preindustrial sulfate load, (ii) absolute changes in sulfate load over the industrial era, (iii) the parameterization of the relationship between effective radius and N d , and (iv) modeled cloud fraction.

10.1002/2015GL063301
The parameterization of the relationship between cloud top effective radius and N d is the largest potential source of inter-model diversity, resulting in a multimodel mean range of shortwave radiative forcing estimates between −39% and +48% about the baseline estimate of −1 W m −2 . The actual differences between the full models are less than would be expected based on differences resulting from the parameterizations alone ( Figure 4). Differences in sulfate load and cloud states mitigate some of the diversity caused by different parameterizations (Figure 2). Hence, shortcomings in our understanding of the physical processes involved may be obscured by compensating errors. It is therefore important to consider models as a whole in order to anticipate the magnitude of the indirect effect, not just their underlying equations.
Differences in meteorology and chemistry lead to pronounced differences in aerosol distribution and regional loads, despite the models being driven with the same anthropogenic emissions. The resultant diversity in the modeled preindustrial state has a large influence on the uncertainty in modeled shortwave radiative forcing from the cloud albedo effect. Driving the functional forms with the central nine preindustrial sulfate loads from 11 CMIP5 models results in a range of shortwave radiative forcing estimates between −15% and +61% about the baseline. Perturbing the absolute change in sulfate load during the industrial era results in a range of −24% to +5% about the baseline, making this the smallest contributor to inter-model diversity of the processes we consider here. This result is broadly consistent with Carslaw et al. [2013], who showed that the albedo forcing is more sensitive to uncertainties in natural aerosol emissions (hence preindustrial conditions) than to uncertainties in anthropogenic emissions. However, the influence of the preindustrial load on the rate of change in effective radius decreases with time, and anthropogenic perturbations to sulfate load still have an important influence on multidecadal variability (Figure 3).
Most of the inter-model spread in the cloud albedo effect results from differences in the response of cloud albedo to changes in aerosol, rather than differences in mean state cloud fraction. However, the influence of the mean cloud state is not insubstantial: perturbing cloud fraction between the maximum and minimum from 11 models changes the multimodel mean shortwave radiative forcing by between −15% and +23% (Figure 4c), which is greater than the effect of perturbing the absolute change in sulfate load during the industrial era. This is consistent with Zelinka et al. [2014] who found that over 20% of the inter-model spread in the cloud scattering component of sRF due to aerosol-cloud interactions was due to differences in mean state total cloud fraction in their subset of CMIP5 models.
The large differences between sulfate loads in models that use the same emissions, the resulting large spread in cloud top effective radii, and the importance of cloud mean states to the magnitude of the cloud albedo effect indicate a need to improve basic model and aerosol fields to improve estimates of the cloud albedo effect. As in previous generations of models one of the principal sources of model diversity is associated with the calculation of aerosol loading [e.g., Pan et al., 1998;Penner et al., 2006;Liu et al., 2007;Quaas et al., 2009]. Consistent with Carslaw et al. [2013], it is the uncertainty in preindustrial rather than present-day aerosol that has the greatest influence on estimates of radiative forcing.
Our results suggest that the greatest reductions in model uncertainty are likely to be made by resolving the differences in the parameterization of cloud top effective radius and reducing uncertainty in preindustrial aerosol load. Improvement of modeled aerosol fields relies on the availability of global observations, not just in polluted regions but also in pristine regions that are representative of preindustrial conditions [Hamilton et al., 2014].