Aquaplanets as a Framework for Examination of Aerosol Effects

Although fundamental to the planetary radiative balance, aerosol impacts are highly uncertain in climate simulations because of the uneven distribution of aerosol sources and the complex interactions with radiation and clouds that are difficult to represent in climate models. This study proposes that aquaplanet configurations represent an idealized framework to investigate aerosol effects. As a simple demonstration, a series of aquaplanet simulations with the Community Atmosphere Model version 6 shows that the spatial distribution of aerosol emissions changes the aerosol effective radiative forcing even with unchanged total emissions. Some statistical properties of the simulations are presented to show that relatively short model integrations yield robust results. Much of the aerosol effect is shown to arise from aerosol‐cloud interactions, especially through rapid adjustments associated with the aerosol lifetime effect that alter the cloud optical thickness.


Introduction
Aerosols-particles suspended in the air-impact the flow of energy through the climate system both directly and indirectly (Kaufman et al., 2002). Directly, aerosols both scatter and absorb radiation, with the partitioning determined by the composition (aerosol-radiation interaction, ari). Indirectly, they can act as nucleation sites for condensation and exert an influence on cloud processes and optical properties (aerosol-cloud interaction, aci). Both effects impact the evolution of atmospheric flows via the thermodynamic equation, providing a forcing term in the equations of motion (Chemke & Dagan, 2018;Fan et al., 2016). Aerosols also act upon the hydrologic cycle, altering the life cycle of water as it moves and transforms following evaporation from the surface through the atmosphere to eventually return to the surface as precipitation (Lohmann & Feichter, 2005).
Modeling the effects of aerosols on climate has progressed in recent years as sophisticated aerosol schemes have been incorporated into comprehensive climate models. Despite progress, aerosol effects on climate remain highly uncertain in climate simulations (Boucher et al., 2013). One reason is the sheer complexity of aci, as aerosols and clouds interact across many scales, from chemical processes that alter the cloud droplet distribution (e.g., Schmeller & Geresdi, 2019) to the impact of such processes on precipitation which influences mesoscale circulations (e.g., Wang & Feingold, 2009); the myriad interactions and scales makes representing such systems in coarse resolution models challenging (Stevens & Feingold, 2009).
With the uncertainties in aerosol effects in climate models well acknowledged, further development of aerosol-relevant parameterizations is necessary. With a lack of suitable alternatives, models that use these deficient schemes are employed for studies of climatic impacts of aerosol effects. One might hope that ever-improving observations would provide avenues for progress toward more trustworthy models. These avenues are not smooth nor direct, often because it is exceedingly difficult to make comparisons that meaningfully evaluate aci Malavelle et al., 2017). Methodologies for evaluating aci in models are still being developed. The complexity of aci makes it challenging because even the heuristics for conceptualizing aci are incomplete. One approach that appears to be gaining in popularity is to consider the radiative impact of aerosols as being composed of forcing terms and adjustments, with the total effect being considered "effective radiative forcing" (ERF; see Bellouin et al., 2020;Boucher et al., 2013). One advantageous characteristic of ERF is that it can be easily calculated from a pair of climate model integrations; the typical experiment is to prescribe sea-surface temperature (SST) to force an atmospheric model over the past ≈30 years in two realizations: one with "present-day" (PD) aerosol emissions and one with "preindustrial" (PI) emissions. For improved error characteristics, the Radiative Forcing Model Intercomparison Project (RFMIP) protocol improves upon this by using a preindustrial SST climatology instead of a time-varying SST data set (Pincus et al., 2016). ERF is diagnosed as the difference in the net top-of-atmosphere (TOA) radiative flux: ERF = R PD − R PI , where R is the net TOA radiative flux. Using climate forcers other than aerosols also produces an ERF (e.g., Forster et al., 2016), so the aerosol ERF is also denoted as ERF aer .
The aerosol ERF can be divided into contributions from ari and aci: ERF aer = ERF ari + ERF aci . Both ERF components arise from the combination of forcing and adjustments (Bellouin et al., 2020). The case of ari is relatively straightforward, as the forcing, RF ari , is the direct, instantaneous radiative forcing due to a change in aerosol while the adjustments can be considered as the radiative response induced by heating perturbations and subsequent dynamical adjustment (this has been described as the semidirect effect in the past; Ackerman et al., 2000;Bond et al., 2013). The aci component of ERF aer is sometimes decomposed into a radiative forcing associated with the albedo change induced by an aerosol perturbation that leads to more, smaller cloud droplets (i.e., the Twomey effect; Twomey, 1977) and adjustments that are associated with changes in the liquid water content and the cloud fraction (see Mülmenstädt et al., 2019 for an informative discussion).
There are a number of challenges for representing and investigating aerosol effects using climate models. One is, as has just been described, that diagnosing aci in climate models in ways that are meaningful for improving models is fraught, but there appear to be some emerging concepts and methodologies that could help. These center on the notion of ERF but rely on decomposing ERF into constituent forcing and adjustments, with ongoing work especially focused on how to relate the adjustments to individual processes. A second challenge is that current models are necessarily used for exploring how aerosols impact various aspects of the climate (e.g., Bollasina et al., 2011) despite the known deficiencies of current parameterizations. This endeavor is critical for gaining confidence in projections of future climate; for example, understanding aerosol effects is necessary for predicting the transient warming as greenhouse gas forcing dominates over aerosol effects (Andreae et al., 2005;Mitchell et al., 1995). One other point should be made: Newcomers to aerosol effects are easily and understandably overwhelmed by the topic. The barrier to entry for the field has grown as models have become more complex and computationally expensive while the literature has expanded in multiple dimensions. Considering all three of these points-needs for model development, tools for numerical experimentation, and aids for initiates-there appears to be a need for modeling approaches that can abstract problems and illuminate processes in the current aerosol effects research tableau (see Mülmenstädt & Feingold, 2018 for a similar suggestion).
One option that may be attractive is to adopt an aquaplanet configuration as an idealized framework for aerosol effects. Additional configuration details will be provided later, but let us define the aquaplanet as an idealized configuration of a global atmosphere model in which the lower boundary is entirely water covered and typically there is no sea ice, no topography, and no seasonality (e.g., Blackburn & Hoskins, 2013;Hoskins et al., 1999). These configurations retain the dynamics and physics of the realistic configuration.
The utility of aquaplanets is supported by a long history that continues to the present. The aquaplanet was the framework used for the first dynamical atmosphere models (see Egger &Pelkowski, 2008 anddiscussion thereof in Blackburn &Hoskins, 2013), and its importance was noted by Lorenz (1967): "within the collection of possible planetary atmospheres, one which is devoid of irregularities occupies a more central and fundamental position than one with any specific arrangement of irregularities." Hayashi and Sumi (1986) provided the first aquaplanet experiments with an atmospheric general circulation model, showing the self-organization of eastward propagating convective clusters in the deep tropics and split intertropical convergence zones straddling the equator. The AquaPlanet Experiment (Blackburn & Hoskins, 2013) was a coordinated model comparison of climate and weather models all configured as aquaplanets; among other results, the comparison showed disparate representations of convective organization across the models, suggesting that parameterized physical processes play a decisive role in tropical variability. The Cloud-Feedback Model Intercomparison Project (Webb et al., 2017) continues the tradition by including aquaplanet experiments in analogy to idealized climate change experiments conducted with realistic geography in contemporary models. While much of the early work with aquaplanets focused on the general circulation, attention has also been given to climate feedbacks (especially cloud feedbacks), both in multimodel assessments (e.g., Medeiros et al., 2015) and process-oriented studies (e.g., Brient & Bony, 2013;Ceppi et al., 2016).
Despite their widespread application in other contexts, aquaplanets have not been much used for investigations of aerosol effects. Many aspects of aerosol effects on climate are intimately connected to the land surface, where most emissions originate (both natural and anthropogenic), and therefore aquaplanets may not be suitable for some important problems regarding aerosol effects. Much of the ongoing discussion of ERF aer , however, is concerned with impacts on warm clouds over the oceans (see also Mülmenstädt & Feingold, 2018;Zhang et al., 2016). A few recent studies have used aquaplanets to study aerosol effects. McCoy et al. (2018) used a high-resolution aquaplanet to show that increased cloud-drop number concentration increases liquid water path and albedo in midlatitude cyclones. Dagan et al. (2019) used aquaplanet experiments to show that tropical and extratropical precipitation respond differently to aerosol perturbations. Gettelman et al. (2016) reported similar results for aquaplanet experiments that varied the specified cloud-drop number concentration. These examples suggest there is potential in using aquaplanets as a conceptual framework to investigate aerosol effects. All three examples, however, simplify the aerosol representation (specified drop number concentration in both McCoy et al., 2018 andGettelman et al., 2016, and specified aerosol optical properties in Dagan et al., 2019); such differences from the standard model configurations leave a gap between results from the idealized and realistic models.
Several approaches can be construed as constituting an informal aerosol-focused hierarchy of models. Global models can be simplified by extracting a single column and running just the model physics. Indeed, single-column models (SCMs) are commonly used for parameterization development and are computationally efficient compared to running the global model. SCMs can be prone to "grid locking" to particular states that differ from those found in the fully interactive model, and that can be problematic for analysis. In the case of aerosol effects, there can be ambiguities regarding sources and sinks of aerosol, and-by construction-advective effects are neglected (Lebassi-Habtezion & Caldwell, 2015). Reduced forms of atmospheric models allow advective effects but usually do not include detailed physical representations of aerosols (rather, aerosol-like forcing is applied; e.g., Chemke & Dagan, 2018;Wilcox et al., 2018). Nudged simulations are a common compromise for aerosol studies; nudging constrains the large-scale flow (and often the thermodynamic state) to some reference and can be used to robustly demonstrate small effects but at the expense of strongly damping some interactions (Kooperman et al., 2012;Malavelle et al., 2017;Zhang et al., 2014). Another approach is to directly simplify the representation of aerosol effects, which can be an effective method of model comparison (Fiedler et al., 2018;Stevens et al., 2017;Voigt et al., 2017).
This work proposes that the aquaplanet configuration using the same aerosol representation as realistic configurations should be included in an aerosol modeling hierarchy. The motivation is to provide a configuration that contains the same essential physical processes as the realistic configuration, including emergent features that result from interactions between the physics and dynamics, but with idealized boundary conditions that allow for robust statistics in relatively short integrations and facilitate rapid numerical experimentation. In section 2, we describe the model being used and present an initial experiment using realistic emissions. A simple toy problem is posed in section 3: Does aerosol radiative forcing depend on the spatial distribution of aerosol emissions? A series of simulations is introduced to address this problem. Section 4 shows that results from these simulations are statistically robust. Section 5 provides some analysis of the reasons for the geographic dependence of ERF aer , and we discuss our findings in section 6.

Model and Experiments
All of the simulations described here use the Community Atmosphere Model, version 6 (CAM6), the atmospheric component of the Community Earth System Model, Version 2 (CESM2, Danabasoglu et al., 2020). The physics of CAM6 are outlined by Bogenschutz et al. (2018). The model uses a four-mode aerosol scheme described by Liu et al. (2016). The model is configured with a nominal 1 • horizontal resolution and 32 vertical levels using the finite-volume dynamical core (Lin & Rood, 1997;Lin, 2004).
The aquaplanet is the same numerical model as the Earth configuration but with idealized boundary conditions: The surface is covered by water with prescribed temperature that varies only with latitude, there is no sea ice nor topography, and the orbit is altered so every day is an equinox (i.e., seasonality is neglected) (Neale & Hoskins, 2000). Relatively short simulations provide stable statistics because of the aqupalanet's symmetry (see Medeiros et al., 2016 and section 4); simulations presented below are 12 years, initialized from an arbitrary previous aquaplanet state. Despite their idealized nature, these aquaplanets retain salient features of Earth's climate, including the global mean surface temperature (288 K), carbon dioxide concentration (348 ppm), and globally averaged insolation (341 W m −2 ). Land-surface feedbacks are absent, thus avoiding circulation changes induced by land-surface temperature changes; such changes have been demonstrated in fixed-SST experiments with realistic geography Dong et al., 2014). The absence of such memory effects from the land surface allows precise diagnosis of the system's rapid adjustment to a perturbation (quantified here using ERF).
The typical approach to estimating the ERF of aerosols is to compare fixed-SST simulations with present-day versus preindustrial emissions. The upper panel of Figure 1 shows this comparison for CAM6; one simulation uses emissions from year 2000 repeated over 12 years with monthly SST also from 2000, and the other simulation uses emissions from 1850 while retaining the same SST from 2000 (these will be referred to as amip simulations for convenience). The global average change in TOA radiative flux provides an ERF estimate of −1.65 W m −2 . This value is more negative than any of the CMIP5 models presented in Rotstayn et al. (2015) using similar experiments or the estimated −0.9 W m −2 for 2011 relative to 1750 (Myhre et al., 2013). Part of the discrepancy with the Rotstayn et al. (2015) results is due to the ERF dependence on the background state: Using preindustrial SST derived from a coupled CESM2 simulation (thereby using a different global surface temperature as well as different pattern of SST) and doing a similar experiment following the RFMIP protocol (this experiment will be denoted rfmip; not all necessary fields are available from these simulations, so they are omitted from some comparisons), the ERF weakens to −1.37 W m −2 , which is within the range reported by Rotstayn et al. (2015) The lower panel of Figure 1 shows a similar experiment with CAM6, comparing year 2000 emissions to 1850 emissions, resulting in an aerosol ERF of −2.08 W m −2 . The difference between the upper and lower panels of the figure is that the model is configured as an aquaplanet in the lower panel; this pair of simulations will be denoted as aqua-pd and aqua-pi. Usually, aquaplanet configurations of climate models remove aerosol effects (Williamson et al., 2012), and that is also the case with the default CESM2 aquaplanet (which follows the description in Medeiros et al., 2016). The aquaplanet simulations in Figure 1 are altered from the default CAM6 aquaplanet by reverting to the same nucleation scheme used in the standard, realistic configuration and applying the same emissions boundary conditions as for the realistic configuration. The aquaplanet SST is the default pattern (called QOBS in Neale & Hoskins, 2000).
Considering the difference between the standard CAM6 configuration and the aquaplanet, less than 1 W m −2 difference in ERF is remarkably small; it is less than the range across CMIP5 models. Figure 1 indicates that there are substantial regional differences. Most of these differences are clearly related to topography and changes of surface characteristics. Since there is no land, the aquaplanet has no source of mineral dust aerosol; an experiment in the standard configuration that removes dust emission confirmed that this is a very small (0.03 W m −2 ) contribution to the ERF (not shown). Where the aquaplanet lacks dust, however, it more than makes up in sea salt, which is emitted from the surface as a function of low-level wind speed. The global average sea salt burden in the aquaplanet experiment is around 29 mg m −2 compared to around 18 mg m −2 for the more realistic configurations (see also Table 1); this change is commensurate with increased ocean area in the aquaplanet. The difference between the experiments is almost entirely due to differences in the locations of land; masking the land locations from both experiments and averaging the "ocean" ERF results in −1.50 W m −2 for the amip experiment and −1.46 W m −2 for the aquaplanet experiment.
To simplify the aquaplanet with interactive aci, a simulation with only sea salt emissions was conducted (salt). In this case, sea salt takes up water and has a direct radiative effect and also impacts the formation of clouds. This simulation, like most of the aquaplanet simulations, is far from radiative balance (Table 1), highlighting the dramatic impact aerosol effects have on the global energy balance (see Appendix A for additional details of the CAM6 aquaplanet configuration). While it represents a very low-aerosol environment, salt provides a good starting point for evaluation of aerosol effects in this idealized setting.

Example Application: ERF from Regional Aerosol Emissions
CESM2 has a strong ERF when changing between present-day and preindustrial aerosol emissions, as shown by the amip and rfmip experiments with with year 2000 and preindustrial SST, respectively. The aquaplanet experiment with realistic emissions (aqua-pd vs. aqua-pi) shows even stronger ERF. In this section, we apply the aquaplanet configuration to ask whether ERF is dependent on the spatial pattern of emissions. This question is relevant when considering preindustrial (or prehistorical) time periods that may have had different natural emissions patterns but also when considering potential changes in the distribution of anthropogenic emissions. It is also crucially connected to climate intervention strategies that rely on aci to increase planetary albedo (Jones et al., 2009;Latham et al., 2012). Some estimates of aerosol forcing make assumptions that directly relate the forcing to the global average emission rate (Andreae et al., 2005;Smith & Bond, 2014;Stevens, 2015), but there is modeling evidence that regional emissions changes are associated with changes in the forcing (Persad & Caldeira, 2018;Shindell & Faluvegi, 2009). Persad and Caldeira (2018) recently reported that regional aerosol emissions perturbations of the same magnitude induce radiative forcing and global temperature responses that vary when applied in different geographic areas. Kasoar et al. (2018) removed sulfate emissions from selected areas and showed differing ERF for different regions, but that study also changed total emissions across experiments. We will consider this question in the highly idealized aquaplanet setting as an example of a problem that can be abstracted into a simpler setting that allows for robust statistics across several experiments for relatively little computational expense.
In the following series of experiments, we use the same preindustrial emissions as were used for Figure 1 (but neglecting elevated and volcanic sources) as the basis for the aquaplanet emissions. All the simulations share global average emissions but differ in the spatial distribution of the emissions. The first experiment distributes the emissions evenly across the globe (glb), while the others restrict the emissions to the northern hemisphere (hemi), a quadrant of the northern hemisphere (quad), the tropics (±30 • latitude; trop), the extratropics (poleward of ±30 • latitude; xtrop), and in a pole-to-pole wedge that extends 30 • in longitude (wedge). The experiments and some global average quantities are reported in Table 1. The patterns of aerosol emission are apparent in the burden of non-sea-salt aerosols, shown by Figure 2; sea salt burden is nearly identical across the simulations. Figures S1-S11 in the supporting information show climatological patterns for several additional quantities.
Adding aerosol emissions leads to increased cloud cover and albedo, resulting in negative ERF with respect to salt; adding aerosol to the system induces a cooling influence. The TOA balances are reported in Table 1, and the ERF between each simulation pair is shown in Figure 3. We will use the terminology ERF for these differences in R, but note that ERF aer is often taken to be defined as the TOA change associated with present-day versus preindustrial aerosol emissions. Rather than introduce an additional term (such as relative radiative forcing) or revert to some form of ΔR, we use ERF and ERF aer in analogy with the usual use. The As the area of emissions decreases, the tendency is for the TOA imbalance to increase toward that of the salt simulation. The planet becomes increasingly similar to the salt simulation because aerosols are efficiently removed from the atmosphere, mostly by being rained out near their origin (Koch et al., 2007;Kristiansen et al., 2016;Stjern et al., 2016). This is corroborated by the changes in the cloud fields (shown later and in the supporting information); as the regional changes get larger for the more concentrated emissions, the global means tend toward the salt value (see also Table 1). These experiments make plain that aerosol ERF is dependent not only on the total aerosol emissions but also on the spatial distribution of the emissions.
These experiments also show that the convergence toward salt is not only a function of the surface area of emissions. The wedge experiment has emissions over a smaller surface area than quad but has a slightly stronger ERF (relative to salt). Three of the simulations (hemi, trop, and xtrop) have emissions distributed over half the planet, but the TOA imbalance varies across the simulations. The ERF among them varies from 1.17 to 2.83 W m −2 . These results hint that tropical aci is more impactful in CAM6 than extratropical aci.

Statistical Features
When making estimates of ERF, there are two common approaches: apply linear regression to a coupled simulation (as in Gregory et al., 2004) or use fixed-SST simulations and take simple differences. Pincus et al. (2016) adopt the fixed-SST method for RFMIP following Forster et al. (2016). The two main advantages of the fixed-SST approach are closely linked: The error characteristics are improved by removing the variability associated with a changing SST, and this allows shorter and less costly simulations to provide a good estimate of ERF. Specifically, Forster et al. (2016) show that the 5-95% confidence interval for ERF can be reduced to 0.1 W m −2 with simulations of about 30 years. The confidence interval used by Forster et al. (2016) reduces with the square root of the number of simulated years because it is calculated using the critical values from the t distribution: CI = t crit × SE, where SE is the standard error (SE = N −0.5 , with the standard deviation and N the number of samples).
The aquaplanet simulations shown here are shorter than the recommended 30 years. Are statistics from such short simulations robust? In Table 2, we show the global mean with 5-95% confidence interval for shortwave TOA fluxes in all-sky, clear-sky, and clean-sky conditions. The global means are well estimated; the confidence intervals are small compared to the mean values (<1%). The values in Table 2 use the time series of annual mean values; this is usually done in order to average over the annual cycle. In all the aquaplanet simulations considered here, however, there is no annual cycle. Using monthly means increases the sample size and narrows the confidence interval (the amip and rfmip confidence intervals increase). This effect is slight for the aquaplanet simulations because the mean is robust with a small number of samples.
Following Forster et al. (2016), we can construct confidence intervals for estimates of ERF with different lengths of simulation. For any pair of simulations, the ERF can be calculated as a function of averaging time,  , as where R is the net TOA flux; a series of these ERF estimates is achieved by differencing cumulative sums of R and dividing by the number of temporal samples for each element. The confidence interval can be calculated as before using the derived series of ERF estimates. Using monthly mean values, we show the ERF estimates with the 5-95% confidence interval in Figure 4 for several selected simulation pairs. All the experiments Figure 4. ERF estimates with increasing simulation length. Shading shows 5-95% confidence interval using t distribution. Error bars show 5-95% confidence interval using bootstrap method; monthly means are used at integer numbers of years from 3 to 10 (thinner error bars), and annual means are used for 11-and 12-year samples (thicker error bars). show that the confidence interval around the estimated ERF is small within just a few years; surprisingly, the amip experiment with realistic geography shows the narrowest confidence interval, suggesting that the ERF estimate is very robust.
The interpretation of the 5-95% confidence interval in Figure 4 is less clear than the values of Table 2. In Table 2, it is a traditional estimate of sampling error; each sample represents a member of the larger population, and, as the sample size increases, the mean of the sample approaches the population mean. In Figure 4, the samples are made in a cumulative manner, altering the asymptotic behavior of the standard deviation with sample size. As an alternative estimate, Figure 4 also shows bootstrapped 5-95% confidence intervals for several averaging lengths. These are calculated by taking the time series of R for each simulation up to the averaging time  , sampling each series with replacement 10 5 times, calculating the difference of the means of the resampled data, and calculating the 5th and 95th percentiles of the distribution of differences. We use monthly means for data up to 10 years (thin error bars in the figure) and annual means for 11-and 12-year averaging intervals (when bootstrapping can produce a large enough number of combinations of data). The bootstrapped confidence intervals are wider than when using the t distribution. As sample size increases, the intervals narrow; qualitatively, the aquaplanets appear to produce robust estimates of ERF for  ≈ 4-5 years. The amip and rfmip experiments show wide confidence intervals when using monthly means, as expected, but they collapse when switching to annual means, consistent with the t-distribution-based result. When changing to annual averages, the bootstrapped confidence intervals for the aquaplanets increase because the sample size falls dramatically. For both estimates of sampling error, the apparently larger values for aquaplanet experiments is due to large values of ERF which are sensitive to relatively small fluctuations in R; this would be true for realistic configurations as well if the ERF was large.
Given these results, one might question whether the aquaplanets contain variability that substantially impacts statistics. The presence of low-frequency variability, for example, could lead to autocorrelation and reduce the effective sample size. Figure 5 tests this for R for all the simulations, showing the autocorrelation for lags up to 6 months. While the amip and rfmip simulations show the expected seasonal signal, the aquaplanets show nearly no autocorrelation in the net radiative flux, dropping below 1/e within 2 months. Higher frequency variability could impact the monthly means, and that is detected in the aquaplanets.  The sampling error of global means is small for all simulations (Table 2), but there is substantial intramonth variability evidenced by Figure 6 showing distributions of all-sky TOA fluxes and CRE. The aquaplanets show larger variability than the realistic configurations, mostly from the shortwave component. Since the insolation and SST are fixed, this variability must arise from cloud variability, confirmed by CRE distributions in the lower row of Figure 6. Separating the CRE into tropical and extratropical regions confirms that most of the variability originates from the tropics (not shown). Clean-sky CRE is very similar to the all-sky CRE: aerosol effects weaken (i.e., make less negative) the SWCRE by about 2 W m −2 but have little impact on the variability, implying that the enhanced aquaplanet variability is circulation driven.
While global means appear to be robust in both the realistic and aquaplanet simulations, there can be very different noise characteristics at individual grid points that can adversely impact regional analyses. To illustrate, Figure 7 shows maps of the ERF between pairs of simulations. In each map, grid points where the difference in net TOA flux is not statistically different from zero at the 5% level are removed. The top row shows the simulations with earth-like emissions; for the realistic configurations, annual averages are used (12 years for amip and 30 years for rfmip). In all these experiments, large regions fail the significance test. In the aquaplanet experiment with realistic emissions, similar results are obtained. Most of the insignificant changes occur where ERF is small. In the idealized emissions experiments, the changes are significant nearly everywhere. Where they are insignificant, it is almost always because the ERF value is very small. It is worth reiterating that the global emissions of the regional aquaplanet experiments are all identical, and for hemi, trop, and xtrop (all in the bottom row of Figure 7), the emissions are spread over 50% of the global surface area and represent a doubling in those regions compared to glb. A doubling of emissions for a specified region is not an extreme perturbation compared to other experiments; examples include a global fivefold increase in sulfate emissions for the PDRMIP protocol (Myhre et al., 2017), regionally removing all sulfate (Kasoar et al., 2018), and applying year 2000 emissions from China as perturbations to several other regions (Persad & Caldeira, 2018).

Contributions to ERF
As mentioned in section 1, decomposing ERF aer is nontrivial, and there does not appear to be a unique nor generally accepted method to do it at this time. One useful approach is to use additional diagnostic radiative transfer calculations in addition to the prognostic calculation: one that excludes clouds to calculate clear-sky fluxes, another that excludes aerosols to calculate clean-sky fluxes, and a third that excludes both clouds and aerosols to calculate the clear and clean-sky fluxes. Ghan (2013) reports that a reasonable estimate of the direct radiative forcing is given by RF = Δ(F − F clean ), where F is the net shortwave flux at TOA. Table 2 suggests that this direct radiative forcing is small. Ghan (2013) uses ΔSWCRE clean = Δ(F clean − F clear,clean ) as a measure of the aerosol effects on cloud radiative effect. This response in shortwave CRE is much stronger than the direct effects; Figure 8 shows the spatial patterns, in which changes in emissions are evident. In the aquaplanet experiments, the regions where emissions are concentrated show strong negative ΔSWCRE clean .
The impact of aci is associated with numerous processes, and decomposing ΔSWCRE clean is a challenging, ongoing topic of research. Ghan et al. (2016), for example, shows a decomposition that relates changes in ΔSWCRE clean to the product of ratios of changes in large-scale averages of several quantities. That study use nudged simulations with 3-hourly outputs and carefully filtered results to focus on low-level, warm clouds. Attempts to replicate that analysis with monthly means have not yielded robust results; both the difference in model configuration and details of the derived quantities may contribute. The equations in Ghan et al. (2016), however, show that ΔSWCRE clean is related to changes in cloud fraction (c), liquid water path (), and effective radius of cloud drops (r e ). Figure 9 shows results of using these quantities to construct a linear model of the time series of global average ΔSWCRE clean . For the aquaplanet experiments, the coefficient of determination is very high, but the realistic amip experiment is not fit as well. The model uses ridge regression, which accounts for collinearity among the predictors. In all aquaplanet cases, the coefficient for the  term is much larger than for cloud fraction or effective radius; for the amip experiment, the cloud fraction and  coefficients are nearly equal. This result does not appear to be sensitive to parameters used in the algorithm nor to different methods of standardization; other combinations of predictor variables result in worse fits. Only subtle changes result when modeling changes in the all-sky CRE, ΔSWCRE. Similar regressions using standard multiple linear regression also suggest the changes in  being dominant. These results are suggestive that the  response (i.e., the "lifetime effect") is the primary control on the TOA response to aerosol perturbations in this model.
There are many examples of diagnostics for the lifetime effect, or rapid adjustments, associated with aci. Sato et al. (2018), for example, suggests the aerosol-induced modulation of  can be diagnosed via regression (i.e., log() ∕ log(N d )), where N d is the column drop number concentration (see Table 1). Calculating that slope either spatially using the climatologies presented or temporally using monthly averages, all the experiments produce positive values that are consistent with other global models but inconsistent with the negative values in observations and reported for a global cloud-system resolving model (Sato et al., 2018). Bender et al. (2019) took a similar approach, calculating correlations between cloud albedo and N d as an indication of the strength of the Twomey effect (or the radiative forcing of aci) and between  and N d as suggestive of the lifetime effect (or adjustments). Figures 10 and 11 show such correlations for the two realistic configurations (both with preindustrial emissions) and two aquaplanet simulations (glb and quad). The correlations tend to be strong and significant, indicating both brightening and moistening with increasing drop number. The other aquaplanet simulations are omitted because they look very similar to those shown; Figure 9. Actual versus modeled ΔSWCRE clean for a set of pairs of simulations (upper left of each panel). The coefficient of determination is included in each panel. The predictor variables are total cloud fraction (c), in-cloud  (, estimated from grid-cell mean  divided by total cloud fraction), and effective radius of cloud drops (r e , taken as the column maximum value). All variables are standardized by removing the mean and dividing by the standard deviation. All fields are globally averaged to construct time series, and then the simulations are differenced. Linear fits to the time series of ΔSWCRE clean are determined using ridge regression and displayed in each panel.
regions of emissions differences are hardly noticeable. The strong positive correlation of  and N d is similar to the positive regressions obtained when following the method of Sato et al. (2018). Correlations, however, do not indicate the relative strength of the effects, nor do they account for correlation between  and N d . Yet another measure of the response of  to aerosol perturbations is = Δ ln  ln  0 ln CCN 0 Δ ln CCN , where the subscript denotes the value in the baseline climate. Zhang et al. (2016) used long-term averages to compute global values of for present-day versus preindustrial aerosol, and expressed that using CCN instead of N d was preferred because it is easier to compare with observations and is a more direct measure of the cloud response to aerosol perturbations (Figures S12 and S13 show similar spatial patterns for ΔCCN and ΔN d ). Figure 11 shows this response of  for select simulation pairs. Global values (see upper right of each panel) are similar to values reported by Zhang et al. (2016), but the values here are negative globally. The negative values obtained here are due to reduced  in experiments comparing preindustrial emissions to present-day emissions (first three panels) but are due to reduced CCN in the regional aquaplanet experiments relative to the glb simulation with uniform emissions. Although there are not strong regional features in , large values tend to occur at the margins of the emissions changes in the aquaplanet experiments.
These diagnostics appear to suggest that in CAM6 ERF aci is mainly due to adjustments associated with the  response (and to a lesser extent the cloud fraction response), but none of them is conclusive. More advanced techniques, such as partial radiative perturbation, may be necessary to provide a confident estimate of these effects . The circumstantial evidence, however, should not be ignored. Adding to it, note that the spatial patterns of ERF (Figure 1 Figures 10 and 11 show signs of similar banded structure. The structure in subtropical latitudes is especially pronounced. These are regions where warm boundary layer clouds predominate in the aquaplanets. As noted previously, the aci in the tropics appears to strongly impact the TOA balance, and this is also seen in ΔSWCRE clean ; for example, the banding in the trop-glb experiment is greatly diminished compared to the other experiments. This reiterates the point that aci in CAM6 is regime dependent, with low-level clouds in the subtropics appearing to respond sensitively to aerosol perturbations.

Discussion
This work proposes that aquaplanet configurations using the same aerosol representation as the realistically configured model provide a useful framework for studying climatic effects of aerosols. Using CESM2, a series of aquaplanet simulations have been introduced to explore how ERF changes under different distributions of emissions. Although there are quantitative differences from the realistic setting, the aquaplanet experiments presented here evince aerosol effects that are qualitatively consistent with the realistic configuration. The abstraction provided by the idealized setting, therefore, provides an opportunity to develop insights with bearing on more realistic studies of aerosol effects and could provide strategies for future experiments.
By spatially redistributing preindustrial emissions, the aquaplanet experiments show that aerosol ERF is not proportional to the global emissions, thus providing a generalization to previous results that have identified sensitivity of ERF to the distribution of emissions (Kasoar et al., 2018;Persad & Caldeira, 2018;Shindell & Faluvegi, 2009). Beyond showing that ERF varies with the distribution of emissions, the aquaplanet experiments expose the regime-dependence of aerosol effects. Aerosols emitted in the tropics, at least in CESM2, are more likely to contribute to enhanced cloud albedo and provide a stronger cooling influence than aerosols emitted in the extratropics. This finding hints that CESM2 (CAM6) is too sensitive to aerosol perturbations, as seen in previous generations of models including CESM1 (CAM5) (e.g., Malavelle et al., 2017).
With evidence that aquaplanets capture the essential mechanisms of aci in the model, they provide a testbed for further exploration of the cloud sensitivity to aerosol changes. Results in section 5 suggest that clouds in CESM2 are sensitive to aerosol perturbations mainly through changes of , often termed the lifetime effect. As can be seen by comparing Figures 8 and 10,  increases and CRE decreases with increased aerosol emissions (further quantified on the global scale by Figure 9). Conversely, reductions in aerosol have substantial consequences for  and CRE, seen in the convergence toward the emissions-free salt configuration as the Figure 13. Changes in grid-cell average . Regions where the difference is not significantly different from zero at the 5% level are removed. Annual means are used for the significance test for the realistic configuration. region of emissions is reduced. Much of the signal appears to arise from the response of low-level clouds in the tropics and subtropics. Future inquiry may leverage aquaplanets to hone in on regime-specific effects, for example, by altering the SST pattern and emissions to promote particular cloud regimes and focus on their response.
The aquaplanet framework proposed here expands the hierarchy of aerosol modeling. Recent aerosol-focused studies have used aquaplanets with simplified aerosol representations (e.g., Dagan et al., 2019;Gettelman et al., 2016;McCoy et al., 2018), while other approaches have used simplified circulation models with forcing to mimic aerosol effects (e.g., Wilcox et al., 2018). The idealized experiments presented here use the full complexity of CESM2 atmospheric physics to bridge the gap between the realistic model and simplified configurations. Using the aquaplanet allows for relatively short integrations, even for regional effects (section 4), while preserving interactions between atmospheric physics and dynamics that lead to emergent behavior. Allowing that complexity while producing robust statistics for reasonable computational cost complements other approaches, such as using nudging or SCMs. Similarly, the aquaplanet configuration complements other approaches that facilitate model comparison, such as simplified aerosol schemes (Fiedler et al., 2018;Stevens et al., 2017;Voigt et al., 2017). Collectively, these approaches begin to outline a model hierarchy (Held, 2005;Maher et al., 2019) for model development and evaluation and hypothesis testing regarding aerosol effects on climate.

Appendix A: Radiative Impact of Interactive Aerosols in CESM2 Aquaplanet
This appendix provides a brief account of the modification of the default CESM2 aquaplanet configuration to incorporate interactive aerosols.
As mentioned in section 2, the default aquaplanet configuration of CESM2 neglects aci. It accomplishes this by assuming constant cloud drop and ice crystal number concentrations in the microphysical calculation . It also removes aerosol emissions to remove ari but includes diagnostic sea salt emissions leading to a (direct) radiative impact. To assess the overall radiative impact of changing from prescribed to predicted hydrometeor number concentration, an additional experiment was performed that removed the sea salt emissions (noari); that showed a 2.5 W m −2 change in the TOA net radiative flux relative the default aquaplanet and a clear-sky change of 4.2 W m −2 . A recommendation is to correct this omission in the default configuration when running CESM2 aquaplanet simulations; to remove sea salt emission, a runtime parameter that controls the sea salt emission scaling factor can be set to zero. It might also be of interest that all the simulations presented here were run with CESM2 without any source code modifications; only runtime parameters were changed.
With the noari simulation as a reference, the impact of interactive aci from the aquaplanet simulations used for Figure 1 can be ascertained by comparing the net TOA radiative flux. Perhaps unsurprisingly, the impact of aerosol effects is very large; the net radiative imbalance of the noari simulation is −11.4 W m −2 compared to 9.7 W m −2 in the aquaplanet with preindustrial emissions (aqua-pi), a difference of about 20 W m −2 . The difference is much smaller in the clear-sky fluxes (47.9 compared to 44.8 W m −2 ), confirming that changes in clouds are the main mechanism affecting the TOA fluxes in these idealized experiments.
An additional simulation was performed in which both the prescribed aerosol emissions and the diagnostic sea salt flux were removed from the aquaplanet. This pristine simulation therefore has no aerosol present. Surprisingly, the model successfully runs in the pristine setting and even produces some liquid clouds. This unphysical result exposes an inconsistency in the model's physics; in reality, homogeneous nucleation of liquid water drops is effectively precluded for the range of conditions present in the simulations (Pruppacher & Klett, 1997), but the model's deep convection scheme produces and detrains liquid independent of cloud condensation nuclei. Homogeneous nucleation of ice crystals is efficient in the upper troposphere, and melting ice may also contribute to the liquid clouds in this simulation. No further results from this simulation are reported here; it is mentioned only to point out that these kinds of inconsistencies exist in climate models, and performing unconventional experiments can help to expose and understand them.
The aquaplanet framework raises the issue that establishing a reference climate impacts estimates of ERF. The convention of defining ERF between "present-day" and "preindustrial" climates accents the focus of many studies on the impact of anthropogenic aerosols (Rotstayn et al., 2015). Both present and past states have substantial uncertainties in natural and anthropogenic aerosols. Different baselines across studies will 10.1029/2019MS001874 produce different estimates of forcing (Carslaw et al., 2013;Smith & Bond, 2014); we have a simple demonstration of this when comparing the amip and rfmip experiments that differ only in their SST and result in different ERF (section 2). Since there is no "preindustrial" reference state for the aquaplanet configuration, it was not clear how to approach the problem of ERF. The result was the process of building from the default aquaplanet configuration to the regional experiments presented here. For many of the comparisons presented here, the differences are between the regional emissions experiments and the uniform emissions simulation (glb). This comparison worked best for the toy problem of section 3 because the simulations have the same global emissions, so the differences show how the spatial distribution impacts the TOA balance. Comparisons with salt are also of interest because it represents the simplest aquaplanet configuration with full aci, and differences with it show how the TOA balance changes as aerosol emissions are added (similar to the experiments of Persad & Caldeira, 2018 in a more realistic setting). Other comparisons highlight other features, which motivated the inclusion of all the comparisons shown in Figure 3.

Data Availability Statement
Data necessary to reproduce the analysis are available via Zenodo (doi: https://doi.org/10.5281/zenodo. 3780047). Interested researchers can contact the author for other output or details of the simulations.