LGM Paleoclimate Constraints Inform Cloud Parameterizations and Equilibrium Climate Sensitivity in CESM2
Abstract
The Community Earth System Model version 2 (CESM2) simulates a high equilibrium climate sensitivity (ECS > 5°C) and a Last Glacial Maximum (LGM) that is substantially colder than proxy temperatures. In this study, we examine the role of cloud parameterizations in simulating the LGM cooling in CESM2. Through substituting different versions of cloud schemes in the atmosphere model, we attribute the excessive LGM cooling to the new CESM2 schemes of cloud microphysics and ice nucleation. Further exploration suggests that removing an inappropriate limiter on cloud ice number (NoNimax) and decreasing the time-step size (substepping) in cloud microphysics largely eliminate the excessive LGM cooling. NoNimax produces a more physically consistent treatment of mixed-phase clouds, which leads to an increase in cloud ice content and a weaker shortwave cloud feedback over mid-to-high latitudes and the Southern Hemisphere subtropics. Microphysical substepping further weakens the shortwave cloud feedback. Based on NoNimax and microphysical substepping, we have developed a paleoclimate-calibrated CESM2 (PaleoCalibr), which simulates well the observed twentieth century warming and spatial characteristics of key cloud and climate variables. PaleoCalibr has a lower ECS (∼4°C) and a 20% weaker aerosol-cloud interaction than CESM2. PaleoCalibr represents a physically more consistent treatment of cloud microphysics than CESM2 and is a valuable tool in climate change studies, especially when a large climate forcing is involved. Our study highlights the unique value of paleoclimate constraints in informing the cloud parameterizations and ultimately the future climate projection.
Key Points
-
Excessive Last Glacial Maximum (LGM) cooling and an ECS > 5°C in Community Earth System Model version 2 are attributed to cloud microphysical processes including ice nucleation
-
A new configuration (PaleoCalibr) is developed that removes an inappropriate cloud-ice-number limiter and decreases microphysical timestep
-
PaleoCalibr simulates realistic LGM and modern climates, a lower ECS (3.9°C), and a weaker shortwave cloud feedback
Plain Language Summary
The Community Earth System Model version 2 (CESM2) shows a much higher equilibrium climate sensitivity (ECS > 5°C) than its predecessor models (≤4°C), which, if true, implies a greater future warming than previously thought and a more severe challenge for climate adaptation and mitigation. It is critical to determine whether the high ECS is realistic and what causes its increase. In a previous study, we suggested that the high ECS is likely unrealistic because CESM2 simulates excessive cooling for an ice age climate—the Last Glacial Maximum (LGM; ∼21,000 years ago). In this study, we investigate which aspects of CESM2 are responsible for the extreme LGM cooling and the high ECS. We find that the simulated LGM climate is very sensitive to treatments of cloud microphysical processes, and that removing an inappropriate limiter on cloud ice number and using a smaller time-step size in the microphysics largely eliminate the excessive LGM cooling. With these microphysical modifications, CESM2 simulates a much lower ECS (∼4°C) and matches present-day observations well. Our study suggests that an ECS > 5°C is likely unrealistic and highlights the importance of using past climates to inform and validate the model development including the treatment of clouds.
1 Introduction
The Community Earth System Model version 2 (CESM2) is the newest and most comprehensive model of the CESM family and is a participant in the Coupled Model Intercomparison Projects phase 6 (CMIP6; Bacmeister et al., 2020; Danabasoglu et al., 2020; Meehl, Arblaster, et al., 2020). A conspicuous difference between CESM2 and its predecessor models is its high equilibrium climate sensitivity (ECS; Bacmeister et al., 2020; Bitz et al., 2011; Gettelman et al., 2012, 2019; Kiehl et al., 2006). In the early versions of CESM (the Climate System Model version 1, the Community Climate System Model versions 2–4, and CESM1), ECS ranges from 2.0°C to 4.0°C, increasing with the model version and spanning the likely (66%) range from multiple synthesis reports (Figure 1; Charney et al., 1979; IPCC, 2013; Sherwood et al., 2020). In CESM2, ECS has risen to 5.6°C (calculated using a 1° atmosphere coupled to a slab ocean; Zhu et al., 2021) and well beyond the likely range in different synthesis reports including the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC, 2021). These increases of ECS with CESM model versions have been attributed to increases of model resolution and improvements of physical parameterizations, in particular clouds (Bacmeister et al., 2020; Bitz et al., 2011; Gettelman et al., 2012, 2019; Kiehl et al., 2006). Specifically, the higher ECS in CESM2, configured with the Community Atmosphere Model version 6 (CAM6), than that in CESM1 with CAM5 (hereafter CESM1) is attributed to changes in the atmospheric parameterizations of stratiform cloud microphysics, unified turbulence, ice nucleation, and convection, as well as the adjustment of aerosol-cloud interactions (ACIs) to match the twentieth century temperature record (Gettelman et al., 2019). A high ECS similar to that of CESM2 has been reported in other CMIP6 models and similarly attributed to the simulation of cloud processes (Zelinka et al., 2020).

Model simulated Last Glacial Maximum (LGM) global cooling and equilibrium climate sensitivity (ECS) in different versions of Community Earth System Model (CESM). ECS is estimated using coupled simulation with a slab ocean (see text for references). CESM2 PaleoCalibr is developed in this study. Vertical patch indicates the 95% confidence interval of proxy estimation of the LGM global cooling (−6.8°C to −4.4°C) from Tierney et al. (2020). Horizontal patches denote the 66% confidence intervals of ECS from the IPCC Assessment Report 5 (IPCC, 2013; light gray), Assessment Report 6 (IPCC, 2021; medium gray), and Sherwood et al. (2020; dark gray).
Whether the high ECS of CESM2 and many other CMIP6 models is realistic remains uncertain and is difficult to address using evidence from present-day observations. CESM2 reproduces well the magnitude of the twentieth century global warming in instrumental records and outperforms CESM1 in many observation-based climate metrics (Danabasoglu et al., 2020). In particular, CESM2 simulates a more realistic cloud phase distribution with more supercooled liquid water over the Southern Ocean, largely correcting a major model deficiency in CESM1 and many other CMIP5-class models (Bjordal et al., 2020; Gettelman et al., 2020; Kay et al., 2012, 2016). An increase in the mean state supercooled liquid water is attributed to the updated ice nucleation and cloud microphysical schemes in the atmosphere model and should lead to a weaker (less negative) cloud-phase feedback than in CESM1 (Gettelman et al., 2020). Thus, the net stronger cloud feedback and subsequent increase in ECS in CESM2 is an expected outcome of model improvements (Bjordal et al., 2020; Frey & Kay, 2018; Tan et al., 2016). On the other hand, process understanding from satellite observations suggests that high-ECS models including CESM2 overestimate the cloud feedback over tropical shallow cumulus regions (Cesana & Del Genio, 2021; Myers et al., 2021), which is consistent with a recent work showing models (including CESM2) underestimate a negative cloud feedback from cloud lifetime changes (Mülmenstädt et al., 2021). The representation of cloud feedbacks in climate models remains as a large source of uncertainty in climate model projections. Thus, CESM2's successful simulation of the twentieth century warming could result from coexisting and compensating model biases due to excessive sensitivities to both aerosol and greenhouse gas (GHG) increases. In this case, the resultant cooling and warming during the historic period offset each other (C. Wang et al., 2021; Kiehl, 2007; Meehl, Senior, et al., 2020).
Paleoclimate constraints represent a unique and independent way to assess the climate sensitivity of models and consist of performing paleoclimate simulations that incorporate reconstructed climate forcings and assessing them against proxy reconstructions of paleotemperature (e.g., Manabe & Broccoli, 1985). Simulations of the Last Glacial Maximum (LGM; an extreme ice-age climate of ∼21,000 years ago) have been performed using many versions of the CESM models and exhibit a close relationship between global cooling and ECS (Figure 1; correlation coefficient = −0.96; Brady et al., 2013; Otto-Bliesner et al., 2006; Shin et al., 2003; Zhu & Poulsen, 2021; Zhu et al., 2017, 2021). CESM2, for instance, has the highest ECS (5.6°C) and also simulates the coldest LGM global temperature among the CESM models, a temperature that is at least 5°C lower than a recent proxy based estimate and the CESM1 LGM global temperature (Tierney et al., 2020; Zhu et al., 2021). CESM2 also overestimates global and regional temperature responses for past warm climates including the Early Eocene (an extreme greenhouse climate of ∼50 million years ago) and the Pliocene (the most recent warm climate of ∼3.2 million years ago with atmospheric CO2 comparable to today's; Feng et al., 2020; Zhu et al., 2020). Taken together, these paleoclimate simulations suggest that CESM2 is too sensitive to large external forcings and that its high ECS and strong cloud feedbacks are likely unrealistic. The excessive cooling in the CESM2 LGM simulation has been attributed to the strong shortwave cloud feedback in the Southern Hemisphere (SH) subtropics and mid-to-high latitudes (Zhu et al., 2021). However, it remains unclear which aspects of the cloud feedback processes (such as processes related to stratiform cloud microphysics, unified turbulence, ice nucleation, and convection) in CESM2 are causing the unrealistic climate sensitivity.
In this study, we use LGM constraints to examine details of the cloud feedback processes in CESM2 and to develop a paleoclimate-calibrated version of CESM2 that has a realistic sensitivity to LGM forcings. We adopt the fully coupled LGM configuration in Zhu et al. (2021) and utilize the fact that CESM2 with CAM5 simulates a much more realistic LGM global surface temperature than with CAM6. We evaluate the impact of individual CAM6 cloud schemes on simulated LGM global cooling through simulations in which CAM6 schemes are replaced, one at a time, with older CAM5 schemes. Additionally, we explore physical and numerical aspects of key cloud parameterizations. Finally, we compare the paleoclimate-calibrated version of CESM2 to present-day observations including the scale-aware and definition-aware diagnostics available in satellite simulators. Our study demonstrates that paleoclimate information provides unique constraints on the cloud parameterizations, which critically determine climate sensitivity.
2 Models and Experiments
CESM2 consists of state-of-the-art models of the atmosphere, ocean, land, sea ice, and river and has the capability to simulate ice-sheet dynamics (Danabasoglu et al., 2020). Among the substantial science and infrastructure improvements from CESM1 to CESM2, updates to the cloud-related parameterizations in CAM6 are the primary reason for the high sensitivity to external forcings (Gettelman et al., 2019; Zhu et al., 2021). Specifically, CAM6 uses an updated cloud microphysics scheme (MG2) that predicts rather than diagnoses the mass and number concentration of rain and snow (Gettelman & Morrison, 2015). Of key significance in this study, MG2 introduced a classical-theory-based heterogeneous ice nucleation scheme that links the mixed-phase ice nucleation directly to temperature and aerosols (HetFrz; Hoose et al., 2010; Y. Wang et al., 2014). Alongside the microphysics revisions, CESM1's separate schemes of the moist turbulence in planetary boundary layer, shallow convection, and cloud macrophysical quantities have been replaced with a unified treatment, the Cloud Layers Unified by Binormals (CLUBB; Bogenschutz et al., 2013; Larson & Golaz, 2005). CLUBB is a higher-order turbulence closure scheme that uses a double-Gaussian probability density function to provide a self-consistent closure treatment of higher-order turbulence moments of vertical velocity, temperature, and moisture, as well as boundary layer cloud properties of both stratocumulus and cumulus. Additional updates and modifications have been implemented to schemes of aerosols, deep convection, orographic gravity wave, and boundary layer form drag (Danabasoglu et al., 2020).
We employ the same LGM initial and boundary conditions as in Zhu et al. (2021). GHGs are 190 ppm, 375 ppb, and 200 ppb for CO2, CH4, and N2O, respectively. Ice sheets are from the ICE-6G reconstruction at 21 ka (thousand years before present) with changes in land surface properties, surface topography, and land-sea mask (Peltier et al., 2015). Earth orbital parameters are fixed at the 21-ka values. Preindustrial aerosol emissions and vegetation cover are used in all the LGM simulations. Similar to Zhu et al. (2021), coupled preindustrial (PI) and LGM simulations are run with prescribed satellite vegetation phenology (unless noted), which allows us to focus on the radiative climate feedback without the need to be concerned about the vegetation phenology feedback. Different from Zhu et al. (2021), a lower horizontal resolution of the atmosphere and land is used to save computing resources (1.9 × 2.5° instead of 0.9 × 1.25°; referred to as FV2 and FV1, respectively). CESM2 FV2 differs from FV1 in the cloud tuning parameters, which are required to achieve an overall top-of-atmosphere (TOA) energy balance for the preindustrial simulation. Specifically, the microphysical autoconversion size threshold for ice to snow (micro_mg_dcs) is decreased from 500 × 10−6 to 200 × 10−6 m and constant of the width of probability density function of vertical velocity (clubb_gamma_coef) is decreased from 0.308 to 0.28.
We perform paired PI and LGM simulations using different configurations of the atmosphere model within the fully coupled CESM2 framework (Table 1). We use the LGM proxy sea-surface temperature (SST)-derived global cooling of 5.6°C (4.4°C–6.8°C; 95% confidence interval; Tierney et al., 2020) as a benchmark to evaluate these configurations. The first two configurations use CAM6 and CAM5 as the atmosphere component model, respectively (referred to as CAM6 and CAM5; hereafter italic font is used for a specific CESM2 configuration). To explore the reason for the greater LGM cooling in CAM6 than in CAM5, additional sensitivity configurations are tested with one cloud scheme in CAM6 either replaced with the older CAM5 version or altered from the default setting (cf., Gettelman et al., 2019). In HetFrzOff, we use the CAM6 configuration, except that the new heterogeneous ice nucleation scheme (HetFrz) is replaced with the older scheme in CAM5. In ClubbOff, we replace the unified moist turbulence scheme (CLUBB) in CAM6 with the corresponding CAM5 schemes. In Mg2Off, we replace the new cloud microphysics scheme (MG2) with the older version (MG1). Considering its overall importance, we developed additional configurations (NoNimax and Mg2Sub8) to further examine details of the cloud microphysics (see Section 3.2 and 3.3 for the rationale for these sensitivity configurations). In NoNimax, a limiter on the cloud ice number concentration is removed in MG2. In Mg2Sub8, a microphysical substep of 8 is used (the default value being 1), which decreases the MG2 time-step size from 600 to 75 s. An additional configuration (NnSub8) that combines NoNimax and Mg2Sub8 is also tested (substep numbers of 4 and 16 are also performed but only briefly discussed in this paper). We emphasize that no parameter tuning is performed in any of the configurations, so the difference between CAM6 and a sensitivity configuration is due to the cloud scheme or modification in question. These fully coupled simulations with various configurations are performed for 100 model years after initializing from the same PI or LGM state. Although many of the simulations have not reached equilibrium in surface climate after 100 model years (Table 1), they are sufficiently integrated to demonstrate the sensitivity of the simulated LGM cooling and cloud feedback to individual cloud schemes and modifications (see results below). Averages of the last 30 years of each simulation are used for analysis.
Configurations | PI ΔN | LGM ΔN | ΔTLGM | λsw_cld_LGM | ECSSOM | λsw_cld_2× |
---|---|---|---|---|---|---|
CAM6 | ‒0.18 | ‒1.01 | ‒9.0 | 0.81 | 6.1 | 0.95 |
CAM5 | 0.30 | 0.27 | ‒6.3 | 0.29 | 3.7 | 0.32 |
HetFrzOff | 0.42 | 0.15 | ‒5.9 | 0.37 | 3.8 | 0.47 |
ClubbOff | ‒0.48 | ‒1.1 | ‒8.9 | 0.64 | 6.2 | 0.86 |
Mg2Off | 0.41 | 0.01 | ‒6.3 | 0.49 | 4.3 | 0.54 |
NoNimax | 0.13 | ‒0.29 | ‒6.9 | 0.64 | 5.0 | 0.79 |
Mg2Sub8 | ‒0.21 | ‒0.81 | ‒8.2 | 0.72 | 4.8 | 0.74 |
NnSub8 | 0.09 | ‒0.14 | ‒6.4 | 0.49 | 4.0 | 0.59 |
- Note. Results from the last 30 years of simulations with various configuration of the atmosphere model are shown. CAM6 uses the default CESM2(CAM6); CAM5 uses the old CAM5 cloud parameterizations; ClubbOff uses the CAM5 shallow convection and boundary layer schemes; HetFrzOff uses the CAM5 ice nucleation scheme; Mg2Off uses the CAM5 cloud microphysics; NoNimax removes the “nimax” limiter; Mg2Sub8 uses 8 substeps in the microphysical scheme; NnSub8 removes the “nimax” limiter and uses 8 substeps in the microphysics. See text for details of these configurations.
To directly show the impacts of each configuration on ECS and to link the cloud feedbacks in paleoclimate and present-day climate simulations, paired PI and 2 × CO2 simulations with each CESM2 configuration are performed using a slab ocean model (SOM). The same mixed layer depth and heat transport convergence (“q-flux” hereafter; derived from the coupled CMIP6 PI simulation using CESM2 FV2) are prescribed in each SOM simulation. No parameter tuning is performed for the SOM simulations except for ClubbOff, in which the relative humidity threshold for low clouds (rhminl) is increased from 0.95 to 0.99. This tuning of ClubbOff SOM simulations decreases the low-cloud fraction, which is necessary to prevent the model from drifting into a cold climate. Each SOM simulation is carried out for 80 years and has reached equilibrium (TOA net radiation < |0.1|W m−2) with the last 30 years used for calculation of ECS (denoted as ECSSOM) and the shortwave cloud feedback.
We use the approximate partial radiative perturbation method (APRP) to quantify the shortwave cloud feedback (Taylor et al., 2007). APRP uses monthly model output of radiation fields to build a simplified radiation model and quantify the shortwave feedbacks. The shortwave cloud feedback parameters in the paired PI and LGM in a fully coupled configuration and the paired PI and 2 × CO2 in a SOM configuration are denoted as λsw_cld_LGM and λsw_cld_2×, respectively. The longwave feedback in the simulations is not quantified because it is not a major driver for the differences in ECS and the LGM temperature response between CESM2 configurations (Gettelman et al., 2019; Zhu et al., 2021).
After the individual cloud schemes and changes are evaluated against the proxy-derived LGM global cooling, a paleoclimate-calibrated CESM2 configuration (PaleoCalibr) in FV2 is developed. A suite of DECK (Diagnostic, Evaluation and Characterization of Klima) simulations and a CMIP6 historical simulation are performed (Eyring et al., 2016), which follows the experimental setup of the simulations using the standard CESM2. Results from the PaleoCalibr preindustrial, historical Atmospheric Model Intercomparison Project (AMIP), historical, and abrupt 4 × CO2 simulations are discussed. The historical AMIP simulation is run with the satellite simulator to facilitate a direct comparison with satellite observations (Swales et al., 2018). We also have additional atmosphere-only simulations with prescribed SST and sea ice from observation at 2000 CE that are used to quantify the ACI and to test sensitivity to some parameters (see details below).
3 Sensitivity of LGM Global Temperature to Cloud Microphysical Processes
3.1 Role of Individual Cloud Schemes
CESM2(CAM6) with a ∼2° atmosphere significantly overestimates the LGM global cooling, which is consistent with the results with a ∼1° atmosphere in Zhu et al. (2021). The LGM global mean surface temperature change (ΔGMST) in CAM6 reaches −9.0°C with a large TOA imbalance of approximately −1.0 W m−2 after 100 model years, suggesting that if the simulation were extended further, additional cooling would be expected (red in Figure 2; Table 1). In contrast, ΔGMST in CAM5 is −6.3°C (brown) and falls within the proxy suggested range of LGM global cooling (gray patch). Similar to CAM5, Mg2Off (green) and HetFrzOff (orange) have LGM ΔGMSTs of −6.3°C and −5.9°C after 100 years, respectively, which also fall within the proxy range. In contrast, ClubbOff has a ΔGMST of −8.9°C that is comparable to the CAM6 value (blue vs. red).

Time series of global mean surface temperature (GMST) in (a) the preindustrial (PI) and (b) the Last Glacial Maximum (LGM) simulations using various atmosphere model configurations within the coupled Community Earth System Model version 2 framework. (c) Changes in GMST between paired LGM and PI simulations. (d) Changes in top-of-atmosphere radiation (ΔN) versus GMST (ΔGMST) in paired simulations. ΔN and ΔGMST (markers) are the 5-year running mean of the LGM time series with the last-30-year averages of the PI simulation subtracted. A linear regression between ΔN and ΔGMST is shown as dashed line for each configuration. In panels (c and d), the LGM ΔGMST and the 95% uncertainty interval from Tierney et al. (2020) are shown. See text and Table 1 for details of the model configurations.
The different LGM ΔGMSTs in these configurations are linked to the strength of shortwave cloud feedback. λsw_cld_LGM in CAM6 is 0.81 W m−2 K−1, more than double the CAM5 value of 0.29 W m−2 K−1 (Figure 3 red vs. brown; Table 1). λsw_cld_LGM in CAM6 is larger than in CAM5 over all latitudes, especially over the SH subtropics and the Southern Ocean (SO). λsw_cld_LGM is 0.49, 0.37, and 0.64 W m−2 K−1 in Mg2Off, HetFrzOff, and ClubbOff, respectively. In the subtropics, Mg2Off produces a λsw_cld_LGM comparable to CAM5 (green vs. brown), indicating that the stronger subtropical λsw_cld_LGM in CAM6 than in CAM5 is largely due to the new cloud microphysics scheme (MG2). Over the SO, HetFrzOff produces a λsw_cld_LGM comparable to CAM5 (orange vs. brown), indicating that the new ice nucleation scheme (HetFrz) explains the stronger SO λsw_cld_LGM in CAM6. Over the SH subtropics, λsw_cld_LGM in HetFrzOff is also weaker than CAM6, although not as weak as in CAM5. HetFrzOff simulates more ice nucleation particles and cloud ice than CAM6 (Figure 4b), which, we speculate, produces larger negative cloud phase and lifetime feedbacks due to the increase in mean state cloud ice which leads to a larger cloud phase transition in response to warming (Mülmenstädt et al., 2021; Tan et al., 2016).

Zonal mean shortwave cloud feedback (λsw_cld; units: W m−2 K−1) for various atmosphere model configurations in (a) the paired preindustrial and Last Glacial Maximum (LGM) simulations using fully coupled Community Earth System Model version 2 (CESM2) and (b) the paired preindustrial and 2 × CO2 simulations using CESM2 slab ocean model. (c) Differences in λsw_cld between the LGM and 2 × CO2 simulations using the same atmosphere model configuration. (d) Scatter plot of the global mean λsw_cld in the LGM and 2 × CO2 simulations. See text and Table 1 for details of the model configurations.

(a) Zonal mean cloud liquid water path in the preindustrial simulations with various atmosphere model configurations in the coupled Community Earth System Model version 2 framework. Panels (b) same as panel (a), but for the cloud ice water path (IWP). Note that CAM5 IWP in panel (b) has been multiplied by 0.5 for illustrative purpose (shown as dashed brown line). See text and Table 1 for details of the model configurations.
The tests of individual cloud schemes suggest that the cloud microphysical processes, including those related to mixed-phase and liquid clouds, are important in driving the strong cloud feedback in CAM6 (Gettelman et al., 2019) and are likely responsible for the unrealistically high CESM2 ECS. Nevertheless, the new cloud schemes were developed according to theory and process-level understanding and were found to be critical to the improved simulation of the SO cloud phase distribution (Gettelman et al., 2020). Given that the new schemes in CAM6 are “better physics,” we next examine details of the cloud microphysical processes while using these more advanced cloud schemes.
3.2 Role of a Cloud-Ice-Number Limiter
A cloud-ice-number limiter (named “nimax” in MG2) sets the maximum allowed number of cloud ice particles to a sum of terms representing each source of ice crystals. The MG2 ice nucleation in mixed-phase clouds was replaced with a more process-based scheme (Hoose et al., 2010; Y. Wang et al., 2014), yet “nimax” was not re-coded to account for the new source terms. Shaw et al. (2022) reported the “nimax” issue and noted that, without a correction, the heterogeneous ice nucleation processes can increase the mass of cloud ice but not raise the number concentration, that is, artificially increasing ice crystal size and sedimentation. Investigating Arctic clouds, Shaw et al. (2022) found that “nimax” suppressed the formation of stable ice clouds and affected cloud feedbacks. Additionally, “nimax” prevents secondary ice number production through the Hallett-Mossop process. With “nimax,” the cloud ice number also has less freedom to adjust to internal or forced variations. Here we examine the role of “nimax” on the climate sensitivity and cloud feedbacks through a suite of simulations (NoNimax), in which “nimax” is removed to improve the physical consistency in mixed-phase clouds. “nimax” was designed in MG1 to avoid excessive nucleation with the old ice nucleation scheme and long microphysical time step. In consideration of MG2's much shorter microphysical time step (600 s in MG2 vs. 1,800 s in MG1) and more advanced ice nucleation scheme, we propose that this ice number limiter is no longer fit for purpose.
NoNimax has minor impact on the preindustrial GMST but warms the LGM by close to 3°C in 100 years, leading to a much-improved LGM ΔGMST of −6.8°C (yellow vs. red in Figure 2). As expected, NoNimax changes the response to LGM forcing by affecting the cloud feedback. The global mean λsw_cld_LGM is 0.64 W m−2 K−1 in NoNimax, 20% smaller than the CAM6 value (Table 1). Zonal mean λsw_cld_LGM in NoNimax is weaker over the SH subtropics and the mid-to-high latitudes in both hemispheres (yellow vs. red in Figure 3). To some degree, NoNimax impacts cloud properties and feedbacks over the SH subtropics in a similar way as HetFrzOff, likely indicating a similar mechanism through increasing cloud ice content (Figure 4b) and strengthening the (negative) cloud phase and lifetime feedback. We note that the projected LGM ΔGMST for NoNimax is ∼‒8°C, estimated by extrapolation using TOA radiation and GMST (yellow in Figure 2d), and indicates that NoNimax would likely overestimate the LGM ΔGMST if the simulation were extended beyond 100 years.
NoNimax produces an unrealistic simulation of cloud ice number concentration (Figure 5b). The zonal-mean in-cloud ice number concentration in CAM6 is generally less than 50 L−1 below ∼400 hPa with maximum centers in the middle troposphere at the mid-latitudes and in the lower troposphere at polar regions. In the stratosphere, the zonal-mean in-cloud ice number reaches values greater than 900 L−1. The high values over the stratosphere likely reflect a model bias, while values over the middle and lower troposphere are roughly of the same order as observations (e.g., DeMott et al., 2010; Patnaude et al., 2021). NoNimax simulates a zonal mean in-cloud ice number greater than CAM6 almost everywhere. Over the Northern Hemisphere (NH) mid-latitudes, the zonal-mean in-cloud ice number reaches values >900 L−1 at ∼400 hPa and >300 L−1 below; these values are roughly an order of magnitude larger than observations (e.g., DeMott et al., 2010).

Pressure-latitude section of the zonal mean in-cloud cloud ice number in (a) the default CAM6 simulation and (b–g) simulations with the “nimax” limiter removed and with substep of 1, 2, 4, 8, 16, and 32 in the microphysical scheme, respectively. Pressure is the y-axis with units of hPa. Results are from atmosphere-only simulations forced by the observed present-day sea surface temperature and sea ice (results are similar if coupled simulations are used). The in-cloud ice number is constructed by averaging monthly cloud ice numbers for grid points with a cloud ice mixing ratio greater than 0.01 part-per-million by mass.
3.3 Role of Substepping in Microphysics
The large overestimation of in-cloud ice number in NoNimax motivates us to explore whether substepping in the microphysics helps to improve the simulation. Microphysical substepping decreases the time-step size through increasing the substep number of microphysical calculations per calculation of the other model physical parameterizations. For simplicity, we perform a suite of atmosphere-only simulations forced by the observed climatological SST and sea ice from 2000 CE. These simulations are run with NoNimax and with an increased microphysical substep of 2, 4, 8, 16, and 32, respectively (Figure 5). At the NH mid-latitudes, the zonal mean in-cloud ice number at 400 hPa decreases from ∼900 to ∼300 L−1 with 2 MG2 substeps (NnSub2; microphysical time step of 300 s) and to ∼100 L−1 with 4 substeps (NnSub4; microphysical time step of 150 s). At ∼700 hPa, the cloud ice number decreases from ∼300 to <50 L−1 with substeps greater than 8. Overall, the simulated cloud ice number is converging after 8–16 MG2 substeps (a microphysical time step ≤75 s).
Employing microphysical substepping together with NoNimax further decreases the shortwave cloud feedback and the simulated LGM global cooling, in addition to improving the simulation of cloud ice number. Three pairs of coupled PI and LGM simulations are performed with NoNimax and microphysical substep of 4, 8, and 16 (referred to as NnSub4, NnSub8, and NnSub16, respectively). The LGM ΔGMST are −6.4°C, −6.5°C, and −6.5°C after 100 model years in NnSub4, NnSub8, and NnSub16, respectively. The global mean λsw_cld_LGM are 0.53, 0.49, and 0.49 W m−2 K−1, respectively. The decrease of λsw_cld_LGM with the microphysical substepping reaches saturation at 8 substeps: successive substep increases from 1 to 4, 8, and 16 decrease λsw_cld_LGM by 0.11, 0.04, and 0.00 W m−2 K−1, respectively. Consistent with the global mean, the zonal mean λsw_cld_LGM also exhibits convergence with an MG2 substep of eight or higher (Figure 6). Although the decrease of the LGM global cooling is small from NoNimax (with a substep of 1) to NnSub8 (ΔGMST of −6.8°C vs. −6.4°C after 100 model years), the projected LGM ΔGMST is much larger (−8.0°C vs. −7.0°C; yellow vs. black in Figure 2d), which is consistent with the large impact on the shortwave cloud feedback (Figure 3) and TOA radiation (Table 1; see also the ECSSOM).

Zonal mean shortwave cloud feedback (λsw_cld) calculated in the paired preindustrial and Last Glacial Maximum simulations with various configurations. See text and Table 1 for details of the model configurations. NnSub4 and NnSub16 are the same as NnSub8 except for the 4 and 16 substeps in the microphysics, respectively.
To understand the processes that weaken the cloud feedback with an increase in microphysical substeps, we perform an additional pair of coupled PI and LGM simulations with CAM6 and 8 microphysical substeps (referred to Mg2Sub8). Mg2Sub8 has the active “nimax” limiter and simulates a low tropospheric cloud ice number like CAM6 (not shown). Compared to CAM6, the global mean λsw_cld_LGM in Mg2Sub8 decreases by 0.09 W m−2 K−1 from 0.81 to 0.72 W m−2 K−1 (Table 1). In comparison, λsw_cld_LGM decreases by 0.15 W m−2 K−1 between NoNimax and NnSub8 with “nimax” removed. Both configurations with (CAM6 and Mg2Sub8) and without “nimax” (NoNimax and NnSub8) consistently show decreases of λsw_cld_LGM in the subtropics and SH mid-latitudes (Figure 6). These results suggest that a large part of the weakening of cloud feedback with microphysical substepping is through pathways other than changing cloud ice number. For example, the rain evaporation and self-collection processes in MG2 are found to exhibit large timestep dependence in a set of preindustrial simulations (Santos et al., 2020), but the exact reasons for the timestep-dependent cloud feedback need further investigation.
3.4 Connected Cloud Feedback Between LGM and 2 × CO2 Simulations
A strong correlation between λsw_cld_LGM and λsw_cld_2× is found across the major configurations that are explored in this study (Figure 3). In the global mean, the correlation coefficient between λsw_cld_LGM and λsw_cld_2× is 0.95 (Figure 3d). A similar strong correlation (−0.93) is also found between LGM ΔGMST and ECSSOM among these CESM2 configurations (Table 1). Averaged across all the configurations, λsw_cld_2× is larger than λsw_cld_LGM by 0.11 W m−2 K−1, which is consistent with previous findings that the shortwave cloud feedback increases with GMST in CESM models (Zhu & Poulsen, 2020; Zhu et al., 2019). In the zonal mean, λsw_cld_2× is larger than λsw_cld_LGM over the middle-to-high latitudes (Figure 3c), likely linked to the more positive cloud-phase feedback in response to warming than to cooling (Zhu & Poulsen, 2020). λsw_cld_LGM is larger than λsw_cld_2× in the tropics, which could be linked to the stronger glacial trade winds and the impact on low clouds through increasing latent heat flux (Zhu et al., 2021). This high correspondence between global and regional shortwave cloud feedback in paleoclimate and present-day simulations (as well as between the LGM ΔGMST and ECSSOM) supports the notion that paleoclimate information can be used to constrain the cloud feedback and ECS (Zhu et al., 2021). Moreover, λsw_cld_LGM is obtained with paired, short PI and LGM simulations of 100 model years, which may still have large GMST trends and TOA energy imbalances (Figure 2; Table 1). The high correlation between λsw_cld_LGM and λsw_cld_2× (obtained in equilibrated SOM simulations) suggests that our major findings on the shortwave cloud feedback depend little on the equilibration state of the coupled simulations. This is further supported by the high correlation (0.95) between λsw_cld_LGM and λsw_cld_2× in shorter coupled PI and LGM simulations of 50 years.
4 A Paleoclimate-Calibrated Configuration of CESM2
As in Zhu et al. (2020, 2021), we find that CESM2 with CAM6 performs poorly in climates with large radiative forcings that exceed that of the historical record. To mitigate this shortcoming, we develop a paleoclimate-calibrated CESM2 configuration (PaleoCalibr) based on NnSub8, whilst other configurations of CESM2, including CAM5, HetFrzOff, and Mg2Off also produce acceptably realistic LGM global cooling (Figure 2), NnSub8 uses the advanced cloud schemes in CAM6, in particular the ice nucleation and microphysical schemes that are based more on theory or process-level understanding. NnSub8 also represents a minimal departure in model code from CAM6 and probably its future versions. Moreover, NnSub8 simulates a slightly positive shortwave cloud feedback over the SO at ∼50°–60°S (Figure 3b), which is more consistent with satellite observations (Myers et al., 2021) than HetFrzOff and CAM5. Based on NnSub8, PaleoCalibr incorporates additional minor tuning. The CLUBB gamma parameter is lower from 0.280 to 0.275 to decrease the TOA radiation imbalance in the preindustrial simulation. The dust emission scaling factor (dust_emis_fact) is lower from 0.70 to 0.55 to ensure a more realistic global mean dust aerosol optical depth. Additionally, a new and simple limiter on cloud ice number (Ni < 1,000 L−1) is added at the end of the microphysical calculations to ensure a realistic simulation of cloud ice number over the stratosphere. This additional model tuning and cloud-ice-number limiter have no impact on the cloud feedback and LGM cooling, which has been confirmed in additional test simulations (not shown).
We perform 500-year simulations for both the preindustrial and LGM using PaleoCalibr and the same experimental setups as the standard CESM2 runs. PaleoCalibr PI has a similar GMST as CESM2 (13.9°C vs. 14.1°C) and a small TOA energy imbalance (0.03 W m−2) at the end of the simulation. PaleoCalibr LGM has a ΔGMST of −6.7°C and a TOA radiation imbalance of −0.08 W m−2 (Figure 7). The projected LGM ΔGMST, using a linear regression between LGM GMST and TOA radiation, is approximately −7.3°C in PaleoCalibr, which is marginally too cold when compared with the proxy estimation (Tierney et al., 2020). We contend that the PaleoCalibr LGM is acceptably realistic and suitable for glacial climate research, considering the uncertainty in the ice sheet forcing and the absence of LGM dust forcing in our simulations (Abe-Ouchi et al., 2015; Ohgaito et al., 2018). We note that the land biogeochemistry (BGC) model is inactive in the PaleoCalibr LGM simulation but is active in PaleoCalibr PI. The PI simulation (and the associated historical and abrupt 4 × CO2 simulations) with land BGC is consistent with the available standard CESM2 DECK simulations. We decided not to include land BGC in the PaleoCalibr LGM because it produces an extra LGM cooling of >1°C after 100 simulation years (not shown) due to vegetation phenology feedbacks. This vegetation phenology-induced LGM cooling is consistent with results in CESM1.2 (Zhu & Poulsen, 2021) but we do not know how realistic it is, given that the land vegetation processes are highly parameterized for the present climate and may not work well under a much colder environment with a much lower CO2 (Lawrence et al., 2019).

Time series of the global mean surface temperature anomaly in the Last Glacial Maximum (LGM) simulations with Community Earth System Model version 2 and the paleoclimate-calibrated (PaleoCalibr) configurations. Black dashed line with the gray patch denotes the 95% uncertainty interval from Tierney et al. (2020) for the LGM global cooling.
4.1 A Realistic Simulation of the Present-Day Climate
We first evaluate the cloud simulation of PaleoCalibr in an AMIP historical simulation with an active Cloud Feedback Model Intercomparison Project Observational Simulator Package. We use the Taylor diagram (Taylor, 2001) for a compact visualization of the model performance (Figure 8). A suite of model variables was compared against observations using multiple metrics including the area-weighted pattern correlation and normalized root-mean-squared differences (RMSDs), as well as the relative bias. Cloud observations that are used in the model evaluation include the climatology of TOA cloud radiative forcing from Clouds and the Earth's Radiant Energy System Energy Balanced and Filled Edition-4.1 (CERES-EBAF; Loeb et al., 2018) and the cloud fraction products from the International Satellite Cloud Climatology Project (ISCCP; Pincus et al., 2012), the Multiangle Imaging Spectro-Radiometer (MISR; Marchand et al., 2010), and the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO; Chepfer et al., 2010), as well as the liquid and ice cloud fraction from CALIPSO. Averages between 2000 and 2014 CE are used in the model-data comparison, except that CALIPSO cloud fraction between 2008 and 2020 CE is used.

Taylor diagram evaluating key cloud variables in Community Earth System Model version 2 and PaleoCalibr with satellite observations. Model variables are from Atmospheric Model Intercomparison Project historical simulations with an active Cloud Feedback Model Intercomparison Project Observational Simulator Package. Cloud observations are the total cloud fraction from International Satellite Cloud Climatology Project (ISCCP; 60°S–60°N), Multiangle Imaging Spectro-Radiometer (MISR; 60°S–60°N and ocean-only), and Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO), the cloud phase partition between liquid and ice from CALIPSO, and the shortwave and longwave cloud radiative forcing from Clouds and the Earth's Radiant Energy System Energy Balanced and Filled Edition-4.1. Averages between 2000 and 2014 are used for the model-data comparison, except for the CALIPSO cloud fraction that is averaged between 2008 and 2020.
PaleoCalibr improves the simulations of cloud fraction and its liquid-ice partition over the standard CESM2 but has degradations in cloud radiative forcing. In the total cloud fraction, PaleoCalibr shows smaller centered pattern errors than CESM2, that is, normalized RMSDs that are closer to one when compared with ISCCP (1.33 vs. 1.37; labeled “1”), MISR (1.43 vs. 1.50; labeled “2”), and CALIPSO (1.40 vs. 1.47; labeled “3”). PaleoCalibr cloud fraction also has a greater pattern correlation with CALIPSO (0.90 vs. 0.85) than CESM2. The phase partition of cloud fraction in PaleoCalibr shows large improvements over the standard CESM2 with greater pattern correlation with the CALIPSO liquid clouds (0.86 vs. 0.76; label “4”) and smaller centered pattern error in both liquid (1.31 vs. 1.38 in the normalized RMSDs) and ice (1.47 vs. 1.61 in the normalized RMSDs; Label “5”) clouds. In shortwave cloud forcing (SWCF), PaleoCalibr exhibits a slightly smaller centered pattern error (1.08 vs. 1.11; label “6”) than CESM2 but has degradations in the pattern correlation with observations (0.85–0.88). PaleoCalibr LWCF degrades slightly from CESM2 with larger centered pattern error (0.91 vs. 0.99 in the normalized RMSD from observation; label “7”). For all the metrics that are examined in PaleoCalibr and CESM2, the relative biases from observations (marker size in the Taylor diagram) fall within the same category, indicating that the improvements/degradations in PaleoCalibr come from a redistribution of cloud properties across the globe rather than a uniform shift.
From a spatial view, PaleoCalibr improves the cloud simulation in the Arctic but shows mixed results over other places (Figure 9). The standard CESM2 overestimates the Arctic cloud fraction in CALIPSO by as much as 20%, which results primarily from a larger modeled liquid cloud fraction. PaleoCalibr largely removes the model bias in CESM2 by simulating a smaller liquid cloud fraction that agrees much better with observation. In the subtropics and mid-latitudes, PaleoCalibr simulates a greater cloud fraction that agrees better with satellite observations (Figures 9a and 9b), but at the expenses of degradations in the SWCF (Figure 9d), reflecting a stubborn “too-few-and-too-bright” model bias (Nam et al., 2012). In the deep tropics, PaleoCalibr simulates a smaller cloud fraction than CESM2, which agrees less with observations. Over the SO, PaleoCalibr cloud fraction and its phase partition are similar to the standard CESM2, suggesting that the improvement from CESM1 to CESM2 in the SO clouds is largely preserved in PaleoCalibr. On average, PaleoCalibr has a more positive SWCF over middle-to-high latitudes and more negative SWCF over the lower latitudes than CESM2.

Comparison of model simulations against (a) the zonal mean total cloud fraction in Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO), International Satellite Cloud Climatology Project (ISCCP), and Multiangle Imaging Spectro-Radiometer (MISR), (b) the CALIPSO liquid cloud fraction, (c) the CALIPSO ice cloud fraction, (d) the Clouds and the Earth's Radiant Energy System Energy Balanced and Filled Edition-4.1 (CERES-EBAF) shortwave cloud forcing (SWCF), and (e) the CERES-EBAF longwave cloud forcing (LWCF). Model variables are from Atmospheric Model Intercomparison Project historical simulations with an active Cloud Feedback Model Intercomparison Project Observational Simulator Package. Averages between 2000 and 2014 are used for the model-data comparison, except for the CALIPSO cloud fraction and phase partition that are averaged between 2008 and 2020. ISCCP and MISR cloud fraction is plot between 60°S and 60°N. MISR cloud fraction values are over ocean only.
We next evaluate the coupled simulation of PaleoCalibr in a CMIP historical simulation. PaleoCalibr reproduces the magnitude of the global warming (∼1.1°C; Figure 10) from 1850 to 2014 CE in the Hadley Centre-Climate Research Unit Temperature Anomalies (HADCRU4) and the Goddard Institute for Space Studies Surface Temperature Analysis (GISTEMP; Jones et al., 2012; Lenssen et al., 2019). The large internal variability relative to the forced response prevents us from a more quantitative evaluation of the temporal characteristics of the historical simulation (green in Figure 10; Kay et al., 2015), but a visual examination suggests that the performance of PaleoCalibr is as good as the ensemble of three CESM2 historical simulations (red, orange, and pink in Figure 10).

Time series of the global mean surface temperature anomaly during the historical period from observations (black), the Community Earth System Model version 2 historical simulations using Community Atmosphere Model version 6 (three members in red, orange, and pink), and the paleoclimate-calibrated configuration (PaleoCalibr; blue). Results from the CESM1 Large Ensemble (CESM1LE; green) are shown as a reference for a possible range of internal variability. The observations are from the HADCRU4 and Goddard Institute for Space Studies Surface Temperature Analysis. Temperature anomalies are calculated from the respective 1,850–1,900 averages.
The spatial characteristics of the PaleoCalibr historical simulation match observations and reanalysis reasonably well with skills largely similar to the standard CESM2, which is summarized in a Taylor diagram (Figure 11). The model performance is evaluated against observations of SST from the Extended Reconstructed Sea Surface Temperature version 5 (ERSST; Huang et al., 2017), precipitation from the Global Precipitation Climatology Project version 2.3 (GPCP; Adler et al., 2018), TOA cloud radiative forcing from CERES-EBAF (Loeb et al., 2018), and the surface air temperature, sea-level pressure, and zonal wind at 300 hPa from the ERA5 (Hersbach et al., 2020). All the metrics are calculated for the mean fields averaged between 1979 and 2014, except for the CERES-EBAF cloud radiative forcing that is averaged between 2000 and 2014. The mean bias of all the fields examined is in the same category between PaleoCalibr and CESM2 (marker size in the Taylor diagram), except for precipitation that has a larger relative bias in PaleoCalibr (11.2% vs. 9.6%). Statistics for surface air temperature (labeled “2”) and longwave cloud forcing (labeled “5”) are very similar between the PaleoCalibr and CESM2. The shortwave cloud forcing (SWCF; labeled “4”) shows a larger difference, which is consistent with results from the AMIP simulations. As a result of the more positive SWCF over middle-to-high latitudes and more negative SWCF over the lower latitudes, SST in PaleoCalibr is warmer by <1°C over mid-to-high latitudes and colder by <0.5°C over lower latitudes (not shown) with a larger centered pattern error than in CESM2 (label “3” in Figure 11). Small degradation in the normalized RMSD is also found in zonal wind at 300 hPa (label “6”) and precipitation (label “7”) with the former having insufficient spatial variance and the latter having too much spatial variance.

A Taylor diagram evaluating key climate variables in Community Earth System Model version 22 and PaleoCalibr coupled historical simulations using observational and reanalysis data sets. Model simulated sea level pressure, surface air temperature, and zonal wind at 300 hPa are compared with averages between 1979 and 2014 from ERA5, shortwave and longwave cloud radiative forcing compared with averages between 2000 and 2014 from Clouds and the Earth's Radiant Energy System Energy Balanced and Filled Edition-4.1, precipitation compared with averages between 1979 and 2014 from Global Precipitation Climatology Project version 2.3, and sea surface temperature compared with averages between 1979 and 2014 from Extended Reconstructed Sea Surface Temperature version 5.
Based on the above analysis, we conclude that PaleoCalibr performs as well as the standard CESM2 in the simulation of key cloud and climate observations. We note that some aspects of PaleoCalibr simulations could be improved through additional parameter tuning that may have little net impact on climate sensitivity, but an extensive re-tuning of the model is beyond the scope of this study.
4.2 A Lower ECS and Weaker Cloud-Aerosol Interactions
ECS is quantified to be 3.9°C in PaleoCalibr by regressing GMST and TOA radiation in an abrupt 4 × CO2 simulation of 150 years, which is much lower than the 5.3°C in CESM2 (Figure 12). If the first 20 years are used in the regression, the difference in the estimated ECS between PaleoCalibr and CESM2 is much smaller (ΔECS = 0.2°C; 3.2°C vs. 3.4°C), suggesting that the effect from PaleoCalibr changes manifests mostly at timescales longer than 1‒2 decades. With additional SOM simulations with an abrupt 2 × CO2, ECSSOM are estimated to be 4.0°C and 6.1°C in PaleoCalibr and CESM2, respectively. The lower ECS in PaleoCalibr is consistent with the much smaller magnitude of LGM global cooling (Figure 7). The shortwave cloud feedback averaged over the last 20 years of the 4 × CO2 simulations are 0.52 and 0.74 W m−2 K−1 in PaleoCalibr and CESM2, respectively. The shortwave cloud feedback shows lower values over the mid-latitudes and the SH subtropics (Figure 13a). Compared to CESM2, the reduced shortwave cloud feedback in PaleoCalibr is more consistent with the observation-based estimates (Ceppi & Nowack, 2021; Cesana & Del Genio, 2021; Myers et al., 2021).

Scatter plot of the global mean surface temperature anomaly (ΔGMST) and the top-of-atmosphere net radiation (ΔN) in the abrupt 4 × CO2 simulations using Community Earth System Model version 22 and PaleoCalibr. 4 × CO2 simulations are run for 150 model years. equilibrium climate sensitivity is estimated using the regression method, that is, a half of the x-axis intercept.

(a) Zonal mean shortwave cloud feedback (λsw_cld; units: W m−2 K−1) calculated in the abrupt 4 × CO2 simulations using Community Earth System Model version 2 and PaleoCalibr. Panel (b) same as panel (a), but for the aerosol cloud interaction (ACI; units: W m−2). The global mean λsw_cld and ACI are shown with figure legend. Note. The y-axis is revered in panel (b).
PaleoCalibr simulates an ACI that is 20% weaker than CESM2. ACI is quantified as the change in the net cloud radiative forcing between a pair of atmosphere-only simulations with aerosol emissions at 2000 CE and 1850 CE, forced with the same observational SST and sea ice (IPCC, 2013). ACI are −1.3 and −1.7 W m−2 in PaleoCalibr and CESM2, respectively. ACI weakening is mostly found at mid-to-high latitudes, where we also observe decreases in the shortwave feedback (Figure 13). This highlights the fact that aerosol forcing and cloud feedback are not independent variables (Gettelman et al., 2019; Kiehl, 2007). A weaker GHG-induced warming and a weaker aerosol-induced cooling may explain the comparable historical warming between PaleoCalibr and CESM2 (C. Wang et al., 2021; Meehl, Senior, et al., 2020).
5 Conclusions and Discussion
In this study, we have investigated the impact of key cloud parameterizations of CESM2 on the simulated LGM global temperature through coupled simulations with individual CAM6 schemes substituted one-at-a-time by older CAM5 schemes. Our investigation takes advantage of the fact that CESM2(CAM6; referred to as CAM6) simulates an excessive LGM cooling but the CESM2(CAM5; referred to as CAM5) LGM simulation falls within the proxy suggested range (4.6°C–6.8°C; Tierney et al., 2020). The different performances of the LGM simulations imply that changes in the cloud parameterizations between CAM5 and CAM6 are responsible for the excessive LGM cooling and therefore the high climate sensitivity of CESM2. Our simulations show that the substitution of CAM6 ice nucleation or cloud microphysics scheme with the CAM5 version (HetFrzOff or Mg2Off) produces a much more realistic LGM than the default CAM6. In contrast, substituting the moist turbulence scheme to the CAM5 version (ClubbOff) has a small impact. Specifically, the LGM ΔGMST after 100 model years are −9.0°C, −6.3°C, −6.3°C, −5.9°C, and −8.9°C in the LGM simulations using CAM6, CAM5, Mg2Off, HetFrzOff, and ClubbOff, respectively. The different magnitude of LGM cooling in these simulations is primarily caused by variations of the shortwave cloud feedback, which are 0.81, 0.29, 0.49, 0.37, and 0.64 W m−2 K−1, respectively. These sensitivity tests suggest that the increased climate sensitivity in CESM2 is largely determined by cloud microphysical processes, which has guided our further examination.
Further exploration suggests that a combination of two changes in cloud microphysics (NoNimax and Mg2Sub8) reduces the excessive LGM cooling in CESM2 to a value that is consistent with the proxy reconstruction. NoNimax improves the physical consistency of mixed-phase clouds through removing an inappropriate limiter (“nimax”) on cloud ice number. NoNimax simulates a greater cloud ice mass and a weaker shortwave cloud feedback but produces excessive numbers of cloud ice particle (with the “nimax” limiter removed). To ensure a realistic simulation of cloud ice particle number, we perform microphysical substepping (Mg2Sub8; 8 substeps in MG2), which reduces the default microphysical timestep from 600 to 75 s Mg2Sub8, when combined with NoNimax, further weakens the shortwave cloud feedback, and simulates a realistic LGM global cooling. In our test simulations, 8 substeps in MG2 are sufficient to produce converging solutions in cloud ice particle number and the shortwave cloud feedback.
A paleoclimate-calibrated CESM2 configuration (PaleoCalibr) is developed, which consists of NoNimax and Mg2Sub8 in the cloud microphysics, as well as a minimal model tuning. A historical simulation using PaleoCalibr reproduces the observed twentieth century warming. PaleoCalibr also simulates the spatial characteristics of key cloud and climate variables very well with improvements in the cloud fraction and its phase partition. PaleoCalibr has a lower ECS (∼4°C) than the standard CESM2 (∼5°C–6°C) and realistic LGM global cooling (∼7°C). PaleoCalibr simulates a 40% weaker shortwave cloud feedback and a 20% smaller aerosol cloud interaction.
We believe PaleoCalibr is a valuable tool in climate change studies, especially when a large climate forcing is involved. Removing the cloud-ice-number limiter represents a more physically consistent treatment of the cloud ice nucleation process than the standard CESM2. The use of a smaller microphysical timestep is supported by Santos et al. (2020), who find that multiple microphysical processes in atmosphere-only preindustrial simulations using MG2 are poorly resolved with a microphysical timestep of 300 s. Our results further show that a shorter microphysical timestep decreases the shortwave cloud feedback and climate sensitivity and that a timestep of 75 s seems to produce a convergent solution in a configuration with the ∼2° atmosphere. Further study is needed to examine which microphysical processes are responsible for the timestep dependence, including the rain evaporation and self-collection processes (Santos et al., 2020).
We note that all the test simulations and the paleoclimate-calibrated configuration in this study use the CESM2 with a ∼2° atmosphere. We expect that the overall impact from removing the “nimax” and microphysical substepping is largely independent of model resolution, but some details including the tuning parameters need to be examined if the ∼1° atmosphere model is used. For example, the exact microphysical substep number that produces a converging cloud feedback could be different due to the different model resolution and parameters. We note further that we have intentionally not performed parameter tuning for each CESM2 sensitivity configuration, which leads to a warmer preindustrial GMST in coupled and SOM simulations in some configurations (e.g., GMST is 17.8°C in the PI SOM simulation with Mg2Off). As a consequence and caveat, part of the differences in ECS and cloud feedback between CESM2 configurations could be caused by their state dependence, instead of changes in cloud treatment. However, we believe that the impact of state dependence on the global mean shortwave cloud feedback is small (<<0.1 W m−2 K−1) in simulations presented here, considering that the shortwave cloud feedback in CESM2 increases from 0.97 to 1.07 W m−2 K−1 when the background GMST increases from 15.2°C to 20.7°C (Zhu & Poulsen, 2020). Nevertheless, the large trend in the CESM2 test simulations (Table 1) prevents us from a meaningful examination of the simulated regional temperatures and the comparison with proxy SSTs in these test simulations.
Our study highlights the unique value of paleoclimate constraints in informing the cloud parameterizations and ultimately future climate projections. Among the CESM2 configurations that are explored in this study, a close correlation is found in cloud feedbacks and temperature responses between CO2 increasing and paleoclimate simulations (Table 1 and Figures 1 and 3d), which indicates that a common set of physical processes are active in past and future climates and serves as the physical basis for a paleo-constraint on clouds and climate sensitivity (e.g., Hargreaves et al., 2012; Schmittner et al., 2011; Zhu et al., 2020). Although the paleoclimate forcing and global temperature response do not provide process-level constraints on cloud feedback processes, they serve as a critical “out-of-sample” test for the cloud parameterizations that are usually developed to match the present-day observations. We encourage the use of paleoclimate constraints as an important tool in future model development and validation, as our knowledge of past climates continues to improve and climate models become more complex. We have ongoing work to evaluate the performance of the paleoclimate calibrated CESM2 in simulating past extreme warm climates, such as the Early Eocene (Zhu et al., 2020).
Acknowledgments
The authors thank G. Danabasoglu, V. Larson, and X. Zhao for helpful discussions and two anonymous reviewers for their constructive comments leading to improvement of the manuscript. The CESM project is supported primarily by the National Science Foundation (NSF). This material is based upon work supported by the National Center for Atmospheric Research (NCAR), which is a major facility sponsored by the NSF under Cooperative Agreement No. 1852977. This work was supported by the National Science Foundation grant 2002397 to C. Poulsen and J. Zhu. Computing and data storage resources, including the Cheyenne supercomputer (https://doi.org/10.5065/D6RX99HX), were provided by the Computational and Information Systems Laboratory (CISL) at the NCAR.
Open Research
Data Availability Statement
CESM2 model code is available at https://doi.org/10.5281/zenodo.3895315. CESM2 code modifications and simulation data developed in this study are available via the Digital Asset Services Hub (https://doi.org/10.5065/bdr7-wt42).