Assessment of Equilibrium Climate Sensitivity of the Community Earth System Model Version 2 Through Simulation of the Last Glacial Maximum

The upper end of the equilibrium climate sensitivity (ECS) has increased substantially in the latest Coupled Model Intercomparison Projects phase 6 with eight models (as of this writing) reporting an ECS > 5°C. The Community Earth System Model version 2 (CESM2) is one such high‐ECS model. Here we perform paleoclimate simulations of the Last Glacial Maximum (LGM) using CESM2 to examine whether its high ECS is realistic. We find that the simulated LGM global mean temperature decrease exceeds 11°C, greater than both the cooling estimated from proxies and simulated by an earlier model version (CESM1). The large LGM cooling in CESM2 is attributed to a strong shortwave cloud feedback in the newest atmosphere model. Our results indicate that the high ECS of CESM2 is incompatible with LGM constraints and that the projected future warming in CESM2, and models with a similarly high ECS, is thus likely too large.

record that has a narrow range of variations in climate forcing and response (Hourdin et al., 2017;Schmidt et al., 2017;Tierney et al., 2020a;Zhu et al., 2020). The Last Glacial Maximum (LGM; 21,000 before present [21 ka]) offers a prime opportunity to inform ECS, as it represents a quasiequilibrium climate state much different from today and has relatively well-known climate forcings and a high spatial coverage of proxy temperatures (CLIMAP Project Members, 1976;Kageyama et al., 2018;Tierney et al., 2020b;Waelbroeck et al., 2009). The LGM has been used to validate climate models and to constrain climate feedback processes since the 1980s (e.g. Manabe & Broccoli, 1985).
In this study, we assess whether the high ECS (>5°C) in CESM2 is supported by LGM constraints. We compare CESM2 simulated LGM cooling with proxy reconstructions and a simulation using its predecessor, the CESM1 (also referred as CESM1(CAM5)). We then use a forcing-feedback analysis to diagnose the origin of the differences between the CESM2 and CESM1 simulations.

Models and Experiments
CESM2 is the latest and most comprehensive Earth system model in the CESM series and is participating in both the CMIP6 and the Paleoclimate Modeling Intercomparison Project phase 4 (PMIP4) Danabasoglu et al., 2020;Feng et al., 2020;Gettelman et al., 2019;Meehl et al., 2020;Otto-Bliesner et al., 2020). CESM2 consists of component models of the atmosphere, ocean, land, sea-ice, and rivers. Substantial science and infrastructure improvements have been made from CESM1 to CESM2 (see Danabasoglu et al. (2020) for details), including updates to the cloud-related parameterizations in the Community Atmosphere Model version 6 (CAM6). These include a new higher-order turbulence closure scheme with a unified description of processes in the cloudy turbulent layers, changes to the two-moment stratiform microphysics scheme including a new capability to predict the mass and number concentrations of rain and snow, modifications of the mixed phase ice nucleation scheme, and many others (Gettelman et al., 2019). These changes in cloud parameterizations from CAM5 to CAM6 are the primary reasons for the increased ECS in CESM2 from CESM1 (∼5.3°C vs. ∼4.0°C) (Gettelman et al., 2012(Gettelman et al., , 2019.

The CESM2
LGM simulation in this study was designed to follow, as closely as possible, the experimental configurations for the "low-top" (also referred as CESM2(CAM6)) simulations of preindustrial and future climates for the CMIP6 Danabasoglu et al., 2020). Similar to the CESM2 CMIP6 simulations, the LGM simulation has a horizontal resolution of 0.9° × 1.25° (latitude × longitude) in the atmosphere and land models, and a nominal 1° horizontal resolution in the ocean and sea ice models. Unlike the CESM2 CMIP6 simulations, the LGM was run without active biogeochemical modules in the land or ocean due to the lack of boundary and initial conditions for the LGM vegetation and marine biogeochemical processes. Similarly, we used an older version of the river transport module due to the insufficient knowledge of the LGM river hydrography that is required by the new river model. To examine the impact of this configuration, we performed a new preindustrial simulation (PI) with these customizations and found that the mean climate state and ECS changed little from the standard CESM2 CMIP6 simulations (see Text S1 and Table S1).
The LGM boundary conditions were implemented following the PMIP4 protocol (Kageyama et al., 2017). Specifically, greenhouse gas (GHG) concentrations were 190 ppmv, 375 ppb, and 200 ppb for CO 2 , CH 4 , and N 2 O, respectively. Earth orbital parameters were fixed at 21 ka values. Land ice sheets (LISs) were prescribed using the ICE-6G reconstruction at 21 ka (Peltier et al., 2015) and comprised of changes in land surface properties (e.g., albedo), surface elevation, and land-sea distribution due to the presence of LISs and the associated effect of a lowered sea level. Surface elevation changes over the LIS and shelf-exposure regions included a resolved component of the grid-box mean elevation and an "unresolved" subgrid-scale component that was used for surface drag parameterizations (Lauritzen et al., 2015; see also Text S1). The LGM simulation used preindustrial aerosol emissions and vegetation cover, as well as a present-day vegetation phenology from satellite observations (Text S1).
Ocean temperature and salinity of the CESM2 LGM simulation were branched from an equilibrated LGM simulation using CESM1 (Table 1; DiNezio et al., 2016;Zhu & Poulsen, 2020a;Zhu et al., 2017), which in turn was initialized from a CCSM4 LGM simulation (Brady et al., 2013). The CESM2 simulation was integrated for an additional 500 years. The simulation has not reached equilibrium and has a small cooling trend of 0.07°C per 100 years in GMST and a top-of-atmosphere (TOA) radiation imbalance of approximately -0.2 W m -2 over the last 100 years of the simulation. Nonetheless for the purpose of this study, in which we show that the LGM simulation is too cold, the trends do not change our major results and conclusions (see below). To better understand the CESM2 LGM cooling, we performed additional simulations with both atmosphere-only and slab ocean configurations to quantify the LGM effective radiative forcing following the methodology in Zhu and Poulsen (2020a) (Text S2). The effective radiative forcing and feedback in CESM2 LGM simulation are compared with those in CESM1, as well as those in abrupt 4×CO 2 simulations using both models under the present-day climate Zhu et al., 2019). To illustrate the role of the new physical parameterizations in CAM6 in changing the climate response to the LGM forcing, we performed a parallel set of PI and LGM simulations using the CAM5 physical parameterizations within CESM2 (Text S3 and Table S1).
To benchmark the LGM simulations, we employed a recent synthesis of LGM sea-surface temperatures (SSTs) inferred from geochemical proxies (Mg/Ca, 37 K U  , and TEX 86 ), which used Bayesian proxy system models to systematically accommodate uncertainties in calibration and various environmental influences (Tierney & Tingley, 2014Tierney et al., 2019Tierney et al., , 2020b. Foraminiferal δ 18 O SSTs relied on unrealistic assumptions regarding the δ 18 O composition of surface seawater (Tierney et al., 2020b) and therefore were not used in this study. In addition, we compare the model simulated LGM cooling in GMST and tropical SST to the proxy-only estimates in Tierney et al., 2020b, which are 5.6°C (4.6°C-6.8°C; 95% confidence interval) and 2.6°C (2.3°C-3.0°C, 95% CI; averaged over 30°S-30°N), respectively. We note that these estimates of global and regional LGM cooling agree with several other independent studies (Bereiter et al., 2018;Friedrich & Timmermann, 2020;Holden et al., 2010;Snyder, 2016;Von Deimling et al., 2006) but are greater than some early estimates (Annan & Hargreaves, 2013;Schmittner et al., 2011;Shakun et al., 2012).

Excessive LGM Cooling in CESM2
The CESM2 LGM simulation cooled rapidly from a starting ΔGMST (LGM-PI) of approximately −6°C to more than −11°C after 500 model years ( Figure 1). CESM2 GMST is at least 5°C colder than the median of proxy estimates and at least 4°C colder than the upper limit of proxy uncertainty. The tropical mean ΔSST in CESM2 is -7°C, also much larger in magnitude than the proxy estimate of -2.6°C. In comparison, CESM1 ΔGMST and tropical mean ΔSST are -6.8°C and -3.6°C, respectively, which are near the upper end of the proxy uncertainty range.
The large discrepancy between CESM2 and proxy temperatures is also evident at individual proxy-SST sites ( Figure 2  U  records, respectively. In contrast, CESM1 ΔSSTs are much smaller in magnitude and have values that fall within the uncertainty range for approximately 60% of the proxy records. RMSE in CESM1 ΔSST (∼1.7°C-2°C) is much smaller than that in CESM2 and only slightly larger than the mean standard error in the proxies (∼1.5°C; arithmetic mean of the standard error of all available SST records).
The cooling of the LGM climate in CESM2 compared to CESM1 is primarily attributed to the update of the atmosphere component from CAM5 to CAM6. We demonstrate this with an additional set of LGM and ZHU ET AL.
10.1029/2020GL091220 4 of 10  PI simulations (CESM2(CAM5)), in which the same boundary conditions and model configurations as those in the CESM2 simulations are used with the exception that the CAM5 physical parameterization package, the default option in CESM1, is implemented. In CESM2(CAM5), the LGM ΔGMST decreases from -6.0°C to -6.8°C during the first 90 years, and then increases gradually to approximately -6.5°C after 500 model years (Figure 1). The CESM2(CAM5) ΔGMST is thus smaller in magnitude than that in CESM2 and agrees better with CESM1 (and proxy estimates), suggesting that differences between CAM6 and CAM5 predominantly contribute to the large magnitude of LGM cooling in CESM2. At individual sites, CESM2(CAM5) ΔSST also agrees better with proxy data compared to CESM2 (RMSE = ∼1.5°C; not shown). The smaller magnitude of the LGM ΔGMST in CESM2(CAM5) compared to CESM1 (-6.5°C vs. -6.8°C) is primarily due to lower snow and ice albedos in the new version of the land model (Lawrence et al., 2019).
We note that the above comparison of proxy and CESM2(CAM6) temperatures is for the average of the last 50 years (year 451-500) of the simulation, during which the TOA radiation imbalance is approximately -0.20 W m -2 (Table 1) and the model surface temperature is still decreasing. The model-data disagreement would be even more prominent if the CESM2 LGM simulation was extended further.

Effective Forcing and Feedbacks in CESM2
To understand the LGM cooling in CESM2, we first quantify the effective radiative forcing (F eff ) following the methodology in Zhu and Poulsen (2020a). F eff includes the instantaneous radiative forcing and adjustments from the atmosphere and surface, and is more closely related to surface warming in the forcing-feedback framework (Sherwood et al., 2015).
LGM F eff comprises contributions from GHG and LIS, which are obtained using both "fixed-SST" simulations in an atmosphere-only configuration and slab ocean simulations (SOM) with thermodynamic atmosphere-ocean coupling but inactive ocean dynamics (Text S2 and  Table S2). Simulations with prescribed sea-surface conditions provide changes in TOA radiation (F fsst ) and land surface temperature (ΔT lnd ) in response to a forcing agent (GHG or LIS); SOM simulations provide estimates of the climate sensitivity parameter associated with the forcing. The effective radiative forcing of GHG or LIS is estimated by correcting the effects of ΔT lnd on F fsst using the climate sensitivity parameter (Text S2).
The LGM effective radiative forcing is -5.2 W m -2 ( Table 1) with contributions of -3.1 and -2.1 W m -2 from GHG and LIS, respectively (Table S2). The magnitude of F eff in CESM2 is smaller than in CESM1 (-6.0 W m -2 ) due primarily to the lower albedo of snow and ice in the new land model (Lawrence et al., 2019). A weaker LGM forcing in CESM2 than in CESM1 implies that climate feedbacks in CESM2 must be stronger in order to explain the larger temperature response.
Accordingly, the effective climate feedback parameter (λ eff ) in CESM2 is -0.48 W m -2 K -1 and larger than that in CESM1 by 0.40 W m -2 K -1 (Table 1). Here λ eff is calculated as where ΔN and ΔGMST are the final 50-year mean TOA radiation imbalance and LGM global cooling in the coupled simulation (Table 1). A less negative λ eff in CESM2 indicates that the net climate feedback is less efficient at damping the surface cooling; that is, CESM2 is more sensitive to the LGM forcing than CESM1. We note that the LGM simulations are not initialized from the unperturbed preindustrial state and that ice sheets have nonradiative pathways to change surface temperature (Zhu & Poulsen, 2020a), which prevent us from implementing the traditional "Gregory method" (Gregory et al., 2004) to estimate the effective radiative forcing and feedback.
We attribute the larger climate feedback in CESM2 to a stronger shortwave cloud feedback than in CESM1. This is consistent with previous studies that show that the shortwave cloud feedback is mostly responsible for the ECS increase from CESM1 to CESM2 and that other climate feedbacks (such as water vapor and albedo feedbacks) depend less on atmospheric physical parameterizations and stay largely unchanged between model versions (Gettelman et al., 2019;. To illustrate this for the LGM, we calculate the shortwave cloud feedback parameter ( sw_cld  ) in the LGM simulations using the approximate partial radiative perturbation method (Taylor et al., 2007). Global mean sw_cld  is 0.74 W m -2 K -1 in CESM2, which is 0.38 W m -2 K -1 larger than that in CESM1 (Table 1 and Figure 3). This result suggests that the stronger shortwave cloud feedback explains the vast majority of the larger effective climate feedback (0.40 W m -2 K -1 ) in CESM2 and therefore the greater LGM cooling.
In response to the LGM forcing, CESM2 sw_cld  exhibits a similar spatial pattern as CESM1 but with more positive values at almost all latitudes (Figure 3). Both CESM2 and CESM1 show maximum cloud feedbacks over the stratocumulus decks in the eastern subtropical ocean basins with larger values and spatial extent in CESM2. At ∼15°S, the zonal mean sw_cld  in CESM2 is ∼2 W m -2 K -1 larger than in CESM1 (Figure 3e). In the Southern Ocean (∼60°S), both models show minimum sw_cld  with values close to zero in CESM2 and -1 W m -2 K -1 in CESM1. Over the Laurentide and Eurasian ice sheets, both models have negative sw_cld  of similar magnitude, which primarily reflects the presence of LGM ice sheets that occupy the space containing air masses and clouds in the PI simulation (Zhu & Poulsen, 2020a). In mid-latitudes (∼30°N/S-40°N/S), sw_cld  in CESM2 is smaller than in CESM1, likely reflecting differences in the extent of equatorward sea-ice expansion and shifts of storm tracks. An equatorward shift of storm tracks associated with global cooling moves cloud systems toward lower latitudes with higher insolation, which causes additional cooling (a positive feedback). A stronger global cooling in CESM2 (ΔGMST of -11.3°C vs. -6.8°C in CESM1) leads to ZHU ET AL.

Discussion: The Connection to 4×CO 2 Simulations
The sw_cld  in LGM simulations is closely connected to that in abrupt 4×CO 2 simulations, which largely determines intermodel differences in ECS under the present-day climate Gettelman et al., 2019). Compared to CESM1, CESM2 global mean sw_cld  is larger by ∼0.4 W m -2 K -1 in both the LGM and 4×CO 2 simulations (Table 1). For both simulations, the intermodel difference in sw_cld  is most prominent over the Southern Ocean and the Southern Hemisphere subtropics (red vs. blue lines in Figure 3e), likely reflecting a Southern Ocean origin of the different model behavior  that impacts the lower latitudes through ocean and atmosphere processes (Zhu & Poulsen, 2020a). These results suggest that the cloud processes that give rise to the higher ECS in CESM2 (in the 4×CO 2 simulation) act in a similar way to produce the larger LGM cooling.
Comparing LGM and 4×CO 2 simulations, both CESM2 and CESM1 exhibit broadly consistent regional differences in sw_cld  ( Figure 3f). Over the northern subtropical Pacific and Atlantic, sw_cld  is larger in LGM than 4×CO 2 simulations (Figures 3a-3d), which is a characteristic feature associated with the southward shift of storm tracks (Lofverstrom, 2020;Lofverstrom et al., 2016) and cloud systems (Zhu & Poulsen, 2020a) due to LGM ice-sheet forcing. In the southern subtropical oceans, sw_cld  is also stronger in LGM simulations, likely due to a strengthening of the tropical easterlies in a cold climate (e.g. Williams & Bryan, 2006). A stronger surface wind increases the boundary layer turbulence and the associated latent heat flux, leading to more low clouds with surface cooling and therefore a stronger sw_cld  than simulations with a warmer climate and weaker winds (Bretherton et al., 2013). This wind-driven mechanism could be further amplified by the SST pattern effect through changing the lower tropospheric stability and low clouds (Armour et al., 2013;Rose et al., 2014;Zhou et al., 2016). Over the middle to high latitudes, sw_cld  is weaker in LGM than 4×CO 2 simulations, likely due to an increase of cloud ice fraction with the background cooling and the more negative feedback associated with the phase transitioning between cloud ice and liquid (Frey & Kay, 2018;Mitchell et al., 1989;Tan et al., 2016;. The close connection in global and regional sw_cld  between LGM and 4×CO 2 simulations support the use of LGM constraints to inform cloud feedbacks and ECS. The LGM constraints, therefore, suggest that some of the new cloud-related parameterizations, including schemes of the stratiform microphysics, shallow convection and moist turbulence, and the mixed phase ice nucleation (Gettelman et al., 2019), likely produces unrealistic response to large variations in external forcings. Examining each of these new schemes in the LGM context is an ongoing study and beyond the scope of this study.

Conclusions
In this study, we performed a CESM2 simulation of the LGM and found that the simulated global cooling exceeds -11°C, which is at least 5°C colder than the median of the latest proxy-based estimates (Tierney et al., 2020b).
LGM tropical mean ΔSST is -7°C in CESM2, which is also much colder than the proxy data estimate (-2°C --3°C). At individual proxy SST sites, CESM2 LGM is on average 4°C-5°C colder than the proxy. In comparison, global and regional cooling in CESM1 (predecessor to CESM2) is much smaller and agrees better with proxies. The larger LGM cooling in CESM2 is attributed to the new physical parameterizations in the atmosphere model (CAM6), which simulates a stronger shortwave cloud feedback than its predecessor (CAM5 in CESM1).
The discrepancy between CESM2 simulated LGM cooling and proxy data cannot be explained by uncertainties in either LGM temperature reconstructions or climate forcings. If earlier proxy syntheses are used for the comparison (CLIMAP Project Members, 1976;Waelbroeck et al., 2009), the model-data discrepancy would be more prominent, as these earlier studies suggested more modest LGM cooling (Tierney et al., 2020b). If realistic distributions of dust aerosols and vegetation cover were available and implement-ed, the CESM2 simulated LGM would be colder and even more difficult to reconcile with the proxy data (Köhler et al., 2010;Sherwood et al., 2020;Tierney et al., 2020b).
Our study suggests that the ECS in CESM2 is too high and closely related to the large model-data discrepancy in the LGM cooling. This result indicates either that the shortwave cloud feedback in CESM2 is too strong or that current climate models are missing important physical processes that counter the shortwave cloud feedback and stabilize the climate. Whichever is true, the projected future warming using high-ECS models, such as CESM2, is likely overestimated. This conclusion from constraints using a past cold climate is the same as our previous studies using past extreme warm climate of the Early Eocene (Zhu et al., 2019(Zhu et al., , 2020. Our findings are also consistent with independent studies, which show that the warming trends in high-ECS models including CESM2 are too large than that in the instrumental record (Brunner et al., 2020;Nijsse et al., 2020;Tokarska et al., 2020). Here, we emphasize the unique strength of the paleoclimate constraint as a true "out-of-sample" test and its advantage of being independent on the transient climate change and internal variability.

Data Availability Statement
Computing and data storage resources, including the Cheyenne supercomputer (doi:10.5065/D6RX99HX), were provided by the Computational and Information Systems Laboratory (CISL) at NCAR. Relevant CESM simulation data and results on the cloud feedback analysis are available in the Zenodo repository (https:// doi.org/10.5281/zenodo.4075597).