Simulations for CMIP6 With the AWI Climate Model AWI‐CM‐1‐1

The Alfred Wegener Institute Climate Model (AWI‐CM) participates for the first time in the Coupled Model Intercomparison Project (CMIP), CMIP6. The sea ice‐ocean component, FESOM, runs on an unstructured mesh with horizontal resolutions ranging from 8 to 80 km. FESOM is coupled to the Max Planck Institute atmospheric model ECHAM 6.3 at a horizontal resolution of about 100 km. Using objective performance indices, it is shown that AWI‐CM performs better than the average of CMIP5 models. AWI‐CM shows an equilibrium climate sensitivity of 3.2°C, which is similar to the CMIP5 average, and a transient climate response of 2.1°C which is slightly higher than the CMIP5 average. The negative trend of Arctic sea‐ice extent in September over the past 30 years is 20–30% weaker in our simulations compared to observations. With the strongest emission scenario, the AMOC decreases by 25% until the end of the century which is less than the CMIP5 average of 40%. Patterns and even magnitude of simulated temperature and precipitation changes at the end of this century compared to present‐day climate under the strong emission scenario SSP585 are similar to the multi‐model CMIP5 mean. The simulations show a 11°C warming north of the Barents Sea and around 2°C to 3°C over most parts of the ocean as well as a wetting of the Arctic, subpolar, tropical, and Southern Ocean. Furthermore, in the northern middle latitudes in boreal summer and autumn as well as in the southern middle latitudes, a more zonal atmospheric flow is projected throughout the year.


Introduction
Around 50 institutions worldwide are participating in the current sixth phase of the Coupled Model Intercomparison Project 6 (CMIP6; Eyring et al., 2016). The Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research in Germany contributes for the first time to CMIP with the novel Finite Element Sea Ice-Ocean Model (FESOM) coupled to the atmosphere model ECHAM6 developed at Max Planck Institute (MPI) for Meteorology in Hamburg. The novelty of FESOM lies in the use of global unstructured meshes that only few institutions worldwide are employing at this stage (e.g., Korn, 2017;Petersen et al., 2019). The unstructured-mesh approach allows putting a particular focus on dynamically active regions such as the North Atlantic Current, the Southern Ocean, and the tropics while using relatively coarse resolution elsewhere. For the set of "Evaluation and Characterization of Klima" (DECK) and ScenarioMIP experiments, a mesh with local refinement of up to 8 km in the North Atlantic Current and the Southern Ocean is used. Coupling the unstructured ocean model FESOM to ECHAM6, which is also used for the MPI-ESM contribution to CMIP6, offers the unique opportunity to investigate the influence of an alternative ocean model formulation on the results which will be exploited in further research.
Many models that participated in CMIP3 and CMIP5 have common descent and share ideas and code with each other (Knutti et al., 2013;Masson & Knutti, 2011). This leads to a clustering of results based on model "genealogy" and challenges the assumption of model independence. The ocean part of the AWI-CM is a new unstructured mesh model. It is thus based on a different dynamical core compared to most of the models contributing to CMIP6. Although many parameterizations in FESOM are similar to conventional structured-grid ocean models, and although the ECHAM model has already participated since CMIP3 in the CMIP efforts (Stevens et al., 2013), it can be argued that the use of an unstructured-mesh sea ice-ocean model is an important contribution to the diversity of the CMIP6 ensemble. Large-scale characteristics dominated by the formulation of the atmosphere, such as the equilibrium climate sensitivity, are not expected to be influenced too much by the ocean formulation. In contrast, the ocean has the potential to modulate the transient evolution and regional patterns of the response considerably. This can lead to differences in projected changes of coupled phenomena such as the El Niño-Southern Oscillation (ENSO) as well as sea ice in polar regions.
The aim of this paper is to present the main characteristics of the AWI-CM in the context of the CMIP6 project based on an evaluation of selected atmosphere, ocean, and sea ice parameters for present-day climate as well as for future climate. The evaluation of the unstructured mesh ocean component compared to the traditional mesh ocean component of Max Planck Institute for Meteorology (MPIM) is beyond the scope of this study and will be the topic of a collaborative publication with the MPIM.
In section 2, a brief model description is given along with a summary of the performed DECK and ScenarioMIP simulations, following the CMIP protocol. In section 3, remaining model drift and imbalances are analyzed. Section 4 describes biases in our present-day simulations for some important atmosphere, seaice, and ocean variables. The climate change signal is analyzed in detail in section 5. Finally, a discussion of the results and conclusions are presented in sections 6 and 7.

Model Description
The sea ice-ocean component of AWI-CM is the Finite Element Sea Ice-Ocean Model (FESOM; see Danilov et al., (2004), for the sea ice component and Wang, Danilov, et al., (2014), for the ocean component). It uses unstructured meshes, that allow simulations of ocean and sea ice dynamics with variable grid resolution. This also enables refinement in resolution for areas where small-scale dynamics are prevalent (e.g., narrow straits and strongly eddying regions; Sein et al., 2016Sein et al., , 2017. Tools have been developed to enable users of FESOM data to perform analysis efficiently (see Appendix A1). Furthermore, selected variables are also available on regular latitude-longitude meshes.
The atmospheric component of AWI-CM is the spectral atmospheric model ECHAM6.3.04p1 from MPIM (Stevens et al., 2013) which is used here without any additional modifications or tuning. This version of ECHAM is also used in the MPIM contribution to CMIP6. Having these two setups thus will allow future intercomparisons of the coupled systems that share the same atmosphere model but use different sea ice-ocean models.
A more detailed description of the AWI-CM components and an evaluation of its mean state and climate variability are provided in Sidorenko et al. (2015) and Rackow et al. (2018), respectively. AWI-CM realistically simulates many aspects of the modern climate, showing an overall performance that is generally better than the most realistic climate models participating in CMIP5.
The CMIP6 version of the code encompasses several changes compared to that described in Sidorenko et al. (2015) and Rackow et al. (2018). The major technical improvement involves the removal of the regular exchange mesh, which in earlier versions was used as an interface between FESOM and the OASIS3-MCT coupler. In the CMIP6 version, the interpolation between unstructured FESOM and structured ECHAM meshes is done by the coupler. Furthermore, the coupling between ocean and atmosphere has been sped up remarkably through the use of the parallel support built in OASIS3-MCT.
Updates of physical parameterizations in the ocean sea ice component comprise the inclusion of (1) a salt plume parameterization  which improves the simulated sea surface salinity in the Arctic Ocean, (2) modified background diffusivities, as suggested by Wang, Danilov, et al. (2014), and (3) a K-Profile Parameterization (KPP) for vertical mixing (Large et al., 1994) in the ocean model which has solved shortcomings related to the North Atlantic circulation, pointed out in Rackow et al. (2018) and Sidorenko et al. (2015). Those previous publications were based on simulations on different meshes and with constant rather than transient forcing. This and the fact that these simulations were performed within the CMIP6 framework according to a common protocol calls for documentation of the CMIP6 version of the model and its results presented in this paper.

CMIP6 Simulations
In this paper, the focus is on the DECK and ScenarioMIP simulations, which were defined in the CMIP6 overview paper  and are summarized in Table 1. Before starting the 500-year coupled piControl-spinup simulation with constant pre-industrial forcing, a 10-year long ocean-only simulation initialized from the EN4 ocean reanalysis (Good et al., 2013) averaged over [1950][1951][1952][1953][1954] has been performed. In these 10 years of ocean-only simulation, the initial adjustment of the ocean state takes place. This pre-spinup helps to ensure a numerically stable adjustment phase of the coupled system. The piControl simulation is a continuation of the piControl-spinup simulation. From the piControl simulation, the idealized greenhouse gas forcing simulations 1pctCO 2 and abrupt-4xCO 2 simulations as well as the historical forcing simulations are branched off at specific years (branch-off point(s); see Table 1). At the end of the historical forcing simulations, that is at the end of the Year 2014, the scenario simulations are continued with forcing prescribed from the anthropogenic forcing scenarios. These scenarios are derived from Shared Socioeconomic Pathways (SSP) (Meinshausen et al., 2019).
The idealized and historical forcing simulations have been branched off sufficiently long before the end of the piControl simulation to ensure that every year of the sensitivity simulations (idealized, historical, and scenario simulations) has a corresponding year in the piControl simulation. The climate change signal is always computed following the delta approach (e.g., Lenderink et al., 2007), that is, as the difference between the sensitivity simulation and the corresponding year(s) of the piControl simulation, to account for possible model drift.
The ECHAM model is run at a spectral resolution of T127L95, where T127 denotes a spectral truncation at total wavenumber 127, which corresponds to about 100 km horizontal resolution in the tropics and higher horizontal (zonal) resolution toward the poles-for example, about 25 km in 75°latitude. L95 stands for 95 Note. The forcing of the ScenarioMIP simulations is described in more detail in Meinshausen et al. (2019). unevenly spaced model levels with high vertical resolution close to the surface (60 to 300 m in the atmospheric boundary layer) and reaching up to 0.01 hPa corresponding to 80 km (i.e., high-top model version).
In our AWI-CM-1-1-MR CMIP6 contribution , the FESOM model is run on a medium-resolution "MR" mesh that follows the mesh design strategy proposed by Sein et al. (2016Sein et al. ( , 2017 ( Figure 1): The main approach is to locally increase the resolution over areas of high sea surface height (SSH) variability as obtained from satellite data. The horizontal resolution of the mesh varies from 8 km over energetically active areas such as the North Atlantic Current region to 80 km over areas with low SSH variability. The number of surface grid points of the MR mesh is close to the number of grid points in conventional regular model grids of ¼°resolution. The performance of the "MR-type" meshes in a climate configuration with AWI-CM in comparison to several other FESOM meshes is evaluated in Rackow et al. (2019), Sein et al. (2018), de la Vara et al. (2020).

Cmorization and Data Publication
CMIP6 is a community project, and sharing our experiment result data is an important aspect of the project.
To be able to utilize data from other groups, a large set of output data has been defined where the attributes and detailed description for each dataset are put in place as a reference. These are called the CMIP6 CMOR data request (DR) tables (cmip6-cmor-tables 2019). The tables have evolved to a great extent over the past 3 years. All CMIP6 data are being published through the Earth System Grid Federation (ESGF) (Juckes et al., 2020)-including the AWI-CM CMIP6 data .
From a technical point of view, we first chose which variables to generate during our model runs, as re-running the simulations is usually not feasible due to time and resource constraints. We currently produce around 150 variables matching the recent CMIP6 CMOR DR tables. The model has been optimized to be able to output the data in a very resource efficient manner; this enables us to use less computing resources and complete the simulations more quickly. Due to the many changes of the requirements regarding the output contents and metadata information, the CMIP6 CMOR DR tables have undergone, we had to develop a flexible strategy to transform the simulation output into the required publishable format.
As a result, we now have a post processing software in place, which can directly be fed with the aforementioned DR tables to produce the output accordingly (Hegewald, 2019). More details on the procedure and an explanation of how to use unstructured mesh data from the ESGF can be found in Appendix A1.

Remaining Drift and Imbalances in the Pre-Industrial Control Simulation
In the pre-industrial control simulation, AWI-CM is in quasi-equilibrium: The 2 m temperature drift from Year 150 to Year 400 of the piControl simulation (the time period to which most of the historical, scenario, and idealized CO 2 increase experiments need to be compared) amounts to 0.00022°C/year. Furthermore, sea ice trends are ranging from −6.9 × 10 2 to −2.7 × 10 2 km 2 /year for the Arctic and from −4.4 × 10 2 to −2.6 × 10 2 km 2 /year for the Antarctic computed for the Years 150 to 400 during March and September, respectively. This suggests that any residual drift of 2 m temperature and sea-ice extent in the coupled system is much smaller than the changes anticipated in a warming word. Figure 2 shows the Hovmöller diagrams for the global average profiles of oceanic potential temperature and salinity for the last 400 years of the control simulation. The amplitude of the drift is less than 0.15°C for temperature and 0.05 psu for salinity, respectively, indicating that the system is close to its quasi-equilibrium state. The drift in temperature is concentrated at depths of 500, 1,500, 3,000, and 4,500 m, while the drift in salinity happens mainly at depths of 500 and 2,000 m. From inspecting the spatial distribution of the drift (not shown), we conclude that the upper drift zone at 500 m stems primarily from the overall cooling and freshening of the ocean. The drift between 1,500 and 2,000 m is partly linked to the Mediterranean outflow which spreads into the southern North Atlantic. The simulated outflow is too warm and too salty. At 3,000 m, we observe that the Atlantic and Pacific Oceans become cooler while Indian and Southern Oceans show positive trends in temperature. Simultaneously, salinity in the North Atlantic shows a negative trend at this depth, partly compensating the warming signal there in terms of density. Everywhere else at this depth there is a positive drift in salinity, most pronounced in the Indian Ocean. Finally, the deepest zone of temperature increase at 4,500 m stems from a warming trend in the Southern Ocean. Although the spatial pattern of non-zero temperature changes implies a small remaining redistribution of heat and salinity, we overall conclude that the system is close to a quasi-equilibrium state. Simulated changes in response to greenhouse gas increases are clearly stronger than this residual drift as shown in section 5.3.
In the last 100 years of the 500-year piControl simulation, which followed the 500-year spinup simulation, there are still imbalances in the top-of-atmosphere (TOA) and net surface radiation. Averaged over these 100 years, the TOA radiation imbalance amounts to 0.34 W/m 2 , whereas the net surface energy flux consisting of radiation and turbulent heat fluxes amounts to 0.84 W/m 2 . Given that changes in the atmospheric energy content on this time scale are much smaller, the discrepancy implies an unphysical atmospheric energy non-conservation of about 0.5 W/m 2 . By using the delta approach in the evaluation of the climate change signal as briefly introduced in section 2.2, this non-conservation is canceled out although one needs to keep in mind the non-linearity of the system. The gradual energy loss of the ocean over the same time period, diagnosed from changes in the 3D ocean temperature (and sea-ice mass changes), corresponds to a global surface energy flux of −0.01 W/m 2 . The deviation from the atmospheric surface flux imbalance by 0.85 W/m 2 cannot be explained by changes in the continental heat content but points to further deviations from energy conservation that can be related to mismatching grids and coastlines between the model components, inconsistent treatment of temperature, precipitation, and runoff (Mauritsen et al., 2012), or other inconsistencies. The atmosphere-related and the surface-related non-conserving energy terms partly compensate each other, resulting in an overall unphysical energy sink of −0.35 W/m 2 , and both of them are relatively constant over all simulations (when averaged over decades and longer; not shown).  Table 1).

Performance Indices
In order to objectively characterize the performance of the historical simulations compared to observations, we use modified performance indices by Reichler and Kim (2008) as described in Sidorenko et al. (2015) for the atmosphere and in Rackow et al. (2019) for the ocean. The referenced reanalysis and observation data the model is compared to and a description of the computation of the index are given in Appendix A2.
The index measures model error compared to observations relative to the average model error of CMIP5 models. A performance index of 0.5 would indicate an excellent performance as the mean absolute error is halved compared to the CMIP5 models while a performance index of 2 would indicate a doubling of the mean absolute error compared to the CMIP5 models. Table 2 shows the atmosphere performance indices of the first ensemble member of the historical simulations. For the other four ensemble members of the historical simulations, the results are very similar (not shown). While the performance indices are first computed for each season individually, here, for brevity, we show the annual average. Globally, AWI-CM shows a good performance in all considered variables and is better than the CMIP5 multi-model mean. Especially Antarctic large-scale circulation and sea ice concentration are very well represented compared to the average of the CMIP5 models. However, there are a few variables such as precipitation, 500 hPa geopotential, and Arctic sea ice which are not in all regions represented better than by the CMIP5 models (only global mean, Arctic, and Antarctic shown for brevity). As pointed out in section 5.2.2, the Arctic sea-ice extent is very well represented both in terms of the mean value and in terms of the trend over the past 3 decades. The sea ice concentration is underestimated in boreal summer and autumn in the interior Arctic-see section 4.5-but the sea-ice extent is not affected by this since values are generally between 50% and 90% and therefore are still well above the threshold of 15%.
From the performance indices for the ocean (Table 3), we can conclude that potential temperature is better represented than in CMIP5. However, this is not the case for salinity. Salinity in the Pacific Ocean as well as in the North Atlantic Ocean deviates more from observations compared to the average of CMIP5 models.
While the performance indices give a quick and objective overview of how a model performs compared to other CMIP5 models, it is necessary to carry out more detailed analysis to investigate if typical errors of climate models such as the Southern Ocean warm bias or the cold bias in the North Atlantic subpolar gyre persist. Regarding the errors in potential temperature and salinity, more analysis is provided in section 4.4.

Atmospheric Circulation
AWI-CM shows a too strong westerly flow above the Southern Ocean especially in austral summer, indicated by too low mean sea level pressure (MSLP) over the southern high latitudes and too high MSLP over the southern middle latitudes (Figure 3). In the Euro-Atlantic sector, there is evidence for a southward shift of the jet stream, resulting in a too strong westerly flow over Southern Europe and a too weak westerly flow over Northern Europe in boreal winter and spring. This bias has been found   in numerous CMIP5 models (Zappa et al., 2013), and it can be associated with an underestimation of Euro-Atlantic blocking (Jung et al., 2012). Especially in boreal winter, the Aleutian low is too weak. This feature was observed in previous ECHAM6 simulations as well (Stevens et al., 2013). The MSLP biases are not negligible and amount to up to 7 hPa. In the regions they occur, these biases are comparable to the climate change signal indicating that the confidence in projections of circulation changes is low.
The MSLP bias is dependent on the season as shown in Figures 3a-3d. However, in the following, we will also consider the annual mean sea level pressure biases (

ENSO Statistics and Phase Locking
Sea surface temperature (SST) anomalies in the tropical Pacific associated with the El Niño-Southern Oscillation (ENSO) are of global concern. Since ENSO is the largest signal of interannual variability on Earth (e.g., Timmermann et al., 2018), the realistic simulation of these SST anomalies, both with respect to their absolute magnitude and temporal behavior, is crucial for any global climate model.
When comparing area-weighted SST anomalies in the Niño 3.4 box (170°W to 120°W, 5°S to 5°N) to observations, we find that the five historical ensemble members with AWI-CM show a realistic distribution ( Figure 5). The clear asymmetry between El Niño and La Niña events  seen in observations (positive skewness of Niño 3.4 SST anomalies) is also evident in the model. The skewness is 0.15 ± 0.16 (one standard deviation) in the five ensemble members while the observed skewness is 0.36 for 1870-2014. Note that all data have been linearly detrended and the seasonal cycle has been removed before computing the standard deviation. Moreover, the Niño 3.4 index has a significant broad spectral peak, both in the model and in observations for 1870-2014, at a typical period of about 4-7 years when compared to corresponding red-noise processes ( Figure 6). While the distribution of the variance over the frequencies is well reproduced in the model, the total variance is overestimated in all AWI-CM-MR ensemble members (0.75-1.01 K 2 compared to the observed 0.57 K 2 ).
To assess the temporal behavior further, we apply a diagnostic that quantifies the seasonal phase locking of Niño 3.4 SST anomalies to the seasonal cycle ( Figure 7). Observed SST variability associated with ENSO, as diagnosed from monthly standard deviation, tends to peak in boreal winter, with a minimum in spring. Especially in boreal winter, the five ensemble members capture the corresponding U-shape and its magnitude relatively well; however, there is a positive bias in spring. A bias of similar magnitude had already been identified in a previous configuration of AWI-CM, using a globally relatively low resolution mesh but with tropical ocean grid refinement at 0.25° (Rackow & Juricke, 2020). The bias appears to be rather sensitive to the applied tropical ocean resolution since the secondary peak in spring is much stronger at a coarser resolution of 1°, using the same atmospheric resolution (see Figure 6 in Rackow et al., 2014).

Ocean
Spatial distributions of temperature and salinity biases at the surface and in the interior of the ocean for historical simulations are shown in Figure 8. Most areas show a small cold bias of 1°C or less in sea surface temperature (SST). There is a pronounced cold bias in the North Atlantic, which is related to the too zonal pathway of the North Atlantic Current; this is a problem that is present in many CMIP climate models (e.g., Wang, Zhang, et al., 2014).  Rayner et al. (2003). The five historical ensemble members with AWI-CM are given in blue. Gray shading denotes the 5-95% confidence interval of an AR(1)-process fitted to OBS, based on a Monte Carlo approach with 10,000 realizations, as detailed by Rackow et al. (2018). The total (integrated) observed Niño 3.4 variance [K 2 ] is 0.57; for AWI-CM-MR, the range is (0.75-1.01). If refining the horizontal resolution further to half of the local Rossby radius which for the long time periods of CMIP6 simulations is computationally prohibitive, this bias is largely reduced . Warm SST biases of up to 1.5°C can be found over the Kuroshio extension, west of Africa as well as very localized close to the equator west of South America, in the Irminger current, over the Labrador Sea, and in the Southern middle latitudes in the Indian and Atlantic sector. Some of these biases are typical for climate models such as the cold bias over the North Atlantic subpolar gyre or the warm bias west of Africa. However, over the Southern Ocean, no pronounced warm bias is found. This is in stark contrast to MPI-ESM-1.2, the climate model with the same atmospheric component but different ocean model (Müller et al., 2018, their  At the surface, most of the ocean exhibits a fresh bias. In many subtropical and tropical areas, this bias amounts to 0.5 to 1 psu; it tends to be weaker in middle-latitude areas. Pronounced but localized salt biases of around 2 psu can be seen close to the coasts of the Eurasian Arctic, in and around the Gulf of Mexico, and in the Bay of Bengal. Smaller salinity biases of up to 0.3 psu can be found over the Southern Ocean and the Pacific warm pool. The general feature of a surface fresh bias in many regions is present also in other climate models such as the E3SM (Golaz et al., 2019), although the regional distribution is not necessarily the same. Features such as the Gulf of Mexico and Bay of Bengal salinity biases are in common with E3SM.
Many CMIP5 models that have coarse ocean resolution suffer from a warm bias at around 1,000 m, which is especially strong in the Atlantic Ocean. Increase in the horizontal resolution leads to reduction of this bias, as pointed out by Rackow et al. (2019). Therefore, the performance of AWI-CM in Atlantic temperature is improved compared to other CMIP models. In the AWI-CM simulations discussed in this paper, the magnitude of the warm bias in the South Atlantic is similar to the one over most of the Pacific Ocean ( Figure 8b). The cold and fresh bias in the North Atlantic is related to the outflow and spreading of Mediterranean waters from the Strait of Gibraltar. The reasons for this bias and possible ways to reduce it are discussed in Rackow et al. (2019). The positive temperature and salinity biases in the Indian Ocean are most probably related to excessive supply of warm and salty water from the Red Sea. Generally, the biases in temperature and salinity compensate each other in terms of density.
It turns out that below a depth of about 500 m in the ocean, the mean absolute error of the potential temperature is smaller in AWI-CM than in most of the CMIP5 models (Figure 9a), while for salinity, AWI-CM is comparable to CMIP5 models ( Figure 9c). Compared to the CMIP5 version of MPI-ESM, which shares a slightly older version (6.0 instead of 6.3) of the same atmosphere component and which is run at T63 corresponding to around 200 km horizontal resolution instead of T127 corresponding to around 100 km horizontal resolution, the potential temperature error is smaller in AWI-CM but the salinity error larger. When focusing on the North Atlantic Ocean, potential temperature ( Figure 9b) for which various models show a pronounced warm bias in 1,000 to 2,000 m (Rackow et al., 2019), AWI-CM performs well. However, for salinity, in the North Atlantic ( Figure 9d) and also in the Pacific (not shown), the mean absolute error is large

10.1029/2019MS002009
Journal of Advances in Modeling Earth Systems compared to most of the CMIP5 models including MPI-ESM. Note that Figure 9 shows results for DJF; for JJA, results are very similar below around 300 m.

Sea Ice
The general patterns of observed Arctic and Antarctic sea ice concentration are well represented in AWI-CM over the last 30 years of historical simulations (Figures 10 and 11). Both Arctic and Antarctic sea ice concentration are overestimated in late winter in the marginal ice zones and underestimated in late summer in most areas. This hints to a too pronounced annual cycle of sea ice cover which can also be seen in the sea-ice extent as shown in Figure 15. Nevertheless, Arctic sea-ice extent and thickness are remarkably well represented especially over the last few years (Figures 15 and 16). While late winter Arctic sea ice concentration biases are very similar to MPI-ESM, late winter Antarctic sea ice concentration in MPI-ESM has a substantial negative bias especially northeast of the Weddell Sea and a slight negative bias in East Antarctic marginal seas (Müller et al., 2018, their Figure 4) rather than a slight positive bias. This difference is consistent with the reduced Southern Ocean warm bias in AWI-CM compared to MPI-ESM. For AWI-CM, the ECS amounts to 3.2°C (Figure 12, half of the 4xCO 2 equilibrium temperature difference). This is similar to the average over the CMIP5 models (IPCC, 2014) and slightly larger than for the CMIP6 version of MPI-ESM (3.0°C; Mauritsen et al., 2019;Müller et al., 2018;Tokarska et al., 2020). The TCR amounts to 2.1°C, which is slightly stronger than the average over the CMIP5 models (1.8°C; IPCC, 2014) and the CMIP6 version of MPI-ESM (1.7°C; Tokarska et al., 2020). Note that by considering changes in the TOA flux and the global-mean near-surface temperature (delta approach), our estimates for the ECS and the TCR are not affected by the imbalances reported in section 3 (apart from possible non-linear effects).
It seems that AWI-CM absorbs energy in the deep ocean more slowly compared to MPI-ESM. However, this hypothesis needs to be confirmed through a thorough analysis in a joint effort with the Max Planck Institute for Meteorology. Ideally, the ECS should not be affected. However, since the Gregory method to compute ECS is only an approximation, small differences can still occur.

Surface Response 5.2.1. Two Meter Temperature and Precipitation
The evolution of the global and hemispheric mean temperature at 2 m above the surface in the piControl, historical, and scenario simulations is shown in Figure 13. The piControl simulation shows no discernible trend in temperature, as expected. When considering the anthropogenic forcing, the historical simulations show a warming of 1.1 ± 0.1°C in 2005-2014 compared to 1891-1900 while for the observations the warming amounts to 0.9°C over the same period. Both in the observations and in the historical simulations, the Northern (Southern) Hemisphere warming is 0.2°C higher (lower) than the global average. The more pronounced warming over the Northern Hemisphere compared to the Southern Hemisphere is partly due to the higher land partition in the Northern Hemisphere compared to the Southern Hemisphere.
Until the end of the 21st century, the global mean temperature rises by approximately 4°C from today under the strongest emission scenario SSP585. Over the Northern Hemisphere, this warming is more pronounced and amounts to approximately 5°C; over the Southern Hemisphere, the warming is limited to approximately 3°C. For the weakest emission scenario, SSP126, the global mean warming remains just below 2°C compared to pre-industrial conditions. The SSP126 scenario has been designed to keep global warming below 2°C-a condition that seems to be fulfilled in our simulations. Overall, the temperature increase in the AWI-CM simulations for both the strongest and the weakest emission scenario agrees with the CMIP5 multi-model ensemble mean (IPCC, 2014, their Figure SPM.6a) and appears to be slightly stronger compared to the CMIP6 version of MPI-ESM-which is expected due to the slightly higher transient climate response in AWI-CM compared to MPI-ESM. Figure 14 shows the spatial distribution of simulated temperature and precipitation changes until the end of the 21st century according to the strongest emission scenario SSP585. Temperature changes are very robust and exceed the 2 standard deviations of interannual variability of the control simulation over the whole globe ( Figure 14a). Generally, precipitation changes are less robust (Figure 14b) with the Arctic and the Southern Ocean as well as the African tropics being prominent exceptions. Simulated precipitation For each year, the near-surface (2 m) air temperature change between abrupt-4xCO 2 and piControl simulation is plotted against the change in net downward radiative flux between the two simulations. The more the abrupt-4xCO 2 simulation approaches the equilibrium, the smaller the difference in net downward radiative flux compared to the reference simulation becomes. To compute the initial radiative forcing, a regression is built from all data points and extrapolated to a change in near-surface air temperature of 0°C. α is the climate response parameter, indicating the strength of the climate system's net feedback (radiative feedback divided by temperature response). To compute the equilibrium temperature difference, the regression is extrapolated to the equilibrium (difference in net shortwave radiation = 0). changes can be regarded as less robust than temperature changes not only because of large internal variability of the precipitation but also because of large biases in present-day climate which amount to more than 7 mm/day in some tropical areas. Bias patterns for both 2 m temperature and precipitation as well as the magnitude of the biases for present-day climate are not surprisingly very similar to the ones in MPI-ESM (Müller et al., 2018, their Figure 7f).
The well-known feature of Arctic amplification, and to a lesser extent also Antarctic amplification, can clearly be seen from Figure 14a. According to the SSP585 scenario, the temperature increases as much as 11°C over the Northern Barents Sea and around Spitsbergen. In the northernmost parts of the European and American continents, the warming exceeds 7°C at the end of the century compared to the historical reference period. Large continental areas are affected by temperature increases of more than 5°C. Also, over the Weddell Sea and over parts of Antarctica, temperature increases of more than 5°C are simulated. Over the ocean, the warming generally amounts to 2-3°C.
Over large areas of central Africa and over the tropical Pacific, precipitation increases of more than 50% are simulated. Other areas, with comparable precipitation increases, include the ocean northwest of South Africa as well as northeastern parts of Greenland. Over the whole Arctic, a substantial precipitation increase of more than 40% is simulated; over the Southern Ocean adjacent to the Antarctic continent, extended areas are affected by precipitation increases of 20% to 30%. These precipitation changes are very robust since they exceed twice the interannual standard deviation of the control simulation. Except for parts of the Amazonas region, simulated precipitation decreases are less robust and are mainly concentrated in subtropical areas. They do not exceed 50% of present-day precipitation.
Compared to the multi-model CMIP5 ensemble (IPCC, 2014, Summary for Policymakers, their Figure SPM.8), the temperature response in AWI-CM looks very similar, both regarding magnitude (11°C over Northern Barents Sea, more than 5°C over large continental areas as well as Weddell Sea and parts of Antarctica, 2°C to 3°C over large parts of the ocean) and pattern of response. However, the warming hole, that is, a lack of warming over the North Atlantic subpolar gyre, that is present in the CMIP5 ensemble (e.g., Chemke et al., 2020;Menary & Wood, 2018), hardly exists in AWI-CM. Furthermore, the precipitation increase in AWI-CM over the Arctic is less pronounced and the precipitation increase over Africa clearly more pronounced compared to the multi-model CMIP5 ensemble (IPCC, 2014, Summary for Policymakers, their Figure SPM8). Otherwise, the precipitation response pattern is quite consistent.
It can be concluded that especially the temperature response pattern with strong Arctic and continental as well as weak ocean warming agrees very well with the multi-model ensemble mean of CMIP5 simulations, even in terms of magnitude. Also the feature of wetting polar, subpolar, and tropical regions as well as drying subtropical regions agrees with patterns from the multi-model ensemble of CMIP5 simulations although the magnitude of the response is not as consistent as the magnitude of the temperature response.

Sea-Ice Extent
The simulated changes in sea-ice extent are shown in Figure 15 for the Arctic (a, b) and the Antarctic (c, d) during March and September according to piControl, historical, and tier 1 scenario experiments (i.e., ssp126, ssp245, ssp370, and ssp585), along with observations of the last decades.
The strongest decline trend in sea-ice extent can be seen in the Arctic, during September (Figure 15b). Starting between 2025 and 2030, there are isolated years with virtually sea ice-free Arctic summers (1 × 10 6 km 2 sea-ice extent or less) independent of climate change mitigation efforts (see also Notz & SIMIP Community, 2020). Starting from around 2050, except for SSP126, there are subsequent summers of a virtually ice-free Arctic ocean. The observed September sea-ice extent according to AWI's Sea Ice Portal (Grosfeld et al., 2016; derived from the University Bremen AMSR-ASI product; see Spreen et al., 2008) for 1979 to 2019 is shown (in purple) on top of AWI-CM outputs, confirming that AWI-CM sea-ice extent agrees well with observations both in terms of the average and in terms of the rate of sea ice decline. However, the September Arctic sea ice concentration is underestimated in AWI-CM simulations of the last 30 years as shown in section 4.5. This needs to be taken in consideration when interpreting the projections of the future Arctic sea ice cover. According to the multi-model CMIP5 ensembles of September sea-ice extent, Arctic sea ice was melting even faster than predictions, even though observations remained within the first standard deviation of the models due to high internal variability of the participating models (Stroeve & Notz, 2015). In comparison to CMIP5, AWI-CM shows stronger sensitivity to the forcings. Unlike multi-model CMIP5 ensembles, ice-free Septembers will be expected not only for SSP585 (corresponding to RCP 8.5 in CMIP5) but also for SSP245 (corresponding to RCP 4.5 in CMIP5) and SSP370 (new pathway). IPCC AR5 reported September sea-ice extent reduction in 2081-2100 with respect to the average of the last 20 years of historical experiments (1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005) to be 43% for RCP 2.6 and 94% for RCP 8.5 (IPCC, 2013, p. 92). According to our simulations, the September Arctic sea-ice extent declines by the end of this century (2081-2100) with respect to the last 20 years of historical experiments (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) according to AWI-CM SSP126 and SSP585 are 64% and 99.99%, respectively. The inter-ensemble variability for both historical and scenario (ssp370) experiments is small. This means that the results are robust against internal variability.
Likewise, Arctic sea-ice extent during March shows a continuously negative trend for historical and scenario experiments (Figure 15a). This negative trend seems to be independent of the scenario until the mid-21st century which implies that the impact of mitigation efforts might not be seen before that in terms of Arctic winter sea ice. However, afterwards, sea-ice extent stabilizes at around 14 × 10 6 km 2 for SSP126 and SSP245. As detailed in Figure 15a, scenarios incorporating higher radiative forcings (SSP370 and SSP585) predict accelerating decline of sea-ice extent. According to the high-end scenario of SSP585, by 2100, Arctic March sea-ice extent will be half of its value at the beginning of the century.
IPCC AR5 (Climate Change 2014; IPCC, 2014; Synthesis Report p. 48) reported low confidence in near-term projections of Antarctic sea-ice extent. This was due to the mismatch between CMIP5 models (strong simulated decline) and observations (no decline) along with very limited understanding of the origin of this mismatch. According to IPCC AR5, it is suspected that this phenomenon is likely due to regional variability within the Antarctic (IPCC, 2013, p. 303). A study of individual CMIP5 models also suggested that although these models cannot replicate the observed Antarctic sea-ice extent trend, the observation still remains within the natural variability of better performing models (Turner et al., 2015). Furthermore, Bintanja et al. (2013) showed that this sea-ice expansion could indeed be due to Antarctic sea ice shelf melting, which is not represented in CMIP5 models.
Similar to CMIP5 models, AWI-CM predicts declining Antarctic sea-ice extent for both September and March over recent decades (Figures 15c and 15d)-which is in contrast to observations-and furthermore till the end of the century. In addition, the simulated difference between late winter and late summer Antarctic sea-ice extent is more pronounced than in observations. Overall, interannual variability for Antarctic sea-ice extent is larger than for the Arctic, which agrees with the findings regarding CMIP5 models by Turner et al. (2015). Similarly, to the Arctic sea ice, mitigation efforts only start to have a noticeable impact from around 2050.
Like for the sea-ice extent, the decline of Arctic sea-ice thickness is also evident from the historical simulation during the freezing season, most pronounced from around mid-20th century till recent years ( Figure 16). Simulated sea ice thickness in the Antarctic shows a weaker decline than that in the Arctic. We compare the simulated ensemble mean thickness in the Arctic with recent satellite thickness data from CS2SMOS (Ricker et al., 2017), which is constructed by merging CryoSat-2 and SMOS thickness together using the optimal interpolation method. Sea ice thickness in the historical simulation falls well into the observed range from 2010 to 2013. Basin-scale observations for sea ice thickness in the Antarctic are rather limited. A more detailed evaluation against observations for Antarctic ice thickness is therefore currently not possible.

10.1029/2019MS002009
Journal of Advances in Modeling Earth Systems

Large-Scale Circulation Response
Similar to other climate models and as stated before, large-scale circulation exhibits biases of the same order of magnitude as the simulated response to anthropogenic forcing affecting the reliability of the projections. Nevertheless, a few features are worth mentioning: The mean sea level pressure (MSLP) response to increasing greenhouse gas concentrations ( Figure 17) is generally characterized by low anomalies over the polar regions and high anomalies in the southern middle latitudes. Considering the geostrophic balance, this leads to an increase of the westerly flow in the northern and southern middle latitudes mostly around 60°latitude. Over the Northern Hemisphere, this increase is most pronounced in boreal autumn (SON) and winter (DJF). In the North Atlantic region, the increase of the westerly flow is located further to the north compared to the CMIP5 ensemble mean as can be seen from Zappa & Shepherd, 2017, their Figure 1), while in the North Pacific region, the location of the increase of the westerly flow is comparable. An intensified Aleutian low in boreal winter leads to a shift of the increased westerly flow over the North Pacific sector toward lower latitudes with a maximum around 45°N. Over the Southern Hemisphere, the increased westerly flow is equally present in all seasons with a shift in the African sector toward lower latitudes in austral winter (JJA) and spring (SON).  There is an ongoing discussion on how the waviness of the atmospheric flow in middle latitudes will change in the future as a result of changes in the Arctic, through Arctic amplification, and in the tropics, through upper tropospheric warming. The contrasting driving from the Arctic versus the tropics has been termed a tug of war in the middle latitudes (e.g., Barnes & Polvani, 2015;Blackport & Kushner, 2017;Chen et al., 2020) Will there be a more zonal flow with a decrease in the intensity of atmospheric waves implying less extreme warm and cold events or will the meridionality of the flow get stronger implying more extreme warm and cold events in the middle latitudes or will there be no change? To answer this question, various different objective indices have been defined. Cattiaux et al. (2016) defined the sinuosity index (SI) as the length of an isohypse of a specific value divided by the length of the 50°N latitude circle. If due to features such as cut-off lows there are separated isohypses of the specific value, the sum of the lengths of these isohypses is taken. The value of the isohypse is chosen as the area average of z500 over 30 to 70°N to accommodate for seasonal differences and climate change signals. If the SI equals to 1, the flow is zonal since the chosen isohypse is a straight line. The higher the SI, the stronger the meridional component of the atmospheric flow. Figure 19 shows the SIs computed for the piControl, historical, scenario simulations, and the ERA5 reanalysis. Overall, the differences between the different simulations are smaller than differences between the model and reanalysis data. In all simulations, the waviness of the flow is more pronounced in boreal winter and spring compared to summer and autumn. The annual cycle is shifted compared to the ERA5 reanalysis. While the simulations show the maximum of waviness around February, according to the ERA5 reanalysis, it is around May. The minimum of waviness occurs around August in the simulations and around October according to the ERA5 reanalysis. While the amount of the maximum waviness is well captured in the model compared to the reanalysis, the minimum is too pronounced in the simulations indicating a too zonal flow in late summer.
Generally, a pronounced interannual variability can be seen both in the simulations and in ERA5. With increasing greenhouse gas concentrations, there is a tendency toward a more zonal flow in boreal summer and autumn, while in winter and spring, there is no robust change. This is consistent with the proposed tugof-war (e.g., Barnes & Polvani, 2015;Blackport & Kushner, 2017;Chen et al., 2020): the upper tropospheric warming in the tropics leads to an increased meridional temperature gradient, stronger mean westerly flow, and decreased waviness. In contrast, in boreal winter, the effect of Arctic amplification leads to a reduced meridional temperature gradient, weaker mean westerly flow, and increased waviness offsetting the impact of upper tropospheric warming in the tropics. However, the impact on the waviness is very much under debate and shows very little robustness. Due to the lack of Arctic amplification in boreal summer, the upper tropospheric warming in the tropics (Figure 18a) may lead to a stronger zonal and less wavy flow. However, even in boreal summer, differences are small compared to the strong interannual variability. Averaged over the year, the zonal mean zonal wind mainly increases in the stratosphere and only to some extent in the upper troposphere in the northern middle latitudes, while in the southern middle latitudes, zonal mean zonal wind increases are present throughout the troposphere, possibly due to the relative lack of Antarctic amplification ( Figure 18a).

Ocean Response
The Atlantic meridional overturning circulation (AMOC) is an important element of the global ocean circulation. Transporting heat from the tropics to the northern North Atlantic, it has profound implications not only for the climate of north-western Europe but for the whole Northern Hemisphere. It is also associated with ocean heat transport from the South Atlantic to the tropics (Weijer et al., 2019). Figures 20 and 21 show the maximum AMOC strength at 26°N for piControl, historical, scenario simulations, and RAPID observations (Smeed et al., 2019) as well as for piControl, 1pctCO 2 , and abrupt-4xCO 2 simulations, respectively. For the 15-year record of the RAPID observations, our model agrees well both in terms of the mean value and in terms of the range of interannual variability with the observations. The historical simulation is indistinguishable from the control simulation; that is, it agrees within a standard deviation with the control simulation, even though other parameters such as the Arctic sea ice and near-surface temperature show substantial changes toward the end of the historical period. Furthermore, the development of the AMOC strength according to the weakest scenario SSP126 is indistinguishable from the control simulation until the end of this century. For the three other emission scenarios, the signal starts to emerge from the noise later than 2050; that is, values are continuously lower than the piControl value minus one standard deviation.
In the case of a transient increase of the greenhouse gas forcing (historical, scenario, and 1pctCO 2 simulations), the AMOC strength at 26°N gradually decreases by around 20% until the end of the 21st century with the high emission scenario SSP585 and by around 25% within 150 years in the idealized 1pctCO 2 simulation. In the abrupt-4xCO 2 simulation, the maximum AMOC strength decreases markedly by around 30% over the    Table 4 shows the ocean volume transports through some key ocean straits, averaged over all five ensemble members and the time period 1985-2014 for the historical runs and over the Years 2071-2100 for the SPP370 runs. The historical runs show volume transports that are comparable to observed estimates for most of the ocean straits. For some straits, however, the volume transports are underestimated, including the export from the Arctic Ocean to the North Atlantic measured at the David Strait, the Indonesian Throughflow, and the transport in the Mozambique Channel. The main reason for this underestimation is due to the fact that the model resolution is not fine enough to resolve those narrow straits. In particular, the three main straits in the Canadian Arctic Archipelago (CAA) are only 10, 30, and 50 km wide at their narrowest locations, respectively, which cannot be well resolved with the mesh we used in the CMIP6 simulations. Improved representation of the CAA, and thus of the ocean transport through the Davis Strait, is expected for future coupled model configurations with higher ocean resolution, following promising results with high-resolution stand-alone configurations using FESOM Wekerle et al., 2013).
For some ocean straits, the ocean volume transport shows a large response to the climate change in the SPP370 scenario. The Florida Current, for example, decreases by about 15% at the end of the 21st century in the SPP370 scenario, which is consistent with the weakening trend of the AMOC described above. The ocean volume transport in the Indonesian Throughflow and the Mozambique Channel also decreases in a warming climate (by about 20%). This implies that the exchange between the Pacific, Indian, and Atlantic Oceans will become weaker. The oceanic linkage between the North Atlantic and the Arctic Ocean, however, is strengthened significantly in a warmer world, as shown by the increase in the volume transport through the Barents Sea Opening (increase by about 40%). Together with the temperature increase in the Atlantic Water, this implies that oceanic heat supply from the North Atlantic to the Arctic Ocean, and hence Atlantification of the Arctic Ocean, will increase in the future. As a consequence of ocean volume conservation, the excess ocean volume inflow through the Barents Sea Opening is balanced by an increased outflow from the Arctic through the Fram Strait.
In a warming climate, the strength of the North Atlantic subpolar gyre (SPG) decreases, as shown by the increase in the sea surface height (SSH) in the SPG region ( Figure 22). The weakened SPG brings less Atlantic Water into the gyre circulation from the northeastern North Atlantic, which allows more Atlantic Water to continue to the north into the Nordic Seas. The enhanced northward flow is manifested by the increase in the SSH along the European coast. This can explain the stronger ocean volume transport through the Barents Sea Opening at the end of the 21st century in the SPP370 scenario (Table 4). The SSH on the northwestern side of the Gulf Stream increases in the warming scenario, which indicates a weakening of the Atlantic Current and is consistent with the weakening of the AMOC and the warming off the East Coast of the United States (Figure 14a).

Changes in the Energy Budget
The global-mean net total TOA radiative imbalance remains, on decadal timescales, close to zero in the historical simulation until around 1970, after which it increases to~0.7 W/m 2 for present-day conditions Note. Positive values mean northward or eastward flows.

10.1029/2019MS002009
Journal of Advances in Modeling Earth Systems ( Figure 23a, black solid curve), reflecting the uptake of heat by the climate system. This is less than the observational estimate of 0.9 W/m 2 for the period 2005-2014 by Trenberth et al. (2016), but within the uncertainty bounds (±0.3 W/m 2 ). It matches the observational estimate by Johnson et al. (2016), who report 0.71 ± 0.1 W/m 2 . Compared to CMIP5 and to other CMIP6 models, our simulated 0.7 W/m 2 are below the average (see Wild, 2020, their Figure 6). After the historical period, the net total TOA radiative imbalance decreases gradually in our SSP126 scenario simulation, stabilizes at~0.9 W/m 2 in the SSP245 scenario simulation, and continues to increase to up to 2.0 W/m 2 in the SSP370 and SSP585 scenario simulations toward the end of the 21st century (Figure 23a, colored solid curves). In contrast to the net total TOA radiation, its shortwave component exhibits a negative imbalance varying between 0.0 and −1.0 W/m 2 over the course of the historical simulation (Figure 23a, black dashed curve), which implies an increased planetary albedo (Figure 23b, black solid curve). The increased planetary albedo, particularly pronounced during the second half of the 20th century (+0.2%; the absolute simulated planetary albedo is 28.9%), is not due to changes in surface albedo (Figure 23b, black dashed curve) but is likely for the largest part due to anthropogenic aerosols that have compensated for a similarly strong positive longwave-radiative imbalance due to increased greenhouse-gas concentrations.
While according to our simulations an increased planetary albedo has prevented a stronger warming of the climate system until present day, the planetary albedo is projected to decrease and thus amplify the future warming in all scenarios (Figure 23a, colored solid curves, and Figure 23b, colored dashed curves). The global-mean effective surface albedo, which seems to have played no major role until around 1980, is projected to decline by more than 1% (the absolute simulated effective surface albedo is~13%) until the end of the century in the SSP585 scenario simulation and is thus a significant part of the projected positive shortwave feedback. Interestingly, the global-mean net shortwave radiation is projected to increase faster than the total radiation (Figure 23a). This implies that, while reduced outgoing longwave radiation (OLR) has caused the warming until present day, the OLR is projected to increase toward the end of the century: Due to the strong shortwave feedback, the positive influence of increasing temperatures on OLR is projected to outweigh the direct negative influence of increased greenhouse-gas concentrations on OLR. This behavior has been found for most CMIP3 and CMIP5 models (Donohoe et al., 2014).
The surface albedo is decreasing particularly strongly in the regions with declining sea-ice extent, that is, the Southern Ocean (60°S to 70°S) and the Arctic (north of 70°N) (Figure 23d, dashed curves). These changes are clearly reflected in the planetary albedo (Figure 23d, solid curves), which however also reveals non-surface-related albedo changes in lower latitudes caused by cloud feedbacks. In particular, toward the end of the century, the planetary albedo is projected to increase in the tropics between 15°S and 15°N (negative feedback) and to decrease in the subtropics (positive feedback) (Figure 23d, solid curves).
While the surface-driven changes in the planetary albedo projected toward the end of the century are substantial in both polar regions, the positive net total TOA radiative imbalance is particularly pronounced over the Southern Ocean (Figure 23c, solid curves). This is consistent with the relatively weak Antarctic and strong Arctic warming (Figure 14a), leading to strongly enhanced upwelling longwave radiation in the Arctic but not in the Antarctic. This asymmetry in terms of polar amplification and TOA fluxes is consistent with the fact that the Southern Ocean temperature responds much more slowly to changes in atmospheric thermal forcing because of the spatial structure of the global meridional overturning circulation (MOC), with circumpolar upwelling of unperturbed water masses in the south and downwelling in the north (Armour et al., 2016;Rackow et al., 2018).

Changes in ENSO
Since ENSO is the dominant mode of interannual variability (Timmermann et al., 2018), with pronounced global impacts through far-reaching teleconnections such as the atmospheric bridge (Alexander et al., 2002), an important question is whether the character of ENSO will change under climate change. To address this question, we resort to the five SSP370 projections with AWI-CM until the end of the 21st century. According to these simulations, when compared to the probability distribution of Niño 3.4 SST anomalies for 1870-2014, strong warm and cold SST anomalies become more likely by the end of this century (Figure 24). The increase of strong cold SST anomalies dominates, so that the clear positive asymmetry between El Niño and La Niña (Figure 5), as diagnosed from the skewness of modeled Niño 3.4 SST anomalies (1870-2014: 0.15 ± 0.16, observed: 0.36), is reduced under climate change (2071-2100: 0.04 ± 0.17). A reduced positive asymmetry under global warming has been found for most CMIP5 models (for the overlapping Niño 3 region; Ham, 2017). However, despite sharing the atmospheric model with AWI-CM, in that study, MPI-ESM-LR and MPI-ESM-MR showed a strong increase of DJF Niño 3 skewness with the RCP 4.5 scenario, which might again hint at the different ocean model formulation in AWI-CM and MPI-ESM and should be evaluated in more detail in the future.
Interestingly, the seasonal cycle that has been subtracted to compute the SST anomalies within the Niño 3.4 box consistently changes under climate change in all ensemble members ( Figure 25): when subtracting the different annual means, a stronger positive (negative) peak in April/May (August-October) is evident, suggesting that seasonality within this region will likely increase until the end of the century. Concerning phase locking of Niño 3.4 SST anomalies to the seasonal cycle, the variability increases throughout the entire year (Figure 26), shifting the characteristic U-shape upwards by the end of the century.

Discussion
The climate model AWI-CM-MR-1-1 presented here has proven to perform well compared to CMIP5 as well as selected CMIP6 models and therefore can be regarded as a solid contribution to the CMIP6 ensemble. Model drift in the control simulation is negligible. While some long standing model biases in AWI-CM such as a too zonal North Atlantic current, a too strong atmospheric westerly flow in the Euro-Atlantic region, a too cold subpolar North Atlantic gyre, and a warm bias west of Africa are still present, there is a good representation of the North Atlantic Ocean temperature profile; that is, the warm bias in mid-depths is largely alleviated as discussed in Rackow et al. (2018). Furthermore, there is no pronounced Southern Ocean surface warm bias and therefore a very good representation of Antarctic sea ice and circulation compared to CMIP5 models and also compared to the Max Planck Institute for Meteorology CMIP6 model MPI-ESM (Müller Figure 24. Change of the probability distribution function of sea surface temperature anomalies in the Niño 3.4 region for 2071-2100 (red line). Compared to 1870-2014 (black and blue lines), extreme anomalies become more likely while the probability of low to medium anomalies decreases. The range of the model results is shaded. ENSO asymmetry (positive skewness) decreases compared to 1870-2014 (see Figure 5). All data have been linearly detrended, and the seasonal cycle has been removed. Since the atmosphere model is the same as in MPI-ESM, it could be hypothesized that the high resolution of the ocean model in the Southern Ocean helps to reduce the long standing biases in this area that is very important for the global ocean circulation as well as heat and carbon uptake (e.g., Frölicher et al., 2015). This is subject to further investigation in the future and in collaboration with the Max Planck Institute for Meteorology. Contrary to observations, Antarctic sea ice decline has been simulated for the past decades, similar to CMIP5 and CMIP6 model results (Roach et al., 2020).
In terms of the response to increasing greenhouse gases, our model shows very similar outcomes compared to the multi-model ensemble of CMIP5 simulations, both in patterns and in magnitude. Features such as a strong Arctic amplification along with a weak Antarctic amplification, Arctic wetting (Bintanja & Selten, 2014), subtropical drying, increased frequency of extreme La Niña & El Niño events (Cai et al., 2015), reduced ENSO asymmetry (Ham, 2017), and weakening Atlantic meridional overturning circulation are very similar to the previous simulations.
However, there are potentially important differences that need further investigation: The weakening of the Atlantic meridional overturning circulation is less pronounced compared to the CMIP5 ensemble mean. Since weakening of the Atlantic meridional overturning circulation has been linked to the emergence of a warming hole over the North Atlantic subpolar gyre (e.g., Keil et al., 2020), it is consistent that the warming hole is only weak according to our model results.
The equilibrium climate sensitivity of our model (3.2°C) is slightly lower than the CMIP5 and CMIP6 multi-model means (3.4°C and 3.7°C, respectively, according to Meehl et al., 2020) and slightly higher than the CMIP6 version of the Max Planck Institute for Meteorology model MPI-ESM sharing the same atmosphere component (tuned to be 3.0°C). Our transient climate response is with 2.1°C slightly higher compared to the CMIP5 multi-model mean (1.9°C according to Meehl et al., 2020), slightly lower compared to the CMIP6 multi-model mean (2.2°C according to Meehl et al., 2020), and around 23% higher than in MPI-ESM (1.7°C). This might imply that the deep ocean takes up less energy in our model compared to MPI-ESM.
Furthermore, in our model, the decline of Arctic sea-ice extent by the end of the 21st century is stronger than the multi-model mean over CMIP5 simulations, suggesting a higher likelihood of an ice-free Arctic in September even before 2050. According to our simulations, mitigation efforts only start to have an impact after 2050 in terms of Arctic winter sea ice. Surface albedo changes, in particular in the polar regions where sea ice declines, are projected to contribute substantially to a strong positive shortwave feedback. Note however that a recent geoengineering study based on AWI-CM indicates a small impact of the Arctic ice-albedo feedback on temperatures outside the Arctic (Zampieri & Goessling, 2019).
While the Arctic sea-ice extent trend is still slightly smaller in our model simulation compared to observations over the last few decades, the global mean temperature increase is slightly larger compared to observations. This could either hint at an underestimation of Arctic amplification in our simulations or that multi-decadal internal variability is superimposed on the observed Arctic climate change. The latter hypothesis is supported by Ding et al. (2017), England et al. (2019), and Kay et al. (2011) stating that around half of the strong negative Arctic sea ice trend over the past decades is explained by internal variability and the other half by the climate change signal, although there are strong seasonal and regional differences (England et al., 2019).
The AMOC decreases by around 25% until the end of the 21st century according to the AWI-CM SSP585 scenario simulation, which is less than the multi-model average value of around 40% calculated from CMIP5 models and Earth System Models of Intermediate Complexity (EMICs; Cheng et al., 2013;Weaver et al., 2012). CMIP6 models tend to show even stronger AMOC declines than CMIP5 models (Lyu et al., 2020). Previous studies suggest that the representation of western boundary currents and the Agulhas leakage in higher-resolution ocean models can influence the AMOC strength (e.g., Biastoch et al., 2009Biastoch et al., , 2018Hirschi et al., 2020;Sein et al., 2018;Weijer et al., 2019). More dedicated studies in this regard will be carried out in our further work.

Conclusions
The Alfred Wegener Institute Climate Model (AWI-CM), described in this study, contributes to the diversity of climate models with the unstructured mesh approach for its sea ice-ocean component. Biases in AWI-CM tend to be less pronounced than in models contributing to the previous Climate Model Intercomparison Project 5, as shown by objective performance indices. Even though some long standing biases such as a too zonal pathway of the North Atlantic current, the cold bias over the North Atlantic subpolar gyre, or the warm bias west of Africa are still present in AWI-CM, especially Southern Ocean sea surface temperature and the atmospheric temperature above, sea ice concentration around Antarctica and North Atlantic Ocean temperature profiles are well represented. Furthermore, there is an excellent agreement of the Arctic sea ice thickness in the past years for which observations are available. Therefore, AWI-CM results are a solid contribution to the CMIP6 project. Sea ice-ocean models on unstructured meshes have matured (now contributing to CMIP6) and can be used at high resolutions enabled through excellent scalability characteristics. Our results support the notion that some of the climate change features are robust against model formulation. However, there are some important features that deviate from other CMIP simulations: 1. Despite the smaller Arctic sea ice decline trend compared to observations, as early as starting between 2025 and 2030, there are isolated years with virtually sea ice-free Arctic summers (1 × 10 6 km 2 sea-ice extent or less) independent of climate change mitigation efforts. Only after 2050 mitigation efforts start to play a substantial role and Arctic sea ice can recover to some extent in the SSP126 scenario with strong mitigation efforts. 2. The AMOC decreases by around 25% until the end of the 21st century according to the AWI-CM SSP585 scenario simulation, which is less than the multi-model average value of 40% calculated from CMIP5 models and Earth System Models of Intermediate Complexity (EMICs; Weaver et al., 2012).
The AWI-CM model data are available through the Earth System Grid Federation (ESGF) and include not only the DECK and ScenarioMIP experiments O'Neill et al., 2016) with AWI-CM-1-1-MR   The atmospheric model data were cmorized with an approach developed by the German Climate Computing Centre (DKRZ). All the FESOM ocean data were also cmorized in a CDO-compatible manner. During the cmorization of sea-ice and ocean data, the user is notified of any incompatibilities between the model output and the designated data request tables. In the conversion process, the most time consuming step would be changes to the bulk variable data itself, but metadata can generally be applied quickly and independent of the data amount in one file. The time and resources required for changes in the variable data on the other hand will theoretically increase linearly with the size of the mesh and the output frequency of the data. Therefore, we had to avoid all steps and utilities which cannot alter data in-place, for example, where no auxiliary file or memory allocation is required.
The overall requirements for a complete CMIP6 formatted file set can be somewhat overwhelming due to its complexity (data request [DR], naming conventions, controlled vocabularies, Earth System Grid Federation [ESGF] requirements, model details). Therefore, our CMIP6 CMOR setup has been realized via a compact and human-readable setup with a domain-specific language (DSL). This setup can be executed sharedmemory-parallel per conversion task on an HPC system right where the data are stored.
The following steps are necessary, and their processing times scale with the grid size and are limited by the I/O speed of the file system (i.e., changes of the bulk data): 1. data output frequency change (e.g., due to requirements in the DR); 2. unit change (e.g., from deg C to K). Technically, this is not a problem, but requires lengthy input/output operations for all affected data; 3. data concatenation (merge); 4. data compression (e.g., different as the ESGF); 5. file format change (e.g., as required by ESGF).
A necessary requirement for using the model data for scientific research is the availability of tools that are able to perform basic pre-and post-processing operations on model output. For ocean models on structured grids, a wide selection of tools is available; for the new generation of global ocean models formulated in unstructured meshes, however, no existing off-the-shelf packages exist. Therefore, we have created a set of python-based analysis and visualization tools for FESOM-pyfesom (https://github.com/FESOM/pyfesom). The pyfesom repository contains a python library and command line tools that allow it to perform basic data analysis and visualization of FESOM data and provide ways to interpolate FESOM data onto a regular grid. There is also a documentation with description of the command line tools and a set of Jupyter Notebooks with examples of library usage. Moreover, we have developed the R package spheRlab (https://github. com/FESOM/spheRlab) which facilitates the analysis and visualization of unstructured-mesh data, including functions for the generation of grid description files that enable full compatibility of FESOM data with the more widely used Climate Data Operators (CDO; https://code.mpimet.mpg.de/projects/cdo). Furthermore, the unstructured mesh files can be further processed with simple CDO commands. To give an example, a typical file is the unstructured potential ocean temperature in thetao_Omon_AWI-CM-1-1-MR_historical_r1i1p1f1_gn_185001-185,012.nc. A horizontal conservative remapping of this file to a 1°× 1°regular grid ("r360x180") from the command line is a one-liner and as simple as cdo remapycon; r360x180 thetao Omon input:nc thetao Omon remapped2D:n where we shortened the input file name to thetao_Omon_input.nc for brevity. An analogous command results in a bilinear interpolation by replacing "remapycon" with "remapbil." If many years need to be remapped, it is advisable to generate the interpolation weights only once (cdo genycon,r360x180 thetao_Omon_input.nc weights_unstr_2_r360x180.nc) and to re-use them for all subsequent remapping commands: cdo remap; r360x180; weights unstr 2 r360x180:nc thetao Omon input:nc thetao Omon remapped2D:nc 10.1029/2019MS002009

A2. Reference Observation Data and Computation of Objective Performance Indices
As reference data for the computation of the objective performance indices, various observation and reanalysis data are selected: For the following atmospheric variables, the ERA-40 reanalysis data are used: 2 m temperature (t2m), 10 m u wind component (u10m), 10 m v wind component (v10m), 500 hPa geopotential height (z500), and 300 hPa u component (u300). This is augmented by the following data: CERES for top of atmosphere outgoing longwave radiation (TOA; Loeb et al., 2012), GPCP for precipitation (pr; Huffman et al., 2009), MODIS for total cloud cover (tcc; Platnick et al., 2003), and OSISAF for sea ice concentration (sic; Tonboe et al., 2016). For the ocean, Polar Science Center Hydrographic Climatology (PHC, updated from Steele et al., 2001) is used as a reference for both potential temperature and salinity.
The absolute error is computed for each grid cell and averaged over different regions. For the atmosphere, the different regions are Arctic (60-90°N), northern middle latitudes (30-60°N), tropics (30°S to 30°N), southern middle latitudes (30-60°S), Antarctic (60-90°S), and global. For the ocean, the domain is split into the major ocean basins: Arctic Ocean, North Atlantic Ocean, North Pacific Ocean, Indian Ocean, South Atlantic Ocean, South Pacific Ocean, and Southern Ocean. Like for the atmosphere, the global ocean is also considered globally in addition. The mean absolute error is computed for each season: for the atmosphere for the four seasons DJF, MAM, JJA, and SON, and for the ocean for two seasons DJF and JJA. For the ocean, model data are vertically interpolated to the z-levels of the PHC. Errors are computed for each z-level of the climatology and averaged over the levels. Then the error is normalized with the mean absolute error averaged over a set of CMIP5 models. By doing this, the performance of our new CMIP6 model can be compared objectively using the performance of CMIP5 models in terms of agreement with observation data. A performance index of 1 indicates that the model performs as well as the average of the CMIP5 models; a performance index of smaller than 1 (larger than 1) indicates a better (worse) performance.