Volume 51, Issue 10 e2024GL109265
Research Letter
Open Access

CMIP6 Models Rarely Simulate Antarctic Winter Sea-Ice Anomalies as Large as Observed in 2023

Rachel Diamond

Corresponding Author

Rachel Diamond

British Antarctic Survey, Cambridge, UK

Department of Earth Sciences, University of Cambridge, Cambridge, UK

Correspondence to:

R. Diamond,

[email protected]

Contribution: Conceptualization, Methodology, Formal analysis, ​Investigation, Data curation, Writing - original draft, Writing - review & editing, Visualization

Search for more papers by this author
Louise C. Sime

Louise C. Sime

British Antarctic Survey, Cambridge, UK

Contribution: Conceptualization, Methodology, Validation, Writing - review & editing, Supervision

Search for more papers by this author
Caroline R. Holmes

Caroline R. Holmes

British Antarctic Survey, Cambridge, UK

Contribution: Methodology, Validation, Writing - review & editing

Search for more papers by this author
David Schroeder

David Schroeder

Department of Meteorology, Centre for Polar Observation and Modelling, University of Reading, Reading, UK

Contribution: Conceptualization, Writing - review & editing, Supervision

Search for more papers by this author
First published: 20 May 2024

Abstract

In 2023, Antarctic sea-ice extent (SIE) reached record lows, with winter SIE falling to 2.5Mkm2 below the satellite era average. With this multi-model study, we investigate the occurrence of anomalies of this magnitude in latest-generation global climate models. When these anomalies occur, SIE takes decades to recover: this indicates that SIE may transition to a new, lower, state over the next few decades. Under internal variability alone, models are extremely unlikely to simulate these anomalies, with return period >1000 years for most models. The only models with return period <1000 years for these anomalies have likely unrealistically large interannual variability. Based on extreme value theory, the return period is reduced from 2650 years under internal variability to 580 years under a strong climate change forcing scenario.

Key Points

  • The latest generation of global climate models rarely simulate an Antarctic sea-ice extent anomaly as negative as observed in winter 2023

  • The return period for such an anomaly is 2650 years under internal variability, decreasing to 580 years under strong climate change forcing

  • After the anomaly occurs, sea-ice extent recovers within a decade to a new, lower state

Plain Language Summary

In 2023, the area of winter Antarctic sea ice fell to the lowest measured since satellite records began in late 1978. It is still under debate how far this low can be explained by natural variations, and how much can be explained by climate change. Global climate models are tools used to study past and predict future global change. We show that, without climate change, the latest generation of these models are extremely unlikely to simulate a sea-ice reduction from the mean as large as observed in winter 2023. Including strong climate change quadruples the chance of such a reduction, but the chance is still very low. When these rare reductions are simulated, sea ice takes around 10 years to recover to a new, lower, area: this indicates that Antarctic sea ice may transition to a new, lower, state over the next few decades.

1 Introduction

In 2023, winter Antarctic sea-ice extent (SIE) reached exceptional record lows. The difference from the 1981–2010 average reached ∼2.5 Mkm2 in July 2023 (Fetterer et al., 2017) before recovering slightly over subsequent months (Gilbert & Holmes, 2024; Ionita, 2024). In the Arctic, SIE has been steadily decreasing in all seasons since satellite records began in the 1970s, explained by warming ocean and air temperatures due to anthropogenic climate change and associated positive feedbacks (Diamond, Schroeder, et al., 2024; Serreze & Meier, 2019; Stroeve & Notz, 2018). By contrast, over the satellite era, Antarctic SIE showed a slight positive trend to a record high in 2014. However, after low winter SIE in 2017, SIE has remained below average in most months, and was followed by 2023's exceptional record winter low (Eayrs et al., 2021; Gilbert & Holmes, 2024; Purich & Doddridge, 2023). The reason for the positive trend and subsequent lows are still debated; each of the recent sea-ice lows have been attributed to a combination of oceanic and atmospheric factors including (but not limited to) interannual variability and changes of atmospheric modes, and Southern Ocean subsurface warming and warm water influx, leading to the persistence of low sea ice since 2016 (Blanchard-Wrigglesworth, Roach, et al., 2021; Eayrs et al., 2021; Ionita, 2024; Meehl et al., 2019; Purich & Doddridge, 2023; Roach et al., 2023; Wang et al., 2019; Zhang et al., 2022). The debate reflects the complex combination of factors that influence Antarctic sea ice, and furthermore that some of these factors may be impacted by climate change (Maksym, 2019; Turner et al., 2015).

Global climate models (GCMs) are tools used to investigate past sea-ice change, as well as predict future change on decadal or centennial scales. Despite a wide range of simulated historic and present-day Antarctic sea-ice states between models, the majority predict a sea-ice decline in response to anthropogenic climate change (Holmes et al., 2022; Roach et al., 2020). Most assessments of CMIP5 (the previous generation) and CMIP6 (latest generation) GCMs focus on their mean states and projected trends, for example, Bracegirdle et al. (2020); Shu et al. (2020). Sea-ice variability in these models has been studied comparatively little, although it has been concluded that simulated winter interannual variability is generally higher than observed (Roach et al., 2020). However, comparisons were to observations before the last few years' lows, see for example, Roach et al. (2020), Gagné et al. (2015), and Blanchard-Wrigglesworth, Donohoe, et al. (2021).

With our study, we therefore investigate winter variability, and in particular under what conditions CMIP6 GCMs simulate anomalies as negative as this year's record low. We focus on the SIE anomaly from a relatively short baseline (the previous decadal mean) to capture rapid retreat. We aim to answer three questions:
  1. How likely are CMIP6 GCMs to simulate an anomaly as negative as that observed in winter 2023 under internal variability alone?

  2. Do these models tend to simulate greater anomalies or more frequent extreme lows under other forcings related to anthropogenic climate change?

  3. Could the last few years' record lows signal a new regime of decreasing Antarctic sea ice as a response to climate change, or a shift to new states of low SIE or more frequent extreme lows (as suggested by for example, Purich and Doddridge (2023), Raphael and Handcock (2022), and Eayrs et al. (2021))?

We answer the first question by comparing the observed variability to multi-model simulated variability in pre-industrial simulations, and the second by comparing to variability in simulations with idealized forcing, realistic historical forcing and realistic future forcing pathways. We answer the final question by considering the instances in these simulations where such an anomaly occurs.

2 Data

We used outputs for six types of simulation. The simulation types are the pre-industrial control (“piControl”) experiment, the “historical” experiment, the “1pctCO2” experiment, and three experiments with different future forcing scenarios (the “ssp” experiments). “piControl” is used to investigate internal variability alone, while the other simulations investigate variability under idealized and more realistic climate forcings.

The “piControl,” “1pctCO2”, and “historical” experiments were run by all models participating in CMIP6. The “piControl” simulation uses invariant solar, greenhouse gases (GHGs), ozone, tropospheric aerosol, volcanic and land-use forcings for the year 1850. The “1pctCO2” experiment is initialized from piControl, with all forcings identical, apart from the atmospheric CO2 concentration, which is increased from 1850 levels by 1% every year, for a minimum of 150 years (Eyring et al., 2016). The “historical” experiment is also initialized from piControl, but forced with historical (1850–2014) observations of all forcings described above (Eyring et al., 2016; Meinshausen et al., 2017). We also used experiments from ScenarioMIP, future projections initialized from the end of the historical simulation and run from 2015 to 2100, forced as described in O’Neill et al. (2016). We chose the three of the four ScenarioMIP “Shared Socieoeconomic Pathway” Tier 1 experiments that were run by almost all models from our model selection: these are “ssp126,” “ssp245,” and “ssp585,” respectively corresponding to additional radiative forcing of 2.6, 4.5, and 8.5 W/m2 by 2100. The first two scenarios assume large reductions in carbon emissions relative to the present day; the final represents an upper bound of the range of scenarios described in the literature (O’Neill et al., 2016).

We used 18 CMIP6 models from 16 model families (Table S1 in Supporting Information S1). These were selected as follows: for a representative sample of CMIP6 models, we prioritized selecting at least one model from each model family with data available on the ESGF archive (see Data Availability Statement); each model must have monthly sea-ice concentration data available for piControl, historical, and 1pctCO2 experiments, and at least two of the three ssp experiments outlined above. We chose models with relatively realistic winter SIE: no very low-biased CMIP6 models (Casagrande et al., 2023; Roach et al., 2020) were used.

In subsequent sections, “earlyhist” denotes only years 1850–1950 of the historical simulations. “latehist + sspxxx” denotes years 1950–2015 of the historical runs, combined with the 2015 to 2100 sspxxx run (to provide a set of simulations encompassing the full observational period, and sufficiently long for robust statistics).

Our analysis focuses on SIE. This is calculated from sea ice concentrations (SIC) from satellite retrievals, and coupled model output SIC (variable “siconc”), regridded to a regular 1° latitude/longitude grid before SIE calculation; see Data Availability Statement for detail. We represent winter with results for calendar month August across simulations and observations, since at the initial time of writing this was the latest month of satellite data available for 2023. See Figure S1 in Supporting Information S1 for August SIE timeseries for all model simulations.

3 Methods

For all simulations and observations, we construct a timeseries of the SIE anomaly from the previous 10-year mean SIE (for year X, this is the mean over years X-10:X-1). Goosse et al. (2009) showed that in a transient climate simulation, summer variability increases with increasing mean. We confirm this for winter sea ice (Figure S2 in Supporting Information S1) and identify a threshold of 10 Mkm2 beyond which this no longer holds. Therefore, to reduce impacts of this correlation on later results (given that the forced simulations have lower mean SIE than piControl), we only use simulation years with moving-average SIE >10 Mkm2. See Table S1 in Supporting Information S1 for the number of years retained for all simulations. Hereafter, we use “SIE variability data set” to refer to this type of data set (a timeseries of anomalies from the previous decadal mean, with moving-average SIE >10 Mkm2).

We first tested whether variability in SIE is normally distributed, as often assumed implicitly in sea-ice studies. We tested both absolute August SIE, and the SIE variability data set for all models and simulations. From a Shapiro-Wilkes test of normality, the null hypothesis (of a normal distribution) was ruled out for most timeseries at the p = 0.05 level. Therefore in Section 4, since it cannot be assumed that all data is normally distributed, and the associated probability distribution is unknown, we use the 5%–95% range (calculated using the scipy percentile function) instead of standard deviation as a measure of spread, and use non-parametric tests based on empirical distribution functions to compare distributions. To determine whether each SIE variability data set was significantly different from observed variability, we applied the Kolmogorov-Smirnov two-sample test using python's scipy.stats package (see Figure S4 and Text S1 in Supporting Information S1 for details).

For brevity we will use ΔSIEaug23 to denote the observational August 2023 anomaly from the previous 10-year (2013–2022) mean. Hereafter we always use “anomaly” to refer to anomalous reductions (ignoring anomalous highs, as all distributions may be asymmetric). We apply two methods to estimate the probability that a given model simulation, for any given year, returns an SIE anomaly of at least ΔSIEaug23. The first method uses empirical cumulative distribution functions (ECDFs): using python (https://www.statsmodels.org/stable/generated/statsmodels.distributions.empirical_distribution.ECDF.html), we calculate the ECDF Femp(x) of the SIE variability data set for each model simulation. The probability that this model simulation returns an SIE anomaly of at least ΔSIEaug23 is then
P x Δ S I E aug23 = F emp x = Δ S I E aug23 $P\left(x\le {\Delta }SI{E}_{\mathit{aug23}}\right)={F}_{\mathit{emp}}\left(x={\Delta }SI{E}_{\mathit{aug23}}\right)$ (1)
The second method uses extreme value theory: the generalized extreme value (GEV) distribution is a family of distributions used to model the extremes of sequences of independent and identically distributed random variables (Ailliot et al., 2011; Alves & Neves, 2011). We applied python's scipy.stats package with inbuilt generalized extreme value distribution (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.genextreme.html) to fit a distribution fgev to each SIE variability data set for each simulation as follows
f gev ( x ) = exp ( 1 c y ( x ) ) 1 / c ( 1 c y ( x ) ) 1 / c 1 / s w h e r e y ( x ) = ( x l ) / s < x < 1 / c i f c > 0 1 / c < x < i f c < 0 \begin{align*}\hfill \begin{array}{lll}{f}_{\mathit{gev}}(x)=\mathrm{exp}\left(-{(1-cy(x))}^{1/c}\right){(1-cy(x))}^{1/c-1}/s& \hfill & \hfill \\ \hfill where& y(x)=(x-l)/s\hfill & \hfill \\ \hfill -\infty < x< 1/c& if\hfill & c > 0\hfill \\ \hfill 1/c< x< \infty & if\hfill & c< 0\hfill \end{array}\end{align*} (2)
and s, l, and c are fitted parameters. Then, the probability P(x ≤ ΔSIEaug23) is given by the cumulative distribution function Fgev(x) of fgev(x):
P x Δ S I E aug23 = F gev x = Δ S I E aug23 . $P\left(x\le {\Delta }SI{E}_{\mathit{aug23}}\right)={F}_{\mathit{gev}}\left(x={\Delta }SI{E}_{\mathit{aug23}}\right).$ (3)

The return period T(x ≤ X) is the inverse of p(x ≤ X). For each model and simulation, we determined the 5%–95% confidence interval of the GEV-estimated P(x ≤ ΔSIEaug23) using the scipy.stats bootstrapping function (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.bootstrap.html): we resampled each SIE variability data set 1000 times, returning a GEV-estimated probability from each sample, to obtain the 5%–95% confidence interval.

4 Results

Figure S3 in Supporting Information S1 shows histograms of the SIE variability data sets with observed August 2022 and 2023 differences highlighted (henceforth referred to as ΔSIEaug22 and ΔSIEaug23). ΔSIEaug22 falls within the distribution simulated by most models, although toward the low-probability (i.e., high return period) end of the distribution. The return period T(x ≤ ΔSIEaug22) for a model to simulate an annual anomaly of this magnitude or greater varies between models and simulations, but for the majority (72/107) of simulations, T(x ≤ ΔSIEaug22) < 10 years, and for all simulations T(x ≤ ΔSIEaug22) < 100 years (from ECDF estimates, see Figure S4b in Supporting Information S1). By contrast, ΔSIEaug23 is toward the extreme low-probability end for all models, as quantified later.

In total across the 14,568 years analyzed here, anomalies of magnitude ΔSIEaug23 are simulated 104 times. The SIE within −20 to +20 years of these instances are composited (Figure 1b; compare to observational SIE in Figure 1a). The year of the anomaly is preceded by ∼5 years of sea ice reduction from the longer-term mean, and SIE then takes ∼10 years to recover to a new, lower, extent. The mean over years 10–20 is reduced relative to the mean over years −20 to −10 by 0.7 Mkm2. This suggests (at least in models) after a ΔSIEaug23 magnitude event, sea ice may take decades to recover. These results include MPI-ESM1.2-LR, a model with particularly high variability (accounting for 57/104 instances). We therefore repeated this analysis for the remaining 47 instances, and obtained a reduction of 0.9M km2. We also repeated the analysis for the 27 occurrences within piControl runs and obtained a reduction of 0.4 Mkm2. This indicates that some of the 0.7 Mkm2 reduction may be accounted for by forced multi-decadal decreasing SIE trends, but even under internal variability only, an anomaly of magnitude ΔSIEaug23 is still followed by lowered SIE over the subsequent decades. See these two additional cases in Figure S5 in Supporting Information S1.

Details are in the caption following the image

(a) Timeseries of August Antarctic SIE, from 1979 to 2023. Thin black line: mean over 2003–2013. Red line: August 2023 value. (b) In gray, all instances in all simulations of an anomaly of magnitude at least ΔSIEaug23, with year−20 to year+20 around the anomaly. Centered on year “0,” the year of the anomaly, and relative to the mean over years −20 to −10. Thick black line: mean over all runs. Thick dashed lines: indicating lower and upper quartiles, and median over all runs. Thin black line: mean over years −20 to −10, and green line: over years 10–20. Red line: anomaly of August 2023 mean from 2003 to 2013 mean, for comparison with (a).

We now consider two quantities across simulations and observations: variability, and return period of a ΔSIEaug23 event. First we compare simulated and observed variability. The variability (defined as the 5%–95% SIE range) shown in Figure 2a is more dependent on the model than the simulation. However, most (13/18) models have variability at least 0.2 Mkm2 higher than observed in at least 5/6 simulations; the multi-model ensemble has range 2.0–3.0 Mkm2 across the six simulations (compare to an observed range of 1.5 Mkm2). This is in line with previous studies that considered sea ice before 2023, for example, Gagné et al. (2015); Roach et al. (2020); Zunz et al. (2013), which found that annual variability simulated by the majority of GCMs is greater than observed. We next compare each simulation's SIE variability data set to the observed variability over 1979–2023 (results were similar when instead using observations over 1979–2022). From Figure S4d in Supporting Information S1, the models with all simulations most consistent with observations appear to be FIO-ESM-2-0 and CESM2; the only model inconsistent with observations at the p < 0.05 level for all simulations is MPI-ESM1.2-LR.

Details are in the caption following the image

(a) 5%–95% SIE range for each model, for each simulation, and bottom row: observations. (b) and (c) The return period T(x ≤ ΔSIEaug23), for each model and simulation, of returning an August SIE anomaly as large as ΔSIEaug23, calculated by two methods: (b) T(x ≤ ΔSIEaug23) from ECDF. Black: an anomalous low equal to or larger than ΔSIEaug23 did not occur any simulated year. (c) As for (b), but T(x ≤ ΔSIEaug23) found by fitting GEV distribution to tails. Black: T(x ≤ ΔSIEaug23) > 104 years.

We now move on to estimating the return period TΔSIEaug23 for each model to return an anomaly of magnitude at least ΔSIEaug23, for all simulations, using two methods (see Section 3). Figure 2b shows ECDF results. For 10/18 models, the value never occurred in any simulation, so empirically determined TΔSIEaug23 > (number of years in the SIE variability data set). The majority (14/18) of models simulate TΔSIEaug23 > 90 years (i.e., occurring at most once or twice) for all simulations. Given that anomalies as large as ΔSIEaug23 are rarely simulated over the time period considered, it is challenging to estimate TΔSIEaug23 for all simulations from the ECDF. Therefore, we apply extreme value theory to the SIE variability data sets for a more reliable return period for these events. Figure 2c shows TΔSIEaug23 calculated with this method (see Figures S4c and S6 in Supporting Information S1 for equivalent probabilities and errors). As might be expected given the variation between models shown in Figure S3 in Supporting Information S1 and Figure 2b, GEV estimates are highly dependent on the model and forcing, but TΔSIEaug23 > 102 years for almost all (94/107) simulations, and >103 for most (73/107) simulations. To better understand these results, we consider both the multi-model ensemble (MME), and the dependence of TΔSIEaug23 on the variability for each simulation. Figure 3a shows the MME SIE variability data set histogram. The MME 5%–95% range for all simulations is significantly broader than the observational 5%–95% range, and the MME data sets are inconsistent at the p < 0.01 level with the observational data set (see also Figure 2a and Figure S4 in Supporting Information S1). ΔSIEaug23 is at the low-probability extreme lower end of the range of the MME. Figure 3b shows TΔSIEaug23 from GEV fits to each MME simulation data set. For piControl, TΔSIEaug23 is 2650 years (5%–95% confidence interval: 1530–6260 years). Climate forcing reduces TΔSIEaug23 for the earlyhist and latehist + ssp245 simulations to respectively 1290 (530–3150) years and 1330 (590–3180) years, and for latehist + ssp585 to 580 (300–1120) years. We emphasize the large reduction from piControl to latehist + ssp585, and the lack of overlap of their respective uncertainty ranges: this is a significant difference by some measure. We note that the MME estimates include models with variability highly inconsistent with observations: from Figure S4 in Supporting Information S1, CAMS-CSM1.0 and MPI-ESM1.2-LR, and from Figure S1 in Supporting Information S1, GFDL-ESM4, for which unrealistic deep convection in the piControl and historical runs results in rapid sea ice loss (Dunne et al., 2020; Heuzé, 2021). Repeating the analysis without these three models yields higher return periods of 103–106 years across simulations (Figure S7 in Supporting Information S1). However, climate forcing still robustly decreases TΔSIEaug23 relative to piControl with the largest reductions for latehist + ssp585.

Details are in the caption following the image

(a) Histogram of the SIE variability data set for the multi-model ensemble, with observations in black, and ΔSIEaug22 and ΔSIEaug23 highlighted in gray and red respectively. (b) GEV-estimated return periods of ΔSIEaug23 using the multi-model ensemble; errors show 5%–95% confidence interval returned by bootstrapping. (c) For each model simulation, GEV-estimated return period against the simulation's respective SIE 5%–95% range. Black vertical line: observational 5%–95% range for comparison. Inset: results for SIE range: 1.1–1.9 M km2. Blue line: fit to datapoints in this region, of form log (GEV-estimated period) = m log (SIE range) + c, with m = −42 ± 8, and c = 15 ± 1. Thin blue lines: upper and lower bounds on this fit. Dashed line: TΔSIEaug23 returned from fit, for SIE range equal to observations (with thin dashed lines: upper and lower bounds.).

We now consider all model simulations individually: Figure 3c shows TΔSIEaug23 against the variability. We identify a strong negative correlation: models with greater variability have lower TΔSIEaug23. The simulations with variability within 0.5 Mkm2 of observations are very unlikely to simulate an August 2023 event (Figure 3c, inset), all yielding TΔSIEaug23 > 103 years. Most simulations within 0.2 Mkm2 of observations yield TΔSIEaug23 > 105 years. We perform a least-squares fit to this range (we do not expect this fit to hold exactly, but use it to provide a rough estimate of expected return periods for this range). Using this fit, for observational variability 1.2 Mkm2, the corresponding TΔSIEaug23 = 5 × 107 (from errors: 1 × 105 to 2 × 1010). This means that for a model simulation with variability similar to observations, the associated return period for ΔSIEaug23 would be on the order of 100 thousand to 10 billion years.

5 Discussion

We separate SIE anomalies into “variability” (1979–2022, 5%–95% probability range) and “extreme events” (such as the winter 2023 anomaly). Under this definition, most CMIP6 models considered here simulate “variability” similar to (but slightly higher than) observed variability, consistent with past findings (Gagné et al., 2015; Roach et al., 2020). From the relationship we identify between TΔSIEaug23 and variability, if a simulation has TΔSIEaug23 in the 10–1000-year range, this is enabled by variability that is much greater than observations. Simulations with variability within 0.5 Mkm2 of observations have associated TΔSIEaug23 of at least 1000 years; a rough estimate shows that for a simulation with variability equal to observations, the associated return period for ΔSIEaug23 is of the order 107 (50 million) years.

This return period is very large (so seems unrealistic). However, given the short satellite record, and that 2023's anomaly was such an unprecedented event, it is very challenging to attempt to estimate the true value of TΔSIEaug23 from the satellite record. Three possible methods are using a Gaussian approximation, or using GEV or ECDF estimates as applied in this paper. These respectively return TΔSIEaug23 ∼ 105, ∼103, and ∼0.03. These estimates differ by several orders of magnitude, and are likely flawed, due to a small and probably non-representative sample (e.g., Gilbert and Holmes (2024)). Both the ECDF and GEV cannot reliably be applied to observations given the very small sample size (Cai & Hames, 2010; Philip et al., 2021); indeed, we find that 5%–95% confidence intervals on these two estimates (using bootstrapping) span 10 orders of magnitude. However, we emphasize that all three estimates are several orders of magnitude smaller than 107 years.

To estimate the true value of TΔSIEaug23 and better quantify variability, more research on observed long-term variability is critical. It is possible that true interannual variability could be greater than the range measured over the satellite era, and may change naturally over centennial or millennial scales, so quantifying variability on these timescales would more accurately indicate TΔSIEaug23, enabling a more robust comparison with models. There is some evidence of changing variability even in the short satellite record (Purich & Doddridge, 2023). Longer records such as marine sediment cores or ice cores could help quantify the variability (Chadwick et al., 2023; Crosta et al., 2022). These “proxy” records have low temporal resolution and high uncertainty, so their interpretation has focused on mean sea-ice state or decadal-scale trends rather than interannual variability (Abram et al., 2013; Thomas et al., 2019). Reconstructions of sea ice since 1905 based on relationships with atmospheric variables do support 2023 as an exceptionally low year for sea ice (Fogt et al., 2022; Yang et al., 2021).

A mechanistic understanding of 2023's record low is critical to understanding the likely future evolution of Antarctic sea ice, as well as whether models capture the processes that caused it. The first papers on the atmospheric and oceanic precursors have been recently published (Ionita, 2024; Purich & Doddridge, 2023), suggesting 2023's record low sea ice was related to a build-up of subsurface ocean heat, possibly enhanced by large-scale circulation changes.

We suggest the likely unrealistic model-returned TΔSIEaug23 of 107 years may be due to inconsistencies between the shape of the SIE distribution in models versus observations: models tend to have a wider-than-observed envelope of 5%–95% variability, but may simulate very extreme events too infrequently, with too narrow tails as compared to observations. Our results suggest that the process responsible for the low sea ice in 2023 is not properly accounted for in models, necessitating better understanding and model improvements. Low persistence is one possible explanation of the very rare occurrence in most models of anomalies as large as observed in winter 2023. Almost all models have persistence in the piControl simulations of at most 2 years (not shown), whereas the recent sequence of sea-ice lows may have been enabled by higher persistence than this (Massonnet et al., 2023; Purich & Doddridge, 2023). This could explain why models tend not to simulate as large departures from the mean as have been recently observed.

However, we do note from Figure 1b that in the rare instances that an anomaly of magnitude ΔSIEaug23 is simulated in models, it is preceded by ∼5 years of decrease from the mean, and SIE after this anomaly takes around a decade to recover to a new, lower, state, suggesting that there is some persistence in the system after such an event. Given that the mean in Figure 1b is taken over models with very different initial sea ice conditions, we do not expect the 0.7 Mkm2 reduction 10–20 years after such an event to be a prediction of the future state of sea ice over the 2030s–2040s. However, it does indicate that, at least in models, a reduction of ΔSIEaug23 is followed by a transition to a lower sea-ice state, so we may expect to see this over the coming decades.

In models, unrealistically deep and frequent Southern Ocean convection can contribute to high sea ice variability; most models considered here have this artifact to varying degrees (13 do, two do not, three unknown, from Heuzé (2021) and Mohrmann et al. (2021)). The slight majority (8/13) of models with deep convection also simulate ΔSIEaug23 at least once, and no models simulate ΔSIEaug23 without some deep convection. Therefore, models with more deep convection may also be more likely to simulate ΔSIEaug23. This would support our suggestion that models that simulate ΔSIEaug23 more frequently than 1/1000 years tend to do so due to unrealistically high SIE variability, which, from these results, could be linked to unrealistically high deep convection. However, given that almost all models (with data available) include some degree of deep convection, and five models have some deep convection but never simulate ΔSIEaug23, it is difficult to draw robust conclusions. A more thorough investigation could provide the basis for a follow-up study. Finally, we note that, over the last decades, sea-ice trends have been highly regional, with statistically significant increases in the Ross Sea but decreases elsewhere (Eayrs et al., 2021; Yuan et al., 2017). Here we investigate pan-Antarctic SIE alone, but further useful research could investigate regional changes.

6 Conclusions

We have shown that CMIP6 models tend to simulate greater interannual variability in SIE than that observed over 1979–2022. However, they are still extremely unlikely to simulate an SIE anomalous low of the magnitude of that observed in August 2023 (ΔSIEaug23). An approximate fit to model simulations with near-observational variability shows that for a model simulation with variability equal to observations, the associated return period for ΔSIEaug23 would be 50 million years. Previous studies have shown differences between sea-ice change in models and observations, with models simulating stronger linear trends (Gagné et al., 2015; Roach et al., 2020) implying they may not be capturing all processes and projections of future reductions may be over-stated (although this trend discrepancy may not remain in light of recent change). We add an interesting counter-argument in that the models may, in fact, under-simulate very rapid ice decline.

However, despite these limitations, we note that climate forcing in the models does robustly reduce the return period for such anomalies, relative to under internal variability alone, by up to an order of magnitude in the multi-model ensemble. The reduced return period in forced scenarios suggest that winter 2023's extreme low was made more likely by climate change, in agreement with recent literature, for example, Purich and Doddridge (2023) and Ionita (2024) (although our study does not constitute a formal attribution analysis). Furthermore, when these rare anomalies do occur in models, sea ice takes approximately 10 years to recover to a new state, in which SIE is lowered by 0.5–1 Mkm2 relative to the mean preceding the anomaly. Therefore, as suggested by Purich and Doddridge (2023) and Ionita (2024), 2023's low may indeed act as a bellwether of future change, indicating a transition to a new regime of lowered winter sea ice, at least for the next few decades.

Acknowledgments

RD acknowledges support from NERC training Grant NE/S007164/1. RD thanks John Slattery for useful conversations and guidance. LCS has received support from the NERC National Capability International Grant SURFEIT: NE/X009319/1, and acknowledges additional support from ANTSIE: EU-H2020G.N.864637 and TiPES: EU-H2020G.N.820970. CRH acknowledges support from NERC Grant DEFIANT: NE/W004747/1. This work was supported by NERC through National Capability funding, undertaken by a partnership between the Centre for Polar Observation Modelling and the British Antarctic Survey. This work used the NCAS CMS PUMA Service (https://cms.ncas.ac.uk/puma/), the ARCHER2 UK National Supercomputing Service (http://www.archer2.ac.uk) and the JASMIN data analysis platform (http://jasmin.ac.uk/). The authors thank F. Massonnet, J. Wang, and H. Goosse, and an anonymous reviewer, for their positive and constructive reviews.

    Conflict of Interest

    The authors declare no conflicts of interest relevant to this study.

    Data Availability Statement

    Observational and model sea ice extent is calculated from sea ice concentrations (SIC) from satellite retrievals (NOAA-NSIDC monthly Antarctic sea ice concentration from years 1978–2023, see documentation at Fetterer et al. (2017)) and coupled model output (variable “siconc”). For both satellite data and model output, SIC was aggregated to a 180 × 360 grid using CDO (Schulzweida, 2023) before calculating SIE for a more robust comparison. The CMIP6 DECK and ScenarioMIP model outputs are in the Earth System Grid Federation (ESGF) archive (Cinquini et al., 2014). For all simulations with multiple ensemble members, we use the first ensemble member (designated “r1i1p1f1”), apart from the HadGEM3 models for both 1pctCO2 and abrupt-4xCO2 simulations, where we used “r1i1p1f3,” the first member available. Individual model data available on ESGF, see Table S1 in Supporting Information S1. Processed model outputs used in this study are available at Diamond, Sime, et al. (2024).