Wind Stress‐Induced Multiyear Predictability of Annual Extratropical North Atlantic Sea Surface Temperature Anomalies

Long‐term predictability of the North Atlantic sea surface temperature (SST) is commonly attributed to buoyancy‐forced changes of the Atlantic Meridional Overturning Circulation. Here we investigate the role of surface wind stress forcing in decadal hindcasts as another source of extratropical North Atlantic SST predictability. For this purpose, a global climate model is forced by reanalysis (ERA‐interim) wind stress anomalies over the period 1979–2017. The simulated climate states serve as initial conditions for decadal hindcasts. Significant skill in predicting detrended observed annual SST anomalies is observed over the extratropical central North Atlantic with anomaly correlation coefficients exceeding 0.6 at lead times of 4 to 7 yrs. The skill is insensitive to the calendar month of initialization and primarily linked to upper ocean heat content anomalies that lead anomalous SSTs by several years.

To achieve skillful multiyear climate predictions, the oceanic state must be initialized realistically. However, subsurface ocean observations are limited in quantity and quality. Often, only surface (heat, freshwater, and momentum) fluxes are prescribed to ocean and climate models. The specification of only momentum flux (wind stress) anomalies in partially coupled climate models enables reproducing major aspects of observed low-frequency variability such as the global warming hiatus during the early 21st century (Delworth et al., 2015;England et al., 2014;Thoma et al., 2015), and the Arctic early twentieth century warming (Svendsen et al., 2018). Previously, climate models initialized only by observed wind stress anomalies were shown to successfully predict tropical Pacific sea surface temperature (SST) and related climate anomalies several months ahead (e.g., Latif et al., 1998). Moreover, Ding et al. (2013) and Thoma et al. (2015) report skill in predicting decadal Pacific variability several years after initialization in hindcasts utilizing reanalysis wind stress anomalies.
We forced a coupled climate model by reanalysis surface wind stress anomalies to produce past climate states that are used to initialize a large ensemble of decadal hindcasts. The main question we address is, by how much wind-driven ocean processes contribute to multiyear SST predictability over the extratropical North Atlantic. The model and the experimental setup are described in section 2. In section 3, we discuss the predictive skill of extratropical North Atlantic SST. Since some ocean processes only operate during certain times of the year, we also assess the sensitivity of the skill to the initialization calendar month, which, so far, has not been specified with respect to multiyear predictability in this region. In section 4, we address the mechanisms giving rise to the skill. A discussion of the major results and the conclusions are provided in section 5.

Model and Implementation of Reanalysis Surface Wind Stress Anomalies
All simulations are conducted with the Kiel Climate Model (KCM, supporting information Text S1; Park et al., 2009). The experimental setup is as follows (additional details in Text S2). First, an ensemble of 10 simulations is performed, in which the daily zonal and meridional wind stress anomalies from ERA-20C (Poli et al., 2016) drive the KCM over the period 1900-1978. These runs are termed spin-up runs. Their initial conditions are taken from a preindustrial KCM control run employing a constant CO 2 concentration of 286 ppm. Second, an ensemble of 10 simulations forced by daily wind stress anomalies covering 1979-2017 from ERA-interim (Dee et al., 2011) is conducted, termed initialization runs. Their initial conditions are taken from the spin-up runs. In both, spin-up and initialization runs, the observed CO 2 concentration is used.
By prescribing only anomalies, mean state biases such as the cold SST bias over the subpolar North Atlantic, addressed for the KCM in Drews et al. (2015) and Drews and Greatbatch (2017), are maintained. On the other hand, initialization shocks and model drift can be reduced by anomaly initialization (Chikamoto et al., 2019;Kröger et al., 2018). Flux correction could serve as an alternative. For example, the KCM's North Atlantic cold SST bias was significantly reduced by freshwater flux correction, as shown in Park et al. (2016). Apart from the prescribed wind stress anomalies, the model is fully coupled and heat and freshwater fluxes are exchanged between atmosphere and ocean without any constraint. A similar partial coupling technique has been used in Ding et al. (2013), Thoma et al. (2015), and Polkova et al. (2019). Annual extratropical North Atlantic SST anomalies are well reproduced in the initialization runs, as described below. The spin-up experiments, however, reproduce only the globally averaged SST but not the extratropical North Atlantic SST anomalies ( Figure S1), possibly due to the large uncertainty in reanalysis products prior to the satellite era (Bordbar et al., 2017;Wohland et al., 2019).

Initialized Hindcasts
We note that the term hindcast is used in different ways. For example, ocean model simulations forced by prescribed observed (reanalysis) surface fluxes are also termed hindcasts . We here use the term hindcast to describe retrospective forecasts with a coupled ocean-atmosphere model. We run an ensemble of 10-member decadal hindcasts initialized from the 10-member initialization-run (ERA-interim forced) ensemble and without any further observations specified apart from the observed CO 2 concentrations. Initialization dates are 31 December (as commonly used in decadal prediction studies), 31 March, 30 June, and 30 September, which allows assessing the sensitivity of the predictive skill to the initialization calendar month. Hindcast ensembles are started between 1988 and 2007, yielding 80 start dates. Each hindcast is integrated for 10 yrs. We note that our method attempts to mimic a real forecast situation. No drift correction was applied.

Evaluation Methods
The aim of this study is to show how specification of wind stress anomalies in a climate model influences multiyear predictions of extratropical North Atlantic SST. As we are not interested in the skill arising from long-term trends, irrespective of whether they are due to external forcing or internal variability, we linearly detrended the data prior to analysis. By removing the trends, the effects of trends in the wind stress and CO 2 are reduced. From the hindcasts we construct ensemble mean time series of SST for different prediction horizons. Year 1, for example, denotes the average over the 12 months following initialization. To asses skill, we calculate anomaly correlation coefficients with respect to the observed SST from HadISST (Rayner et al., 2003). The statistical significance (95% confidence level) of correlation and regression coefficients is assessed by means of a bootstrapping method with moving blocks to account for autocorrelation. Blocks of 5-yrs length are randomly selected, with replacement, to construct 1,000 pseudo time series creating a statistical distribution of correlation and regression coefficients. All regressions are normalized with the SST index standard deviation. The hindcasts are compared with the persistence forecast (hereafter persistence) that is given by the autocorrelation of the observed SSTs.

Predictive Skill in the Initialized Hindcasts
The skill in predicting SST is presented for the global ocean in Figure S2 (supporting information). Shown are the anomaly correlations (left) and the ratio between the root-mean-square errors (RMSEs) of the hindcasts divided by the standard deviation of observed SST anomalies. A ratio of unity or higher indicates no skill. When predicting SST at lead times longer than 1 yr, the extratropical North Atlantic stands out with significant skills forecasting SST averaged over Years 6-9 (y 6-9, bottom panels). When forecasting the SST of the first year (y 1, top panels), significant skill is also found over the central and eastern tropical Pacific, in parts of the extratropical North and South Pacific, and south of Australia. The high skill over the Pacific likely has some contribution from the El Niño-Southern Oscillation (ENSO) and associated teleconnections and from the Interdecadal Pacific Oscillation (IPO). In contrast to the North Atlantic, significant skill over the Pacific is not retained ( Figure S2), when predicting SST averaged over Years 2-5 (y 2-5, middle panels) and Years 6-9 (y 6-9, bottom panels). There is, however, some marginal skill left over the central equatorial Pacific and off the equator. In the following we focus on the skill in the North Atlantic that is maintained over lead times of several years.
The extratropical North Atlantic, especially the region in the gray box shown in Figure 1 (45-55°N, 35-15°W), is characterized by high anomaly correlations (up to 0.8) and relatively small normalized errors (down to about 0.7). We compute an index by averaging the SSTs over the box. The ensemble mean of the initialization runs, and the 1-and 7-yr hindcasts reproduce fairly well the observed multiyear variability (Figure 2a). The correlation, based on 12-month running means, between observations and the ensemble mean initialization run amounts to 0.61, that of the 1-yr hindcast to 0.71 and that of the 7-yr hindcast to 0.60. All three correlations are statistically significant based on a t-test at the 95% confidence level. The correlations do not fundamentally change when the area average is taken over the entire extratropical North Atlantic basin. We note that the correlation for the hindcasts is computed over a shorter period compared to that of the initialization run , as the hindcasts only start in 1988.
An important source of multiyear predictability might be the enhanced power in the North Atlantic surface climate at the subdecadal time scale, which has been reported in several studies (Álvarez-García et al., 2008;Costa & Verdiere, 2002;Czaja & Marshall, 2001;Deser & Blackmon, 1993;Eden & Greatbatch, 2003;Fye et al., 2006;Martin et al., 2019;McCarthy et al., 2018;Reintges et al., 2017;Sutton & Allen, 1997). The subdecadal variability has been suggested to originate from coupled ocean-atmosphere dynamics in which the NAO and related wind stress plays an essential role (e.g., Krahmann et al., 2001). A subdecadal SST variability is also observed in the ensemble mean initialization runs and hindcasts (Figure 2a), however, with some differences in timing and amplitude. For example, the 7-yr hindcast ensemble mean depicts relatively small fluctuations, which is expected, because stochastic fluctuations associated with atmospheric forcing are strongly damped when averaging over several realizations.
When predicting annual averages of SST (Figure 2b), the skill of the hindcasts beats persistence at all lead times and there is not much sensitivity to the calendar month of initialization. Persistence becomes negative after 3-4 yrs. The correlations of the hindcasts, however, stay above 0.6 at lead times of 4-7 yrs and only

Mechanism for Multiyear Predictive Skill
To identify the source of wind stress-induced multiyear SST predictability over the extratropical North Atlantic, lag regressions of selected variables onto the SST index defined above (gray box shown in Figures 1, 3, and 4) are computed from the ensemble mean of the initialization runs (Figure 3). When considering the ensemble mean, the ocean variability due to the prescribed reanalysis wind stress anomalies that is identical in all ensemble members is retained, whereas internally produced variability that differs among ensemble members is largely eliminated.
At lag −4 yr, that is, 4 yrs prior to the SST index attaining its maximum, the only significant SST anomalies are negative and found in the area of the box (Figure 3, first column). This is consistent with the autocorrelation of the observed SST anomalies (persistence) becoming negative around lag −4 yr (Figure 2b). The downward (atmosphere-to-ocean) net surface-heat flux regressions (Figure 3, fourth column) are positive in the region of the largest negative SST anomalies, indicating that heat fluxes are damping and not causing the SST changes. This implies an important role of ocean dynamics. Heat content anomalies exhibit a dipole structure with positive anomalies in the west (Figure 3, second column). Wind stress anomalies are insignificant at lag −4 yr (Figure 3, third column, arrows). We note, however, a positive (anticyclonic) signal in the barotropic stream function to the southwest of the SST anomalies (Figure 3, third column, color shading), with the positive heat content signal at its northern flank (Figure 3, second column). The lag −4-yr anomaly in the barotropic stream function that, with smaller amplitude already is present at lag −6 yr, is likely the delayed response to the strong wind stress anomalies, their curl and associated poleward heat and mass transports at lag −6 yr. These wind stress anomalies resemble a pattern associated with a positive NAO phase. This is confirmed by the HadSLP2 sea level pressure regressions ( Figure S3c).
At lag −2 yr, SST anomalies north of 40°N have become positive and are strongest in the subpolar region. The heat flux again provides a damping. At this time, the center of the midlatitudinal positive heat content anomaly has propagated northeastward, straddling the box. This heat advection is presumably driven by both the anomalous "subsurface water intrusion (%3C700 m)" ( Figure S3a) and the mean upper ocean circulation ( Figure S3f). At the same time, strong wind stress anomalies are observed over the entire basin and consistently, the barotropic stream function anomalies change sign.
The positive SST anomalies are, by definition, fully developed over the box area at lag 0 yr. At this time, the anomalies in all scalar variables to first order exhibit the same pattern as those simulated at lag −4 yr but with opposite signs, suggesting a cyclic behavior. Quasi-oscillatory behavior also is supported by the wind stress anomalies, which are similar but of opposite signs at lag −6 yr and at lag −2 yr and extend almost Figure 3. Lag regressions calculated from the ensemble-mean of the initialization runs onto the extratropical SST index (average over the gray box). From left to right: SST, upper ocean (700 m) heat content ("HC(%3C700 m)"), barotropic stream function ("PSI," color shading) and wind stress (arrows), downward net surface heat flux ("HF"), and the Atlantic meridional overturning stream function. Lag −6 yr, for example, means that the regressed variable is leading the SST index by 6 yrs. Only significant (95% confidence) regression coefficients are shown (wind stress arrows are shown if at least one of the components is significant). Regressions are normalized by the SST index standard deviation. Data have been linearly detrended prior to analysis. over the entire North Atlantic. At lag 0 yr, the SST anomalies cover a large part of the extratropical North Atlantic. The positive anomaly in the middepth to near-surface overturning stream function at lag −2 yr north of about 40°N (Figure 3, fifth column) seems to be another potential contributor to the positive heat content anomaly and extratropical SST warming in the initialization runs, as the overturning stream function anomaly would be associated with an enhanced poleward heat transport in the upper layers. At lag 0 yr, a positive overturning stream function anomaly is observed further south. Finally, at this time, a strong cyclonic barotropic stream function anomaly has developed and a negative heat content anomaly appears to the west of the center of the SST anomaly that are harbingers of the next phase reversal. Based on the ensemble mean of the initialization runs, we hypothesize that the key for the multiyear SST predictability over the extratropical North Atlantic is the anomalous ocean circulation developing with a time delay in response to NAO-related wind stress pattern, with the caveat that we cannot rule out a significant contribution from the mean flow advection of temperature anomalies. Furthermore, a cycle in the North Atlantic climate system with a period of 8 yrs is suggested by the ensemble mean evolution derived from the initialization runs.
A regression analysis from the ensemble mean hindcasts is performed next (Figure 4), similar to that shown in Figure 3. Annual means, regardless of the initialization month, are considered. All variables are regressed onto the SST index obtained from the 7-yr hindcast (black line in Figure 2a), and the lags are numbered

10.1029/2020GL087031
Geophysical Research Letters relative to Prediction Year 7. Lag −6 yr, for example, refers to the regression of the 1-yr hindcast onto the 7-yr hindcast. We note that the anomalies derived from the hindcasts' ensemble mean are considerably weaker in comparison to the initialization runs' ensemble mean. The lag 0-yr regressions (lag 7-yr hindcasts; Figure 4, bottom panels) of the oceanic variables, with the exception of the overturning stream function, are similar to those from the initialization runs (Figure 3, bottom panels). In particular, the heat flux provides a damping of the SST anomalies, again arguing for ocean dynamics as the main driver of the SST changes. The evolution of the SST anomalies (from lag −6 yr to lag 0 yr) also is similar to that derived from the initialization runs. The wind stress regression patterns between lag −6 yr and lag −2 yr depict similar cyclic behavior as the initialization runs. However, the regression coefficients are relatively small. Similarly, the model fails to reproduce the sea level pressure (SLP) patterns (cf. Figures S3b and S4b with Figure S3c). Despite the role of the overturning stream function for the evolution of the extratropical North Atlantic SSTs in the initialization runs (Figure 3), its contribution to the mechanism that provides predictive skill in the hindcasts appears to be limited (Figure 4). The positive overturning anomaly at lag −2 yr in the hindcasts is much weaker than that observed in the initialization runs. At the other lags, the hindcast-overturning anomalies strongly differ from those in the initialization runs. Despite these discrepancies, the general structure and amplitude of the heat content anomalies look remarkably similar, suggesting that the overturning does not play a central role in providing the predictive skill in the hindcasts.
The wind stress initialization seems to force ocean-gyre dynamics in the hindcasts similar to that suggested by the initialization runs: Slowly developing ocean (gyre) circulation and heat content anomalies lead extratropical North Atlantic SSTs. The anomalous ocean velocities integrated over the upper 700 m in the initialization runs and hindcasts are similar ( Figures S3a and S4a, arrows). They are consistent with the barotropic stream function and are hypothesized to drive anomalous heat advection and accumulation in the presence of mean state temperature gradients ( Figures S3a and S4a, color shading). Again, an important contribution from mean flow advection to the SST tendencies cannot be ruled out.

Discussion and Conclusions
This study shows that prescribing reanalysis wind stress anomalies (added to the model's climatology) in the KCM, next to the skill arising from long-term trends, is an additional source for skillful forecasts of the extratropical North Atlantic SST at lead times of several years. The wind stress-induced skill in the KCM is linked to upper ocean heat content anomalies that appear to be associated with NAO-related wind-driven ocean circulation (gyre) changes. Furthermore, there is a propagation of the heat content anomalies along the Gulf Stream extension, which may in part be due to the advection of upper ocean temperature anomalies by the mean ocean circulation. Our results suggest a quasi-cyclic mechanism with approximately 7-yr period, which is connected to basin-wide wind stress anomalies across the North Atlantic ( Figure S5). This type of subdecadal variability was previously analyzed in Martin et al. (2019) and Reintges et al. (2017). The subdecadal periodicity is also present in the observed wind stress, which drives SST variability of the same periodicity with a time lag of about 2 yrs. This subdecadal cycle in the coupled atmosphere-ocean system is at the heart of the multiyear SST predictability over the extratropical North Atlantic through an adjustment of the oceanic gyre circulation. Analyzing a similar experimental setup in other climate models would be beneficial.
Despite its potential role in the forced initialization runs in which anomalous Ekman transports associated with the NAO might modulate the meridional overturning circulation, the hindcast runs do not provide evidence supporting a major contribution of the meridional overturning circulation to the multiyear predictive skill for the extratropical North Atlantic SST. We note in this context, that the experimental setup applied in this study does not take into account the observed surface buoyancy forcing and the ocean density structure in the initialization of the hindcasts. Matei et al. (2012), for example, initialized their hindcasts with reanalysis data of not only wind stress but also surface heat and freshwater fluxes. In their analysis, the AMOC at 30°N and associated meridional heat transport is presented as the major source of multiyear subpolar North Atlantic SST predictability. Changes in the meridional heat transport as a source for multiyear skill in the North Atlantic have also been suggested by Borchert et al. (2019Borchert et al. ( , 2018. The lack of observed heat and freshwater fluxes in the initialization of the KCM might reduce the predictive skill in our hindcasts, for example, through a too weak overturning circulation and Gulf Stream response. In our study, however, the overturning circulation can be affected by changes in the gyre circulation through salinity anomalies propagating into the deep-convection region. This mechanism operates on multidecadal time scales in the KCM and in other models (Ba et al., 2013;Delworth et al., 1993).
The focus of this study is the multiyear predictability of extratropical North Atlantic SST. There also is multiyear SST predictability in other regions: in the tropical Pacific (Years 6-9 in Figure S2), where the skill pattern is horseshoe like and considerably differs from the ENSO patterns. It is reminiscent of the observed early 1990s SST anomalies investigated by Latif et al. (1997) and might be of interest in future work.