Skillful long-range prediction of European and North American winters
Abstract
Until recently, long-range forecast systems showed only modest levels of skill in predicting surface winter climate around the Atlantic Basin and associated fluctuations in the North Atlantic Oscillation at seasonal lead times. Here we use a new forecast system to assess seasonal predictability of winter North Atlantic climate. We demonstrate that key aspects of European and North American winter climate and the surface North Atlantic Oscillation are highly predictable months ahead. We demonstrate high levels of prediction skill in retrospective forecasts of the surface North Atlantic Oscillation, winter storminess, near-surface temperature, and wind speed, all of which have high value for planning and adaptation to extreme winter conditions. Analysis of forecast ensembles suggests that while useful levels of seasonal forecast skill have now been achieved, key sources of predictability are still only partially represented and there is further untapped predictability.
1 Introduction
Despite recent advances in weather and climate forecasting, skillful predictions of year to year fluctuations in winter North Atlantic Oscillation [Walker and Bliss, 1932] and associated changes in weather at lead times of months have until recently been elusive [Johansson, 2007; Kim et al., 2012; Smith et al., 2012]. This is because climate models have shown little extratropical atmospheric circulation response to slowly varying components of the climate system such as the ocean [Kushnir et al., 2006], which might otherwise provide long-range predictability. As a result, while many state-of-the-art seasonal forecast systems show significant predictability for tropical climate, only low forecast skill is generally found in the extratropics [Arribas et al., 2011; Kim et al., 2012]. This has led to the conclusion that little predictability may exist for key extratropical events such as extreme winters [Jung et al., 2011]. However, climate models are imperfect and predictability could well be underrepresented. Indeed, past forecast systems have occasionally shown signs of skill in extratropical circulation [Palmer et al., 2004; Müller et al., 2005], and encouraging levels of skill for the Arctic Oscillation were recently reported by Riddle et al. [2013]. An improvement in long-range forecasting of the extratropics would generate enormous benefit to society as it would allow planning in highly populated regions of the Northern Hemisphere for the risk of severe winter weather including winter wind storms [Renggli et al., 2011] and disruption to transport [Palin et al., 2013] networks for example.
The single most important factor for year to year fluctuations in the seasonal climate around the Atlantic Basin is the state of the North Atlantic Oscillation (NAO) and its hemispheric equivalent, the Arctic Oscillation. Year to year variability in the NAO describes the state of the Atlantic jet stream and is directly related to near-surface winds and hence winter temperatures (through advection) across North America, Europe, and other regions around the Atlantic Basin. We present estimates of the predictability of the surface NAO and winter climate from the Met Office seasonal forecast system Global Seasonal forecast System 5 (GloSea5) which has high ocean resolution, a comprehensive representation of the stratosphere, and interactive sea ice physics, all of which mediate predictable teleconnections to the North Atlantic as shown below.
2 Predictability of the North Atlantic Oscillation
The forecasts used here were produced using the Met Office Global Seasonal forecast System 5 (GloSea5). The climate model at the core of this forecast system is Hadley Centre Global Environmental Model version 3 with atmospheric resolution of 0.83° longitude by 0.55° latitude, 85 quasi-horizontal atmospheric levels, and an upper boundary at 85 km near the mesopause. The ocean resolution is 0.25° globally in both latitude and longitude with 75 quasi-horizontal levels. This resolution is necessary to reduce key biases in the ocean and atmosphere and give a realistic winter blocking climatology in the model [Scaife et al., 2011]. A 24-member ensemble of forecasts was run for each winter in the period 1993 to 2012 with lagged start dates centered on 1 November (25 October, 1 November, and 9 November) and eight members initialized on each of the three start dates. Members from the same start date differ only by stochastic physics [Arribas et al., 2011]. Initial atmospheric and land surface data were taken from ECMWF Re-Analysis (ERA)-Interim observational reanalyses, and initial conditions for the global ocean and sea ice concentration were from the Forecasting Ocean Assimilation Model (FOAM) system [Blockley et al., 2013]. This configuration allows very skillful predictions of various slowly varying components of the climate system to be made for the coming winter (Table S1).
Figure 1 shows the skill of predicting the year to year fluctuations in the winter surface NAO (difference in sea level pressure between Iceland and Azores) at lead times of 1 to 4 months, well beyond weather forecast time scales. The resulting correlation coefficient between the ensemble average of 24 forecast members per winter and the observed surface NAO is 0.62 in GloSea5. This is statistically significant at the 99% level of confidence (using a t test and allowing for the small lagged autocorrelation in model and observations). It confirms potential predictability hinted at in statistical studies [Folland et al., 2012; Cohen and Jones, 2011] and atmospheric simulations with prescribed ocean conditions [Rodwell et al., 1999; Mehta et al., 2000; Bretherton and Battisti, 2000] and supports recent results for the Arctic Oscillation [Riddle et al., 2013], using a seasonal forecast system based on first physical principles. The value achieved here greatly exceeds persistence forecast skill (0.15) and suggests that useful levels of seasonal forecast skill for the surface NAO can be achieved in operational dynamical forecast systems. Our result is also insensitive to the details of the model or the hindcast. For example, a repeat hindcast using a new dynamical core [Walters et al., 2013] resulted in a similar correlation score of 0.6, as did removal of individual strong and predictable NAO winters such as 2009/2010 [Fereday et al., 2012] or the poorly predicted winter of 2004/2005 (Figure S1 in the supporting information). None of these changes reduces the significance of the correlation below the 95% level. Note also that the forecast skill in our system arises largely from interannual variability rather than trends or low-frequency variability, as differences in the surface NAO from 1 year to the next are skillfully predicted with a correlation coefficient of 0.46 which is also significant at the 95% level, particularly for years which project strongly on to the NAO (Figure S1). As a further check we also calculated probabilistic skill and reliability scores. Relative operating characteristic scores [World Meteorological Organization (WMO), 1992] for lower tercile winter temperatures are 0.70 for the Northern Hemisphere (20–90 N), 0.77 for North America (50–165 W, 10–85 N), and 0.65 for Europe (30 W–40E, 30–80 N), with high levels of reliability for all of these regions.

3 Sources of Predictability
While we cannot assess causality without additional experiments, which are beyond the scope of this study, digging deeper into the forecasts reveals several potential sources of predictable signals (Table S1 in the supporting information). One source of predictability originates in the tropical Pacific. Previous studies have shown that the El Niño–Southern Oscillation can drive interannual variations in the NAO [Brönnimann et al., 2007] and hence Atlantic and European winter climate via the stratosphere [Bell et al., 2009]. Figures 2b and 2c confirm that this teleconnection to the tropical Pacific is active in our experiments, with forecasts initialized in El Niño/La Niña conditions in November tending to be followed by negative/positive NAO conditions in winter. Established mechanisms [Bell et al., 2009] operate in the forecasts, with deep easterly anomalies occurring in the extratropical jet stream after descending from the stratosphere in midwinter (Figure 2a).

Previous studies also identify precursors to the NAO in North Atlantic Ocean temperatures [Rodwell et al., 1999; Frankignoul, 1985; Rodwell and Folland, 2002]. By selecting forecasts in years with a warm or cold north Atlantic subpolar gyre in November, we can examine the resulting winter signal in the atmospheric circulation. Forecasts starting from cold/warm North Atlantic states also result in winter predictions with more positive/negative NAO (Figures 2d–2f), although pattern correlation is low in this case. Note that although the ENSO signal is of reasonable strength, in many of the cases of predictability in Figure 2, forecasts show evidence of the same mechanisms and patterns operating as in the real world but with a weaker signal; we return to this later.
Our third teleconnection to the NAO arises from the initialization of Arctic sea ice, particularly in the Kara Sea to the north of Europe. Interannual variability of sea ice and hence surface temperature is large here and has previously been connected to the generation of large-scale circulation anomalies [Cohen and Jones, 2011; Yang and Christensen, 2012]. Figures 2h and 2i show the association between sea ice anomalies in this region in November and the subsequent winter circulation in forecasts and observations. As identified in other studies [Yang and Christensen, 2012] low/high sea ice concentrations in the Kara Sea in November precede negative/positive NAO anomalies, with anomalous pressure gradients over northernmost Europe and the East Atlantic.
Our final teleconnection to the NAO arises from the quasi-biennial oscillation (QBO) in the tropical lower stratosphere. Interannual variability between westerly and easterly phases of the QBO has long been known to influence the troposphere in the Atlantic sector [Ebdon, 1975] in the sense shown in Figure 2, with westerly QBO being associated with a stronger extratropical jet, particularly in early winter [Pascoe et al., 2005]. Figures 2j–2l again show a similar but weaker forecast signal.
4 Anomalous Signal-to-Noise Ratio
Despite the reproduction of known teleconnection patterns, it is clear from Figure 2 that the amplitude of signals in the forecasts is smaller than in observations. Similarly, while the ensemble mean signal in these forecasts correlates well with the observed NAO (corr = 0.62), the signal-to-noise ratio defined as the ensemble mean standard deviation divided by the total ensemble member standard deviation [Kumar, 2009] is low (s = 0.2). Despite this, the variability of the NAO from individual forecast members agrees well with observed variability and is around 8 hPa, so it is only the ensemble mean signal and not the variability of ensemble members that is too small. This presents something of a puzzle because for a perfect forecast system the expected signal-to-noise ratio and the correlation are directly related. Indeed, given the ensemble mean forecast correlation of 0.62, we would expect a signal-to-noise ratio much higher than found here [Kumar, 2009] (Figure 2). The answer lies in the weak signals in the forecast system (Figure 2) which result in the correlation between individual forecast members and the observations being several times higher than correlations between pairs of forecasts, a result similar to that found in atmosphere only experiments [Mehta et al., 2000]. In summary, individual forecast members contain weaker predictable signals than the observations.
Despite the high skill in predicting extratropical winter climate, low signal-to-noise ratios mean that large forecast ensembles are still needed to achieve a given skill. This is illustrated by systematically sampling subsets of forecasts from the full ensemble of 24 members (Figure 3). Ensemble mean prediction skill for the NAO increases with the number of forecast members and is still increasing, albeit more slowly, as the full size of our ensemble is approached. Just this scenario has been previously examined from a statistical viewpoint [Murphy, 1990], and the skill limit of an infinite-sized ensemble depends only on the average correlation between pairs of forecast members and the average correlation between forecast members and observations. This limit exceeds 0.8 for the NAO in our system. Along with improvements in the modeled signal strength, increased ensemble size could therefore lead to further increases in seasonal forecast skill for the extratropics.

5 Discussion: Implications for Regional Prediction
The NAO governs many aspects of European and North American winter weather, and predictability of the NAO therefore leads to similarly skillful predictions of surface winter climate (Figure 4). For example, the risk of damaging winter wind storms is highly relevant to the insurance sector [Renggli et al., 2011], and this quantity can be predicted with high levels of skill across northern Europe and large areas of North America (Figure 4a). Similarly, winter temperatures have impacts on energy pricing and can disrupt transport networks [Palin et al., 2013] but show predictability across large areas around the Atlantic Basin in our seasonal forecasts (Figure 4c). Finally, also related to atmospheric circulation, skillful prediction of near-surface winter wind speeds is demonstrated, again across large areas of Europe and North America (Figure 4e). This quantity is increasingly important as it governs year to year variations in the supply of wind-generated renewable energy. While Figures 4a, 4c, and 4e show large areas of significant correlation skill between the forecasts and observed historical conditions, there are patches of low skill for some fields in regions known to be affected by the NAO, such as temperature in northern Europe, which may arise due to imperfect model teleconnections. It is therefore interesting to ask how well the forecast NAO alone would serve as a proxy for regional prediction. Using only the forecast NAO (Figures 4b, 4d, and 4f) suggests that much of the skill in our forecasts arises from the prediction of the NAO alone. For example, the small regions where storminess is poorly predicted in Figure 4a coincide with regions where the NAO influence is weak (Figure 4b). Furthermore, while skill in North America arises from ENSO and the NAO, for regions such as Europe where the NAO dominates, then simply using the forecast NAO may actually improve regional predictions (Figure 4d). Using either methodology, and assuming that the recent 20 year period is representative of coming years, predictions from this system could allow plans to be made months ahead for the risk of key weather-related impacts on society.

Acknowledgments
This work was supported by the Joint DECC/Defra Met Office Hadley Centre Climate Programme (GA01101), the UK Public Weather Service research program, and the European Union Framework 7 SPECS project. Leon Hermanson was funded as part of his Research Fellowship by Willis as part of Willis Research Network (WRN). We acknowledge discussions with Warwick Norton, Dan Rowlands, Ed Hawkins, Tim Palmer, and Rowan Sutton when comparing these results to other seasonal forecasts and assessing predictability.
The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.





