Which orographic scales matter most for medium-range forecast skill in the Northern Hemisphere winter?

15 It is generally accepted that increased horizontal resolution improves the representation 16 of atmospheric circulation in global weather and climate models. Understanding which 17 processes contribute towards this improvement can help to focus future model develop-18 ment eﬀorts. In this study, a set of ten-day global weather forecasts, performed with dif-19 ferent atmospheric and orographic resolutions ranging from 180km to 9km, are used to 20 examine the impacts of resolving increasingly smaller orographic scales on the forecast 21 skill of the Northern Hemisphere (NH) winter circulation. These experiments aim to an-22 swer two main questions: what is the relative contribution from increases in atmospheric 23 versus orographic resolution to the overall improvement in the NH winter medium-range 24 forecast skill obtained when increasing the horizontal resolution?; and how do diﬀerent 25 orographic scales aﬀect diﬀerent scales of the atmospheric ﬂow? For experiments in which 26 the subgrid-scale orography parametrizations are turned oﬀ, increases in orographic res-27 olution are responsible for almost all of the increase in skill within the troposphere. In 28 the stratosphere, higher atmospheric resolution also contributes to skill improvements 29 , likely due to a better representation of gravity wave propagation and breaking. All scales 30 of orography considered here are found to be important for the obtained changes in the 31 circulation and appear to rapidly aﬀect all considered scales of the ﬂow. In experiments 32 in which the subgrid-scale orography parametrizations are turned on, the beneﬁts of in-33 creasing the horizontal resolution decrease, but do not entirely disappear, suggesting that 34 these parametrizations are not perfect substitutes for the unresolved orography. 35


Introduction
The skill of weather forecasts has improved dramatically over the past decades, with the accuracy of global medium-range weather forecasts increasing by approximately one day per decade (Simmons & Hollingsworth, 2002).In other words, current forecasts of key measures of the large-scale atmospheric circulation, such as the geopotential height at 500hPa in the extra-tropics, for six days ahead are as accurate as forecasts five days ahead were ten years ago (see Figure 1 of Bauer et al. (2015)).These remarkable advances in Numerical Weather Prediction (NWP) represent a quiet revolution because they have resulted from a steady accumulation of advances in scientific understanding (e.g.numerical techniques, parametrizations of physical processes, data assimilation methodologies), utilisation of observations and supercomputing capacities and technologies (Bauer et al., 2015).Increases in model resolution, which have become affordable due to enhanced supercomputing facilities, are among the key factors contributing to this increase in forecast skill.Both the horizontal and vertical resolution of global NWP models have significantly increased over the past decades.Twenty years ago, the operational global ten-day weather forecasts of the European Centre for Medium-Range Weather Forecasts (ECMWF) were performed at a horizontal resolution of approximately 62km and a vertical resolution of 31 levels between the surface and 10hPa, while today they are performed at approximately 9km and with 137 levels between the surface and 0.01hPa.
Increases in horizontal resolution obviously imply concomitant increases in atmospheric and orographic resolution.The term "atmospheric resolution" is used here in a broad sense to refer to the horizontal resolution at which the atmosphere and its lower boundary condition (land-surface, sea-ice and ocean characteristics) are discretized.The term "orographic resolution" refers to the resolution of the grid-box mean (or resolved) orography.Forecast skill increases with horizontal resolution because numerical truncation errors made when solving the equations describing the atmospheric flow and the surface gradients are reduced, thus leading to better resolved dynamical and physical processes, and a lesser need to parametrize processes taking place at scales smaller than the model grid box.This applies particularly to orography and its effects on the flow, which are better resolved at high resolution and, thus, require less parametrization.
Orography affects the atmospheric circulation through a variety of processes and on a wide range of temporal and spatial scales (Smith, 1979), and is known to be one of the key factors controlling the Northern Hemisphere (NH) circulation during winter (Held et al., 2002).This is, in part, due to the direct forcing exerted by the significant orographic barriers of the NH (e.g.Rocky Mountains, Himalayas) on the stationary planetary waves, which shape the zonally asymmetric circulation (Charney & Eliassen, 1949).
The impact of orography on the large-scale circulation is also due to cumulative impacts of small-scale orographic processes, such as turbulent orographic form drag (Wood et al., 2001;Beljaars et al., 2004), low-level blocking of the flow (Lott & Miller, 1997) or breaking of orographically generated gravity waves (Palmer et al., 1986;McFarlane, 1987).For example, small-scale orographic gravity waves are known to impart a drag force on the large-scale flow, particularly when they become unstable and break.As a result of the multi-scale nature of orography and its impacts, orographic processes are only resolved in part by numerical models ; and processes occurring at scales smaller than the typical resolution of global climate (O(100km)) or NWP models (O(10km)) need to be parametrized.
To date, the representation of unresolved orographic processes in numerical models remains a major challenge (Sandu et al., 2019).
In this study, we aim to quantify the contribution of increased orographic resolution versus increased atmospheric resolution to the gain in medium-range weather forecast skill obtained when increasing the horizontal resolution from O(100km) to O(10km).
Our focus is on the large-scale NH winter-time circulation, since this is when and where the orographic effects on the large-scale atmospheric circulation are maximized, due to the presence of the significant orographic barriers of the NH (Rocky Mountains, Himalayas), of a strong westerly jet and of high static stability near the surface.To this end, we perform global ten-day forecast experiments for January 2015 with the ECMWF Integrated Forecasting System (IFS) at different atmospheric and orographic resolutions.The resolutions considered here range from those typical of climate models (∼180km) to those typical of current global NWP models (∼9km).A series of experiments are performed in which only the atmospheric resolution is increased while the orographic resolution is unchanged or, conversely, the atmospheric resolution is unchanged while the orographic resolution is increased.The concept is similar to the experiments carried out by Jung et al. (2012) in the Athena project but, while Jung et al. (2012) investigated the impact of horizontal (and orographic) resolution on predictions at seasonal and climate timescales, we focus here on medium-range forecasts and on the NH winter circulation.
Other studies have demonstrated the benefits of increased resolved orography for the representation of atmospheric blocking (Berckmans et al., 2013) and precipitation (Schiemann et al., 2018) at climate timescales; as well as the benefits of increased atmospheric resolution for the representation of the mid-latitude jet position and strength in multi-year simulations (Lu et al., 2015).Questions pertaining to convergence with resolution, relevant orographic scales and impacts on medium-range forecast skill remain largely unanswered, however.By design, our experiments allow us to address the following questions: (i) what is the relative contribution from increases in orographic versus atmospheric resolution towards the improvements in medium-range forecast skill of the NH winter circulation obtained when the horizontal resolution is increased from 180 to 9km?; (ii) how are the different scales in the atmospheric circulation affected by varying the resolved orographic scales?(iii) and, finally, are current parametrizations, designed to represent unresolved orographic processes, able to reproduce the improvement in medium-range forecast skill obtained when increasing the orographic resolution?
To explore questions (i) and (ii), the parametrizations used to account for unresolved orographic effects in the IFS are switched off in our ten-day forecast experiments with various atmospheric and orographic resolutions.Unresolved orographic effects are accounted for in the IFS through two parametrization schemes: the turbulent orographic form drag (TOFD) scheme (Beljaars et al., 2004), which represents effects of orographic features with horizontal scales smaller than ∼5km; and the subgrid-scale orography (SSO) scheme which accounts for low-level flow blocking and orographic gravity waves (Lott & Miller, 1997) and represents orographic effects of features with horizontal scales between 5km and the model grid box.Each of these parametrizations acts to decelerate the winds at various levels of the atmosphere due to turbulent form drag, low-level flow blocking or gravity-wave breaking.Given that the TOFD scheme represents effects of orographic features with horizontal scales smaller than 5km, its impact should, in theory, be fairly resolution independent across the range of resolutions considered here (180km -9km).On the other hand, the impact of the SSO scheme should decrease when the horizontal resolution increases, as a result of increasingly smaller orographic scales becoming resolved.However, the strength of the parametrized drag strongly depends on the mean wind speed which is, in turn, affected by these two schemes as well as by the resolved dynamics and other subgrid processes (e.g.turbulent mixing).The two orographic drag schemes, hence, interact with each other, with others schemes (Sandu et al., 2013) and with the resolved dynamics (van Niekerk et al., 2018).Given that these interactions may well be resolution dependent, it is easier to examine the impacts of atmospheric and orographic resolutions on the NH winter circulation in a set-up in which the SSO and TOFD schemes are switched off.
To answer question (iii), we also perform ten-day forecast experiments with the SSO and TOFD schemes switched on for a few selected resolutions.These experiments allow us to examine the extent to which the gap in large-scale forecast skill between low and high resolution experiments can be closed through the introduction of parametrized orographic drag.Ideally, if the orographic drag parametrizations would be perfect, they should account for unresolved orographic effects at all resolutions and the skill gained by increasing the orographic resolution should be negligible.In practice however, parametrizations of orographic effects are known to be uncertain and poorly constrained (Zadra, 2013;Sandu et al., 2016Sandu et al., , 2019)).They also behave inconsistently across resolutions, in the sense that they do not accurately account for the handover between resolved and parametrized orographic drag as the resolution is varied (Brown, 2004;Vosper, 2015;van Niekerk et al., 2016).The experiments with and without the SSO and TOFD schemes also allow us to assess how the two parametrizations behave, and interact with each other, at various horizontal resolutions.To our knowledge, this exercise has never been carried out before and can provide some insight into the orographic processes responsible for the increase in skill obtained when increasing the orographic resolution.
The paper is structured as follows.The model setup and the set of experiments with varying atmospheric and orographic resolutions are described in Section 2. We first focus on the experiments in which the SSO and TOFD schemes are switched off.The impact of atmospheric and orographic resolution on the NH winter circulation is analysed in Sections 3 and 4, respectively.The impacts of different orographic scales on the atmospheric flow are also investigated in Section 4. Section 5 then compares the increase in large-scale forecast skill in the experiments with and without the SSO and TOFD schemes, and discusses the impacts of the two schemes as horizontal resolution is varied.Conclusions are drawn in Section 6.

Forecasts with varying atmospheric and orographic resolutions
To explore the role of orography and of its various scales on the forecast skill of the NH winter circulation, a set of forecasts with varying atmospheric and orographic resolutions are performed with a recent version of the ECMWF IFS (cycle 43r1, operational between Nov 2016 and July 2017).IFS has a spectral hydrostatic dynamical core with semi-Lagrangian advection and semi-implicit time integration schemes, and uses a comprehensive set of physical parametrizations described in detail in the IFS Documentation (2016).For the reasons detailed in the introduction, the SSO and TOFD schemes are switched off in all but a few selected experiments discussed in Section 5.
The resolutions considered are TCo63 (180km), TCo159 (72km), TCo319 (36km), TCo639 (18km), and TCo1279 (9km).T Con denotes a triangular spectral truncation with a maximum wavenumber n (n is often referred to as the spectral truncation number), paired with a cubic octahedral reduced Gaussian grid (Wedi (2014), Malardel et al. (2016)).The wavenumber n indicates how many of the characteristic horizontal wavelengths are needed to go around the globe at the Equator.With a cubic grid, the shortest wave is described by 4 grid points.Similarly T Ln denotes a triangular spectral truncation with a maximum wavenumber n, paired with a linear reduced Gaussian grid (Côté & Staniforth, 1988), which represents the shortest wave by 2 grid points.Thus both TCo1279 and TL1279 represent the same number of waves in spectral space, but the grid point distance of TCo1279 is approximately 2 times smaller than that of TL1279.
Across this range of resolutions (180 to 9km), we perform forecast experiments with various combinations of atmospheric and orographic resolutions, as depicted in Figure 1.Each experiment consists of 31 forecasts starting daily at 00UTC during January 2015, from the operational ECMWF analysis at TL1279 resolution (16km).This period was selected for consistency with the runs performed in the Global Atmospheric System Studies (GASS)/Working Group for Numerical Experimentation (WGNE) Constraining ORographic Drag Effects (COORDE) intercomparison project which uses high resolution simulations to constrain low-level blocking and orographic gravity wave drag effects.For most combinations of atmospheric and orographic resolutions, we run ten-day forecasts which are then used to explore the impacts on the NH winter circulation (filled circles in Fig- ure 1).For some combinations, however, we only perform 24 hour integrations (dashed circles in Figure 1) to explore the impact of atmospheric resolution on the resolved orographic torque (see Section 3).
The various experiments in Figure 1 are referred to hereafter using the notation An/On, where An/On represent the atmospheric/orographic resolutions and n is the spectral truncation number of the respective resolution.For example, A1279/O1279 refers to an experiment with an atmospheric horizontal resolution of TCo1279 and a grid-box mean orography of TCo1279 (i.e. both the atmospheric and orographic resolution is ∼9km, which is the resolution of operational ECMWF ten-day forecasts since May 2016).A1279/O319, therefore, refers to an experiment at TCo1279 atmospheric resolution, but with a lower resolution (TCo319) grid-box mean orography (simply referred to as orography hereafter).
The IFS requires both a spectral and a grid point representation of the orography, which need to be at the atmospheric spectral resolution and on the corresponding cubic octahedral grid, respectively.The spectral orography used in the A1279/O319 experiment is derived from the TCo319 orography, by setting the coefficients of the wavenumbers from 320 to 1279 to zero.This spectral orography is then used to derive the grid point TCo1279 orography through inverse spectral transformation.This procedure for deriving the orography fields is applied in all experiments in which On is coarser than An.
The experiments on the diagonal of Figure 1 Increasing the orographic resolution means that increasingly smaller scales of the orography are represented.Indeed, as illustrated across the TCo63 to TCo1279 resolution range in Figure 2 and 3, the higher the spectral truncation wavenumber, the more of the small-scale orographic variance of the original 1-km orography dataset (i.e.TCo 7999) is captured.Small-scale features of the grid-box mean orography are normally filtered out in numerical models to ensure numerical stability.A recent model intercomparison of orographic fields (Elvidge et al., 2019) has revealed that the IFS has the least filtered representation of the mean orography among major global NWP models.Indeed, only a sharp filter is applied when creating the spectral orography at each On resolution from the original 1-km orography dataset in order to mitigate the Gibbs phenomenon associated with steep orography (see IFS Documentation (2016) for more details).This is in part due to the fact that the cubic discretization can stably support an orography with more variance in the small scales than for e.g. a linear discretization (full versus dashed orange line in Figure 2), therefore providing the same spectral representation as that obtained from the original 1-km orography dataset for all wavenumbers almost up to the spectral truncation number (Wedi (2014); Malardel et al. (2016) and Figure 2).
The experiments with increasing orographic resolution thus allow us to explore how the different wavebands of the orographic spectrum (0 to 63, 63 to 159, 159 to 319, 319 to 639 and 639 to 1279), i.e. different orographic scales, impact the atmospheric flow.3 Impacts of atmospheric resolution on the NH winter circulation In this section we examine how the increase in atmospheric resolution affects the representation of the atmospheric flow by analysing the results of the experiments included in the red rectangles of Figure 1 (experiments with increasing An and constant On).Analysis of both the resolved orographic torques and selected metrics of the largescale circulation is performed.

Resolved orographic torques
Increasing the atmospheric resolution leads both to smaller truncation errors related to the atmospheric flow and better resolved grid-box mean orography.A certain atmospheric resolution, e.g.A63, is not enough to completely resolve the orography at the same resolution (O63) and numerical models can generally only effectively resolve scales larger than several times the grid-box size (δx).The effective resolution of a model generally varies between 4 and 10δx depending on choices made for discretization, numerical diffusion (Skamarock, 2004) and advection scheme.Using surface wind observations over the ocean, Abdalla et al. (2013) estimated the effective spectral resolution of the IFS to be approximately 8δx, at a time when a linear TL1279 reduced Gaussian grid was used for operational ten-day forecasts.Similarly, Vosper et al. (2016) showed that for the UK Met Office model, contributions to the resolved orographic drag from wavelengths shorter than 8 to 10δx are poorly resolved.Previous studies using idealised models (e.g.Davies and Brown (2001)) have also estimated similar effective resolutions.
A certain An eff is, therefore, necessary to completely resolve an On orography, where n eff > n, and we define n eff n δx to be the effective orographic resolution.
The impact on the atmospheric circulation from resolved orography can be quantified through the resolved orographic component of the vertically integrated angular momentum budget, given by van Niekerk et al. ( 2016); Sandu et al. (2019).We calculate the resolved orographic term online on the native model grid using the horizontal gradient of the surface height used in the model.We then average the resolved orographic term over the set of 24hr forecasts performed during January 2015, and integrate it to obtain the total resolved torque over the NH defined as: where p s is surface pressure, h is the height of the surface, r is the radius of the Earth, λ is the longitudinal coordinate and φ is the latitudinal coordinate.Note that no additional filtering is applied in this calculation.
Increasing An, at a constant On (series of experiments included in the red rectangles in Figure 1), should lead to an increase in the resolved orographic torque up to the In Figure 4, we then examine how the total resolved torque changes for each series of experiments with increasing An and constant On included in the four red rectangles in Figure1.
Note that in our definition a positive torque indicates a deceleration of westerly flow.Given that the dominant flow direction is easterly in the subtropics and westerly in the extratropics, the sign of torque is negative in the subtropics and positive in the extra-tropics.
The smaller magnitude of the total resolved torque at higher orographic resolutions is due to the fact that the positive torque over the extra-tropics increases more rapidly than the negative torque over the subtropics, as a consequence of the increasingly resolved orography over the major mountains such as Himalayas and Rocky Mountains when the orographic resolution increases.
-  At all orographic resolutions, the total resolved torque strongly increases in magnitude when An is increased to 2 or 2.5 × On (points on the left of Figure 4), and then changes much less as An increases further.At O63 the total resolved torque changes very little as An increases from 5 to 10 or from 10 to 20 × On.Similarly at O159, the total resolved torque remains almost unchanged as An increases from 4 to 8 × On.Our results, therefore, suggest that, at least for O63 and O159, the effective orographic resolution is around 4 to 5δx.This means, for example, that an An of at least TCo319 is necessary to fully resolve an O63 orography and its effects on the flow.The fact that the behaviour of the total resolved torque, up to an An/On ratio of 4, is comparable at O63, O159 and O319 suggests that the effective orographic resolution is similar at higher On.
However, since we cannot perform experiments with an An/On ratio larger than 4 for On higher than 159 due to computing time constraints, this hypothesis cannot be verified further.

Large-scale circulation
We have shown that increasing the atmospheric resolution up to the effective orographic resolution leads to a change in the resolved orographic torque, which should have an impact on the large-scale circulation (Sandu et al., 2019).However, since the resolved torque is an integral measure over the entire depth of the atmosphere, it does not tell us at which levels of the atmosphere the orography is having an impact on the circulation.To evaluate the changes in large-scale circulation and the forecast skill that result from increased atmospheric resolution, we examine the changes in anomaly correlation coefficient (ACC) of the geopotential height of the atmosphere.We focus on 500hPa, which is a good indicator of hemispheric circulation patterns because it is considered as a steering level for the weather systems below.We also focus on 50hPa, which is well in the stratosphere but sufficiently low that it feels the integrated effect of the orographic gravity waves that break above in the NH mid-latitude winter.The ACC is defined as where f and a are forecast and analysis anomaly relative to the model climatology, overbar denotes average over the NH, L is the total number of sample points in the NH, w i = cos φ is a weighting factor equal to the cosine of latitude.As standard practice, ACC is computed from geopotential height fields truncated to n = 63 and interpolated onto a 2.5 degree regular latitude-longitude grid.The model climatology is derived from the ERA-Interim reanalysis (Dee et al., 2011).Note that the ACC of geopotential height at 500hPa is one of the key metrics used to measure NWP skill.
The differences in Fisher-Z transformed ACC (i.e.tanh −1 (ACC), see Jolliffe and Stephenson (2012)) of geopotential height at 500 and 50hPa (Z500 ACC and Z50 ACC hereafter), between the experiments depicted by filled circles in Figure 1 and the lowest resolution experiment (A63/O63), are shown in Figure 5.Note that the Fisher-Z transformed ACC is better approximated by a normal distribution than ACC (Jolliffe & Stephenson, 2012).Focusing on the pairs of lines of similar colours (black, green, blue and red tones), it can be seen that increasing the atmospheric resolution affects the Z500 ACC very little, except when going from A63/O63 to A319/O63 which will be discussed below.Indeed, at O159, O319 and O639, the difference in Z500 ACC between the A63/O63 experiment and the experiments in which the atmosphere and orography are at the same resolution, An = On (lighter tone lines, corresponding to the experiments on the diagonal of Figure 1), closely match the difference between the A63/O63 experiment and the experiments in which the atmospheric resolution is much higher than the orographic resolution, An = x × On (darker tone lines, corresponding to the experiments in the blue rectangle of Figure 1).In other words, the atmospheric resolution increase and the associated timestep decrease add very little to the skill in the troposphere at these res- olutions.An additional experiment performed at A63/O63 with a shorter timestep (1200s instead of 1800s) confirmed that the Z500 and Z50 ACC are not sensitive to the chosen timestep.
At 50hPa, however, increasing the atmospheric resolution does improve the Z50 ACC markedly (Figure 5b).The differences in Z50 between the two O159 green curves (A1279/O159 and A159/O159), between the two O319 blue curves (A1279/O319 and A319/O319), and between the two O639 red curves (A1279/O639 and A639/O639) are statistically significant at a confidence level of 95% up to day 5, 4 and 2, respectively (not shown).Note that the statistical confidence level is 95% throughout this study.This improved representation of the circulation in the lower stratosphere with increasing atmospheric resolution may be due to the fact that the vertical propagation of atmospheric gravity waves is strongly affected by numerical truncation errors, which will become larger in the upper atmosphere both as a result of the distance over which the wave has travelled when it reaches 50hPa and the fact that the vertical resolution is degraded with altitude.Griffin and Thuburn (2018) showed that the vertical, horizontal and timestep resolution of a model can lead to errors in the phase speed of gravity waves and, thus, their vertical propagation.In particular, gravity waves with horizontal wavelengths close to the grid-scale propagate too vertically compared with an analytic solution of gravity wave propagation.As a result, the waves generated by, for example, an O159 orography may be propagating less accurately in the vertical when the atmospheric resolution is A159 than at A1279, at which the waves generated by the smallest scales of the orography are well resolved.The stronger gravity wave activity and breaking in the A1279/O159 than in the A159/O159 simulations is confirmed by shaper potential temperature structures and stronger zonal wind deceleration in the stratosphere, which are visible in snapshots of the simulated potential temperature and zonal wind fields across the major mountain chains (i.e. Himalayas or Rocky Mountains) (not shown).
(a) Z500  At an orographic resolution of O63, Z500 ACC does not change much when An is increased from A319 to A1279, similarly to what is found for other resolutions.In contrast, a large impact on Z500 that is statistically significant for several days of the fore- cast is obtained when the atmospheric resolution is increased initially from A63 to A319 (grey line compared to the zero line in Figure 5a).Given that for all other experiments in which the orography is held constant the atmospheric resolution increase, and the associated timestep decrease, have very little impact on the large-scale skill in the troposphere, this is a radical departure from the pattern.There are two possible explanations for this result.First, it is possible that increasing An at such a coarse resolution also reduces truncation errors related to the representation of other processes such as, for example, extra-tropical storms or cyclones.Jung et al. (2012) had in fact shown that the atmospheric activity in IFS is markedly different between a TL159 (126km) and a TL511 (39km) resolution, but changes relatively little as the resolution increases beyond TL511.
Second, it is possible that the 0 -63 orography wavenumbers (large scales) matter more for the tropospheric circulation than other orographic scales.If this was the case, it seems plausible that better resolving these scales by increasing An up to the effective orographic resolution leads to an increase in the large-scale forecast skill.While it is not entirely possible to disentangle these two possible causes for the circulation changes found in the A319/O63 experiment, the next section investigates which orographic scales matter most for the large-scale circulation.to meso scales (zonal wavenumbers k=21-63, O(100km)).Finally, we explore how different orographic scales affect other measures of the large-scale circulation, namely, the mean sea-level pressure and the barotropic winds (see definition below).
The increases in orographic resolution from O63, to O159, O319 and O1279 all bring remarkable improvements in large-scale circulation forecast skill throughout the atmo-sphere.This is illustrated in Figure 6 through the changes in Z500 ACC and Z50 ACC that result from increasing the orographic resolution step-wise, from O63 (the zero line in the figure) to O1279, while keeping the atmospheric resolution constant at A1279.Each orographic resolution increase leads to a gain in Z500 medium-range skill of approximately 0.2/0.3forecast days.For example, the Z500 ACC at day 5.3 in the A1279/O639 experiment is approximately the same as that at day 5 in the A1279/O319 experiment (not shown).To put this into context, Z500 medium-range forecast skill in the NH extratropics has increased by approximately one day per decade, due to combined changes in all the components of NWP systems (Bauer et al., 2015).The Z50 ACC is also significantly improved as the orographic resolution is increased.
It is remarkable that the large-scale circulation skill increases almost linearly with the orographic resolution, and does not seem to saturate, at least up to a resolution of approximately 9km. Figure 6 also demonstrates that increases in skill can be obtained not only by increasing the truncation number of the orographic resolution (green vs blue vs red vs orange solid lines), but also by representing the orography as accurately as possible up to the truncation number.The experiment ran at A1279 with the TL1279 instead of the TCo1279 orography (dashed versus plain orange lines in Figure 6) demonstrates that, for the same truncation number, considerable skill can be gained by maintaining more orographic variance at the smallest resolved scales (dashed versus plain orange lines in Figure 2).The differences in Z500 ACC between the two curves are statistically significant up to day 5 (not shown).
All considered orographic scales appear thus to have a significant impact on the large-scale circulation, which manifests itself very rapidly, i.e. within a few hours (Figure 6).An interesting question is whether this overall impact on circulation is due to orography effects on the large (planetary) scales of the atmospheric flow, or rather to effects on meso scales.This question was explored to some extent by Tibaldi (1986), but that study was performed at a resolution of approximately 200km, and for a single case study.Similarly to the overall effect on Z500 ACC (Figure 6), it turns out that all the orographic wavenumber bands considered (63-159, 159-319, 319-639 and 639-1279) affect, almost commensurately, Z500 ACC in the different zonal wavenumber bands (Figure 7).At planetary (Figure 7a), synoptic (Figure 7b) and intermediate scales (Figure 7c) the increase in skill obtained from increasing On is statistically significant up to day 6/7 for all increases in On, while at meso scales (Figure 7d) it is significant during the first 4 days of the forecasts.Even the smallest orographic scales considered (639-1279, or 18-9km) have a direct and large impact on all scales considered, including the planetary scales.The smaller differences in Z500 ACC at zonal wavenumbers 21-63 are presumably due to the fact that the errors at small scales saturate faster than those at larger scales (Dalcher & Kalnay, 1987).
To better understand the impact of the different orographic scales, or wavebands (63-159, 159-319, 319-639 and 639-1279) on other metrics of the large-scale circulation, we now examine the differences in barotropic wind between the pairs of A1279 experiments with the corresponding orographic resolutions (e.g.A1279/O159 -A1279/O63, A1279/O319 -A1279/O159 and soforth) in Figure 8.The barotropic wind components (u b , v b ) represent the mass weighted vertically integrated horizontal wind components (u, v): The barotropic wind is a measure of the vertically averaged atmospheric flow, in which the tropospheric winds dominate due to the mass weighting.Changes in barotropic winds allow us, thus, to illustrate the effects of changes in orography on the large-scale tropospheric winds (Figure 8).The monthly mean u b in the analysis fields from which all the forecast experiments are initialised, shows a strong and elongated westerly jet along 30N, with an intensity peaking over the Middle East Mountains, Himalayas and East Pacific (Figure 9a).The mean error in u b in the A1279/O63 experiment shows an excessively strong subtropical westerly flow from Europe to East Asia, and too weak winds over Greenland, indicative of a lack of tilt in the North Atlantic jet (Figure 9b).This largely resembles well known biases in models in which the orographic gravity wave drag or blocking is not parametrized (e.g.Wallace et al. (1983) and to its overall southward displacement over the Himalayas (Figure 8).This results in notably smaller biases in u b at A1279/O1279, particular in the regions where the largest errors are found at A1279/O63 (Figure 9b,c).The barotropic wind vector in the A1279/O1279 experiment is overlayed in all panels.
As was the case for Z500 ACC, the barotropic wind changes are roughly linear with the increase in orographic resolution and the impact does not seem to saturate up to an orographic resolution of 9km.The impacts on the barotropic wind appear to be slightly larger for the O63-O159 and O639-O1279 than for the O159-O319 and O319-O639 orographic wavebands (a,d versus b,c in Figure 8).Similar conclusions can be drawn by examining the changes in mean sea-level pressure (Figure 10).The step-wise increases in orographic resolution lead to an increase in pressure north of the major mountain chains over Eurasia, reducing the mean error with respect to the analysis from which the forecasts are initialised (Figure 11).This signature is typical, and largely resembles, that ob- tained when increasing the parametrized SSO drag (Sandu et al., 2016;Elvidge et al., 2019).As with parametrized forces, the increased orography acts as a drag force that decelerates the flow.In order for the flow to stay balanced, this drag force is equilibrated by an ageostrophic northward wind which transports mass from the equator to the poles (Lott & D'andrea, 2005).From Figure 11 it is evident that the increase in orographic resolution leads to a large reduction in the mean sea-level pressure error over the NH but, given that the remaining mean sea-level pressure error resembles the impact of the increase in resolved orography, it also indicates that a further increase in resolved orography may be beneficial.
These changes in various metrics of the large-scale circulation in the NH during winter indicate that a step-wise increase in orographic resolution leads to a gradual decrease in the mean errors, and consequently to an increase in medium-range forecast skill throughout the atmosphere.Interestingly, all orographic wavebands considered, from thousands to tens of km, have a direct and rapid impact on the chosen metrics of the large-scale circulation.The large impact found when increasing On from O639 to O1279 suggests that these impacts do not saturate up to orographic resolutions of O(10km), at least when parametrizations of subgrid-scale orographic effects are not used, suggesting that further resolution increases can lead to further increases in skill.In Section 5, we will examine whether these findings still hold when the SSO and TOFD schemes are turned on.

Behaviour of orographic drag parametrizations across resolutions
In previous sections we have shown that, in the absence of orographic drag parametrizations, the gains in tropospheric NH winter circulation forecast skill obtained when increasing the horizontal resolution beyond TCo159 are almost exclusively due to increases in orographic resolution (Figure 5).If the orographic drag parametrizations are able to accurately mimic the unresolved orographic effects on the flow at each resolution, this would imply that the benefits gained by increasing the horizontal resolution should vanish almost entirely when the parametrizations are turned on.To explore whether this is the case, and as a means of assessing the quality of these orographic drag parametrization schemes, we compare the changes in Z500 and Z50 ACC obtained when increasing the horizontal resolution with the SSO and TOFD schemes turned on to those obtained when they are turned off.We do this for the experiments on the diagonal of Figure 1, since the subgrid-scale orography fields are available only for configurations in which An and On are equal.Given that the model behaviour is different at TCo63 compared with higher resolutions (see discussion in Section 3), we only focus here on the changes in skill between the TCo159, TCo319, TCo639 and TCo1279 resolutions (i.e. the A159/O159, A319/O319, A639/O639 and A1279/O1279 experiments, Figure 12).
Figures 12c,d suggest that turning on the parametrizations leads to a large increase in skill in both Z500 and Z50 at TCo159 (compare green line and zero line), and that the skill increases less markedly with increasing resolution when the SSO and TOFD schemes are turned on than when they are switched off.Indeed, the gap between the green, blue, pink and orange lines in Figure 12c,d is smaller than that in Figure 12a,b.However, this gap does not entirely disapear, and differences between each pair of curves in Figure 12c,d are statistically significant up to day 4 or 5.This suggests that the parametrizations are not perfect, in the sense that they do not manage to accurately represent the unresolved orographic effects at each resolution.Moreover, the gap in skill between the coarsest res- and (b) A1279/O1279 forecasts with respect to the analysis fields from which the forecasts are initialized, at a leadtime of 24 hours.Note that mean sea-level pressure fields are used rather than surface pressure fields due to the difference in surface elevation between the two experiments and the analysis which is at TL1279 resolution.
olutions (going from A159/O159 to A319/O319) is larger than that between the highest resolutions (going from A639/O639 and A1279/O1279) when the parametrizations are turned on (Figure 12c,d vs a,b).This is likely due to the fact that the impact of SSO decreases at higher resolutions and, therefore, its impact on the forecast scores decreases.
It could also be due to the fact that one or both schemes, or their interactions, do not behave appropriately across the range of resolutions considered.To explore these possibilities further, we examine the impact of each of the two schemes (SSO -green, TOFD -red), and of their combination (black) at various resolutions, as shown in Figure 13.
At all resolutions, the large-scale skill of ten-day forecasts produced without the SSO and TOFD schemes is significantly inferior to that of forecasts produced when using both schemes.Each parametrization has a large impact on Z500 ACC across the entire forecast range, corroborating previous studies (Figure 13).This confirms the significant contribution of parametrized orographic drag to improvements in the representation of the large-scale circulation in both NWP and climate models (Palmer et al., 1986;Lott & Miller, 1997;Charron et al., 2012;Sandu et al., 2019).Examining the individual impacts of the two schemes, it is clear that the impact of the SSO scheme decreases as the horizontal resolution increases from TCo159 to TCo1279.Recall that the SSO scheme is designed to account for drag processes from 5km up to the mean grid-scale orography.
Consequently, as the horizontal resolution increases and more orographic scales become resolved, the subgrid-scale orographic variability that needs to be represented by the SSO scheme decreases and so its impact on the circulation at higher resolution decreases.Meanwhile, the TOFD scheme represents scales smaller than 5km, which remain completely unresolved even at TCo1279, has a fairly constant impact on the circulation across the TCo159 to TCo1279 resolution range.When the two schemes are used together, they appear to strongly interact non-linearly.This is evident from the fact that the combined impact of the SSO and TOFD schemes (black lines) is not equal to the sum of the impacts of the individual schemes (green and red lines in Figure 13).This is not surprising, given that both schemes exert drag on the low-level winds, and that the magnitude of this drag depends on the strength of the lowlevel winds.In other words, the magnitude of the low-level blocking drag exerted by the SSO scheme will be stronger when the TOFD scheme is not active, because the low-level winds would not have been slowed due to the TOFD.The opposite is also true.
Again, the TCo63 resolution behaves as an outlier compared with the other resolutions considered (Figure 13a).Given that this is the lowest resolution considered, the impact of the SSO scheme should be the largest but, in practice, the impact of the SSO scheme is comparable, or even smaller than, that at A159/O159.The impact of the TOFD scheme, on the other hand, is almost half of that found in higher resolutions.Examining the Z500 ACC for the different wavebands discussed in Section 4 (i.e.1-3, 4-9, 10-20 and 21-63) when the parametrizations are turned on, it appears that at resolutions higher than TCo159, the two schemes affect all considered atmospheric scales (not shown), similar to what was found for impacts of resolved orography (Figure 7).At A63/O63, however, they only affect the largest scales (1-3 and 4-9).This is likely due to the fact that at this resolution, the synoptic variability is poorly represented.

Conclusions
Concerted community efforts have been made in recent years to understand and reduce the uncertainties related to the representation of orography and of its impacts on the large-scale circulation in numerical models used for weather prediction and climate projections (Sandu et al., 2019).This study complements a series of recent studies that are part of these efforts (Sandu et al., 2016;van Niekerk et al., 2016;Pithan et al., 2016;Sandu et al., 2017;van Niekerk et al., 2018;Elvidge et al., 2019) by investigating the impacts of the representation of orography (both resolved and parametrized) on the NH winter circulation across several resolutions ranging from those typical of global climate models (∼180km) to those typical of global NWP models (∼9km).
An extensive set of ten-day weather forecasts performed with the ECMWF IFS at various atmospheric and orographic resolutions was used to address the questions itemised in the introduction.Taking them in turn, we found that, when no orographic drag parametrizations are used, the increase in medium-range forecast skill obtained throughout the troposphere when increasing the horizontal resolution from TCo159 (72km) to TCo1279 (9km) is almost exclusively due to increases in orographic resolution.This is a significant re-sult because it provides an atmospheric process, i.e. orography, that could be improved to address the shortcomings at lower resolutions and motivates the need for better representation of sub-grid orographic processes.At the coarsest resolution considered (i.e.TCo63 (180km)), tropospheric skill does increase when the atmospheric resolution is increased from A63 to A319.This is likely due to the fact that reduced truncation errors help to improve the synoptic variability, which is very poorly represented at such a coarse resolution.In contrast with the troposphere, the stratospheric forecast skill is improved due to both orographic and atmospheric resolution increases.The positive impacts of the increase in atmospheric resolution, in the case of the stratospheric circulation, are likely related to the better representation of orographic gravity wave propagation.
We also demonstrate that all orographic scales considered (0-63, 63-159, 159-319, 319-639, 639-1279) commensurately affect the large-scale circulation.What is more, they all directly affect scales of the flow ranging from planetary O(10000km) to meso O(100km) scales within a few hours.These findings imply that even the smallest scale orography is important for, and affects very rapidly, the planetary scales and thereby the large-scale circulation.The question as why exactly this is the case is subject of ongoing work.
When the orographic drag schemes are used, increases in horizontal resolution from TCo159 to TCo1279 still bring significant improvements in the representation of the largescale NH winter circulation.This suggests that, although existing orographic drag parametrizations have significantly contributed to improvements in NWP skill, they do not accurately capture the effects of unresolved orography on circulation, in agreement with the results of other recent studies (van Niekerk et al., 2018;Sandu et al., 2019).Our analysis has also demonstrated that the SSO and TOFD schemes strongly interact, and that this nonlinear interaction is resolution dependent, particularly at resolutions typical of global climate models, at which the contribution from the SSO parametrization is very large and synoptic variability is poorly represented.From this study, we argue that increases in horizontal, and, more precisely, orographic resolution, have contributed to step-wise increases in forecast skill of the circulation within the troposphere (and stratosphere) during the NH winter and that these improvements have not yet reached saturation at current global NWP resolutions.

Figure 1 .
Figure 1.Combinations of different atmospheric (An, horizontal axis) and orographic (On, vertical axis) spectral resolutions and corresponding grid point distances in the cubic grid used in this study.Filled circles denote ten-day forecasts and dashed circles denote 24-hour forecasts (see text for details).The colours of the circles match the colours of the lines used in Figures 5, 6, 7 and 12 to illustrate the impacts of the different resolutions on the NH winter circulation.The experiments included in the red rectangles use the same orographic resolution, allowing us to analyze the impact of increases in atmospheric resolution on the large-scale circulation.The experiments in the blue rectangle use the same atmospheric resolution, allowing us to analyze the impact of increases in orographic resolution on the large-scale circulation.
Figure1, with a constant An and a varying On, allow us to explore the impact of the orographic resolution on the NH winter circulation (Section 4).Note that for each experiment the timestep has a default value which is resolution dependent.The timesteps are 1800s/1800s/1200s/720s/450s for the A63/159/319/639/1279 experiments.Experiments along the horizontal axis of Figure1, thus, encompass both an increase in atmospheric resolution and a decrease in timestep.

Figure 2 .
Figure2.Variance of the grid-box mean orography for TCo63 (black), 159 (green), 319 (dark blue), 639 (red) and 1279 (orange, solid line) as a function of total wavenumber n in logarithmic scale.The variance of the TCo7999 orography (purple), which is very close to that of the original 1-km orography dataset used to derive the orographies at all resolutions, is also drawn for reference.Each orography spectrum is obtained by transforming the orography into spectral space, via decomposition into spherical harmonics.The spectrum as a function of total wavenumber is obtained by summing the squared coefficients over all zonal wavenumbers, which is representative of the variance of the field.The spectra are multiplied by n 5/3 and n −5/3 is represented as a dotted line.For further details see for exampleMalardel et al. (2016), and their Figure1.The variance of the TL1279 orography is also shown (dashed orange line, see text for details).
effective orographic resolution.As An increases beyond the effective orographic resolution, the resolved orographic torque should remain unchanged as the orography becomes entirely resolved.To ascertain the value of the effective orographic resolution of the IFS from our experiments, we compute the resolved orographic torque online during the forecasts, similarly to what was done in van Niekerk et al. (2016) and Sandu et al. (2019).

Figure 4 .
Figure 4.The total resolved torque [N m] over the Northern Hemisphere, as a function of An/On, from the experiments with a varying An and a constant On (included in the red rectangles in Figure 1, also see text for details).Note values are divided by 10 18 .Colours indicate series of experiments performed at O63 (black), O159 (green), O319 (blue) and O639 (red).
Journal of Advances in Modeling Earth Systems (JAMES)

Figure 5 .
Figure 5. Differences in the Fisher-Z transformed anomaly correlation coefficient (ACC) for (a) Z500 and (b) Z50, as a function of lead time, between the experiments with a varying An and a constant On (included in the red rectangles in Figure 1) and the lowest resolution experiment A63/O63.Colours correspond to the full circles in Figure 1, black, green, blue and red tones corresponding to the series of experiments with increasing atmospheric resolution performed at O63, O159, O319 and O639.Error bars indicate a 95% confidence interval.
Journal of Advances in Modeling Earth Systems (JAMES)

Figure 6 .
Figure 6.Differences in the Fisher-Z transformed anomaly correlation coefficient (ACC) for (a) Z500 and (b) Z50, as a function of lead time, between the A1279 experiments with increasing On and the A1279/O63 experiment.Colours correspond to the full circles included in the blue rectangle (vertical axis) of Figure 1.The dashed orange line is the difference between an A1279 experiment performed with the TL1279 instead of the TCo1279 orography.Error bars indicate a 95% confidence interval.
We explore this further by decomposing the change in Z500 ACC, obtained when increasing On at constant An, into several zonal wavenumber bands (k=1-3 (O(10000km)), 4-9 (O(1000km)), 10-20 (O(1000-100km)) and 21-63 (O(100km))).We recall that in Figure 6 the change in Z500 ACC is calculated from Z500 fields spectrally truncated at n = 63.This decomposition illustrates which scales of the atmospheric flow are affected by the changes in On.Given that our experiments cover several orographic wavenumber bands, or scales, it also illustrates how different orographic scales affect different atmospheric scales.

Figure 8 .
Figure 8. Differences in the monthly mean of the zonal component of the barotropic wind [m/s], at a leadtime of 24 hours, between the A1279 experiments with the following orographic resolutions (a) O159 and O63, (b) O319 and O159, (c) O639 and O319 and (d) O1279 and O639.

Figure 9 .
Figure 9. (a) Monthly mean of zonal component of the barotropic wind [m/s] over Jan. 2015 for analysis fields from which the forecast experiments are initialized; (b) monthly mean error of the A1279/O63 and the (c) A1279/O1279 forecasts with respect to the analysis, at a leadtime of 24 hours.

Figure 10 .
Figure 10.Monthly mean difference in mean sea-level pressure [hPa], at a leadtime of 24 hours, between the A1279 experiments with the following orographic resolutions (a) O159 and O63, (b) O319 and O159, (c) O639 and O319 and (d) O1279 and O639.

Figure 11 .
Figure 11.Monthly mean error of the mean sea-level pressure [hPa] of the (a) A1279/O63

Figure 12 .
Figure 12.Differences in the Fisher-Z transformed ACC for Z500 (left) and Z50 (right), as a function of leadtime, between the experiments at A1279/O1279 (orange), A639/O639 (pink), A319/O319 (blue) and the A159/O159 experiment.Results from the experiments without and, respectively, with the SSO and TOFD schemes, are shown in (a,b) and (c,d).The A159/O159 experiment without the SSO and TOFD schemes is taken as reference in all panels, for easying the comparison of the impacts on large-scale skill obtained when the parametrizations are turned off or on.The green lines in panels c, d show the difference in ACC between the A159/O159 experiment with and without the SSO and TOFD schemes.

Figure 13 .
Figure 13.Differences in the Fisher-Z transformed ACC for Z500, as a function of leadtime, between the experiments where the SSO (red), TOFD (green) or SSO and TOFD (black) are turned on and the experiment when both schemes are turned off, performed at the (a) A63/O63, (b) A159/O159, (c) A319/O319 and (d) A1279/O1279 resolutions.