Warming Trends in Summer Heatwaves

The frequency and severity of heatwaves is expected to increase as the global climate warms. We apply crossing theory for the first time to determine heatwave properties solely from the distribution of daily observations without time‐correlation information. We use Central England Temperature time series to quantify how the simple increased occurrence of higher temperatures makes heatwaves (consecutive summer days with temperatures exceeding a threshold) more frequent and intense. We find an overall twofold to threefold increase in heatwave activity since the late 1800's. Week‐long heatwaves that on average return every 5 years were typically below ∼28 °C and now typically exceed it. Our analysis takes as inputs average user‐specific heat wave properties. Its output pinpoints the range of temperatures for which changes in the distribution must be well resolved statistically in order to track how these heatwave properties are changing. This provides a quantitative benchmark for models used for the attribution of heat waves.


Introduction
Heatwaves are one of the most impactful aspects of climate. There is increasing evidence that global warming is leading to more intense and frequent heatwaves with a corresponding increase in their impact (Liss et al., 2017;Mora et al., 2017;Pachauri & Meyer, 2014). Definitions of heatwaves vary (Frich et al., 2002;Perkins & Alexander, 2012), we focus on runs of consecutive days with temperatures exceeding a constant threshold value. They are multiday runs of daily observations at unusually high temperatures that are found in the tail of the distribution. By definition, these are rare events. Direct statistical quantification solely from observations of how heatwaves may be changing in time is thus highly uncertain, although comparable analysis of model output can be less uncertain due to the ability to generate multiple ensembles (Perkins-Kirkpatrick & Gibson, 2017).
There are two distinct contributing factors to an increase in heatwave activity. First is a change in atmospheric circulation patterns leading to altered spatiotemporal correlations. Changes in the frequency, duration, and location of blocking patterns are a prime example of this factor (Barriopedro et al., 2006;Dong et al., 2008;Garcia-Herrera et al., 2010;Renwick, 1998;Sáez de Adana & Colucci, 2005). Changes in heatwaves have been studied in relation to this factor by, for instance, identifying links between a warming trend in the high quantiles of spatially distributed daily summer temperature observations and a modification of

10.1029/2018GL081004
Key Points: • First application of crossing theory to determine heatwave properties solely from the distribution of daily observations without time-correlation information • A twofold to threefold increase in heatwave activity since the late 1800s is found in the Central England Temperature record • Observational, model-independent, generic analysis gives a new and independent check for global climate model heatwave results regional feedbacks and global circulation patterns (Coumou et al., 2018). Such changes are indicated by an increase in spatially correlated high daily temperatures (Hansen et al., 2012). However, a second, important contributing factor to increased heatwave activity is that simply increasing the occurrence of higher temperatures will in itself imply that heatwaves will be on average longer lasting, more frequent, and more intense. This will arise, for example, when there is a change in the moments of a given temperature distribution (Cubasch et al., 2013) but requires quantitative knowledge of how the full distribution is changing. One approach is to use extreme value statistics to model the behavior of the distribution tails (Keellings & Waylen, 2014).
In this Letter we will quantify how changes in the higher values in the observed distribution of daily temperatures translate into changes in average heatwave properties. Our approach quantifies the changes in heatwave properties that arise just from changes in the full distribution, before including additional information on spatial or temporal correlation. This provides a baseline with which to assess recent heatwave occurrences direct from the data. The uncertainties are solely those intrinsic to the observations and to the sampling of the observed distribution and are not dependent on any uncertainties in estimates of spatial or temporal correlation. Our methodology thus provides a quantitative benchmark for models, essential for attribution of heatwaves. It provides a quite general framework which takes as input quantitative, user-relevant heatwave properties and outputs the range of quantiles of the cumulative density function (cdf) that need to be accurately resolved in data, and in models.

Crossing Theory
Crossing theory (Cramer & Leadbetter, 1967;Lawrance & Kottegoda, 1977;Rice, 1944;Vanmarcke, 1985; 2010) provides a set of results for stationary random time series that relate the distribution of observations to the properties of runs above (here, heatwaves) or below a threshold. It has found applications as diverse as hydrology (Bras & Rodríguez-Iturbe, 1993;Nordin & Rosbjerg, 1970;Rodríguez-Iturbe, 1968) and cosmology (Coles & Barrow, 1987). While most developments and applications of the theory following Rice (1944) have required the distribution of observations to be Gaussian, there is a key result, the run-length ratio identity (Coles & Barrow, 1987;Lawrance & Kottegoda, 1977;Vanmarcke, 2010) for random variables, which does not require any particular form for the observed distribution, as long as we can meaningfully take averages, and the other necessary continuity conditions are satisfied by the continuous physical process underlying the observed discrete time series (Vanmarcke, 1985). Following Lawrance and Kottegoda (1977), we assume that the observations may be described as a stationary discrete time series of random variables {X t , t = 1, 2...} with X t at time t, and we will consider heat waves as intervals where the time series exceeds a threshold u > E(X t ). The X t do not need to be Gaussian distributed, and we assume no special time-structure, that is, there is no particular form of temporal correlation. The beginning of a heatwave is then an upcrossing of u, it occurs at time t if X t − 1 < u and X t > u and the upcrossing occurs with probability P(X t − 1 < u, X t > u).
If the number of upcrossings in time interval of M observations (0, M) is N M (u) then its expectation value is This is the number of heatwaves that occur on average in the time interval of M observations.
The probability of an observation being found above the threshold u is P(X t > u) and so the mean number of observations found above u in time interval (0, M) is P(X t > u)M. This is the total time spent in heatwave conditions on average in the time interval of M observations. The mean duration of runs of observations above u, that is, the average duration of a heatwave, is then given by the total time spent in heatwave conditions (on average in time interval M) divided by the number of heatwaves (occurring on average in time interval M):̄( which is a statement of the run-length ratio identity. Now P(X t > u) = 1 − C(u) where C(X) is the cumulative density function (cdf) of X t . For a cdf of daily observations, the mean run duration̄(u) is also in days.
Now the mean of the number of runs (or heatwaves) that occur in time (0, M) also determines the average return period for runs that exceed u which is If we have daily observations then time interval M refers to M days. However, the return period is most usefully expressed in years. M 0 then determines the units of return period, so that for an annual average, M∕M 0 = 365 and for a summer seasonal average M∕M 0 = 92. We then havē We have verified this relationship using simulated stochastic processes with different time-correlation, examples of these tests are shown in supporting information Figures S5-S9. The return period and run length are thus related to each other through the cdf. If we can estimate the value of the cdf of daily temperature observations at the heat wave threshold temperature u then we can also estimate the average heatwave duration at a given return period, or alternatively, the average return period of a heatwave of a given duration. This expression does not contain any information about time correlation that would arise from atmospheric blocking or other regional scale interactions so it does not determine return periods and run lengths independently, or obtain their distributions. Furthermore, it cannot capture correlation between runs in daytime and nighttime temperatures both rising above thresholds, which is the UK Met Office "Heat Health watch" heatwave definition. It does however translate observed changes in the distribution of daily temperatures to changes in average heatwave properties, in the sense of runs of consecutive days above a threshold. used to relate the length of a run of hot days above a given temperature threshold (heatwave duration) to its return period. We plot (a) run length versus temperature threshold for given return periods; (b) return period versus temperature threshold for a given run length and (c) return period versus run length at given threshold temperatures.

Data and CDF Estimation
We apply this expression to a well-studied data set, the Central England temperature (CET) record (Parker et al., 1992). The CET has undergone extensive analysis to quantify trends with time and there is evidence of warming both in the mean value and in the seasonal extremes (Brabson & Palutikof, 2002). We consider the data since 1878 where daily maxima and minima are known to a precision estimated to be better than a degree (Parker & Horton, 2005). The daily observed maximum temperature observations for June, July, August (JJA) for each of nine consecutive years are used to form a cdf; all plots will relate to the central year. The CET of daily maxima from 1878 to 2018 then gives samples centered on years 1882-2014. Kernel density estimates (Silverman, 1986) of the cdfs allow them to be evaluated at a specific threshold value. In the plots the cdfs are sampled at every 0.1 • , the same resolution as the CET. We then have a running kernel density estimate of the cdf, one for each year that is at the middle of each 9-year sample from 1882-2014, we plot this in Figure 1 along with the running probability density function (pdf). The choice of the number (nine) years in each sample is to optimize resolution of the cdf both at the higher quantiles and in time. In the supporting information we reproduce Figure 1 for samples of 1, 5, and 13 years (see Figures S1-S3). Figure 2 is an illustration of how equation (4) and the cdf formed from the consecutive summer (June, July, August) seasons for the years 2006-2014 relates the average length of a run of hot days above a given temperature threshold (heatwave duration) to its average return period. These plots indicate how the cdf can be translated into useful information at user-specific temperature thresholds. For example one can read from the plots that in the summer of 2010 (the central year of the 2006-2014 sample), a 6-day run of consecutive days in which the daily maximum temperature exceeds 28 • C, a threshold for overheating of buildings (Lomas & Porritt, 2017), has on average a return period of 3-4 years. An estimate of the cdf uncertainty is provided by 95% confidence bounds using the Greenwood (1926) formula, an example of these is shown in Figure S4. This is then transformed to give a corresponding uncertainty in average return period (at fixed run length) and run length (at fixed return period). The uncertainties that we will plot in Figures 3 and 4 are those of the observations (Parker & Horton, 2005) and those of the estimated cdf. They linearly combine to ∼ ±1-2 • C.

Averaged Heat Wave Properties
We now look across time to see how these average heatwave properties have changed over the last 140 years. We select one of the curves from the center panel of Figure 2, the return period of runs of six consecutive days where the daily maximum temperature is above a given threshold, and in Figure 3 we plot it for each of the overlapping nine summer season cdfs which have central years . Values corresponding to the 0.95 and 0.99 quantiles of the cdf of daily observations are indicated in Figure 3 and we can see on Figure 3 that to quantify any changes in the average return periods of heatwaves at user-relevant temperature thresholds (maximum daily T > 25-30 • C) and duration (about a week) requires resolving the underlying cdf of maximum daily temperature observations at and above the 0.95 quantile. A specific, user dependent choice of heatwave threshold, duration, and return period identifies the region of the cdf which needs to be well resolved statistically in order to quantify heatwave properties and how they may be changing in time. This constrains the minimum sample interval used to form the cdfs as both the number of daily observations in the sample and the functional form of the cdf, directly translate to an uncertainty which we estimate empirically as discussed above. In Figure 3 we can see that the curves move progressively to the right with increasing time to an extent exceeding the estimated ±1-2 • uncertainty. If we choose the threshold temperature to be 28 • C (black vertical line) then the return period of a 6-day heatwave has changed from ∼6-8 to ∼2-4 years over the period from the early 20th century to the early 21st century; it is about two to three times more frequent on average. In the supporting information we provide companion plots to Figure 3 corresponding to Figures 2a and 2c. They all rely on the same underlying cdfs but present the same changes in different ways. If we focus on a threshold of 28 • C, heatwaves with a 5-year return period would have been on average a week long in the late 1880's, and would now last about 2-3 weeks.
Although the curves move progressively to the right as time increases in Figure 3, it is not a smooth monotonic trend and there is considerable variability. We can plot the detailed time dynamics at a specific threshold temperature, return period, and duration and this is shown in Figure 4. Figure 4a plots the cut through Figure 3a at a threshold of 28 • C and duration of 6 days and shows how the return period of these heatwaves has changed. Figure 4b plots the analogous cut (though Figure S10) at a threshold of 28 • C for a  (1) relates these cdfs to average return periods and run lengths where the daily maximum temperature is above a given threshold. Color indicates the sample central year in the time sequence as in Figure 3. Gray shading indicates uncertainties estimated as the larger of that from 95% confidence bounds in the underlying cdf estimated using the Greenwood (1926) formula and from an intrinsic ±1 • C in the temperature time series (Parker & Horton, 2005).
return period of 5 years and shows how the duration of these heatwaves has changed. This quantifies overall how heatwaves have become more frequent and longer lasting. By fixing the return period and duration as in Figure 4c we can track the change in overall heatwave intensity, that is, the change in the threshold that the maximum daily temperature exceeds in each day of the heatwave. A week-long heatwave with an average return period of 5 years had a threshold temperature in the late 1880's typically below the 28 • C value for overheating of buildings. Now, the heatwave threshold is typically above 28 • C so that the threshold for overheating of buildings is almost always exceeded in each day of the heatwave .
The variability before about 1950 is approximately consistent with the estimated underlying cdf uncertainties. However, large decadal time scale excursions also occur, most notably around 1960-1970 and in the recent past (the "hiatus"; Stocker et al., 2013). These arise from the time variability in the high quantiles of the cdf, that can be seen in Figure 1, which do not necessarily simply follow that of the mean. This underlines the difficulty in perception, based on a few decades in time, of any secular trend in heatwave properties.

Conclusions
We have estimated the overall secular changes in summer heatwave average occurrence rates, duration, and intensity which arise from the observed upward drift in the tail of the distribution of CET daily temperature observations. These changes imply significant heatwave impact for end users. They were obtained empirically directly from the data and did not require knowledge of the causes of observed climate change, of trends in atmospheric greenhouse gases, or nonlinear modeling needed to capture changes in patterns of regional convection and atmospheric blocking. As we did not use global climate models, instead dealing directly with observations, our results are interpretative rather than predictive. Both forward prediction and detailed attribution of heatwaves to anthropogenic forcing require models that can capture the full time dynamics of the nonlinearities of the system. It is well established that both the mean and the higher moments of surface temperatures are changing (Klein Tank et al., 2005) such that the length of the longest excursion above a threshold will increase (Della-Marta et al., 2007). Our method of analysis provides a quite general framework to take as input quantitative, user-relevant heatwave properties, namely, temperature threshold, average duration, and return time, and provide as output the range of quantiles of the cdf that must be accurately resolved in data, and in models.
These results were obtained for a single time series. The same method can be applied simultaneously across many single-station, or spatially gridded, daily temperature observations, for which there are data sets since the early 1950s. Indeed, maps of time change in the cdf at high quantiles and corresponding temperature changes Stainforth et al., 2013) translate directly into maps of changes in average heatwave properties. However, care is needed in using spatial data that has required spatial interpolation as this can modify the high quantiles of the cdf, upon which we have seen heatwave behavior can depend.