Volume 10, Issue 6 e2021EF002271
Commentary
Open Access

Attributing and Projecting Heatwaves Is Hard: We Can Do Better

Geert Jan Van Oldenborgh

Geert Jan Van Oldenborgh

KNMI, De Bilt, The Netherlands

Contribution: Conceptualization, Methodology, Software, Validation, Data curation, Writing - original draft, Visualization, Funding acquisition

Search for more papers by this author
Michael F. Wehner

Corresponding Author

Michael F. Wehner

Applied Mathematics and Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA

Correspondence to:

M. F. Wehner,

[email protected]

Contribution: Methodology, Writing - review & editing

Search for more papers by this author
Robert Vautard

Robert Vautard

LSCE/IPSL, laboratoire CEA/CNRS/UVSQ, Gif-sur-Yvette Cedex, France

Contribution: Conceptualization, Validation, Writing - review & editing

Search for more papers by this author
Friederike E. L. Otto

Friederike E. L. Otto

Grantham Institute for Climate Change and the Environment, Imperial College, London, UK

Contribution: Conceptualization, Methodology, Funding acquisition

Search for more papers by this author
Sonia I. Seneviratne

Sonia I. Seneviratne

Institute for Atmospheric and Climate Science, ETH Zürich, Zürich, Switzerland

Contribution: Conceptualization

Search for more papers by this author
Peter A. Stott

Peter A. Stott

University of Exeter and Met Office Hadley Centre, Exeter, UK

Contribution: Methodology

Search for more papers by this author
Gabriele C. Hegerl

Gabriele C. Hegerl

Geosciences, University of Edinburgh, Edinburgh, UK

Contribution: Validation, Writing - review & editing

Search for more papers by this author
Sjoukje Y. Philip

Sjoukje Y. Philip

KNMI, De Bilt, The Netherlands

Contribution: Methodology, Writing - review & editing, Project administration, Funding acquisition

Search for more papers by this author
Sarah F. Kew

Sarah F. Kew

KNMI, De Bilt, The Netherlands

Contribution: Methodology, Writing - review & editing, Project administration, Funding acquisition

Search for more papers by this author
First published: 19 May 2022
Citations: 22

Abstract

It sounds straightforward. As the Earth warms due to the increased concentration of greenhouse gases in the atmosphere, global temperatures rise and so heatwaves become warmer as well. This means that a fixed temperature threshold is passed more often: the probability of extreme heat increases. However, land use changes, vegetation change, irrigation, air pollution, and other changes also drive local and regional trends in heatwaves. Sometimes they enhance heatwave intensity, but they can also counteract the effects of climate change, and in some regions, the mechanisms that impact on trends in heatwaves have not yet been fully identified. Climate models simulate heatwaves and the increased intensity and probability of extreme heat reasonably well on large scales. However, changes in annual daily maximum temperatures do not follow global warming over some regions, including the Eastern United States and parts of Asia, reflecting the influence of local drivers as well as natural variability. Also, temperature variability is unrealistic in many models, and can fail standard quality checks. Therefore, reliable attribution and projection of change in heatwaves remain a major scientific challenge in many regions, particularly where the moisture budget is not well simulated, and where land surface changes, changes in short-lived forcers, and soil moisture interactions are important.

Key Points

  • The IPCC AR6 WG1 states the “frequency and intensity of hot extremes have increased”

  • The IPCC notes that the effect of increased greenhouse gas on high temperatures is moderated or amplified at local scales by other factors

  • Confident quantitative attribution statements of the human influence on heatwaves are limited by our understanding of these local processes

Plain Language Summary

Heatwaves are arguably the most deadly weather phenomena. As the Earth warms due to higher concentrations of greenhouse gases, one would expect heatwaves to become worse as well, killing even more people unless they are better protected against the heat. However, it turns out that the world is not so simple and that many other factors also influence heatwaves. Land use changes, irrigation, air pollution, and other changes also drive trends in heatwaves. Some of these cause much larger trends while some have counteracted the climate change-driven trends up to now. In some regions, the causes of high trends have not yet been identified. Current generation climate models often do not simulate all these mechanisms correctly so will have to be improved before we can more confidently trust their description of past trends and projections of future trends in heatwaves.

1 Introduction

Extreme heat is one of the deadliest natural hazards (Harrington & Otto, 2020) and also is one where climate change really is a game changer. For instance, European heatwaves were diagnosed as the deadliest disaster of 2019 (Vautard et al., 2020). The recently released IPCC report concluded that “It is virtually certain that hot extremes (including heatwaves) have become more frequent and more intense across most land regions since the 1950s, with high confidence that human-induced climate change is the main driver of these changes. Some recent hot extremes observed over the past decade would have been extremely unlikely to occur without human influence on the climate system” (Seneviratne et al., 2021). However, challenges arise in the observed trends at the local to regional scales that matter for planning and adaptation. Although changes in heatwaves are widely thought to be simpler to attribute to anthropogenic climate change than precipitation events, there remain significant challenges. Recent work of the World Weather Attribution initiative has highlighted the general issues in attributing regional changes in extreme weather (Philip et al., 2020; van Oldenborgh, van der Wiel, et al., 2021). This paper focuses on the specific problems that can be encountered when attributing the human influence to heatwaves.

First, the observed trends in heatwave frequencies are not always positive, or at least do not closely follow global warming. Figure 1a shows the trends in the maximum temperature of the hottest day of the year (TXx) for the last century of GHCN-D v2 stations as a regression on smoothed global mean temperature. Apart from individual station records showing breaks or spurious trends, there are coherent areas with negative or zero trends. In the Central Plains of the United States, the highest temperatures were observed during the Dust Bowl of the 1930s (Cook et al., 2011; Cowan, Hegerl, et al., 2020; Donat et al., 2016), not in recent years, while in the Central-Eastern United States, hot extremes are also not steadily increasing with global mean temperature during recent decades and daytime maxima show different trends from minima (Portmann et al., 2009). In India, extreme high temperatures have no or very small trends since the 1970s (van Oldenborgh et al., 2018) (Figure 1b). This highlights that drivers other than anthropogenic GHG emissions also play an important role in heatwaves. Second, climate models do simulate increasing frequencies and intensities in heatwaves on large geographical scales but their skill in simulating the observed trends on smaller scales is collectively poor across the world, notably in regions with good observational data and huge modeling efforts, for example, Europe and North America. In an attribution study of the 2019 European record heatwaves, we found that the highest temperatures of the year in that region have increased in observations much more than in 44 analyzed global and regional climate simulations in this region (Vautard et al., 2020). Similar problems were already reported in previous generation regional climate models (Min et al., 2013), suggesting that model development has not addressed these deficiencies. In other regions, notably eastern North America and India (Cowan, Hegerl, et al., 2020; Donat et al., 2017; van Oldenborgh et al., 2018), the problem is reversed with models considerably overestimating the observed trends. In addition, there is a lack of consistency in simulating the magnitude of trends in heat extremes in different model ensembles (regional EURO-CORDEX vs. global CMIP5) and model generations (CMIP5 vs. CMIP6; Coppola et al., 2021). While there is little difference between the CMIP5 and CMIP6 ensembles in global skill metrics of their simulation quality of average TXx, many models fail to adequately simulate long period return values of extreme heat (Wehner et al., 2020). Comprehensive analyses relating such metrics to model performance in simulating trends have yet to be conducted.

Details are in the caption following the image

(a) Trend in the highest maximum temperature of the year (TXx) as a regression on 4-year smoothed global mean surface temperature (GMST). GHCN-D v2 stations with a minimum radial separation of 2°, and at least 50 years of data in 1900–2019 are shown. (b) The same for 1970–2019 and at least 30 years of data. Units: °C per °C global warming.

While there is no doubt that at very large spatial and temporal scales, heatwaves are increasing and models do represent this, the change in daily maximum temperature, that is, extreme heat, is very different on the scales where people live and decisions on preparedness are made. With the current generation of climate models, we are unable to quantify this change reliably, which affects our ability to reliably attribute changes in probabilities of hot extremes on relevant spatial scales. Given these discrepancies in representing the past, confidence in quantitative projections of heat extremes remains low. In the remainder of the paper, we illustrate the problem, and discuss and test the reasons for these discrepancies that have been suggested in the literature, ending with a set of priorities for future research.

2 Heatwave Characteristics

We start our investigation by laying out the basic properties of heatwaves, not all of which are well-known. Any assessment on changes in heatwaves depends strongly on how these are defined. Definitions frequently employed include continent-averaged seasonal mean temperature (e.g., Stott et al., 2004), a quantity climate models are able to simulate well, and that is strongly correlated to external forcing. Such a definition also maximizes the signal-to-noise ratio as natural variability is averaged out more than the anomaly corresponding to the event itself (Angélil et al., 2018). At the other end of the spectrum is the local instantaneous highest single-day temperature in a year (often denoted by TXx). This definition corresponds to a broad understanding of heatwaves in the general public as the media usually reports daily records. It also corresponds to health impacts in places where the most vulnerable population is working outdoors, such as outdoor laborers in India (Nag et al., 2009) or in Central California (Castillo et al., 2021). In Europe, a few days' average of daily mean or maximum temperature describes the impacts of extreme heat on the population better, accounting for accumulation of the effect of heat on the most vulnerable population indoors (D’Ippoliti et al., 2010; Heaviside et al., 2017). Here, we consider annual maximum daily maximum temperature, TXx, generally the hottest summer afternoon each year, as it is very widely used (Hartmann et al., 2013) and can be well compared with station observations. We have previously found that using the maximum of the 3-day running mean of daily mean temperature (TG3x) is more appropriate for health impacts in Europe and gives very similar results (Kew et al., 2019; Vautard et al., 2020).

In general, the distribution of the temperature of the hottest afternoon of the year is described well by a general extreme value distribution (GEV), in agreement with extreme value theory (Coles, 2001). The distribution is not stationary but changes with global warming and other drivers of local temperature trends. An efficient and often realistic way to describe these changes is to assume the whole distribution shifts up with an indicator of climate change, for which the smoothed global mean surface temperature (GMST) is an often used metric (Philip et al., 2020). This variable is well-estimated and updated in real time. The scale and shape parameters describing the variability and tail shape are thus assumed constant. As an example, the observations and fit are shown in Figure 2 for De Bilt in the Netherlands, which has a long homogenized record. The low-pass filtered time series resembles the well-known GMST increase, indicating that global warming is the dominant driver of the non-stationarity, thus justifying its use as a covariate. A secondary driver might be local aerosols intercepting incoming solar radiation, but this averages out in the fit as the effects of dimming from the 1960s–1980s and the subsequent brightening up to the 2000s cancel if the analysis period includes both.

Details are in the caption following the image

(a) Highest maximum temperature of the year (TXx) at De Bilt, the Netherlands. The green curve denotes a 10-year running mean. (b) Gumbel (return time) plot of a GEV fit of TXx shifted with the smoothed GMST. The red lines indicate the fit and the 95% confidence intervals in the current climate (2019), the blue lines in the early industrial climate (1.2°C lower GMST). The observations are shown twice: once shifted to the early industrial climate using the fitted trend (blue stars), once shifted to the climate of 2019 (red pluses). The purple line denotes the value observed in 2019, which is included in the fit.

The curves in Figure 2b denote the GEV for two values of the GMST, in a 1.2°C cooler world (blue, early industrial) and 2019 (red, the current climate during a recent extreme). For comparison, the observations are shown for the early industrial and current climates, shifted with the fitted trend from the actual smoothed GMST.

The shape parameter of the GEV distribution is almost always found to be negative in heatwave analyses, resulting in the distribution having an upper bound (Wehner et al., 2018). This shape of the tail implies that the probability of an event to occur decreases rapidly as the upper bound is approached and is zero above it. We are not aware of a rigorous derivation of the origin of the upper bound in the literature. We think it could be a consequence of the nonlinearities in the surface energy balance and its interaction with the water balance, plus convection as a moderating effect. Both the sensible and latent heat fluxes increase rapidly with temperature. The assumption of constant scale and shape parameters in the distribution implies that the upper bound shifts with the rest of the distribution, which is found in observations as well as historical model simulations (Vautard et al., 2020).

This procedure of fitting a GEV shifting with GMST to the observed annual maxima allows us to answer the question of how much hotter and more likely extreme heat is now than it was a century ago. Applying this method to the TXx at De Bilt observed on July 27, 2019, 37.5°C, denoted by the purple line in Figure 2b, we find that the record observed in 2019 would have been virtually impossible in the climate of 1900 (the purple line is above the blue central curve representing the best fit). Taking the upper bound of the 95% confidence interval (obtained from bootstrapping; Philip et al., 2020) gives a return time of at least 15,000 years in the climate of 1900. In the warmer climate of today, the return period of that event is about 30 years, with a lower bound of 13 years (intersections with the red curves), while the magnitude of a temperature extreme of this rarity is about 4.0 ± 1.1°C (2σ bounds) higher than it would have been in the early industrial climate. Similar analyses have been done for areas where heatwaves have not increased at all in temperature (van Oldenborgh et al., 2018). However, these observational analyses only detect a trend or its absence, they cannot attribute the causes of it.

3 Potential Causes of Heatwave Trends

Long-term changes in heatwaves are influenced not only by globally well-mixed greenhouse gases but also by more localized influences, including aerosol trends (Péré et al., 2011), land use changes (Cowan, Hegerl, et al., 2020), vegetation and soil moisture changes (Donat et al., 2017), irrigation (Thiery et al., 2017), and urbanization effects (Heaviside et al., 2017). Furthermore, the meteorological conditions conducive to heatwaves could change regionally by potential changes in mean atmospheric circulation or in the frequency of specific weather patterns leading to extreme heat (Horton et al., 2015).

Local circumstances such as the thermometer screen and its immediate surroundings also influence TXx in observations disproportionately and must be homogenized before observations can be compared to models. The De Bilt series has been homogenized for the change in screen and displacement to a less sheltered location in 1951 (Brandsma, 2016). Urban heat effects could in theory also affect the highest temperatures, but in De Bilt they are small as record temperatures are always attained during southerly or south easterly wind directions with no urban areas within 10 km upstream. However, for other stations, it might not be as straightforward to identify whether such local effects are small as, for instance, many inner-city stations are in city parks. The anomalously large or small trends observed in these stations (e.g., Madrid Retiro and Dublin Phoenix Park) give the suspicion that they could be influenced more by changes in the lawn sprinkling schedules than global warming, so we avoid using these observations in our attribution analyses and would recommend not using them for model/observation comparisons. These issues call for a detailed investigation of station temperature homogeneity and a massive effort in homogenization in many places of the world.

4 Climate Model Ensembles Can Misrepresent Local Heatwave Trends

To disentangle all these effects on heatwaves and isolate the change driven by anthropogenic climate change, we have to turn to climate models. Figure 3 shows the simulated trend in TXx in the CMIP5 ensemble of opportunity (Sillmann et al., 2013) with a median resolution of about 200 km over roughly the same periods as Figure 1. (We have excluded MIROC-ESM and MIROC-ESM-CHEM, as these have intermittent physically impossible high temperatures in the deserts.) The maps show less structure than the observed trends. This is partly due to the natural variability being averaged out and partly due to missing or misrepresented local forcings. Notably, in neither time period does the multimodel average represent the observed negative or neutral trends in TXx in central and eastern North America. The central great plains early heatwaves have been linked to rapid devegetation in the 1930s associated with the dustbowl drought, which led to record heatwaves at the time not yet superseded (Cowan, Hegerl, et al., 2020; Cowan, Undorf, et al., 2020). A factor in the negative trends in the eastern United States may be downstream effects of increasing irrigation further west (DeAngelis et al., 2010) from the 1950s onwards, coinciding with a positive precipitation trend (Kirtman et al., 2013; Portmann et al., 2009). Changes in agricultural practices leading to higher evaporation have also been implicated (Changnon et al., 2003). It has been speculated that revegetation after the decline of agriculture might also have been a factor (Portmann et al., 2009). The CMIP5 models do not include cooling due to irrigation, which leads to biases in trends over the United States, Iran, Pakistan, and India (Mueller et al., 2016; Thiery et al., 2017), although some specialized simulations do (Lawston et al., 2020; Lobell & Bonfils, 2008). They furthermore likely misrepresent the warming effect of black carbon and the cooling effect of sulfate aerosols over India (Padma Kumari et al., 2007) nor are they forced with rapid vegetation changes (Cowan, Undorf, et al., 2020).

Details are in the caption following the image

Trend in the highest maximum temperature of the year (TXx) as a regression on 4-year smoothed global mean surface temperature (GMST) as in Figure 1a but the historical/RCP4.5 CMIP5 ensemble (Sillmann et al., 2013) for (a) 1900–2019 and (b) 1950–2019 using the ensemble mean global mean temperature as covariate. Units: °C per °C global warming.

While Figure 3 does show a stronger warming trend over Europe than in other parts of the world, the multimodel average does not accurately represent the much higher observed trends in western Europe (Min et al., 2013) or southeastern Australia (van Oldenborgh, Krikken, et al., 2021). So far, there has been little progress in determining whether these discrepancies are due to missing or misrepresented local forcings (aerosols, land use, vegetation, irrigation), overly strong land surface drying in historical heatwaves, or due to natural variability, misrepresented feedbacks, or changes in the observational methods or their local surroundings. Two decades ago, systematic errors in blocking frequency and persistence were a major source of biases in weather and climate models (Palmer et al., 1990), but in modern models, these are realistic in the European summer (Iles et al., 2020; Krikken et al., 2019; Vautard et al., 2020) and thus not a reason for the persisting model deficiencies there.

With respect to natural variability, we find in many locations that the discrepancies between observed and modeled trends are much larger than can be expected on the basis of natural variability and model spread alone. We use again De Bilt as an example, but the results are similar across Western Europe. Figure 4a shows the histogram of trends in the CMIP5 models at the location of De Bilt over 1900–2019, with models with N runs each entered with weight 1/N so that all models have equal weight. Model results show the grid cell enclosing the observation station. If that is an ocean cell, then the nearest cell to the east or west is used (van Oldenborgh, van der Wiel, et al., 2021). Only 6 of the 10 CSIRO ensemble members have trends higher than the observed one, all other models have lower trends. However, the CSIRO model places the Mediterranean warming trend too far north and underestimates the observed global warming trend by 30%. Both these factors give a high trend in heatwaves relative to the global mean temperature rise, but for the wrong reasons: the climate of the Netherlands is not Mediterranean and the CSIRO global mean temperature rise is unrealistically low. The CMIP5 ensemble thus fails to reproduce the observed trends, even though it includes all relevant natural variability, including possible low-frequency effects from the subpolar gyre (Haarsma et al., 2015) and the model spread as proxy for model uncertainty.

Details are in the caption following the image

(a) Histogram of the TXx trends at De Bilt in the CMIP5 (Taylor et al., 2011) ensemble compared to the observed trend 1900–2019, both expressed as a regression on the (modeled/observed) smoothed GMST. The standard deviation (s.d.) of natural variability is estimated from models with three or more ensemble members. (b) The same for 55 CORDEX RCM/GCM combinations and observations using 1950–2019 or 1970–2019 depending on data availability, and using observed GMST (1950–2019) as reference (Coppola et al., 2021). (c) The same for 30 realizations of the CESM Large Ensemble over the period 1920–2019 (Deser et al., 2020). Units: °C per °C global warming.

Figure 4b shows the same for 55 RCM/GCM combinations at 11 km resolution of the CORDEX ensemble (Coppola et al., 2021; Vautard et al., 2021) for Europe, over 1951–2019 or 1971–2019 (depending on the models' data availability). In this comparison, the observed trend is larger than all modeled trends. This again implies that in this case natural variability is unlikely to be the driver behind the strong increase in heatwaves that are at the moment not correctly represented in climate models. These models have much higher resolution than the CMIP5 models (median resolution 200 km) so the problem is not simply solved by going to higher resolution, but also need other improvements.

The recent development of large single-model ensembles (Deser et al., 2020) provides an opportunity to better quantify the natural variability of extreme temperatures and place observed trends in that context (Tebaldi et al., 2021). Figure 4c shows that a histogram of the 1920–2019 De Bilt temperature trends from the Community Earth System Model Large Ensemble (CESM1 LENS) fails to include the observed trend. As for the CMIP5 and CORDEX multimodel ensembles, the modeled trends are lower than observed. We note that variability in mean temperature trends can be overestimated in large ensembles (McKinnon et al., 2017) further suggesting that important processes are missing. Nonetheless, we encourage the modeling community to expand large ensemble simulations to the DAMIP single forcing scenarios (Gillett et al., 2016) to aid in attribution studies.

For other regions where ensembles of models do not reproduce the observed trends in heatwaves, either too high or too low, similar conclusions that factors in addition to natural variability must be considered, as for instance shown qualitatively in India (van Oldenborgh et al., 2018). For eastern North America, the agreement of models including realistic land use changes with observations (Cowan, Hegerl, et al., 2020) suggests a limited role of natural decadal variability of the atmosphere, which is in agreement with the low correlation between the natural decadal variability and longer-term trends in temperature extremes (van Oldenborgh et al., 2012). While the trends shown in Figure 4 show that the actual probability of extreme heat in De Bilt is higher than the CMIP5, CORDEX, and the CESM1 LENS ensembles can produce, often the opposite is the case. As Knutson (2017) discusses, statements of “attribution without detected changes” in observations can still be useful, albeit with lower confidence than when observed and simulated trends are mutually consistent.

5 Biases in Variability

These biases in the trend are not the only problem. In particular, to accurately attribute change in probability of extreme events to anthropogenic climate change, the variability of the extremes in the model is as important as the trend. In the case of heatwaves, the upper limit of the probability of an event in the current climate divided by its probability in a climate without global warming (commonly referred to as the probability ratio, PR) rises with increasing variability (Philip et al., 2020). Almost all climate models analyzed have unrealistically high variability, with factors of 1.5–6 higher scale parameters in GEV fits of the high tail in Europe (Leach et al., 2020) and this bias is also apparent in subtropical and tropical regions (Freychet et al., 2021). Overestimation of the variability in extreme temperatures by climate models undermines confidence in our understanding of heatwave trends (Kew et al., 2019; van Oldenborgh, Krikken, et al., 2021; Vautard et al., 2020) as none of the models pass a frequently employed model evaluation test (Philip et al., 2020), demanding that the model GEV fits are compatible (within the sampling uncertainty) with the observed GEV fit of the observations. In such cases, best estimate attribution statements should not be made. However, this inconsistency between models and observations does not preclude placing conservative lower bounds on the human influence of heatwaves. The overestimation in the variability of extreme temperatures remains unexplained and could result from several processes, for example, excessive land-atmosphere-cloud-precipitation feedbacks (Miralles et al., 2019).

6 Conclusions

While large-scale changes in mean temperature are well understood, changes in local and regional heatwaves, particularly, daytime maxima, are much harder to simulate and hence attribute. This failure to understand today's observed trends and the discrepancies between the modeled and observed trends and variability also hinders confidence in projections of the future trends. The extrapolation of the observed trends shown in Figure 1a is very different from the simulated trends from climate model output over the same period in Figure 3.

Heatwaves on the scales people experience them are strongly influenced by the local energy budget that determines the use of energy between evaporation and heating, set by the land surface, vegetation, irrigation, and urbanization. Other factors such as circulation changes or aerosols may also be important and feedbacks may well be misrepresented in climate models during these extreme circumstances. Many of these drivers and feedbacks are not well-simulated in current climate models as evidenced by striking discrepancies between observed and modeled trends and variability in certain regions of the globe. We have shown above that the discrepancies cannot always be explained by natural variability and in some cases are well outside the range of CMIP historical simulations even in well-understood regions (Cowan, Undorf, et al., 2020; van Oldenborgh et al., 2018). Diffenbaugh et al. (2017) use four performance metrics to compare observed and simulated trends in TXx finding broad regions, mostly in Eurasia, where models are deemed adequate. But they also reject the parts of North America we discussed in Section 1. (India was not included in their analyses of TXx). We have also shown that the failure of climate models to represent trends in heatwaves does not change with the resolution of the model nor can a previously observed failure in climate models to represent atmospheric blocking be identified in the current generation of models.

On the other hand, process studies have indicated a strong role of surface conditions such as vegetation and moisture availability, suggesting that adapting local vegetation and moisture conditions may be able to moderate, to an extent, extreme local heat (e.g., Heaviside et al., 2017; Stone et al., 2014). We have further highlighted that recent studies indicate that the overestimation of trends in regions like North America and India could be due to a misrepresentation of local irrigation and aerosol effects. Given no corresponding trends in either of these drivers in regions where models considerably underestimate trends in extreme heat this cannot be the explanation for all deficiencies. Similarly, while changes in measurement technique and location can explain some discrepancies, they are very unlikely to explain the systematic and widespread discrepancies.

This leaves still an uncomfortably large list of potential reasons for our current lack of understanding of the drivers of extreme heat, including land use changes and soil moisture, aerosol effects and atmosphere feedbacks as well as circulation effects other than blocking. Until we simulate realistic effects of all relevant drivers and feedbacks such that these properties agree within the uncertainties of natural variability of the weather in our climate model simulations, we cannot give confident estimates for the change in frequency and intensity in heatwaves due to anthropogenic global warming up to today in those areas where these missing processes are important, but only lower or upper bounds. Nor can we confidently trust the projections of future heatwaves there. There remain three possible broader reasons for divergence between observed and simulated heat extremes at the scales that affect people. First, the possibility that the models are right, but are being given incomplete local information such as missing land surface feedbacks and use changes, and so on. Second, the possibility that the models are truly incorrect and would not have captured observed trends even if these regionally specific matters were fully incorporated in the modeling framework. Third, the natural variability at local scales predominates over anthropogenic forcing and that the models either do not simulate internal variability correctly or our ensembles are not large enough to capture it. Careful simulation and evaluation of historic events in the context of natural variability help distinguish these contributing factors. In our view, it is thus an immensely important priority for climate model development studies to focus on extreme heat, the deadliest and most immediate effect of human-induced climate change.

7 Eulogy

Geert Jan van Oldenborgh passed away on October 12, 2021 before he could respond to the reviewers' comments. While we mourn the loss of our friend and colleague, we celebrate his life in this, his final scientific paper. His contributions to the science of extreme weather event attribution were immense and will continue to influence us and many others as we continue to understand the effects of global warming on extreme weather.

Acknowledgments

The research and pre-operational settings underpinning this work were supported by the EUPHEME and SERV_FORFIRE projects, which are part of ERA4CS, an ERA-NET initiated by JPI Climate and co-funded by the European Union (Grant #690462). The work was also supported by the French Ministry for solidarity and ecological transition through the “climate services convention” between the Ministry and the Centre National de la Recherche Scientifique. MFW acknowledges support from the Regional & Global Model Analysis (RGMA) program area of the DOE's Office of Science under contract number DE-AC02-05CH11231 and GH acknowledges funding from the EMERGENCE project, NERC NE/S004661/1. This paper builds on results from the Copernicus COP 039 and C3S 62 contracts lead by KNMI, which focused on the development of a prototype for extreme events and attribution service within the context of the Copernicus Climate Change Service (C3S) implemented by the European Centre for Medium-Range Weather Forecasts (ECMWF) and funded by the European Union.

    Data Availability Statement

    All data used in this work are publicly available: GHCN-D v2 is available at https://climexp.knmi.nl/selectdailyseries.cgi. De Bilt TXx was constructed from the daily time series of The Bilt at https://climexp.knmi.nl/getdutchstations.cgi?TYPE=tx. The CMIP5 TXx values can be found at https://climexp.knmi.nl/selectfield_cmip5_annual.cgi. The 55 RCM/GCM combinations at 11 km resolution of the CORDEX ensemble are available via the ESGF as linked from https://cordex.org. The Community Earth System Model Large Ensemble (CESM1 LENS) is available at https://www.cesm.ucar.edu/projects/community-projects/LENS/data-sets.html.