Volume 12, Issue 5 e2023EF004035
Research Article
Open Access

Spatial Sensitivity of River Flooding to Changes in Climate and Land Cover Through Explainable AI

Louise Slater

Corresponding Author

Louise Slater

School of Geography and the Environment, University of Oxford, Oxford, UK

Correspondence to:

L. Slater,

[email protected]

Contribution: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing - original draft, Writing - review & editing, Visualization, Funding acquisition

Search for more papers by this author
Gemma Coxon

Gemma Coxon

School of Geographical Sciences, University of Bristol, Bristol, UK

Contribution: Methodology, Writing - review & editing

Search for more papers by this author
Manuela Brunner

Manuela Brunner

WSL Institute for Snow and Avalanche Research SLF, Davos Dorf, Switzerland

Institute for Atmospheric and Climate Science, ETH Zurich, Zurich, Switzerland

Climate Change, Extremes and Natural Hazards in Alpine Regions Research Center CERC, Davos Dorf, Switzerland

Contribution: Methodology, Writing - review & editing

Search for more papers by this author
Hilary McMillan

Hilary McMillan

Department of Geography, San Diego State University, San Diego, CA, USA

Contribution: Methodology, Writing - review & editing

Search for more papers by this author
Le Yu

Le Yu

Department of Earth System Science, Ministry of Education Key Laboratory for Earth System Modeling, Institute for Global Change Studies, Tsinghua University, Beijing, China

Ministry of Education Ecological Field Station for East Asian Migratory Birds, Beijing, China

Department of Earth System Science, Xi'an Institute of Surveying and Mapping Joint Research Center for Next-Generation Smart Mapping, Tsinghua University, Beijing, China

Contribution: Resources, Writing - review & editing

Search for more papers by this author
Yanchen Zheng

Yanchen Zheng

School of Geographical Sciences, University of Bristol, Bristol, UK

Contribution: Resources, Writing - review & editing

Search for more papers by this author
Abdou Khouakhi

Abdou Khouakhi

School of Water, Energy and Environment, Centre for Environmental and Agricultural Informatics, Cranfield University, Cranfield, UK

Contribution: Resources, Writing - review & editing

Search for more papers by this author
Simon Moulds

Simon Moulds

School of Geography and the Environment, University of Oxford, Oxford, UK

School of Geo Sciences, University of Edinburgh, Edinburgh, UK

Contribution: Writing - review & editing

Search for more papers by this author
Wouter Berghuijs

Wouter Berghuijs

Department of Earth Sciences, Free University Amsterdam, Amsterdam, The Netherlands

Contribution: Writing - review & editing

Search for more papers by this author
First published: 30 April 2024

Abstract

Explaining the spatially variable impacts of flood-generating mechanisms is a longstanding challenge in hydrology, with increasing and decreasing temporal flood trends often found in close regional proximity. Here, we develop a machine learning-informed approach to unravel the drivers of seasonal flood magnitude and explain the spatial variability of their effects in a temperate climate. We employ 11 observed meteorological and land cover (LC) time series variables alongside 8 static catchment attributes to model flood magnitude in 1,268 catchments across Great Britain over four decades. We then perform a sensitivity analysis to assess how a 10% increase in precipitation, a 1°C rise in air temperature, or a 10 percentage point increase in urban or forest LC may affect flood magnitude in catchments with varying characteristics. Our simulations show that increasing precipitation and urbanization both tend to amplify flood magnitude significantly more in catchments with high baseflow contribution and low runoff ratio, which tend to have lower values of specific discharge on average. In contrast, rising air temperature (in the absence of changing precipitation) decreases flood magnitudes, with the largest effects in dry catchments with low baseflow index. Afforestation also tends to decrease floods more in catchments with low groundwater contribution, and in dry catchments in the summer. Our approach may be used to further disentangle the joint effects of multiple flood drivers in individual catchments.

Key Points

  • We employ partial dependence analysis and sensitivity testing to assess where changes in climate or land cover might affect flooding the most

  • Rising precipitation and urbanization tend to lead to larger floods in catchments with high baseflow contribution and low runoff ratio

  • Rising air temperature with unchanged precipitation lowers flood magnitudes, especially in dry catchments with low baseflow index

Plain Language Summary

We developed a machine learning-based approach to investigate why the effects of changes in climate and land cover (LC) on floods vary spatially. To inform the model, we used climate and LC data for 1,268 catchments in Great Britain over four decades. We found that increasing rainfall and urban development tend to lead to larger floods, especially in rivers fed largely by groundwater. In contrast, rising air temperature and afforestation, in the absence of any changes in rainfall, lead to smaller floods, particularly in areas where rivers are less fed by groundwater. We believe that our findings can be used to develop more targeted flood management strategies.

1 Introduction

Understanding and predicting changes in river flooding is key to preparing for future environmental change, but this task is still hampered by a suite of “cascading uncertainties” arising from the hydrological and climate models used (Devitt et al., 2021), the different emissions scenarios, and a suite of local factors such as changing land cover (LC), human water use, and varying catchment characteristics (Seneviratne et al., 2021). At regional scales, the complex interplay of different flood drivers translates as a mix of increasing and decreasing trends in the historical record that are often spatially co-located (Archfield et al., 2016; Slater et al., 2021). In Great Britain (GB) for instance, there have been some increases in floods in the northern and western regions, but spatially mixed patterns of increases and decreases in the rest of the country which are difficult to explain (Hannaford et al., 2021). These spatial patchworks of flood trends suggest that in any given location, the evolution of flooding is not tied solely to shifts in precipitation extremes (Sharma et al., 2018; Wasko & Nathan, 2019), but often to multiple, sometimes conflicting, flood-generating mechanisms.

The field of large-sample hydrology has made great strides in disentangling these different flood-generating mechanisms. The dominant process may include short/long rain, excess rainfall, snowmelt, or a combination of rain and snow (Stein et al., 2020). In many cases, these drivers co-occur and have compounding effects (Jiang et al., 2023). In Europe, for instance, most floods are driven by the combined influence of heavy precipitation and high antecedent soil moisture (Berghuijs et al., 2019). Antecedent soil moisture plays a critical role in flood generation (Slater & Villarini, 2016), to the extent that a prolonged shift from rain-on-dry-soil to rain-on-wet-soil conditions can lead from a flood-poor to a flood-rich period (Tarasova et al., 2023). Baseflow, defined as the portion of streamflow contributed by slow subsurface flow components, often has an even more significant impact on annual flooding than soil moisture (Berghuijs & Slater, 2023). Severe floods may also include snow-related processes, which tend to affect floods over a larger contiguous spatial area than rainfall-driven events (Brunner & Fischer, 2022). Alongside these climatic drivers, changes in forest or urban LC also play a vital role in modulating flood risk, but the effects of LC change can be challenging to measure (e.g., Anderson et al., 2022; Buechel et al., 2022, 2023; Han et al., 2022).

Determining how local catchment characteristics condition the impacts of climate and LC on flooding remains an unresolved research question. Here, we develop a large-sample approach based on a quantile regression forest (QRF) machine learning (ML) model to simulate the role of different flood drivers across GB and assess their spatial variability. Although ML-based methods such as random forests have been used to reveal the influence of certain climate and catchment characteristics on flood-generating processes (Stein et al., 2021), they have not yet been employed to understand how the effects of changing climate and LC on flood magnitudes might vary spatially. ML is helpful in this instance because it can assess the overall effect of predictors for all basins through partial dependence analysis and their local effects through a suite of sensitivity tests based on specific changes in each predictor of interest (precipitation, temperature, urban and forest LC). We address the three following research questions: (a) What is the relative contribution of climate versus LC to seasonal flood magnitude in GB? (b) To what extent can the historical trends in flooding be explained by the trends in these same climate and LC variables? (c) What explains the spatial variability in the effects of climate and LC on floods in GB?

2 Data

We compile a new data set which includes time-series variables of flood magnitude, climate, and LC over four decades (1982–2020) alongside static variables describing catchment characteristics and hydrological signatures for 1,268 river catchments across GB. An overview of the different data sources employed is provided in Table S1 of the Supporting Information S1, with mean values of key predictors shown for each river catchment in Figure S1 of the Supporting Information S1. Daily river flow data (in m3 s−1) were obtained from the National River Flow Archive (NRFA, 2023) using the rnrfa package (Vitolo et al., 2016) and converted to specific discharge (mm d−1) by dividing the daily river flow by the drainage area. Daily values of total precipitation (rainfall, mm d−1) and daily maximum air temperature at 1.5 m (tasmax, °C) were obtained by extracting the weighted catchment-averaged value for each day from the 1 km-resolution HadUK-Grid data set (a collection of gridded climate variables derived from UK land surface observations, created by the UK Met Office Hadley Centre), downloaded from the UK Center for Environmental Analysis archive (Met Office, 2023). The daily potential evapotranspiration (PET, mm d−1) data were computed from HadUK data by Brown et al. (2022) using the Penman-Monteith equation derived in terms of specific humidity (Robinson et al., 2022).

Our analyses are performed at the seasonal scale, where winter is DJF, spring is MAM, summer is JJA and autumn is SON. The seasonal scale is employed rather than the event scale, for the purpose of generating seasonal simulations to evaluate how the effects of changes in LC and climate on flood magnitude differ across a range of catchment characteristics. Seasonal streamflow, precipitation, PET, and temperature metrics were computed as follows for seasons with a minimum of 90 daily values. We take the maximum of the mean daily streamflow in each season (i.e., Qmax) to represent seasonal floods. For precipitation, we take the total of all daily values in each season. For PET and maximum air temperature, we take the median daily value in each season. Values of antecedent precipitation and temperature are taken from the previous season. All other meteorological variables—relative humidity (%), ground frost (days), vapor pressure (hPa), and lying snow (days)—were obtained by extracting the catchment-averaged mean seasonal value (or total count of days in each season) from the seasonal 5 km HadUK data (see Table S1 in Supporting Information S1 for details).

We include two time series of LC variables: forest (percent of the catchment covered by trees) and impervious surface (percent of the catchment surface that is impervious). Both variables were extracted from the Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC Plus) data set, a timeseries data set of annual LC data at 1 km resolution for the period 1982–2020 (Yu et al., 2022). We selected FROM-GLC Plus because the data cover a longer period than other products, such as the European Space Agency's Climate Change Initiative LC data set (1992–2020). FROM-GLC Plus was developed using ML-based reconstruction of multi-source data with multi-season sample space-time migration. To our knowledge, it represents the best available high-resolution LC product over a four-decade period. The percentage of impervious surface and tree cover in each river catchment was extracted from the 1 km version of FROM-GLC Plus for every year, resulting in a time series of annual impervious and forest data for each catchment. We then tailored our final data set to the period 1982–2020 for which all time series predictors are available. In the sensitivity simulations, we refer to increases in impervious LC and forest LC as “urbanization” and “afforestation.”

Several static catchment descriptors were obtained from the CAMELS-GB data set (Table S1 in Supporting Information S1; Coxon et al., 2020), a compilation of catchment attributes for a large sample of catchments in GB. These are: the catchment drainage area (km2), stream gauge elevation (mASL), static attributes describing the catchment hydrogeology (spatial coverage of fractured aquifers) and soils (percent of organic content, silt, clay, and sand content), the mean river network slope (m km−1), the aridity index (dimensionless), and mean daily precipitation (mm d−1). These variables were selected for their influence on catchment hydrology (Coxon et al., 2020) and their utility for modeling streamflow using large-sample machine-learning models over GB (Lees et al., 2021, 2022). We also consider four dimensionless streamflow signatures from CAMELS-GB that describe hydrologic behavior: the baseflow index (BFI; ratio of mean daily base flow to daily discharge), the runoff ratio (ratio of mean daily discharge to mean daily precipitation), the streamflow-precipitation elasticity (sensitivity of streamflow to changes in precipitation at the annual timescale), and the slope of the flow duration curve (see Table S1 in Supporting Information S1 for details). We use relevant static variables within the model (described in Section 3.1 and shown in Figure 1; Table S1 in Supporting Information S1). The streamflow signatures and other static catchment descriptors such as mean precipitation and aridity index are used to evaluate the sensitivity simulations (Section 3.2) to assess how the influence of increasing precipitation, temperature, urbanization, and afforestation differs across catchments with different characteristics (e.g., catchments with low/high groundwater or surface runoff). The BFI describes the slowly varying component of streamflow contributed by slow subsurface flow. However, it is important to note that hydrograph separation techniques such as the BFI are purely statistical, and as such ignore the physical processes and contribution of different water sources to streamflow (Duncan, 2019; Penna & van Meerveld, 2019). We also use the runoff ratio to describe the proportion of precipitation that ultimately runs off into rivers; a high runoff ratio indicates that a large proportion of precipitation contributes to streamflow.

Details are in the caption following the image

Contribution of each driver to flood magnitude across Great Britain. Partial dependence plots (PDPs) are shown for each of the observed variables driving the multi-site quantile regression forest (QRF) model. (a) PDPs for time-series variables in the QRF model. PDPs show the marginal effect of each feature on the predicted flood magnitude, while holding all other predictors constant. Floods are the maximum of the mean daily discharge in each season (Qmax). The three colored lines indicate 3 completely independent QRF models trained on 3 randomly sampled equal-sized subsets of catchment IDs. (b) Map of randomly selected catchments. (c) PDPs for static variables in the QRF model. (d) Relative importance of each predictor in the three QRF models constructed for each of the three independent data sets. “Ant.” denotes antecedent values of precipitation or temperature from the previous season, and “Riv. netwk slope” denotes the mean river network slope (see Table S1 in Supporting Information S1).

3 Methodology

3.1 Modeling Approach

We employ a QRF ML model to extract insights about the role of different predictor variables and to simulate the effect of increasing four key predictor variables (precipitation, temperature, urban and forest LC) by fixed amounts. The QRF model is chosen for its ability to detect nonlinear relationships between variables, to provide insight into variable importance, to generate probabilistic predictions of seasonal floods (Qmax), and its robustness to outliers (as a nonparametric approach). QRFs are a generalization of random forests (Meinshausen & Ridgeway, 2006). Random forests have previously been used in large-sample hydrology to develop multi-catchment models (i.e., pooling data from many catchments at once) and explore the influence of different predictors on streamflow signatures describing hydrologic behavior (e.g., Addor et al., 2018; Bloomfield et al., 2021), “functional flow metrics” used to assess the ecological functionality of streams (Grantham et al., 2022), or “no flow metrics” such as the number of days with no flow during the summer months (Hammond et al., 2021). Unlike a traditional random forest, QRF predicts the entire distribution of the target variable by estimating the conditional quantiles at different probabilities. This makes QRF particularly useful for data with asymmetric or heavy-tailed distributions, as in the case of flood magnitudes. QRFs also require less tuning of the parameters and are stable and robust with noisy predictor variables (see e.g., Papacharalampous et al., 2019). We implement QRF using the ranger R package (Wright & Ziegler, 2015), and we employ the median prediction in the simulation results.

We develop a multi-catchment QRF model that is trained on data from all the selected catchments and seasons at once over the four-decade period (1982–2020; Figure 1). The QRF model includes 11 time series variables and 8 static catchment descriptors (Table S1 in Supporting Information S1), alongside two categorical variables (catchment ID and season). Including the catchment ID in the QRF model leads to improved performance, likely because it enables the model to differentiate the unique rainfall-runoff relationship of each catchment. We also find that developing a single model, or four individual models for each season, does not change the results. We used a grid search to assess the effect of tuning hyperparameters (number of trees, number of variables to split at in each node, and minimum node size) on model performance, and found limited improvements compared to the default values. We therefore settled on 100 trees across the different experiments. We include all sites with at least 5 years of data, as previous work indicates that multi-location ML models benefit from a “data synergy effect” with larger and more heterogeneous data sets (Fang et al., 2022), enabling them to capture more general patterns and relationships while reducing the risk of overfitting. We do not restrict the analysis to “reference” sites with natural (ised) streamflow, as the QRF is able to handle offsets at individual sites (such as different rainfall-runoff relationships in each catchment) and we seek to include a wide variety of catchment types across GB. Our target variable is the maximum value of mean daily streamflow in each season (flood magnitude, i.e. Qmax).

To assess the association between the different predictors and Qmax we use partial dependence plots (PDPs) implemented using the pdp R package (Greenwell, 2017). PDPs show how the target variable changes when a given input variable is varied, while holding all other variables constant. We first assess the consistency of the partial dependence patterns across three independent, equally sized data sets based on randomly sampled catchment IDs. We then evaluate the model's performance both spatially and temporally, as in Mai et al. (2022). In the spatial evaluation, we employ five-fold cross-validation with five independent groups of randomly sampled catchments (no overlap of sites across folds). We train the model on four-fifths of the sites, and test on the remaining (unseen) fifth of the sites, then repeat this for each fold (Figure S2 in Supporting Information S1). In the temporal evaluation, we train the model on the historical data (pre-2015) and test on the more recent data (from 2015 to 2020 inclusive) (Figure S3 in Supporting Information S1).

3.2 Sensitivity Tests

To explain the spatial variability of flood drivers, we develop a suite of sensitivity tests for four key variables, namely increases in precipitation and temperature, alongside urbanization and afforestation. We focus on large increases in each variable (as might occur over several decades) to be able to detect a signal in seasonal flood magnitudes. For seasonal maximum daily temperature, our baseline simulation is an increase of 1°C in each season. For reference, the historical increase in air temperature across GB from 1961 to 2015 was estimated at approximately 0.20 ± 0.13°C decade−1 (Blyth et al., 2019), which represents an average increase of 0.8°C over 40 years. For precipitation, we select an increase of 10% (we do not consider potential decreases). For LC, we include an increase in both forest and impervious LC by 10% points (pp, e.g., from 20% to 30% of the catchment afforested is a 10pp increase), and refer to these as afforestation and urbanization scenarios. These scenarios are on the high end compared to the most likely LC scenarios for the UK (Buechel et al., 2023). High-end scenarios were chosen for LC because small increases in LC have limited effect on streamflow (Buechel et al., 2023) or can be difficult to detect (Anderson et al., 2022), and we seek to assess the extent to which LC may offset or amplify the effects of changes in climate. For comparison, we also run more “extreme” scenarios (+20% precipitation, +2°C, +20pp afforestation and urbanization), which are shown alongside the baseline scenarios.

The sensitivity tests are based on repeating the four decades of observed data from the historical period like-for-like and perturbing only one variable at a time by the proportions mentioned above. They are not designed to represent real conditions, as the other predictor variables are held constant. Finally, we consider a combined scenario which includes +10% precipitation, +1°C temperature, and +10pp urbanization. We restrict the simulations to sites with at least 30 years of complete data (555 sites). We then compute the change between the mean of the original streamflow predictions and the mean of the perturbed predictions for each sensitivity test. If a perturbed predictor variable from a given season/catchment exceeds the range seen in the training data across all catchments, the streamflow prediction is removed from both sets of predictions (original and perturbed), because the QRF cannot extrapolate. This approach ensures a robust comparison, so the data points are identical in both periods. The number of simulated predictors exceeding the observed range is nonetheless very small. Across all 555 sites and four seasons there were only two sites for which precipitation exceeded the training values (+10%), four sites for impervious surface (+10pp), and nine sites for forest (+10pp). For temperature (+1°C), there were no outlier predictor values in autumn, spring and winter, but an average of 0.09 values (years) per site in the summer.

4 Results and Discussion

4.1 Evaluation of ML Model Performance

The QRF model performs well at predicting the largest daily streamflow value (Qmax) in every season, with mean Pearson's R for the median flood magnitude of 0.83 (R2 = 0.69; Nash-Sutcliffe efficiency = 0.67) for the spatial five-fold cross-validation and 0.86 (R2 = 0.73; NSE = 0.71) for the temporal evaluation (Tables S2 and S3 in Supporting Information S1). Seasonal Qmax is a challenging predictand (Figure S4 in Supporting Information S1), but our QRF model's performance is similar to that of other ML models (e.g., Frame et al., 2022 also report an NSE of 0.71 and an R-value of 0.86 in their temporal evaluation of the annual peak flow event). Here, the temporal split performs slightly better than the spatial split when all catchments are combined (Figures S2b and S3b in Supporting Information S1), but the spatial split performs best at the catchment level (Figures S2c and S3c in Supporting Information S1).

A primary concern regarding the use of RF/QRF models for prediction is their inability to extrapolate to large, “unseen” events where the value of the predictors exceeds the range of values seen in the training data set, such as an extremely large precipitation event (e.g., Papacharalampous et al., 2019). However, it has been shown that the inclusion of multiple sites within these large-sample (multi-catchment) models—as done here—produces better results than models trained for specific locations (Kratzert et al., 2024). This is because the larger training envelope reduces the likelihood of extrapolation. Previous work has shown that ML models trained on multiple river catchments consistently outperform conceptual and process-based hydrological models (Frame et al., 2022), including basin-wise, subbasin-based, and gridded models (Mai et al., 2022), which makes them suitable for this type of explainability analysis.

4.2 Relative Contribution of Climate and Land Cover to Flooding in GB

We use PDPs to assess the average relative contribution of climate and LC to seasonal flood magnitudes at the national scale. Figure 1 shows all 11 time-series variables and 8 static variables included in the model. PDPs describe the average relationship between the target variable (flood magnitude, on the y-axes) and the predictor variables (on the x-axes), while holding all other predictors constant. We assess the reliability of the multi-catchment approach by splitting the data into three independent, equally sized data sets based on randomly sampled catchment IDs (Figure 1b) and training a separate multi-catchment, multi-season QRF model for each data set. The consistency of the PDP shapes for different predictor variables across the three models provides confidence in the results (Figures 1a–1c). The slight offset among the different PDPs (i.e. slightly higher values of Qmax for sample 3) is small when all PDPs are depicted using the same y-axis.

We find that precipitation and antecedent precipitation from the previous season are the most important drivers of flood magnitude (confirming previous work, such as Berghuijs et al. (2019) and Zheng et al. (2023)), as can be seen from the variable importance plot and the y-axis of the PDPs (Figure 1a). The large y-axis range of the PDP for precipitation emphasizes the stronger influence of precipitation on the seasonal maximum flood relative to other predictors (Figure 1a). The mean river network slope and percentage of organic content in soils are the third and fourth most important predictors, showing the role of catchment characteristics in modulating the catchment response to climate. Larger floods are found in catchments with high soil organic content (greater water holding capacity) (Figure 1c). Air temperature and PET both exert a negative effect on flooding, such that high air temperature and high PET are associated with smaller flood magnitudes, when all other variables are held constant.

The impacts of LC are secondary but non-negligible (Figures 1a–1d), consistent with previous work (e.g., Blum et al., 2020; Buechel et al., 2022). On average, urbanization tends to increase the seasonal maximum flood, also in line with previous findings (Anderson et al., 2022; Blum et al., 2020; Han et al., 2022), except in the catchments that have limited urban extent to begin with. Perhaps more surprisingly, we find that forest cover exhibits a non-linear association with flooding, where specific discharge decreases with afforestation up to about 20% of forest extent. As the forest extent then increases from approximately 20% up to 80% of the entire catchment, flood magnitude starts to increase again. One possible explanation is that the areas with low and high forest cover both tend to have higher soil organic content (Figure S11 in Supporting Information S1), which is associated with higher Qmax, on average (Figure 1). This relationship is consistent across the three entirely independent models, suggesting it is similar across different catchments in GB with varying precipitation gradients (we also find the relationship is consistent when splitting the data temporally to train independent models). Further work to assess the consistency of this relationship across different LC data sets or climatic zones would be valuable. The finding may explain why previous work has uncovered both positive and negative effects of afforestation on flooding in different catchments (e.g., Anderson et al., 2022).

4.3 To What Extent Can Historical Flood Trends Be Explained by Changing Climate and Land Cover?

By combining the analysis of flood drivers above with an assessment of historical trends in each of the variables over the period 1982–2020 (Figure 2), it is possible to describe how each variable has contributed to historical trends in flooding at the national scale. Theil-Sen regression analysis reveals increases in flood magnitudes mainly in the summer and to a lesser extent in the autumn in northern/western GB, with decreases in southeastern GB (Figure 2). Spring exhibits mostly decreases in Qmax, while other seasons show mixed patterns of increasing and decreasing floods. Overall, only a minority of the catchments exhibit statistically significant trends in seasonal Qmax (p < 0.05). Previous work also found mixed patterns at the seasonal scale (Hannaford & Buys, 2012), with decreases in the spring mean flow in recent decades (Harrigan et al., 2018). These findings somewhat contradict the narrative that GB should expect “wetter winters and drier summers” with climate warming (e.g., Lowe et al., 2018), from the perspective of flood hydrology. Trends in seasonal Qmax broadly reflect the trends in total seasonal precipitation, with increasing rain in the summer and decreases in the spring, but they show substantial spatial discrepancy (Figure 2), suggesting that other drivers, such as catchment attributes, modulate or amplify the seasonal response to rainfall events (Figure 1).

Details are in the caption following the image

Historical trends in floods, climate and land cover (LC) variables for each season (1982–2020). Color gradient represents the total change over the entire period, with decreases in red and increases in blue. The changes in forest and impervious LC (annual data) are shown in percentage points of LC (pp; left). The seven hydro-climate variables in the center are shown as a percent change (HC, %) and the air temperature change in degrees Celsius (T, °C; right), all computed using the Theil-Sen trend. Black catchment outlines indicate that trends are statistically significant with a confidence level of 95% (p < 0.05) as determined by the Mann-Kendall test. Antecedent precipitation and temperature are not shown, as they are simply values from the previous season. Trends are shown only when there are at least 6 years with non-zero values.

LC exhibits spatially consistent historical trends, with monotonic decreases in forest cover and increases in impervious LC across the country (Figure 2). Although the urbanization trend is self-evident, decreasing forest cover trends are less expected. The UK has a relatively low forest cover (13% for the UK, vs. 46% for Europe, or 31% for the World; Forest Research, 2022). National statistics for the period indicate that total UK woodland area was relatively stable, with a slight increase over 2000–2012 (from 2,954 to 3,110 kha; Forest Research, 2022), but the European Union's CORINE LC maps (an inventory of European LC) also found a decrease in forest cover over the period 2000–2012 from cutting of coniferous forest (Cole et al., 2018), similar to the trend in the FROM-GLC Plus LC data, used here. Given the positive (i.e., increasing) effect of impervious LC on flood magnitudes and the nonlinear impacts of forest cover according to the PDPs (Figure 1), GB's urbanization over the past four decades (Figure 2) has likely contributed to slightly increasing flood magnitude, while the loss of forest cover has likely had more variable effects depending on the catchment.

Maximum daily air temperature, vapor pressure, and relative humidity have mostly increased across GB in the last four decades (Figure 2). Specifically, temperatures have increased largely in the spring, autumn, and winter, with few significant changes in the summer. The lack of a significant warming trend in summer can be explained by the hot summers in the early 1980s, leading to a relatively flat time series over the period 1982–2020 (see e.g., Kendon et al., 2022, Figure 9).

Hydrometeorological trend analysis is limited by its reliance on start and end date selection, which influences the sign and magnitude of trends, ultimately reflecting the influence of large-scale modes of climate variability during a sometimes-arbitrary time-window. However, trend analysis is useful for uncovering discrepancies in the drivers of flooding in cases where precipitation and streamflow trends do not match. Here, for instance, we note that in the summer there is a discrepancy between increasing precipitation trends and decreasing flood magnitudes in central and eastern GB (Figure 2). The decrease in summer floods seems to be associated with the decreasing spring precipitation, along with increasing spring PET and air temperature (see PDP analysis in Figure 1). This inter-seasonal memory effect occurs in the drier catchments of the southeast, which tend to have lower flood magnitudes and a high BFI (Figure S1 in Supporting Information S1). The effect then persists across the autumn and winter, supported by the changing climate in those seasons.

4.4 What Explains the Spatial Variability in the Effects of Climate and Land Cover on Floods?

We run a series of sensitivity tests using a single QRF model to disentangle the effects of each driver across GB (Figure 3; Figure S5 in Supporting Information S1). We select relatively high-end changes (+10% precipitation, +1°C temperature, or +10% points of catchment LC) to facilitate the detection of a signal in flood magnitudes and evaluate which catchment characteristics or streamflow signatures explain the spatial variability in the changes in flooding. While the PDP analysis provides information on the average catchment behavior, the spatial sensitivity analysis (Figures 3 and 4) provides insight into the local drivers of individual catchment behavior, revealing how changes in flood magnitude depend on streamflow-based signatures such as base flow index and runoff ratio (Figure 4) or other geographic catchment characteristics and signatures (Figure S6 in Supporting Information S1). It is important to recall that the sensitivity tests are theoretical simulations evaluating what might happen if one variable is altered in the absence of changes in any of the other variables; these simulations are not designed to represent real-world scenarios.

Details are in the caption following the image

Sensitivity tests showing the effects of perturbing different predictor variables individually. Precipitation was increased by 10%, temperature by 1°C, and forest and urban land cover (LC) by 10pp in each catchment (e.g., from 10% to 20% afforested). Gray shading indicates locations with insufficient data (a minimum of 30 complete years of seasonal data are required). Further maps showing the effects of +20% precipitation, +2°C temperature and +20pp forest and urban LC are provided in Figure S5 of the Supporting Information S1. We implement a graduated transparency filter, where the transparency of catchments increases with the absolute bias based on the temporal model evaluation; catchments reaching a bias of 5 mm/d are rendered fully transparent.

Details are in the caption following the image

Key factors explaining the spatially variable impact of flood drivers. (a) Baseflow index (BFI), (b) Runoff ratio. Y-axis indicates the percentage change in seasonal maximum flood magnitude (Qmax) produced by a change in each of the drivers (columns). Colored (gray) circles represent increases by 10%, 1°C or 10pp (20%, 2°C or 20pp). Precipitation is indicated in blue, temperature in orange, urbanization in pink and forest cover in green. Rows are the four seasons. Each circle is one catchment. A higher BFI indicates a greater proportion of groundwater in the streamflow. A low runoff ratio indicates that a small fraction of precipitation contributes to runoff. Further factors are shown in Figure S6 of the Supporting Information S1.

One of the key challenges of any sensitivity analysis is that the results depend on the accuracy of the model predictions. For example, flood underestimation in certain catchments could lead to underestimating the effects of different predictors in those catchments. We tested for this by stratifying the relationships in Figure 4 according to the magnitude of the prediction bias at each site (Figure S7 in Supporting Information S1). We noted that the patterns were generally consistent, but, as expected, could vary in groups of sites with strong positive or negative prediction bias. For instance, catchments with least prediction bias (between −1 and 1 mm/day) exhibit a largely positive effect of urbanization on flood magnitude in the summer, while catchments with strong negative prediction bias show a largely negative effect. This lends credibility to the effect of urbanization increasing flood magnitude. Conversely, some associations are not significant in the catchments with least prediction bias and would benefit from further investigation.

The largest increases in flood magnitude arise from the combined effects of +10% precipitation and +10pp urbanization in southeastern GB in summer (Figure 3). Seasonally, the summer sees greater relative increases in flood magnitude from rising precipitation (Figures 3 and 4). This is consistent with the analysis of historical trends, where summer has seen the largest relative increase in precipitation and flooding (Figure 2).

The increase in precipitation has a greater effect in locations with a high BFI (Figure 4a, p < 0.001), where the rivers have a large extent of fractured aquifers and receive a large proportion of their water from groundwater (Bloomfield et al., 2021). This supports previous work on the importance of baseflow in determining multi-decadal flood trends (Berghuijs & Slater, 2023) and the flood frequency curve (Spellman & Webster, 2020). Sites with a low runoff ratio, which are typically considered to have high infiltration and evapotranspiration, are also significantly more sensitive to increases in precipitation, especially in autumn (Figure 4b, p < 0.001). These sites tend to have a high aridity index and a lower specific discharge (Figure S8 in Supporting Information S1). Overall, our analysis shows that it is the combination of low runoff ratio and high BFI which best identifies sites where increases in precipitation may have the greatest impact on future floods.

An increase in maximum daily air temperatures by 1°C (in the absence of any related changes in precipitation) almost always causes a reduction in river flooding (Figures 1 and 3). The effect of rising temperature (and thus evaporation) offsets the effect of rising precipitation on flooding, as has been suggested previously (e.g., Arnell & Gosling, 2016; Buechel et al., 2023). Across all seasons, the negative effect of high temperature on flooding is significantly more pronounced in areas with limited groundwater contribution (Figure 4a, p < 0.001) and at drier sites with a high aridity index and low mean annual rain in the spring, summer and autumn (Figures S6b and S6c in Supporting Information S1, p < 0.001). This may be because dry catchments with low BFI have less moisture in the soils and subsurface regulating evapotranspiration (Chen & Hu, 2004; Wu et al., 2017; Yeh & Famiglietti, 2009), so more of the precipitation that falls on the surface evaporates. Catchments with limited terrestrial water storage may be more sensitive to high temperatures driving evaporation. The effect of warmer temperatures in reducing flood magnitudes is also significantly greater in catchments with high streamflow-precipitation elasticity in the spring, summer and autumn (Figure S6f in Supporting Information S1, p < 0.001). Overall, therefore, dry catchments with limited groundwater and high elasticity are most likely to see reduced floods from increasing air temperatures in the warmer months of the year.

An increase in impervious LC by 10% points (e.g., from 10% of a given catchment to 20% of the same catchment) in the absence of any other changes in climate or LC has mixed effects—both increasing and decreasing flooding depending on the catchment (Figure 3). In catchments of northern GB with limited urban extent, we tend to find a negative effect of urbanization on Qmax, while in more densely urbanized catchments of the midlands and southeastern GB, urbanization is associated with slightly increasing flood magnitudes in the autumn and substantial increases in the summer. This reflects the shape of the PDP (Figure 1). The effect of urbanization on floods (in the absence of any related change in precipitation) tends to be positive in catchments with a larger BFI and smaller runoff ratio (Figure 4, p < 0.001). The increase in flood magnitude is particularly visible in the summer in catchments with high aridity index (Figure S8 in Supporting Information S1), where the effect of a 10pp increase in urbanization can be equivalent to the effect of a 10% increase in precipitation (Figure 4). In contrast, more surprisingly, we note that urbanization is also associated with decreasing flood magnitudes in catchments with a low groundwater contribution or high runoff ratio (Figure 4, p < 0.001). The decreasing effect is found in catchments that have low urban extent to begin with (Figures 1 and 3; Figure S1 in Supporting Information S1). This suggests that as a rural catchment starts to urbanize, human impacts can attenuate peak floods, for example, through the construction of stormwater management systems. The data-driven approach merely describes patterns observed in the data. Catchments that have a greater urban extent to begin with (e.g., >5%) tend to see increases in flooding with urbanization (Figures 1-3).

Afforestation also has variable effects on flooding: catchments with less than about 20% forest extent tend to see lower flood magnitudes as forest extent increases (Figure 1). This decreasing effect is stronger in catchments with a low BFI (Figure 4a; p < 0.001), which on average tend to have higher values of specific Qmax and lower aridity index (Figures S8 and S9 in Supporting Information S1). However, we also see that in the summer the decrease in flood magnitudes with afforestation is slightly, but significantly (p < 0.005), greater in drier catchments with low mean annual precipitation and a higher aridity index (Figures S6b and S6c in Supporting Information S1). The larger decrease in streamflow with afforestation in dry areas aligns with previous work (Buechel et al., 2022). It is possible that this greater decrease in drier, or groundwater-limited catchments is due to their greater sensitivity to evapotranspiration (Chen & Hu, 2004, Yeh & Famiglietti, 2008) from afforestation. However, the data are complex. Catchments with very low or high forest extent both tend to have higher organic content in soils and higher values of Qmax (Figure S11 in Supporting Information S1). These results would be worth exploring further alongside additional information on the age and type of trees in the different catchments.

The relationships between flood sensitivity and different streamflow signatures or catchment characteristics are also likely to vary for different groups of sites. For instance, the relationships between flood sensitivity and BFI are strongest in sites with high Qmax but can be insignificant in sites with low Qmax (see e.g., the effect of rising precipitation or changing LC in certain seasons; Figure S12a in Supporting Information S1). Conversely, the negative association between flood sensitivity to urbanization/afforestation and the runoff ratio is strongest in sites with low Qmax and is not significant in the group with high Qmax (Figure S12b in Supporting Information S1).

As with any study, our approach has limitations worth acknowledging. The sensitivity analysis perturbs just one variable at a time while holding all other variables constant, and as such is not intended to reflect real-world situations. For robust sensitivity testing, there should be limited variation in model prediction bias. Therefore, strengthening the model's performance may further enhance the simulation results. It is still unknown whether training large-sample ML models on hydrologically similar groups could increase their performance (Kratzert et al., 2024). The simulation design could also be improved with emphasis on the uncertainty quantification. High flow observations can contain substantial uncertainty. Different LC products are also known to produce slightly different estimates (Yu et al., 2022) depending on a range of factors including their resolution, the LC definitions, variations in data sources, classification methods, or the training samples employed. We suggest it would be valuable to perform the simulations with a wider range of observational data sets and across a broader gradient of climates to assess the consistency of the results across space and time.

5 Conclusions

The attribution and prediction of floods is challenging at local to regional scales due to the interplay of climate, LC, anthropogenic water use, and catchment characteristics (e.g., Seneviratne et al., 2021). Understanding what drives flooding is an essential first step before attempting to project future floods. Here we develop an empirical, ML-driven multi-catchment model to shed light on the drivers of the seasonal maximum flood and the spatial variability of their effects in GB. The high similarity of the PDPs across three random, independent subsets of catchments suggests that the effect of each predictor is consistent, based on the chosen data sets, across GB.

At local to regional scales, we find strong spatial differences between historical trends in flood magnitude and precipitation in the groundwater-dominated catchments of the southeast. These differences suggest there may be an inter-seasonal catchment memory effect driving the changes in river flood magnitudes. Decreasing summer floods appear to be associated with decreasing spring precipitation, and potentially to a lesser extent increasing spring PET and air temperature.

The sensitivity analysis indicates that increases in precipitation tend to increase flood magnitudes more in rivers with high BFI and low runoff ratio, which often have lower values of specific discharge and higher aridity index. Increases in air temperature (in the absence of changing precipitation) lead to decreases in seasonal flood magnitudes through evaporation. These results consider each variable independently, and thus do not imply that climate change decreases flooding. The negative effect of high air temperature on flood magnitudes is significantly more pronounced in groundwater-limited areas and in the spring, summer and autumn in dry catchments with low rainfall and high streamflow elasticity. Our simulations of LC change effects suggest that in GB, urbanization may increase flood magnitudes more in catchments with a high BFI and low runoff ratio (which often have a higher aridity index), while afforestation is found to decrease river floods more in locations with low BFI.

Overall, the combined effect of rising precipitation and urbanization amplifies flood magnitude in many locations, and especially in groundwater-dominated catchments. The approach proposed herein may be used to further disentangle the empirical joint effects of, for instance, urbanization/afforestation and rising precipitation in individual catchments, or in different regions of the globe. Further work would be valuable to improve the explainable AI design and quantify the uncertainty arising from different data sources.

Acknowledgments

The authors would like to thank the Reviewers and Associate Editors for their insightful and helpful comments on the manuscript. Zhenrong Du is thanked for assistance with the FROM-GLC Plus data set. Bailey Anderson and Marcus Buechel are thanked for insightful discussions. LS and SM were supported by a UKRI Future Leaders Fellowship award to LS (MR/V022008/1). LS was additionally supported by NERC (NE/S015728/1), the John Fell Fund, and the Returning Carers' Fund at the University of Oxford. GC and YZ were supported by a UKRI Future Leaders Fellowship award to GC (MR/V022857/1). HM was supported by the NSF Hydrologic Sciences Program, NSF Division of Earth Sciences (Award 2124923). The authors wish to thank the Centre for Environmental Data Analysis and the National River Flow Archive for providing the data used for the analyses. One of the authors is Associate Editor at Earth's Future.

    Conflict of Interest

    The authors declare no conflicts of interest relevant to this study.

    Data Availability Statement

    Meteorological variables for Great Britain are from HadUK-Grid Gridded Climate Observations (Met Office, 2023). Potential evapotranspiration data are from Brown et al. (2022). Streamflow data are from the UK National River Flow Archive (NRFA, 2023). Static catchment descriptors are from CAMELS-GB (Coxon et al., 2020). The land cover data are from FROM-GLC Plus (Yu et al., 2022).