Areal Models for Spatially Coherent Trend Detection: The Case of British Peak River Flows

With increasing concerns on the impacts of climate change, there is wide interest in understanding whether hydrometric and environmental series display any sort of trend. Many studies however, focus on the analysis of highly variable individual series at each measuring location. We propose a novel and straightforward approach to trend detection, modelling the test statistic for trend at each location via an areal model in which the information across measuring locations is pooled together. We exemplify the method with a detailed study of change in high flows in Great Britain. Using areal models, we detect a statistically relevant signal for a positive trend across Great Britain in the recent decades. This evidence is also found when different temporal subsets of the records are analysed. Further, the model identifies areas where the increase has been higher or lower than average, thus providing a way to prioritise intervention.


Introduction
River flooding is a major natural hazard that threatens the well-being of communities and can have extremely high costs: The global annual average loss from river flooding is estimated to be USD 104 billion (United Nations Office for Disaster Risk Reduction (UNISDR), 2015), and in the United Kingdom alone, the expected annual flood damages is GBP 560 million (Sayers et al., 2015). There is a widespread interest in understanding how climate change impacts fluvial flood risk (IPCC, 2012) so that appropriate management strategies can be put in place. This interest has resulted in a number of studies investigating projected and observed changes in peak flow magnitude (and/or frequency) at the global (Hirabayashi et al., 2013;Do et al., 2017), continental (Alfieri et al., 2015;Mediero et al., 2015), and national or regional scale (Giuntoli et al., 2015;Slater & Villarini, 2016;Prosdocimi et al., 2014). The overall picture gives mixed results, with high flows projected to increase and decrease in different areas of the world under representative concentration pathway RCP8.5 (Dankers et al., 2014), while for the U.K. national scale investigations based on the UKCP09 projections (Murphy et al., 2009) under a range of emission scenarios  indicate an overall increase in high flows in the last decades of the 21st century. In contrast, studies based on gauged historical data give a more faceted picture, in the United Kingdom as well as in other parts Geophysical Research Letters 10.1029/2019GL085142 of the world (Archfield et al., 2016;Hall et al., 2014;Hannaford, 2015), with no clear detectable changes in the behavior of high flows.
Failure to detect a clear time trend signal in gauged peak flows (or other environmental variables) does not necessarily mean that an overall trend does not exist: The absence of evidence for change does not give evidence for the absence of change. Most statistical approaches used for trend detection would need very long records to perform optimally (Svensson et al., 2006), and such long records are sparse in Britain (see Figures S1 and S2 in the supporting information) and generally across the world. In particular, tests applied to short time series have low statistical power; that is, they are not able to detect signals of change even when these are present in the data (Prosdocimi et al., 2014;Vogel et al., 2013). To overcome this lack of power, we develop an areal model that pools information across stations in the same geographical region to enhance the shared trend signal. Areal models can be viewed as multilevel or hierarchical models (see Gelman et al., 2013, Verbeke & Molenberghs, 2009, which are routinely used in life sciences and social sciences to obtain a clearer estimation of the phenomena under study by pooling together the information across several observations (see, e.g., . By pooling together the information of nearby stations, the signal for the evidence of change, and in particular of an increase in flow magnitudes, is enhanced and becomes very clear.

Data
We use the annual maxima of the instantaneous (15-min) gauged peak flow recorded at 640 stations in Great Britain (GB) made available by the National River Flow Archive (2018). This is a subset of the national Peak Flow Dataset, which is maintained by the National RiverFlow Archive (NRFA) and is the successor of HiFlows-UK, the reference data set used in the United Kingdom to carry out flood estimation studies (Environment Agency, 2012;Lamb et al., 2009). Annual maxima are selected as the highest flow value registered in any given water year, which in the United Kingdom runs from 1 October to 31 September. In this study we used flow values for all the years of station records deemed to have reliable rating curves up to, at least, bank full flow. This ensures that the data series that the measuring authorities deem to be of the highest quality and reliable throughout the recording period are included in the study. To ensure that the results can be indicative of the impacts of (anthropogenic) climate change, only records that end in a year subsequent to the water year 2000 and that refer to catchments with low levels of urban land-cover are included. Finally, only stations with more than 20 years of data are retained in the study. This results in the inclusion of a total of 640 stations with a median length of 47 years: See Text S1 for additional information on the spatiotemporal coverage of the records used in this study.
For practical reasons, river flow measurement and hydrometric data collection in the United Kingdom are organized on a catchment or basin basis, rather than according to the administrative boundaries. Therefore, the country has been divided into 107 hydrometric areas (HA; National River Flow Archive, 2014), which consist in integral river catchments having one or more outlets to the sea or tidal estuary. Of the 107 British HAs, 97 are located in mainland Britain and stations with high-quality annual maxima records are available in 90 of those. Each station is located in a specific HA, and these are defined based on river systems, which typically experience similar climate and weather (see Text S3 for an exploration of the climatology of the HAs), with some of the catchments within each HA possibly nested within each other (and therefore not independent from each other). HAs are based on geophysical properties of river basins and were designed to facilitate an integrated approach to the collection of hydro-meteorological data: Their definition is independent of the study of trends in river flow and as such is an objective way to separate stations into groups that can be expected to behave similarly. We will therefore use the hydrometric areas in the spatial model outlined in the next section. Figure S2 shows how the different hydrometric areas span across the countries in GB.

Methods
For each station in the study a simple regression is performed on the log-transformed river flow with time as a covariate, as in Vogel et al. (2011) and Prosdocimi et al. (2014). For each station i, the value of the test statistic for the significance of time T i is derived. Time here is used as a proxy for anthropogenic climate 10.1029/2019GL085142 change, and the test statistic T i is a standardized summary of the evidence in favor of a time trend, so of a change, at each station i (see Text S2 for more discussion on the derivation of the test statistic). Stations are located in one HAs only, with each HA typically experiencing similar climate and weather (see Text S3). It is therefore conceivable that similar changes occur at different locations within each HA, so that the test statistic value of stations within each HA should be similar in sign and magnitude and can be pooled together to give a clearer indication for the potential of change in the specific HA and across GB.
An areal model for the test statistic is therefore constructed so that the value of the test statistic at each station is modeled as the random variation around the sum of the average value and an areal component h j , which can take different values for each HA j. This is written as (see, among others, Lawson, 2013) where is the mean signal for trend across HAs, h j(i) is a parameter taking specific value for the hydrometric area j to which the station i belongs, and i ∼ N(0, 2 T ) is the station-specific random error. This model implies that the test statistic at each station i in a region j is the realization of a random variation around the regional value + h j . It is assumed that the effects h j for each hydrometric area are independent and identically distributed (iid) with h ∼ N(0, 2 H ). The h j 's are unknown random quantities that reflect our belief that variability of the test statistic within region j is likely to be smaller than the overall variability of the test statistic. The parameters that need to be estimated from the data are , H , and T : This is done in a Bayesian fashion using R-INLA (Rue et al., 2009), which allows for fast approximate estimation of complex models. This means that the posterior distributions of the model parameters given the observed data (i.e., the observed test statistic values) are estimated. Stations within each HA would then have the same estimated posterior distribution for the test statistic in the areal model, an indication of the strength of evidence for a trend in an HA averaged across all stations within the area. From this posterior probability, the evidence for either a positive, negative, or null trend can be derived.
The parameters are estimated by pooling the information from all stations in the network, thereby using the available information in an optimal way. The overall level gives an indication of the strength of evidence in favor of a trend across the parts of GB included in this study. More specifically, the posterior estimate of is approximately the average of all HA sample averages (where by "HA sample average" we mean the average of the observed test statistics within a given HA). In particular, the pooling in the area-level model means that the posterior estimate of is robust to differences in the number of stations per HA. For a given HA, the posterior estimate of the test statistic in this HA is approximately the weighted sum of its HA sample average and the estimated overall trend . The weight on the HA sample average increases as the number of stations in the HA increases, meaning the posterior evidence of trend in an HA with many stations is less influenced by pooling than in HAs with sparser data. Details of the estimation theory for partial pooling models such as the areal model presented in equation (1) can be found in , chapter 12.
A number of approaches to pool information in space have been proposed for the detection of trends in environmental variables (see, e.g., Fischer & Knutti, 2014;Renard et al., 2008), and some of these make use of Bayesian hierarchical models (e.g., in Brady et al., 2019;Renard et al., 2006). The areal model proposed has the advantage of using as the response variable the test statistic, a simple concept that is typically easy to compute, is normalized, and has a well-defined theoretical distribution under the null hypothesis of no change. After choosing a spatial aggregation unit (in this manuscript, the externally predetermined HA), it is straightforward to derive information about the posterior distribution of the average test statistic at each aggregation unit and to identify the areas with high probabilities for the test statistic to be different from 0, that is, an indication of change in the original variable of interest. In this study we propose to use HAs as the spatial aggregation unit, as these have been defined independently for hydrometry purposes and are commonly used in practice to identify river basins and coherent areas for water management purposes. Other aggregations might be used, possibly not based on geographical proximity, but based on, for example, flood-generating mechanism or other similarity measure. Nevertheless, results for different aggregations would be more difficult to visualize on a map, and the interpretation of the results would be less direct since it would not be related to a specific area and river basin.  For a vast majority of stations (71%), the test statistic is not significant at the 10% significance level, indicating that the null hypothesis of no change (i.e., no trend) in time cannot be rejected. As discussed in Prosdocimi et al. (2014), this might be connected to the low statistical power of the test applied to short time series. For 4% of stations a significant negative trend is found, while positive significant trends are found in 25% of stations. There is therefore an indication that positive trends are more frequent than negative trends, and there appears to be some spatial clustering of positive trends in northwestern England and parts of Scotland. The tendency of the test statistic of all stations to be positive rather than negative is also evident in the general distribution of the test statistics, which is shown in Figure S3.

Results
The central and right panel of Figure 1 summarize key results of the areal model fit, highlighting a clear positive trend signal when regional information is pooled together (estimates for the variance components are presented in Table S1 and Text S5). The map in the middle panel shows the mean value of the estimated posterior distribution of the test statistic for each HA: These tend to be positive, with only few areas exhibiting slightly negative values. The 90% credible interval for the overall trend is (0.64, 0.91). Thus, there is a tendency for increasing trends across the river flow measuring network in the country. For 54 out of 90 areas, the entire 90% credible interval for the mean test statistic is positive, that is, more than 95% of the posterior distribution of the area-specific test statistic value is larger than 0 (purple HAs in the right panel of Figure 1). For no HA in the country does the 90% credible interval of the marginal posterior distribution of the area-specific test statistic contain negative numbers; this shows that across the river flow measuring network in GB, there is an either null or positive trend. The strongest signal in favor of trend is found in northern England, parts of Scotland, and Wales, and the weakest signal is found in Southern and Central England. This indicates that these areas might need to be given higher, respectively lower, priority for a new flood risk assessment. Some spatially structured variation in the estimated strength of the trend in the different HAs can be noted, even though the model does not specifically enforce this. This might indicate that large-scale climate variability, which operates on a large spatial scale, is a large driver of the changes in high flows. These findings are not dissimilar when robust regression approaches are used in the derivation of the test statistic (see Text S6).  (Hannaford, 2015), but the areal model allows to separate out an island wide effect and the areas that have experienced coherent changes in high flows. Nevertheless, a more HA specific analysis would be needed to identify the possible causes behind the evidence for change (or lack thereof) in any area: Local factors and the response of single catchments to external forcings can have strong impact in the final estimated value of the test statistic for each station in the HA. These local factors are not directly included in the areal model but would need to be taken into account in any assessment of the evidence for a trend within a HA.
The period of record covered by the data can have an influence on the estimated magnitude and sign of the tests, which aim to identify monotonic trends (Hannaford et al., 2013;Svensson et al., 2006), and tests applied to data covering different periods might give contrasting results. As seen in Figures S1 and S2, the flow series available in GB cover different periods of time, with a few very long records and most stations having valid records starting in the 1970s. The overall trend and the HA specific signals found in the analysis might therefore be representative of different types of changes, and the strong evidence for trend cannot be directly related to a change in peak flow behavior over a specific period of time. Therefore, we carry out a second analysis that focuses on a subset of stations over a fixed period of time. The analysis uses the 298 stations with complete records between 1976 and 2016 (included), that is, with a total of 41 consecutive years of data. The location of the gauging stations included in the study and the value for the time trend test statistic at each station are shown in Figure 2, together with results of the areal model fitted to the data subset (estimates for the variance components are presented in Table S2 and Text S5). The 90% credible interval for the overall trend signal across the river flow measuring network in GB is now found to be (0.31, 0.72): The evidence for trend is not as large as when all records are used, but it is still strong and positive. The posterior mean of the test statistic is found to be negative in 15 out of 65 areas, with the entire 90% credible interval below 0 in 4 of them (the green HAs in the right panel in Figure 2). Changing the time window of the investigation gives a less striking result but still indicates that overall peak flow magnitude is increasing throughout the country.
To further assess the evidence in favor of a changing behavior of peak flows, the subset of stations with exactly 41 years of data was further analyzed taking 10 subsets of 31 consecutive years of data with changing initial year (from 1976 to 1985). The estimated posterior distribution for the overall trend parameter in the different subperiods is shown in Figure 3: Across all subperiods the overall trend is generally positive, and for no subperiod does the 90% credible interval contain 0. The lowest posterior mean value (0.23) is found when analyzing the 1979-2010 subperiod, and the highest value (0.88) is found when analyzing the 1984-2015 subperiod. The water year 2010 was characterized by a drought condition (Kendon et al., 2013), while several record-breaking flood events were recorded in 2015 (Barker et al., 2016). Notice also that 1984 was characterized by strong drought conditions (Marsh & Lees, 1985): This might further enhance the strength of the signal for the 1984-2015 period. The difference in the overall effect in the two periods is likely to be a reflection of the general behavior of peak flows in the final and start year of the analysis. In general, the analysis ending in water year 2007 to 2010 indicates an increase in high flows with a smooth decline in time for the overall trend describing the increase. In contrast, the analysis based on records ending in the most recent 6 years have stronger signals in favor of a change with more variability across each subanalysis. This indicates that the overall signal increases in each subanalysis, culminating in a very large estimated value found when the record-breaking water year 2015 is included in the analysis. This very strong indication for an increase in flood risk is then followed by a much milder signal when the records including the more modest water year 2016 are also included in the analysis. The estimated area-specific posterior mean found for each data subset is shown in Figure S5, with the summary of the credible interval in Figure S6. Regardless of the observation period used in the analysis, there is an indication that peak flow magnitudes are increasing across GB, with a stronger and more persistent signal in the northern part of England and parts of Scotland, while there appear to be less of a concern for changes in high flows in the southeast of England. This finding still holds true when the test statistics included in the areal model are derived from a robust regression model (see Text S6). Even when ensuring that the large records in some series in the latter years are less influential in the estimation of the regression model at each station, a strong evidence for an increase in peak flow is found.
The length of the period for which it is possible to run subanalyses in which a considerable number of stations has a complete record is unfortunately fairly limited and does not allow for more in-depth analyses of the possible large-scale climatic drivers linked with unusually high or low peak flows at a country-wide scale. Climate modes typically evolve slowly in time with persistent periods of positive or negative anomalies, which can impact the behaviors of high flows. For example, modes of the Atlantic Multidecadal Oscillation (AMO) and of the North Atlantic Oscillation (NAO) have been linked to period of elevated high flows in Europe and North America (Hodgkins et al., 2017) and in GB (Hannaford, 2015), thus linking the occurrence of flood-rich periods to multidecadal variability rather than to long-term time trends. Given that in the short time scales for which most flow records are available climate indices have been slowly varying, the detected changes might be a consequence of the dominance of a climatic state rather than a time-related trend.

Discussion and conclusions
The natural high variability typical of short environmental records such as peak flow data and the lack of long records has previously hindered the ability of at-site tests to identify clear signals of change in high river flow across large regions Prosdocimi et al., 2014). In this study, we use areal models to pool together the information that directly measure the strength of the evidence a change in peak flows over time across all stations. Using this approach, we find strong evidence for a positive trend in the magnitude of gauged annual maxima of peak river flow in GB. This holds true when different subsets of the available records are analyzed and when using robust regression approaches in the derivation of the test statistic. The signal is clearly detected when all test statistic values across the island are modeled simultaneously in an areal model. These results are in line with those in Brady et al. (2019), in which a similar strength in change in time in near natural catchments was identified using more complex and computationally demanding spatial models. Exploiting the spatial structure of the flow data enhances the trend signal and allows for a clearer inference, thus bridging the previously reported discrepancy between the projected increases in flood risk in GB and the lack of clear signal in the observational peak flow records. Further, the model identifies areas for which the area-specific evidence for a (positive) trend is strong, allowing for a spatial characterization of the potential changes in floods. These areas would be the natural candidates for more in-depth analysis of changes in flood frequencies.
In this study we do not attempt to explain the driving causes that lead to the observed change but rather focus on presenting strong evidence that a change has indeed occurred. The fact that the high flows in the most recent years appear to have on average higher values than those in the past does pose a challenge in terms of whether the full record available at each station should be used when estimating flood frequencies and whether some adjustments should be put in place to account for the fact that estimates obtained using the whole record might underestimate the current flood frequencies (see, e.g., Luke et al., 2017, for a suggestion of such a correction). The approach presented in this study could easily be applied to other parts of the world and other types of environmental data: Pooling the information on the strength of trend at different stations will likely enhance the ability of detecting clearer signals of change across large measuring networks.