In Situ Observations and Lumped Parameter Model Reconstructions Reveal Intra‐Annual to Multidecadal Variability in Groundwater Levels in Sub‐Saharan Africa

Understanding temporal variability in groundwater levels is essential for water resources management. In sub‐Saharan Africa, groundwater level dynamics are poorly constrained due to limited long‐term observations. Here, we present the first published analysis of temporal variability in groundwater levels at the national scale in sub‐Saharan Africa, using 12 multidecadal (ca. 1980s to present) groundwater level hydrographs in Burkina Faso. For each hydrograph, we developed lumped parameter models which achieved acceptable calibrations (NSE = 0.5–0.99). For eight sites not showing significant (p < 0.001) long‐term groundwater level declines, we reconstructed groundwater levels to 1902, over 50 years before the earliest observations in the tropics. We standardized and clustered the eight reconstructed hydrographs to compare responses across the sites. Overall, the 12 hydrographs were categorized into three groups, which are dominated by (1) long‐term declines (four sites), (2) short‐term intra‐annual variability (three sites), and (3) long‐term multidecadal variability (five sites). We postulate that group 1 is controlled by anthropogenic influences (land use change and abstraction). Correlation of modeled water table depth and groundwater response times with hydrograph autocorrelation suggests that hydrogeological properties and structure control differences between groups 2 and 3. Group 3 shows a small recovery in groundwater levels following the 1970/1980s drought. Differences in intra‐annual to multidecadal variability in groundwater levels have implications for water management and highlight the value of long‐term monitoring. Reconstructions contextualize current groundwater status, forecasts, and projections. The approach developed is generic and applicable where long‐term groundwater level data exist.


Introduction
Sub-Saharan Africa has a population of over one billion people (World Bank, 2019), and this is predicted to double by 2050 (United Nations, 2019). Such increases in population will result in significant increases in demand for water for both drinking and productive uses, and improving access to reliable water supplies to meet this demand will require further development of groundwater (MacDonald & Calow, 2009). In parallel, depending on the nature of groundwater recharge processes, climate change may significantly impact future availability of groundwater resources (Cuthbert, Taylor, et al., 2019). In the face of climate change, increasing abstraction to meet growing demand for water while avoiding environmental degradation and resource depletion will require judicious management of groundwater resources (Foster & Chilton, 2003;Gaye & Tindimugaya, 2019).
Effective management of groundwater resources can benefit from an understanding of historic temporal variability in groundwater levels. Primarily, this understanding can help water resource managers assess current groundwater resource status in relation to previous extreme events (Jackson et al., 2016;Taylor et al., 2013) or trends. It can also provide a baseline against which to assess the significance of both short-term forecasts (<1 year) in groundwater levels associated with weather events under current climate (see, e.g., Daliakopoulos et al., 2005;Mackay et al., 2015;Prudhomme et al., 2017;Thiery, 1988) and longer term projections (>10 years) of changes in groundwater levels associated with impacts of future climate change Prudhomme et al., 2013;Treidel et al., 2011).
Evaluating historic temporal changes in groundwater levels remains a considerable challenge, particularly in sub-Saharan Africa. There is a dearth of published long-term groundwater level data in the region (Taylor et al., 2009). Some countries (e.g., South Africa, Foundation for Water Research (2005)) have monitoring networks with long groundwater level time series; however, temporal dynamics in these data sets have not been analyzed at the national scale and published in the peer-reviewed literature to date. Recent efforts to collate and publish data in the peer-reviewed literature across sub-Saharan Africa reported just 14 multidecadal time series for the entire region, with only five sites with data before 1990 and the longest record going back to 1955 (Cuthbert, Taylor, et al., 2019). In order to address the challenges posed by a paucity of long-term observations, globally numerous workers have developed methods to reconstruct groundwater levels beyond the historic record. These methods use statistical (Conrads & Roehl, 2007) or lumped parameter groundwater models to estimate groundwater levels from long-term meteorological (Filippi et al., 1990;Jackson et al., 2016) or climate proxy data (Perez-Valdivia & Sauchyn, 2011). However, despite the lack of long-term groundwater observation data in sub-Saharan Africa (Cuthbert, Taylor, et al., 2019), these methods have not been applied at the national scale in the region to date. In light of the potential future stresses of groundwater resources associated with population growth and climate change, there is an urgent need to develop an improved understanding of the temporal variability in groundwater levels in sub-Saharan Africa based on both long-term groundwater level observations and reconstructions.
In this paper, we address this need by presenting a series of 12 multidecadal groundwater level hydrographs for Burkina Faso. We develop simple lumped conceptual models for each site and use the developed models to reconstruct groundwater levels back as far as 1902, over 50 years before the earliest published groundwater level observations in the tropics (Taylor et al., 2013). We then standardize the groundwater level reconstructions and undertake cluster analysis to evaluate the controls on the temporal variability in groundwater levels. This analysis of temporal dynamics is unique at the national scale in sub-Saharan Africa. The methodology is novel, generic, and flexible and can be applied effectively anywhere where long-term groundwater level data are available.

Study Area and Groundwater Level Observations
The study area used was the country of Burkina Faso, West Africa ( Figure 1). Burkina Faso and the wider Soudano-Sahelian region have experienced significant multidecadal variability in rainfall, as exemplified by the well-known drought period during the 1970s-1990s that caused significant impacts on food production and vegetation in the Sahel (Epule et al., 2014). Recent evidence suggests there has been a subsequent recovery in rainfall in the region, although somewhat limited in magnitude (De Longueville et al., 2016;Nicholson et al., 2018). In Burkina Faso, there is a national groundwater level monitoring network, operated by the Direction Générale des Ressources en Eau (DGRE), which consists of boreholes at 52 sites across the country. Manual measurements of groundwater level are carried out at least two times per week by local trained observers at these sites. These measurements are taken under the supervision of regional government officers who collect the data from the observers, validate by identifying and rectifying outliers in consultation with the local observers, and transmit to the DGRE for processing. Of the boreholes, 12 have been monitored since the 1980s or earlier ( Figure 1 and Table 1). Limitations in human, financial, and material resources, in particular over the period 1993 to 2006, have resulted in some gaps in the records. The boreholes are predominantly located in shallow weathered basement rocks, with one site (Dingasso) located on fractured metasediments. Weathered basement rocks underlie~46% of sub-Saharan Africa (MacDonald et al., 2008). The sites cover a range of aridity and current land use settings, as summarized in Table 1.
Groundwater level data from this monitoring network has been the subject of limited previous research focusing on numerical modeling. Martin and Thiéry (1987) evaluated changes in groundwater levels at Ouagadougou from 1978 to 1985 and used a 1-D reservoir model to reconstruct monthly groundwater levels back to 1929. Using the same approach as Martin and Thiéry (1987), Filippi et al. (1990) developed 1-D reservoir models for six of the sites (Bassinko, Silmissin, Niangoloko, Tibou, Arbinda, and Katchari), but only using <2 years of monitoring data. Mouhouyouddine et al. (2017) subsequently also used the same approach to model observed groundwater levels at Ouagadougou from 1978. Recently, Cuthbert, Taylor, et al. (2019 used data from this site in a continental scale evaluation of controls on groundwater recharge processes. To the authors' knowledge, no published research has been undertaken on the other sites in the monitoring network. While this previous research has led to the development of bespoke numerical models for some of the sites in the monitoring network, it should be acknowledged from the outset of this research that the hydrogeological conceptualization of each of the 12 sites remains limited. Particularly outside of Ouagadougou, limited information is available on the nature of groundwater recharge (diffuse vs. focused) at each site; aquifer properties and stratigraphy; location and form of groundwater discharge (lateral groundwater flow, direct evapotranspiration of groundwater by deep rooted plants); and how anthropogenic influences may have impacted groundwater levels over time. The impact of this limited information on the development and application of the models is discussed in section 2.2.1.2.  Mackay et al., 2014) to simulate the observed groundwater level time series. The structure of AquiMod is shown in Figure 2. AquiMod was chosen as it has been specifically designed for modeling groundwater level time series at observation boreholes, is quick to run, and includes Monte-Carlo parameter sampling. The model consists of three modules containing simple algorithms for soil drainage, unsaturated zone water transfer, and groundwater flow in the saturated zone. The soil module uses the UN FAO method

10.1029/2020WR028056
Water Resources Research (Allen et al., 1998) to partition rainfall into evapotranspiration, runoff, and soil drainage. Soil drainage is then transferred through the unsaturated zone using a Weibull distribution function, which distributes drainage reaching the water table across a number of time steps. The discharge flux from the saturated zone is calculated based on Darcy's equation using the difference between the water table and outlet elevation and estimates of saturated hydraulic conductivity. Up to three layers in the saturated zone can be specified where detailed information on changes in hydraulic conductivity with depth are known. Further information detailing the equations used by AquiMod is provided in Supporting Information S1, and for a full description of the model, the reader is referred to Mackay et al. (2014).
AquiMod requires time series of rainfall and potential evapotranspiration (PET) as driving data. For each site, we used local daily rainfall observations collected by the National Agency of Meteorology of Burkina Faso as shown in Figure 1 and Table 1. We used the CRU PET data which derives PET at 0.5 0 resolution using the Penman-Monteith equation (Harris et al., 2014).

Model Structure and Calibration
As discussed in section 2.1, there is limited conceptual information for each site pertaining to groundwater recharge and discharge processes, as well as aquifer stratigraphy and hydraulic properties and any anthropogenic influences. This conceptual uncertainty means that it is challenging to define both the most appropriate model structure and associated parameter ranges for calibration of the model. Some have favored statistical approaches to groundwater level reconstruction in these data-scarce settings (Conrads & Roehl, 2007). However, the stronger physical basis of lumped conceptual modeling approaches has the advantage that when additional field data are collected to refine the conceptual model of each site, this can be integrated into a refined AquiMod model structure and parameter set (Jackson et al., 2016). This would not be possible in an entirely statistical approach.
Given this conceptual uncertainty, we used the most parsimonious published AquiMod model structure which consists of a simple one-layer saturated zone and the UN FAO soil and Weibull unsaturated zone modules discussed in section 2.2.1.1. The equations used in this AquiMod model structure are shown in Supporting Information S1. Soil water balance methods similar to the UN FAO method have been used as a part of the 1-D reservoir modeling previously undertaken at some of the sites, which resulted in acceptable model calibration of groundwater levels (Filippi et al., 1990;Martin & Thiéry, 1987;Mouhouyouddine et al., 2017). These methods have been widely applied in semiarid African settings for recharge estimation (see Wang et al., 2010, for a summary of studies). However, it is acknowledged that soil water balance methods may underestimate recharge in semiarid settings, particularly when applied using a long time step (Eilers et al., 2007). In these settings, focused, indirect recharge events associated with high intensity Note. Current land use derived from satellite imagery (Maxar Technologies, 2020) and Aridity Index derived from Trabucco and Zomer (2018).

Water Resources Research
ASCOTT ET AL.
rainfall events may dominate. Moreover, recharge estimates derived using soil water balance methods can be subject to significant uncertainty due to challenges in measurement of key components such as runoff and evapotranspiration (Wang et al., 2010). In this context, we highlight that the purpose of this research is to understand temporal variability in groundwater levels, not accurate quantification of groundwater recharge. We discuss the impact of the limitations of using a soil water balance method on the results (modeled groundwater level fluctuations) in section 4.4.
For each site, the model was calibrated using the full groundwater level time series. This approach has been shown to be more robust than undertaking split-sample model calibration and validation (Arsenault et al., 2018). The length and period of observed groundwater level time series are different for each site (see Table 1). We used the first observed groundwater level used to initialize the model, and we resampled the groundwater level observations to monthly mean values to avoid any bias associated with variability in the frequency of groundwater level monitoring within each time series. On the basis of guidance on application of soil water balance methods provided by Eilers et al. (2007) and the available meteorological data, the model was run on a daily time step. We used Monte-Carlo sampling using 4 × 10 6 simulations to calibrate the model, using the Nash-Sutcliffe efficiency (NSE, Nash & Sutcliffe, 1970; see Supporting Information S1, Equation 12) as the objective function. Jackson et al. (2016) and Moriasi et al. (2007) use a threshold of NSE > 0.5 to determine whether the model effectively simulates the groundwater level behavior, which we adopt here. Table 2 details the parameter ranges used in the calibration. In the calibration of lumped conceptual models for groundwater level reconstruction in the United Kingdom, Jackson et al. (2016) used existing catchment information to fix field capacity (FC), wilting point (WP), baseflow index (BFI), Weibull number of time steps (n), catchment length (x), and discharge elevation parameters (z). They calibrate the rooting depth, depletion factor for vegetation, Weibull shape (κ) and scale (λ) parameters, saturated hydraulic conductivity (K), and storage coefficient (S). We adopt a similar approach here but also calibrate the BFI as streams near the observation boreholes are generally reported to be dry or ephemeral and also have limited discharge data (BRGM, 1986).

Water Resources Research
Calibration ranges for maximum rooting depth, depletion factor, Weibull κ and λ, and fixed values for Weibull n were derived from Jackson et al. (2016) and Allen et al. (1998). As no information on BFI was available, this parameter was set to the full possible range. Values for FC and WP for each site were calculated using the van Genuchten (VG) equation. The VG parameters (α, n, θ r , and θ s ; see van Genuchten, 1980) for calculation of the water retention curve for each borehole location were derived from pedotransfer functions (PTFs) developed by Hodnett and Tomasella (2002) for tropical soil; these equations were also used and tested by Wösten et al. (2013) for the semiarid Limpopo river basin in Africa. These PTFs use percentage sand, silt, and clay as key input variables, as well as percentage of organic carbon, cation exchange capacity, bulk density, and pH. This information was obtained from the SoilGrids database (Hengl et al., 2017), for the six soil layers for each site. The calculated VG parameters were inserted in the VG equation used to describe the soil water retention curve (van Genuchten, 1980, see also Equations 1 and 2 in Wösten et al., 2013), with matric potential set to FC (10 kPa) and permanent WP (1,500 kPa), to yield the equivalent soil moisture contents for each layer. The mean of all six layers was used as input to AquiMod.
Calibration ranges of K and S were derived in part using site-specific pumping test data held by DGRE and summarized by BRGM (1986). The extent of information provided on pumping tests at each site is highly variable (BRGM, 1986). More detailed data are available for five sites (Arbinda, Bassinko, Nafona, Niangoloko, and Tibou). At these sites, 72-hr pumping tests were undertaken with drawdowns measured in the pumping borehole and observation boreholes. While detailed time-variant drawdown-pumping rate data are not available, quasi steady state drawdowns at the end of the tests have been provided, as well as estimates of transmissivity derived using the Theis (1935) and Gringarten and Witherspoon (1972) methods (3.2-11.2 m 2 day −1 ). For three sites (Katchari, Silmissin, and Tougou), transmissivity values (3.9-13.0 m 2 day −1 ) have been provided but with no information on the pumping test or analysis method. For the eight sites above, no information on the saturated thickness of the aquifer has been provided. We therefore made an initial estimate of hydraulic conductivity values based on the reported transmissivity values and an estimate of the saturated thickness under pumping conditions based on borehole logs, resulting in a range of 0.3-1.7 m day −1 . While Yameogo (2008) reports transmissivity, unconfined storage coefficient, and hydraulic conductivity for a number of sites across the city of Ouagadougou, Mouhouyouddine et al. (2017) notes that no successful pumping tests have been undertaken at the observation borehole in Ouagadougou used for long-term monitoring in this research. No pumping test data are available for the remaining three sites (Dingasso, Binde, and Ouda), and no data for unconfined storage coefficients are available for any of the sites. The resultant calibration ranges for K and S (Table 2) therefore represent a "best estimate," encompassing the K estimates derived above and considering the pumping test data quality and ranges reported for weathered basement aquifers in Africa (MacDonald et al., 2008;Ouedraogo et al., 2016;Wright, 1992) and West Africa specifically (Bianchi et al., 2020;Dickson et al., 2019;Koïta et al., 2017;SNC-Lavalin/INRS, 2011). Mackay et al. (2014) and Jackson et al. (2016) report significant interaction in AquiMod between hydraulic conductivity and the discharge elevation and aquifer length parameters. The location and form of groundwater discharge is poorly constrained at each site, and thus, no physically based information is currently available to constrain the discharge elevation and aquifer length. In order to keep the models as parsimonious as possible, and to reduce the number of dimensions sampled in the Monte-Carlo calibration, we undertook a series of preliminary calibration runs to determine appropriate fixed values for the discharge elevation and aquifer length that produce model results that meet the NSE calibration threshold used here.
The fixed values shown in Table 2 and used in the calibration are the mean of the values of the z (m) and x (m) parameters for the top 10 parameter sets derived from preliminary runs using ranges of z = 0 to minimum observed groundwater level and x = 0-10,000.

Reconstruction of Groundwater Levels
For eight sites, we reconstructed groundwater levels using the parameter sets for the top 10 groundwater level "behavioral" models by NSE derived during the model calibration. For each of the sites, we used the full length of the rainfall time series for the gauge locations shown in Figure 1 and Table 1. To ensure the model was in a dynamic balance (Rushton & Wedderburn, 1973) before the reconstruction, we ran a spin-up period of 50 years repeating the first year of observed rainfall and PET data as driving data, with the mean groundwater level observation as the starting head for the spin-up. For the remaining four sites, historical groundwater level reconstructions could not be derived using lumped conceptual 10.1029/2020WR028056 Water Resources Research models as observed, and calibrated groundwater levels show statistically significant long-term declines (p < 0.001, for a modified Mann-Kendall trend test) (Hamed & Ramachandra Rao, 1998), accounting for significant non-normality and autocorrelation (p < 0.001, for an Anderson-Darling normality test for each site).

Analysis of Groundwater Level Hydrographs
To compare different sites, for each site, we converted the reconstructed hydrograph for the best model parameter set to the standardized groundwater level index (SGI, Bloomfield & Marchant, 2013) for the period where the reconstructions for all the sites overlap . Bloomfield and Marchant (2013) detail the methodology for the calculation of the SGI. In brief, monthly mean groundwater levels are calculated from daily data; a normal-scores transform is then applied to data for each month separately; and the monthly time series is reconstructed so that it has a mean of zero and standard deviation of one. We then compared the standardized hydrographs by visual inspection in conjunction with the standardized precipitation index calculated using a 12-month accumulation period (SPI-12, McKee et al., 1993). We then used hierarchical cluster analysis using Ward's method and Euclidean distances (Haaf & Barthel, 2018) to determine whether groups of sites behaved in similar ways.
In the analysis of standardized hydrographs in the United Kingdom, Bloomfield and Marchant (2013) and Ascott et al. (2017) showed that differences in temporal variability between observed hydrographs can be characterized by quantifying m max , the timescale over which there is significant autocorrelation in each SGI time series. They showed that variability in m max could be related to attenuation of the recharge signal through the unsaturated zone (as indicated by water table depth) and observed hydrogeological properties (hydraulic diffusivity; the ratio of hydraulic conductivity to storage coefficient) of the aquifer. In our research, a dearth of real-world hydrogeological property data and significant gaps in the observed groundwater level time series makes using this method highly challenging. We therefore developed a novel approach that explores the sensitivity of m max to similar modeled data and parameters, and we use this to make inferences as to the potential real-world controls on temporal variability in groundwater levels. For each site, we correlated m max with (1) the mean modeled water table depth for 1962-2010 and (2) the groundwater response time (GRT, Currell et al., 2016;Cuthbert, Gleeson, et al., 2019) derived from the best model parameter set. GRT (days) can be estimated as where T is transmissivity (m 2 day −1 ), S is storage coefficient (-), and x is the aquifer length (m). Relating m max to GRT rather than hydraulic diffusivity as undertaken by Bloomfield and Marchant (2013) is advantageous because GRT integrates aquifer length and in doing so also reduces the dimensions of the model parameter uncertainty space by one. Figure 3 shows observed groundwater levels for the 12 long time series in Burkina Faso and daily rainfall time series used as inputs for the model calibration. There are a number of gaps in the time series, which is common for hydrological time series in West Africa (Gyau-Boakye & Schultz, 1994). Across the time series, a range of different modes of groundwater level variability can be observed. Groundwater level changes at Nafona and Dingasso are dominated by short-term intra-annual variability. The remaining sites are characterized by both intra-annual variability and long-term interannual to multidecadal variability. Binde, Ouda, Silmissin, and Katchari show long-term declines in groundwater level, with very limited intra-annual groundwater level fluctuation at Binde. Bassinko, Tibou, Tougou, Niangoloko, and Arbinda all show intra-annual variations in groundwater levels imposed on long-term trends. The longest time series, Ouagadougou, shows both substantial long-term trend and seasonality.

Water Resources Research
the behavioral threshold (NSE > 0.5). Values of these parameter sets, as well as the parameter sets for the top 10 models are shown in Supporting Information S1. There are limited differences in modeled groundwater levels between the top 10 parameter sets. Long-term increasing and decreasing trends at Arbinda, Bassinko, and Ouagadougou are well captured by the models, as well as long-term declines at Binde, Silmissin, Ouda, and Katchari. The skill of the models in capturing the extent of intra-annual variability in groundwater levels across the different sites varies considerably. For example, the model for Dingasso captures the full extent of seasonal variability. In contrast, at a number of other sites (e.g., Nafona, Niangoloko, Ouagadougou, and Tibou), the models do not match the extreme high and low groundwater levels present in the observed time series.

Groundwater Level Reconstructions and SGI Analysis
Figure 5 shows observed and reconstructed groundwater levels using the best and top 10 model parameter sets as described in section 3.2, for the eight sites which do not show significant long-term declines. There . Observed groundwater levels in meters below ground level (black, mBGL) and mean daily rainfall (gray, mm) for the 12 sites distributed across Burkina Faso (see Figure 1 and Table 1).

Water Resources Research
ASCOTT ET AL.

Water Resources Research
ASCOTT ET AL.

Water Resources Research
is generally a limited difference between the reconstructions in the top 10 model runs. Figure 6 shows reconstructed groundwater levels converted to SGI with SPI-12 for each site. The SGI time series are notably less smoothed than the reconstructed groundwater levels ( Figure 5) which is due to the SGI being derived from monthly values, whereas the raw reconstructed data are daily. Similar to the groundwater level observations (Figure 3) and calibrated models (Figure 4), there are significant variations in the groundwater level responses between the different sites. Arbinda, Bassinko, Niangoloko, Ouagadougou, and Tougou are all dominated by long-term variability in groundwater levels. Across these sites, some coherence of high and low groundwater level events present (e.g., high groundwater levels in the early 1960s and long-term declines through the 1970s and 1980s). In contrast, Dingasso and Nafona show more significant intra-annual variability, with limited long-term trends. Tibou shows a combination of both short-term variability and long-term trends, with both long-term recessions during the 1970s/1980s and superimposed intra-annual changes. Figure 7 shows a heatmap and dendrogram of the SGI derived from the reconstructions presented in Figure 6. SGI values for each site are shown by orange-red (dry, SGI < 0) to blue (wet, SGI > 0) colors, and the sites are

10.1029/2020WR028056
Water Resources Research ordered based on the hierarchical cluster analysis method reported in section 2.2.3. The coherence of multidecadal high and low groundwater level periods across multiple sites is evident, particularly at Tougou, Niangoloko, Arbinda, Ouagadougou, and Bassinko. The hierarchical clustering shown in the dendrogram reflects differences between the sites previously presented in Figures 3-6. Overall, three groups can be identified: (1) sites showing long-term groundwater level declines (Ouda, Silmissin, Binde, Katchari, see Figure 4), (2) three sites that show short-term variability in groundwater levels (Nafona, Dingasso, and Tibou), and (3) five sites that show long-term multidecadal groundwater level variability (Tougou, Niangoloko, Arbinda, Ouagadougou, and Bassinko). While some of the boreholes in these groups are closely located (e.g., Ouda and Binde), generally, there is no spatial coherence in these groupings. Figure 8 shows autocorrelation timescale m max for each site as a function of mean modeled water table depth for 1962-2010 and the GRT as derived from the best model parameter set. Sites that show short-term variability in groundwater levels (Dingasso, Nafona, and Tibou) have shallow water tables, small GRTs, and hence short m max , whereas sites with long-term changes (Arbinda, Bassinko, Ouagadougou, Tougou, and Niangoloko) have deeper water tables, longer GRTs, and hence longer m max . There is a weak positive correlation (Pearson r = 0.47, p = 0.24) between water table depth and m max , and a strong positive correlation (Pearson r = 0.93, p < 0.001) between GRT and m max .

Potential Controls on Differences in Temporal Variability in Observed and Reconstructed Groundwater Levels Between the Groups
The observed and modeled hydrographs presented in section 3 reveal three groups of sites with substantial differences in the modes of temporal variability between the clusters. Group 1 shows long-term declines in groundwater levels, group 2 shows short-term variability, and group 3 shows long-term multidecadal changes. Using both model results and the limited available conceptual information available, here, we evaluate potential causes for differences between the sites. There are three potential controls on the differences between the sites: (1) processes controlling generation of groundwater recharge; (2) the hydrogeological properties, geometry, and structure of the saturated and unsaturated zone; and (3) anthropogenic influences.
The cause of the long-term declines in groundwater levels observed in group 1 (Silmissin, Katchari, Ouda, and Binde) is likely to be principally due to anthropogenic influences. At Silmissin, land use has changed substantially between the initial borehole drilling (BRGM, 1986) and the present (Maxar Technologies, 2020).

Water Resources Research
The site is now located in a peri-urban setting with substantial human development, and there are reports that historic development of barrages have altered the surface water balance (Taylor, pers. Comm.), a common feature in West Africa (Fowe et al., 2015). It is therefore likely that anthropogenic influences such as barrage development, land use change, and potentially increased groundwater abstraction may have contributed to the long-term declines observed at this site. At Katchari, a large wellfield has been developed (UNESCO, 1999) nearby which may be causing long-term drawdown. The presence of seasonal changes in groundwater levels (superimposed on long-term declines) at Ouda suggests that the observed decline is not the result of recharge only occurring during episodic, high intensity focused recharge events. While current land use at both Ouda and Binde appear to be rural, it seems plausible that land use change or increases in local abstraction over previous decades are causing the long-term declines in groundwater levels.
The remaining sites can be broadly characterized by those that show short-term variability (Dingasso, Nafona, and Tibou) and those that show longer term trends (Arbinda, Bassinko, Niangoloko, Ouagadougou, and Tougou). These different modes of variability are reflected by differences in the autocorrelation timescale m max . The relationships between modeled m max , water table depth, and GRT suggest that while attenuation of the recharge signal in the unsaturated zone is likely to have some impact on the autocorrelation of the time series, the hydraulic properties and geometry of the saturated zone are likely to be the most significant control on the differences in temporal variability between groups 2 and 3. The limited conceptual information available makes it beyond the scope of this research to provide a detailed physically based explanation for these differences for all the sites. However, some plausible explanations can be made for two of the sites which have notably short and long m max values: Dingasso and Arbinda, respectively. Dingasso is located in weathered sandstone of relatively high permeability, and is in a valley bottom with a very shallow water table (BRGM, 1986). It is therefore likely that recharge to and discharge from the groundwater system (by evapotranspiration of shallow groundwater and lateral groundwater flow) occur rapidly, resulting in significant intra-annual variability, but that there are limited changes in groundwater levels over decadal timescales, and therefore low values of m max . In contrast, Arbinda is located in an area with a lack of surface drainage and where the water table is deep (BRGM, 1986). In this case, it is likely that little direct evapotranspiration of groundwater occurs, and groundwater discharge occurs via lateral flow over long distances, resulting in a notably long aquifer length and hence high m max . Areas of further work to refine the conceptual understanding at each site to better understand the reasons behind the differences between the groups is discussed in section 4.4.

Multidecadal Groundwater Level Variability Present in the Reconstructions
The analysis in section 4.1 has explored some potential causes of the differences in modes of temporal variability between the three groups. The driver of the long-term multidecadal variability within group 3 is primarily the driving rainfall data. This is exemplified by model results at Ouagadougou and Bassinko,

Water Resources Research
locations that are in relatively close vicinity to each other (only 13 km apart) and driven by the same rainfall data; as a result, they show very similar SGI time series (Figure 6). The multidecadal high and low periods evident in the groundwater level reconstructions ( Figure 5) can be related back to known meteorological periods reported in the standardized rainfall departures (SRD) presented by Nicholson et al. (2012) and Nicholson et al. (2018) for the Soudano-Sahelian region. At Bassinko and Ouagadougou, a period of low rainfall in the 1910s (SRD < −1) was followed by increases in rainfall in the 1920s-1930s (SRD > 0.5), resulting in decreases in reconstructed groundwater levels and a subsequent recovery. A brief drier period in the late 1940s (−0.5 < SRD < 0) occurred, although not as intense as the 1910s, resulting in short-term decreases in reconstructed groundwater levels at these sites. For all sites in group 3, there has been significantly higher rainfall in the 1950s and 1960s (SRD > 0.5), followed by decreases through the 1970s and 1980s (SRD < −1). This is also evident in the SPI-12 data presented in Figure 6. These rainfall trends have resulted in increases and then significant declines in reconstructed groundwater levels. These declines are also evident in the observed groundwater level data at Ouagadougou. From 1990 onwards, a partial recovery in rainfall has occurred (−0.5 < SRD < 1; see also SPI-12 in Figure 6), albeit more variable, resulting in limited increases in both modeled and observed groundwater levels in boreholes in group 3.
Long-term declines in rainfall (and hence reconstructed groundwater levels) from the 1950s to 1980s are driven in part by changing patterns of sea surface temperature, amplified by local soil and vegetation processes (Biasutti, 2019). The partial recovery in rainfall from 1990 to present may be explained by increasing sea surface temperature (Salack et al., 2016). Some recent work has, however, attributed the partial recovery primarily to the direct effect of greenhouse gases, with the general ocean warming playing a secondary role (Dong & Sutton, 2015). The variability in recent sea surface temperature is associated with both increases and decreases in groundwater levels evident in both the modeled and observed groundwater levels.

Implications for Water Resources Management
The different modes of temporal variability in the groundwater level observations and calibrated models have significant implications for the future management of groundwater resources. In section 4.1, we postulated that the long-term declines in groundwater levels present at sites in group 1 are principally controlled by changes in land use and abstraction. If this is the case, and further land use and abstraction changes occur in the future, it seems unlikely that these sites could be used to monitor changes in groundwater levels associated with changes in recharge due to climate variability and change. For the remaining sites, as illustrated in Figure 8, the principal controls on the SGI time series autocorrelation timescale (and hence response to rainfall events) are the hydrogeological structure and properties of each site. In response to the same driving rainfall event, it would be anticipated that sites in group 2 (short-term intra-annual variability) would react more rapidly than sites in group 3 (long-term multidecadal changes). Sites in group 2 could potentially be used as an early warning for later responses in group 3. Conversely, recovery from a longer, multi-annual extreme event would be anticipated to be slower in group 3 than group 2. These findings are consistent with the analysis of the autocorrelation timescales of standardized groundwater level hydrographs developed by Bloomfield and Marchant (2013) and Ascott et al. (2017).
Long duration groundwater level hydrographs are rare in sub-Saharan Africa (Cuthbert, Taylor, et al., 2019) but are essential for characterization of temporal variability of groundwater resources across timescales from intra-annual to multidecadal frequencies. In this research, the differences between the sites and responses to climate drivers (e.g., the Sahel drought of the 1970s/1980s and subsequent recovery) can only be ascertained through continued and extended monitoring of groundwater resources. This supports the assertion of Cuthbert, Taylor, et al. (2019) that long-term monitoring of groundwater levels across sub-Saharan Africa should be developed more widely. The groundwater level reconstructions derived in this research are beneficial data sets for water resources planning as they can be used as a baseline to contextualize current, short-term (seasonal) forecasts, and long-term (decadal or longer) future projections of groundwater resource status. The range of variability present in the reconstructed groundwater levels highlight that multidecadal variability in current and recent historic (past 100 years) climate needs to be considered in planning as well as future climate change. This historic variability, as well as the range of uncertainty in future climate projections in the Sahel (Monerie et al., 2017), supports the need for development of no-regrets adaptation measures for groundwater resources such as those suggested by World Bank (2010).

Limitations and Outlook for Further Research
The research reported here provides a number of unique and novel contributions to hydrological science. The presentation, modeling, and analysis of a national-scale collection of multidecadal groundwater level hydrographs is original and has not been published in a sub-Saharan African setting. Clustering of reconstructed SGI time series and analysis of SGI autocorrelation timescales with GRTs are globally novel, flexible, and generic methods and can be applied in other locations with groundwater level time series (e.g., other national groundwater monitoring networks with long time series in sub-Saharan Africa and elsewhere). There are, however, a number of limitations and areas for further work in this research, which we discuss here.
The conceptual understanding of processes controlling groundwater recharge and discharge are poorly constrained at each site. As a result, the simplest model structure has been used to calibrate the model, which has resulted in acceptable model calibrations. However, it is plausible that better model calibrations could be achieved using different model structures (e.g., two-or three-layer saturated zone models, or different soil moisture balance methods currently not included in AquiMod; e.g., Mansour et al., 2019;Rushton et al., 2006). Moreover, for some sites the model may be "getting the right answer for the wrong reasons." For example, the long-term declines in groundwater levels observed at Binde are well replicated by the model. In this case, the calibrated model parameter set results in no recharge being generated, resulting in a continuous groundwater level decline. However, in reality, the long-term declines may be a result of changes in land use and groundwater abstraction.
For a number of sites the full extent of groundwater level maxima and minima are not replicated by the calibrated models. In part, this is likely to reflect a bias in the groundwater level observations to less extreme values but may also reflect the nature of groundwater recharge processes at each site. As previously discussed in section 2.2.1.2, diffuse algorithms such as the FAO method (Allen et al., 1998) implemented in AquiMod are known to underestimate recharge when this occurs through focused, indirect events associated with high intensity rainfall. If this process is occurring at a site, not representing it in AquiMod may be a reason why the most significant rises in observed groundwater levels cannot be matched.
Reconstructed groundwater levels at Niangoloko and Tougou are notably higher than the historic observations, reflecting the long-term decreases in rainfall through the 1970s and 1980s. Given these reconstructions are substantially outside of the observed range, they should be considered to be more uncertain than the reconstructions at the other sites. The reconstructions have been developed using a one-layer model. It is plausible that actual groundwater levels during the period of reconstruction may have been different to those derived from the model. This is likely to be the case if the hydraulic properties of the aquifer at higher levels are different to the hydraulic properties derived in the calibration to observations or if there is any interaction with potential hydraulic boundaries to the aquifer system at higher levels (e.g., the ground surface, any low permeability material overlying the weathered basement). It should be noted, however, that despite this uncertainty, the overall trend in the reconstructions for these sites associated with the rainfall data is likely to be valid. The reconstruction trends in the 1970s and 1980s are consistent with actual groundwater level observations at other sites (Ouagadougou) during this period. While out of scope of the present study, there is also likely to be uncertainty in the rainfall and PET data itself used to derive the groundwater level reconstructions, which should be evaluated further.
The limitations discussed above can be addressed by further field investigations (e.g., soil physical measurements; aquifer tests; groundwater dating/isotopic and geochemical measurements; and gauging and baseflow separation of local surface water flows) and long-term groundwater level monitoring to develop the conceptual understanding of groundwater flow at each site. However, we recognize that undertaking such investigations can be challenging with the security situation in Burkina Faso (UK Foreign and Commonwealth Office, 2020), and even if they are completed, there is still likely to be considerable uncertainty in the conceptual understanding. We therefore advocate integrating any new conceptual information in a multimodel approach, testing of different model structures and parameter sets for the soil, unsaturated and saturated zone. This conceptual refinement and integration into the lumped parameter model would make AquiMod more physically based in this environment, as discussed in section 2.2.1.
In this research, we compared reconstructed groundwater levels across the sites by calculating SGI and undertaking hierarchical cluster analysis. Applying the SGI algorithm to sites where multidecadal variability is greater than shorter term intra-annual or interannual variability results in SGI time series with limited 10.1029/2020WR028056

Water Resources Research
intra-annual variability (e.g., Ouagadougou, Figure 6) in comparison to the raw time series (note the ranking of each month in the context of the full time series in the SGI method reported in section 2.2.3). This smoothing is likely to affect the subsequent cluster partition. Such smoothing was not reported in the initial development and application of the SGI by Bloomfield and Marchant (2013), who applied the algorithm to relatively stationary time series in UK aquifers with higher permeability and storage, and a more humid climate (Trabucco & Zomer, 2018) than in the basement aquifers of Burkina Faso. Recently, workers have compared different approaches to groundwater level hydrograph comparison and grouping in humid settings (Haaf & Barthel, 2018;Heudorfer et al., 2019), including the clustering approach we report in this research. Further work undertaking a similar comparison of approaches using the data reported here would be beneficial globally for improved analysis and interpretation of groundwater level hydrographs in more arid settings.
Having further refined the models developed in this research, we envisage a wide range of future applications to support water resources management in sub-Saharan African settings. Short-term (seasonal) forecasts and long-term (>decadal) projections of groundwater levels could be derived, as has been undertaken using lumped conceptual models elsewhere Mackay et al., 2015). With improved conceptual understanding gained by field investigations, it may be possible to quantify changes in the absolute magnitude of groundwater storage, rather than just levels, using the models. This would be beneficial as it could then be possible to quantify the buffering capacity of groundwater to drought events, which has been shown to be potentially significant in sub-Saharan Africa (Foster et al., 2012;MacDonald & Calow, 2009;Vouillamoz et al., 2015). Numerous approaches (recently summarized by Ascott et al., 2019) have linked lumped parameter models such as AquiMod to changes in groundwater levels in abstraction boreholes to estimate impacts of climate variability on individual borehole yields. Given suitable abstraction borehole data, this could be undertaken in Burkina Faso. The models, historical reconstructions, forecasts, and projections subsequently developed would then need to be post-processed and evaluated in the context of the key decision-relevant metrics (Bornemann et al., 2019) for water resources management and then integrated into existing planning frameworks for Burkina Faso.

Conclusions
This research has, for the first time in sub-Saharan Africa, presented, modeled, reconstructed, and analyzed a national-scale collection of multidecadal groundwater level hydrographs. We conclude that: 1. Groundwater level responses are classified into three groups: which are dominated by (1) long-term declines, (2) short-term intra-annual variability, and (3) long-term multidecadal variability. 2. Group 1 is likely to be controlled by anthropogenic influences such as land use change and groundwater abstraction. The hydrogeological properties and geometry of the aquifers control the differences between groups 2 and 3. Since 1990, sites in group 3 have shown a recovery, albeit limited, in groundwater levels following the drought of the 1970s and 1980s. 3. The differences in intra-annual to multidecadal variability in groundwater level changes across the sites have implications for water resources management and monitoring. 4. The reconstructed groundwater level hydrographs can contextualize current groundwater level status, future short-term forecasts, and longer term multidecadal scale projections of the impacts of climate change on groundwater resources. 5. The approach in this research is generic and flexible and can be adopted in other areas where groundwater level monitoring data exists.

Data Availability Statement
Potential evapotranspiration data used in this research are described in Harris et al. (2014)