Volume 58, Issue 6 e2021WR030216
Research Article
Open Access

Probabilistic Water Demand Forecasting Using Quantile Regression Algorithms

Georgia Papacharalampous

Corresponding Author

Georgia Papacharalampous

School of Engineering, University of Patras, University Campus, Patras, Greece

Correspondence to:

G. Papacharalampous and A. Langousis,

[email protected];

[email protected];

[email protected]

Search for more papers by this author
Andreas Langousis

Corresponding Author

Andreas Langousis

School of Engineering, University of Patras, University Campus, Patras, Greece

Correspondence to:

G. Papacharalampous and A. Langousis,

[email protected];

[email protected];

[email protected]

Search for more papers by this author
First published: 05 May 2022
Citations: 6

Abstract

Machine and statistical learning algorithms can be reliably automated and applied at scale. Therefore, they can constitute a considerable asset for designing practical forecasting systems, such as those related to urban water demand. Quantile regression algorithms are statistical and machine learning algorithms that can provide probabilistic forecasts in a straightforward way, and have not been applied so far for urban water demand forecasting. In this work, we fill this gap, thereby proposing a new family of probabilistic urban water demand forecasting algorithms. We further extensively compare seven algorithms from this family in practical one-day ahead urban water demand forecasting settings. More precisely, we compare five individual quantile regression algorithms (i.e., the quantile regression, linear boosting, generalized random forest, gradient boosting machine and quantile regression neural network algorithms), their mean combiner and their median combiner. The comparison is conducted by exploiting a large urban water flow data set, as well as several types of hydrometeorological time series (which are considered as exogenous predictor variables in the forecasting setting). The results mostly favor the linear boosting algorithm, probably due to the presence of shifts (and perhaps trends) in the urban water flow time series. The forecasts of the mean and median combiners are also found to be skillful.

Key Points

  • A new family of probabilistic urban water demand forecasting algorithms is proposed based on concepts from the statistical learning field

  • Seven algorithms from this family are extensively compared for probabilistic one-day ahead urban water demand forecasting

  • One of the largest urban water flow datasets in the field is exploited and one of the largest sets of predictor variables is investigated

Plain Language Summary

There is a well-known requirement for automation in urban water demand forecasting. There is also an ongoing (and relatively recent) effort for replacing point forecasting systems (i.e., forecasting systems offering expected-value forecasts only) with probabilistic forecasting systems (i.e., forecasting systems offering expected-value forecasts and their uncertainty bounds) in technical and operational urban water demand settings. With this work, we aspire to contribute to these important efforts by proposing and extensively testing a new family of algorithms for probabilistic urban water demand forecasting. This family exploits several machine and statistical learning concepts that –in our view– could also offer much to the urban water demand forecasting field. In addition to introducing these concepts to the latter field, we conduct a large-scale comparison of seven algorithms and investigate the contribution of a large variety of input variables to solving one-day ahead urban water demand forecasting problems. Overall, this work could be used as a guide for (a) performing probabilistic urban water demand forecasting, and (b) conducting detailed assessments and comparisons of probabilistic forecasting methods in the field.

1 Introduction

Urban water demand or consumption variables (e.g., urban water flow at the inlet points of district metered areas or household water inflow) are variables with large practical interest (see, e.g., Donkor et al., 2014). Their statistical and distributional features have been extensively studied, for example, by Kossieris and Makropoulos (2018), and Kossieris et al. (2019). These features and their relationships with hydrometeorological variables (examined similarly to those of energy demand variables; see Tyralis et al., 2017) play a key role in the design of urban water demand forecasting methodologies (thoroughly reviewed and classified by Donkor et al., 2014). In fact, most of these methodologies reasonably (attempt to) consider all the periodicities present in the urban water demand data (i.e., the daily, weekly and annual periodicities) along with several exogenous predictor variables. The latter variables mostly include hydrometeorological variables that are found or assumed to be informative.

Apart from the above-outlined important considerations and, as automatic time series forecasting (Chatfield, 1988; Hyndman & Khandakar, 2008; Taylor & Letham, 2018) is an essential requirement for urban water supply management frameworks, emphasis should also be placed on the forecasting methods' wide applicability in operational conditions (in which only limited human intervention is possible). Machine and statistical learning algorithms (see, e.g., Alpaydin, 2014; Hastie et al., 2009; James et al., 2013; Witten et al., 2017) can be reliably automated and applied at scale (Papacharalampous et al., 2019). Therefore, they are befitting and increasingly adopted for solving urban water demand forecasting problems (see, e.g., Duerr et al., 2018; Herrera et al., 2010; Herrera et al., 2011; Lee & Derrible, 2020; Nunes Carvalho et al., 2021; Quilty & Adamowski, 2018; Quilty et al., 2016; Smolak et al., 2020; Xenochristou & Kapelan, 2020; Xenochristou et al., 2020; Xenochristou et al., 2021), and several other water informatics problems (see, e.g., Althoff, Dias, et al., 2020; Althoff, Filgueiras, & Rodrigues, 2020; Althoff, Bazame, & Garcia, 2021; Markonis & Strnad, 2020; Rahman, Hosono, Kisi, et al., 2020; Rahman, Hosono, Quilty, et al., 2020; Sahoo et al., 2019; Scheuer et al., 2021; Tyralis, Papacharalampous, & Langousis, 2021; Tyralis & Papacharalampous, 2017; Xu, Chen, Zhang, & Chen, 2020; Xu, Chen, Moradkhani, et al., 2020).

While most of the urban water demand forecasting methodologies are designed to produce mean-value forecasts (see again the review by Donkor et al., 2014), there are also a few recent ones issuing probabilistic forecasts (e.g., that by Gagliardi et al., 2017). The latter methodologies include some that are based on machine and statistical learning algorithms (e.g., those by Kley-Holsteg & Zie, 2020; Quilty & Adamowski, 2020; Quilty et al., 2019; Rezaali et al., 2021). However, currently they do not include quantile-regression-based methodologies (see Section 2), although such methodologies and their related concepts have already been exploited in other water informatics contexts, such as the probabilistic prediction of hydrological signatures in ungauged regions (see Tyralis, Papacharalampous, Langousis, et al., 2021) or probabilistic hydrological post-processing contexts (see, e.g., Dogulu et al., 2015; Papacharalampous et al., 2020; Tyralis, Papacharalampous, Burnetas, & Langousis, 2019), as an alternative to other approaches (see, e.g., Althoff, Rodrigues, & Bazame, 2021; Montanari & Koutsoyiannis, 2012; Sikorska-Senoner & Quilty, 2021).

More precisely, Tyralis, Papacharalampous, Langousis, et al. (2021) have compared boosting with linear models as base learners and boosting with a combination of linear models and stumps as base learners (Bühlmann & Hothorn, 2007; Hofner et al., 2014; Hothorn et al., 2020) in probabilistically predicting −in quantile regression settings− mean daily discharge, 5% flow quantile, 95% flow quantile, baseflow index, average duration of high-flow events, frequency of high-flow days, average duration of low flow events, frequency of low-flow days, runoff ratio, streamflow precipitation elasticity, slope of the flow duration curve and mean half-flow date in 667 catchments in the contiguous United States. They have found that the former boosting algorithm performs better than the latter in predicting signature quantiles at levels 2.5% and 97.5%, while the opposite holds for the prediction of the median. Moreover, Papacharalampous et al. (2019) and Tyralis, Papacharalampous, Burnetas, and Langousis (2019) have converted conceptual hydrological model outputs into probabilistic hydrological predictions (via post-processing) at the daily time scale for 511 catchments in the contiguous United States by applying and combining a variety of quantile regression algorithms. Specifically, the algorithms proposed and/or applied in the latter two works include the linear-in-parameters quantile regression (Koenker & Bassett, 1978), two versions of generalized random forests (Athey et al., 2019; Meinshausen, 2006), gradient boosting machine with trees as base learners (Friedman, 2001), boosting with linear models as base learners, quantile regression neural networks (Taylor, 2000; Cannon, 2011), the mean combiner of quantile predictions (Lichtendahl et al., 2013) and a more advanced stacked generalization algorithm (which is stacked on the top of the individual algorithms for optimally combining their outputs). The latter two algorithms have been shown to perform mostly better than the remaining ones for this specific technical problem. Multiple quantile regression algorithms have also been compared beyond water science and water informatics. Vasseur and Aznarte (2021) have compared 10 quantile regression algorithms (three of which post-process the forecasts obtained by linear regression; for the latter algorithm, see, e.g., James et al., 2013, Section 3; Hastie et al., 2009, Section 3.2) in forecasting air quality in the city of Madrid. They have found that quantile gradient boosted trees (a modified version of LightGBM; Ke et al., 2017) perform the best for their case study. They have also found that forecasts of similar quality can be obtained by post-processing the forecasts of linear regression using quantile k-nearest neighbors (Mangalova & Shesterneva, 2016).

In this work, we aim to fill the current gap of not benefitting from quantile regression algorithms and their underlying concepts in the urban water demand forecasting field and, therefore, to propose a new family of urban water demand forecasting algorithms. We also aim to provide the first extensive comparison of such algorithms by using one of the largest datasets used so far in the field (see also the datasets in Duerr et al., 2018; Xenochristou & Kapelan, 2020; Xenochristou et al., 20202021). We additionally consider one of the largest sets of predictor variables and provide large-scale results on their relative importance. We conduct our investigations at the daily time scale (as previously made, e.g., by Zhou et al., 2000; Adamowski et al., 2012; Quilty et al., 2019). This specific time scale is relevant to operational control, planning and management, together with the hourly and weekly time scales (Donkor et al., 2014, Table 1). In fact, water utilities need to know how large will be the demand today (and tomorrow) for optimally operating their treatment plans and wells.

Table 1. Quantile Regression Algorithms Applied in This Work
S/n Quantile regression algorithm Indicative application details R package Main reference
1 Quantile regression quantreg Koenker (2021)
2 Linear boosting {Initial boosting iterations = 2,000} mboost Hothorn et al. (2020)
3 Generalized random forests {Number of trees = 2,000} grf Tibshirani and Athey (2020)
4 Gradient boosting machine {Number of trees = 2,000} gbm Greenwell et al. (2020)
5 Quantile regression neural networks {Number of hidden nodes = 1, Hidden layer transfer function = sigmoid} qrnn Cannon (2019)
6 Mean combiner of forecasts All the above Lichtendahl et al. (2013)
7 Median combiner of forecasts
  • Note. The application is made in R Programming Language (R Core Team, 2021; see also Appendix A) by adopting the default hyperparameter values in the utilized R packages. Algorithms #1–5 are individual algorithms, while algorithms #6 and #7 are combiners of the forecasts of the five individual algorithms.

2 Methods and Concepts

This work builds on several –new for the field– concepts. The main of these concepts is the focus on modeling the relationships between the predictors and conditional quantiles of the predictand (i.e., the core concept of quantile regression algorithms) instead of modeling the relationship between the predictors and the conditional mean of the predictand (as done by standard regression algorithms). We exploit this specific concept by applying seven quantile regression algorithms (see Table 1) that are characterized by diverse algorithmic features and exhibit varying performances in real-world problems (with their relative performance being dependent on the problem; see, e.g., the relevant literature information in Section 1).

The application is made in R Programming Language (R Core Team, 2021; see also Appendix A). For algorithms #1–5, an additional degree of automation (other than that already incorporated into the five exploited R packages; see Table 1) is required for building the regression matrix (as it also applies to standard regression algorithms) and for forecasting at several quantile levels by also eliminating quantile crossing (if present). In fact, only the functions of the quantreg and grf packages can be used for simultaneously issuing quantiles at more than one quantile level. Moreover, different trainings and applications in forecast mode of the algorithms take place for issuing the quantile forecasts of different levels for most of the individual algorithms; therefore, sometimes the predictive quantiles can cross. The further automated versions of the algorithms check whether this phenomenon occurs and provide its ad hoc treatment. For algorithms #6 and #7, the automation involves: (a) the independent training and the independent run in forecast mode of each individual algorithm (by using the further automated versions of the algorithms with direct practical utility); and (b) the computation of the mean or the median of the individual quantile forecasts. Step (b) is made separately for each set {time point, quantile level}.

Among the selected quantile regression algorithms, the simplest one (that also serves as a benchmark for the others) is the linear-in-parameters quantile regression algorithm (simply referred to as “quantile regression” in the literature and in what follows) by Koenker and Bassett (1978; see also Koenker, 2005). An explanation of this algorithm from a practitioner's point of view can be found in the tutorial by Waldmann (2018). In summary, this algorithm performs minimization of the quantile score (see, e.g., Gneiting & Raftery, 2007) averaged over all observations. At level a ∈ (0, 1), the quantile score imposes a penalty equal to L(r; x) to a predictive quantile r, when x materializes according to Equation (1). In this equation, I{⋅} denotes the indicator function.
urn:x-wiley:00431397:media:wrcr25984:wrcr25984-math-0001(1)

This same minimization is also the objective of three of the remaining selected individual algorithms, specifically of linear boosting (proposed by Bühlmann & Hothorn, 2007), gradient boosting machine (proposed by Friedman, 2001), and quantile regression neural networks (proposed by Taylor, 2000 and improved by Cannon, 2011), while generalized random forests do not rely on the quantile loss function.

The boosting algorithms belong to the broader family of ensemble learning methods (Sagi & Rokach, 2018). In summary, the concept behind boosting is the composition of a strong learner (i.e., an algorithm with high predictive ability) by iteratively improving (boosting) weak base learners (i.e., algorithms with low predictive ability). These base learners are linear learners for the linear boosting algorithm and decision trees by Breiman et al. (1984) for the gradient boosting machine algorithm, and are added to the ensemble sequentially. At each iteration, the new base learner is trained to minimize the error of the ensemble (composed by all the previously added base learners). The number of iterations should be large enough to ensure proper fitting and small enough to avoid overfitting. More detailed popularizations of the boosting algorithms can be found in the works by Mayr et al. (2014), and Tyralis and Papacharalampous (2021).

Generalized random forests (Athey et al., 2019) is another ensemble learning algorithm for quantile regression, which is formulated as a variant of the original random forest algorithm by Breiman (2001). Random forests are widely applied in water science and water informatics (Tyralis, Papacharalampous, & Langousis, 2019). They work by averaging an ensemble of trees and use an additional randomization procedure with respect to bagging by Breiman (1996). With respect to quantile regression forests (another variant of the original random forest algorithm for quantile regression by Meinshausen, 2006), generalized random forests are theoretically expected to be more suitable for modeling heterogeneities because of their partitioning mechanism in the nodes of the decision trees.

Quantile regression neural networks (Cannon, 2011; Taylor, 2000) are artificial neural networks (see, e.g., Hastie et al., 2009, Chapter 11) especially formulated for quantile regression. The family of artificial neural network algorithms is perhaps the most popular family of machine learning algorithms in water informatics (see, e.g., the review by Maier et al., 2010; see also Table 1 in Tyralis, Papacharalampous, & Langousis, 2019). The main concept behind this specific family is the extraction of linear combinations of the predictor variables followed by the modeling of the predictand as a non-linear function of these linear combinations (Hastie et al., 2009, Chapter 11.1).

Apart from the various individual algorithms, there are also forecast combination methodologies. This category of methodologies is herein represented by (a) the mean combiner of the forecasts obtained by the five selected individual algorithms and (b) the median combiner of the same forecasts. Simple combination methodologies such as those are known to be hard-to-beat in practice in the forecasting field (Lichtendahl et al., 2013; Winkler, 2015), sometimes even by more advanced stacked generalization approaches. Still, their advantages are not yet widely applied in water science and water informatics (Papacharalampous & Tyralis, 2020).

3 Experimental Design

We perform probabilistic one-day ahead urban water demand forecasting at the daily temporal scale. Our solutions to this technical problem are predominantly based on the algorithms and concepts outlined in the previous section, while their extension to other horizons and/or temporal scales is possible, and would be a straightforward process from an algorithmic point of view. Information on statistical software is provided in Appendix A.

To extensively compare the selected quantile regression algorithms (see Table 1), we use a large urban water flow data set comprised by recent measurements (taken during 2015–2020) from 54 local automated stations (hereafter referred to as “gauges”) located at the inlet points of individual district metered areas of the water distribution network of the city of Patras in Western Greece (see Figure 1). The district metered areas exhibit various sizes, topographic and network specific characteristics, as well as data availability (see Serafeim et al., 2022). These gauges are part of the “Integrated System for Pressure Management, Remote Operation and Leakage Control of the Water Distribution Network of the City of Patras”, which is the largest smart water network in Greece, with the Municipal Enterprise of Water Supply and Sewerage of the City of Patras (DEYAP) acting as the competent Authority for its operation and management (Karathanasi & Papageorgakopoulos, 2016).

Details are in the caption following the image

Locations of the 54 urban water flow gauges and the hydrometeorological station exploited in this work, and daily urban water demand data availability.

Starting from the original-measured time series of 1-min resolution, we form time series of daily urban water flow means for each gauge. In our final data set, we only retain daily values that have resulted from 1-day-long 1-min time series with missing up to 20% of values (over 24 hr). We remove the outliers of the daily urban water flow time series, as these outliers have been previously identified with the Friedman's super smoother (Friedman, 1984; Hyndman & Khandakar, 2008), and their unrealistic segments (that might be due to unsuccessful measurements). The means of the final daily urban water flow time series (i.e., those used to form the training and testing samples) range from 0.06 L/s to 95.65 L/s. Their mean is 8.43 L/s and their median 3.12 L/s.

Additionally to the urban water flow time series, we use: high, mean and low temperature time series; high, mean and low dew point time series; high, mean and low humidity time series; high, mean and low wind speed time series; high and low atmospheric pressure time series; and total precipitation time series. These time series are daily and originate from a single hydrometeorological station (see Figure 1). They extend from January 2015 to December 2020 and have no missing values, thereby successfully covering the entire time periods in which urban water demand data are available.

Based on the 54 daily mean urban water flow time series and the 15 daily hydrometeorological time series, we define and use the following 52 predictors for probabilistically predicting mean urban water flow at day t:
  • 1. Mean urban water flow at days {tk, k = 1, …, 7}

  • 2. High, mean and low temperature at days {tk, k = 1, …, 3}

  • 3. High, mean and low due point at days {tk, k = 1, …, 3}

  • 4. High, mean and low humidity at days {tk, k = 1, …, 3}

  • 5. High, mean and low wind speed at days {tk, k = 1, …, 3}

  • 6. High and low pressure at days {tk, k = 1, …, 3}

  • 7. Total precipitation at days {tk, k = 1, …, 3}

Considering urban water demand in the few previous days and weekly periodicity is expected to increase performance in daily urban water flow time series forecasting and, therefore, endogenous predictors up to seven days before day t are selected. On the contrary, hydrometeorological exogenous information up to three days before day t is expected to be sufficient (as weather conditions in the last three days are expected to be much more informative than previously observed weather conditions). Moreover, the motivation here has been to consider all the 15 daily hydrometeorological time series and rely on the good properties of most of the selected models (see Section 2) to handle possible redundant information. For example, boosting algorithms perform intrinsic variable selection (see their property description, e.g., in Tyralis & Papacharalampous, 2021), and the random forest variants (see their property description, e.g., in Tyralis, Papacharalampous, & Langousis, 2019) are known to not be affected by such redundancy (if any). In light of these latter facts, the boosting and generalized random forests algorithms are (roughly) expected to provide their best forecasts when provided with as much information as possible (thereby also potentially affecting in a positive way the performance of the simple combination methodologies).

By considering the selected predictors, we form the available samples for algorithm training and testing for each gauge. The sizes of these samples are depicted in Figure 1. These sizes differ from gauge to gauge due to differences in terms of data availability (i.e., different periods of data availability and differences in terms of missing values). Apart from using the predictors to train the algorithms and apply them in forecast mode, we also assess them −in terms of their relative importance− in solving probabilistic one-day ahead urban water demand forecasting problems for all the gauges. This assessment is made by exploiting the generalized random forest algorithm, which allows the computation of a simple weighted sum of how many times a feature was split on at each depth in the forest (Tibshirani & Athey, 2020). Supportively, we examine the Spearman's correlations between the predictand and predictor variables, as well as between the predictor variables, separately for each gauge. We compute these correlations to assess the monotonic relationships between the variables (while Pearson's correlations assume linear relationships).

For comparing the algorithms, we train each of them (again separately for each gauge) by using the first 50% of the available samples (e.g., those corresponding to the first two years of a four-year time series), and use the trained versions to probabilistically predict mean urban water flow (by predicting the quantiles at probability levels 2.5%, 10%, 50%, 90% and 97.5%) in the second 50% of the available samples given as inputs the values of the 52 predictors in this same latter half. Finally, we assess the output forecasts by computing the average quantile score according to Equation (1), and summarize the results in terms of (a) relative differences with respect to the benchmark algorithm and (b) rankings. Here, we should note that the computation of the average quantile score is made, separately for each set {algorithm, quantile level}, by using (a) the quantile forecasts of the algorithm assessed each time and (b) the true values corresponding to these quantile forecasts (and not the quantile forecasts of the benchmark), while the relative differences with respect to the benchmark are computed subsequently (based on the previously computed scores). This latter computation is made according to Equation (2); see also Hyndman and Koehler (2006, Section 2.3) for a thorough discussion on relative performance measures. In this equation, urn:x-wiley:00431397:media:wrcr25984:wrcr25984-math-0002 denotes the relative difference in terms of average quantile score, while urn:x-wiley:00431397:media:wrcr25984:wrcr25984-math-0003 and urn:x-wiley:00431397:media:wrcr25984:wrcr25984-math-0004 denote the average quantile scores of the quantile forecasts F1 and F2, respectively, with F1 being a quantile forecast provided by an arbitrary algorithm and F2 being a quantile forecast provided by the benchmark (for the same quantile level).
urn:x-wiley:00431397:media:wrcr25984:wrcr25984-math-0005(2)

4 Experimental Results

4.1 Predictor Variable Importance

As we have used a large number of predictor variables, it is meaningful to start by presenting large-scale results on the relative importance of these variables (i.e., their rankings from the most to the least informative ones) in solving probabilistic one-day ahead urban water demand forecasting problems. We should mention, at this point, that this relative importance cannot be directly inferred by computing correlations (at least not in a straightforward and trustable way). Such correlations between the predictand and the predictor variables, as well as between the predictor variables, are presented in Figure 2 for the regression setting using daily urban water flow data from an arbitrary gauge (see also some characteristic relationship explorations in Figures 3 and 4) and in Figures S1−S53 for the regression settings using daily urban water flow data from the remaining gauges (see Supporting Information S1). While the correlations between the exogenous predictors are (almost) identical for all the regression settings (as data from a single hydrometeorological station is used for the investigations), the remaining correlations can vary significantly from setting to setting (probably because of the different land uses and the different water losses that lead to differences in the patterns characterizing the 54 urban water demand time series), thereby also posing some additional difficulties to our interpretations related to the scale of our investigations (i.e., the large number of the investigated regression settings).

Details are in the caption following the image

Spearman's correlations between the predictand and the predictor variables, as well as between the predictor variables, for the “Med Frigo” gauge. Rectangulars classify the correlations into groups. For instance, the correlations between the exogenous predictors are presented in the upper square with dimensions 45 × 45, while the correlations between the endogenous predictors are presented in the lower square with dimensions 7 × 7. The mean rankings of the predictor variables according to their importance in solving probabilistic one-day ahead urban water demand forecasting problems (see Figure 5) have been used for ordering the variables in both axes.

Details are in the caption following the image

Examples of relationships between the predictand (i.e., mean urban water flow at day t) and endogenous predictor variables (i.e., mean urban water flow at days (a) t − 1, (b) t − 2 and (c) t − 7) for the “Med Frigo” gauge.

Details are in the caption following the image

Examples of relationships between the predictand (i.e., mean urban water flow at day t) and exogenous predictor variables (i.e., (a) mean temperature, (b) mean due point, (c) mean humidity, (d) mean wind speed, (e) high atmospheric pressure and (f) total precipitation at day t − 1) for the “Med Frigo” gauge.

The formal way for inferring this relative importance has, therefore, been adopted and the rankings of the predictor variables have been computed separately for each urban water flow gauge. These rankings are presented in Figure 5, along with their mean values computed over all the urban water flow gauges. The most important predictor variables on average are the lagged urban water flow variables (endogenous predictors) from the most to the least recent ones. The only exception to this latter rule concerns the urban water flow variables observed seven days before the forecast time. In fact, these variables are found to be more important than the urban water flow variables observed five and six days before the forecast time. This outcome could be attributed to the weekly periodicity.

Details are in the caption following the image

Rankings of the predictor variables according to their importance in solving probabilistic one-day ahead urban water demand forecasting problems using data from the 54 urban water flow gauges and hydrometeorological data. Rectangulars separate the endogenous (rectangular on the left) from the exogenous (rectangular on the right) predictor variables. The predictor variables are displayed in the horizontal axis from the most to the least important on average across the 54 urban water flow gauges (from the left to the right). The mean rankings are shown at the bottom.

The second most important category of predictors includes those related to temperature, which together with the endogenous predictors have been found to be the dominant predictor variables almost for all urban water flow gauges. A mixed-variable group of exogenous predictors follows. This group includes mean, high and low humidity variables, mean and high wind speed variables, high and low atmospheric pressure variables, and mean, high and low dew point variables. These predictors have a quite varying ranking from gauge to gauge. On the contrary, the least important predictors are −almost for all urban water flow gauges− the total precipitation and low wind speed ones.

4.2 Forecasting Performance Assessment

It is meaningful to compare the outputs of the seven quantile regression algorithms with different properties. A visual inspection of such outputs can be made through Figure 6. This figure presents probabilistic one-day ahead urban water demand forecasts provided by three algorithms for an arbitrary urban water flow time series. We observe that the linear boosting algorithm (Figure 6a) and the mean combiner of the five individual algorithms (Figure 6c) produce better prediction intervals than the gradient boosting machine algorithm (Figure 6b). These latter prediction intervals are, in fact, wider than necessary for the largest observed values. Moreover, their width is not as varying as the widths of the prediction intervals obtained by the other two algorithms.

Details are in the caption following the image

Probabilistic one-day ahead urban water demand forecasts (specifically, the median-value forecasts, 80% prediction intervals and 95% prediction intervals) provided by (a) the linear boosting algorithm, (b) the gradient boosting machine algorithm and (c) the mean combiner of the five individual algorithms (depicted using red nuances) for an arbitrary urban water flow time series (depicted as purple points).

A detailed comparison of the seven algorithms has been conducted regarding their performance when utilized with the same set of predictors for probabilistic one-day ahead urban water demand forecasting. In Figure 7, we present the mean and median values of the relative differences computed for the tested algorithms with respect to the quantile regression benchmark in forecasting quantiles of five different levels. A uniformly best algorithm for all the examined quantile levels has not been identified; however, linear boosting has been identified as the best-performing algorithm overall in terms of both the mean (Figure 7a) and the median (Figure 7b) relative differences, with the mean and median combiners being very close to it (or even better) for specific quantile levels. Generalized random forests, gradient boosting machines and quantile regression neural networks perform worse than the linear boosting algorithm and the quantile regression benchmark, probably because of some shifts (or trends) characterizing the urban water demand time series.

Details are in the caption following the image

(a) Means and (b) medians across the 54 urban water flow gauges of the relative differences (%) in terms of average quantile score for the tested algorithms with respect to quantile regression (benchmark) in forecasting quantiles of levels 2.5%, 10%, 50%, 90% and 97.5%.

Lastly, in Figure 8 we present the rankings of the seven algorithms in solving probabilistic one-day ahead forecasting problems. These rankings depend to some extent on the quantile level (thereby highlighting further that there is not a uniformly best algorithm), and mostly favor the linear boosting algorithm and the two combiners of forecasts.

Details are in the caption following the image

Rankings of the tested algorithms in forecasting quantiles of levels (a) 2.5%, (b) 10%, (c) 50%, (d) 90% and (e) 97.5% for the 54 urban water flow gauges.

5 Discussion

Our large-scale results prove the wide applicability of the proposed family of probabilistic urban water demand forecasting methods. This wide applicability stems from a well-known advantage of machine and statistical learning algorithms, that is, their appropriateness and efficiency as parts of technical and operational frameworks (Papacharalampous et al., 2019). It also allows their proper assessment and comparison, which relies on big datasets comprising many representative cases (Boulesteix et al., 2018). Herein, we have presented such an assessment. We have also adopted consistent scoring rules (see, e.g., Gneiting & Raftery, 2007; Gneiting, 2011, Section 2.2), thereby providing a guide for future studies implementing the general-purpose concept of quantile regression algorithms (see, e.g., Waldmann, 2018) for urban water demand forecasting purposes.

Among the most notable findings of this work are the findings on predictor variable importance. Extensive related investigations are missing from the urban water demand forecasting literature, although similar examples have been thoroughly discussed in energy research (e.g., by Eseye et al., 2019). We believe that these specific findings of the present work could be a good starting point for selecting predictors in similar probabilistic forecasting settings, given also the large scale of our investigations. We have found that the endogenous predictors (i.e., the lagged urban water demand variables) are the most important ones on average, followed by the temperature predictors. Additionally, we have found that precipitation and low wind speed variables are less important than temperature, humidity, wind speed (high and mean), atmospheric pressure and dew point variables. Of course, these findings are somewhat dependent on the study area; therefore, they should be confirmed for every case study of interest.

Other limitations (and possible extensions) of this work should also be discussed. First, the algorithms have been herein applied with their default hyperparameters. This specific strategy is characterized in the literature as a “reasonable and justified choice” for most practical problems (see, e.g., Arcuri & Fraser, 2013; Tyralis, Papacharalampous, & Langousis, 2019); still, the comparison of the algorithms could lead to somewhat different outcomes if hyperparameter optimization had been performed. This, of course, would require additional computational resources and could, therefore, be restrictive within operational frameworks (while, for instance, our large-scale experiments herein required less than an hour to run on a regular PC). Furthermore, we have only compared the one-day ahead forecasting performance of the algorithms. Therefore, extensions of this work are required to validate that the best-performing algorithms stay the same for longer forecast horizons. Of course, it is expected that the performance of the algorithms will be somewhat reduced in such a case.

As the boosting and generalized random forest algorithms are not affected by less important predictors by construction (see their property description, e.g., in Tyralis, Papacharalampous, & Langousis, 2019, and Tyralis & Papacharalampous, 2021), we have considered the entire set of endogenous and exogenous predictors herein. Nonetheless, to reduce the computational requirements (which are increased with increasing the number of predictors), one could prefer the use of fewer predictors even when algorithms with such good properties (in terms of predictors) are selected. Further, removing some potentially redundant variables (which, based on Figure 5, are expected to be somewhat different for each problem case) could lead to performance improvements for quantile regression neural networks and probably also for quantile regression, although for the latter algorithm some regularization could also help. Another worth-discussed limitation of this work concerns the removal of outliers from the daily urban water demand forecasting time series. As such outliers, both exceptionally low and high, such as those exhibited during exceptional events (e.g., Easter holidays or Christmas), are of high practical interest for water utilities, their skillful prediction is also important. However, a meaningful examination of this aspect could not be supported by the daily time series of this work, which comprise only few of such events (as each time series covers at maximum a few years; see Figure 1). In this view, a direct expansion of this work would require longer time series and would include the outliers in the forecasting experiments.

Probably due to the presence of shifts (and perhaps trends) in the urban water demand time series, the linear boosting algorithm has been found to be the best-performing quartile regression algorithm in solving probabilistic one-day ahead urban water demand forecasting problems. Boosting algorithms are known for their ability of “garnering wisdom from a council of fools” (Tyralis & Papacharalampous, 2021), and hold their own special place among the machine and statistical learning algorithms. They have also been shown to perform well in probabilistic energy demand forecasting (see the results by Taieb et al., 2016). This might be due to possible similarities characterizing energy and water demand variables (e.g., in terms of shifts and trends).

Some last discussions should be made on the good properties of simple combination methodologies. These properties have been already discussed in water science and informatics by Papacharalampous et al. (2020), and include their ability to “harness the wisdom of the crowd” (see, e.g., the in-depth discussions in the forecasting field by Lichtendahl et al., 2013; Winkler, 2015). In the present work, these same properties have resulted to forecasts that are –in most cases− comparably skillful to the forecasts provided by the best-performing algorithm. This also implies that the simple combination methodologies might be able to retain their good properties even in cases in which some of the combined forecasts are much less skillful than the rest. This practically means that, by using simple combination methodologies instead of using individual algorithms, we can considerably increase robustness in forecasting, thereby reducing the risk of obtaining a bad forecast at every single forecast attempt. Reasonable solutions to practical forecasting problems could also be more advanced stacked generalization approaches (see, e.g., Tyralis, Papacharalampous, Burnetas, & Langousis, 2019), whose investigation in probabilistic urban water demand forecasting could be the subject of future research.

6 Concluding Remarks

We have proposed a new family of probabilistic urban water demand forecasting methods. This family relies on the general concept of quantile regression, which has not been investigated before in the field. Inspired by this concept, we have applied and extensively compared seven algorithms with direct practical utility in probabilistic one-day ahead urban water demand forecasting settings. For the comparison, we have used data from 54 urban water flow stations along with hydrometeorological time series of several types. The key findings and take-home messages of this work are the following:
  • 1. Quantile regression algorithms are straightforward-to-use and, therefore, appropriate for designing practical systems for urban water demand forecasting.

  • 2. The linear boosting algorithm has been identified as the best-performing one in our forecasting experiments.

  • 3. The above outcome could be attributed to the presence of shifts or trends in the urban water demand time series.

  • 4. For our study area, the temperature predictor variables have been found to be more important in solving probabilistic one-day ahead urban water demand forecasting problems than the remaining exogenous predictor variables.

  • 5. Also, humidity, wind speed (high and mean), atmospheric pressure and dew point variables have been found to be more important than precipitation and low wind speed variables.

  • 6. Simple combination algorithms can reduce the risk of providing bad-quality forecasts by increasing robustness.

Acknowledgments

This research work has been conducted within the project PerManeNt, which has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH – CREATE – INNOVATE (project code: T2EDK-04177).We are grateful to the Associate Editor and the anonymous Referees for their very constructive and fruitful suggestions, which helped us to substantially improve the paper. We also sincerely thank the Municipal Enterprise of Water Supply and Sewerage of Patras (DEYAP) for providing the urban water flow data in their recorded form. The first author gratefully acknowledges the Department of Water Resources and Environmental Modeling of the Faculty of Environmental Sciences of the Czech University of Life Sciences in Prague, Czech Republic, the Department of Engineering of the Roma Tre University in Rome, Italy, and the Department of Water Resources and Environmental Engineering of the School of Civil Engineering of the National Technical University of Athens in Athens, Greece, as portions of this work have been conducted during her time as a researcher there. Lastly, she is sincerely grateful and appreciative to the Data Skeptic data science podcast and its host Kyle Polich for inviting her to discuss this work (https://dataskeptic.com/blog/episodes/2022/water-demand-forecasting).

    Appendix A: Statistical Software

    The urban water demand forecasts have been issued and assessed in R Programming Language (R Core Team, 2021), which has also supported the various explorations and visualizations of this work. Specifically, the following contributed R packages have been used: data table (Dowle & Srinivasan, 2021), devtools (Wickham, Hester, & Chang, 2020), dplyr (Wickham et al., 2021), forecast (Hyndman & Khandakar, 2008; Hyndman et al., 2021), gbm (Greenwell et al., 2020), gdata (Warnes et al., 2017), ggExtra (Attali & Baker, 2019), ggplot2 (Wickham, 2016; Wickham, Chang et al., 2020), ggrepel (Slowikowski, 2021), grf (Tibshirani & Athey, 2020), knitr (Xie, 201420152021), lubridate (Grolemund & Wickham, 2011; Spinu et al., 2020), maps (Brownrigg et al., 2018), maptools (Bivand & Lewin-Koh, 2021), MASS (Ripley, 2021; Venables & Ripley, 2002), mboost (Hofner et al., 2014; Hothorn et al., 2020), qrnn (Cannon, 20112019), quantreg (Koenker, 2021), readr (Wickham & Hester, 2020), reshape2 (Wickham, 2007, Wickham, 2020a), rmarkdown (Allaire et al., 2021; Xie et al., 2018) and tidyr (Wickham, 2020b).

    Data Availability Statement

    The hydrometeorological data have been retrieved on 9 January 2021 through the following link address: https://www.wunderground.com/dashboard/pws/IU0394U06. The urban water demand data are protected under a nondisclosure agreement. Interested parties can ask for their access directly from the Municipal Enterprise of Water Supply and Sewerage of Patras (DEYAP; https://www.deyap.gr).