Real‐Options Water Supply Planning: Multistage Scenario Trees for Adaptive and Flexible Capacity Expansion Under Probabilistic Climate Change Uncertainty

Planning water supply infrastructure includes identifying interventions that cost‐effectively secure an acceptably reliable water supply. Climate change is a source of uncertainty for water supply developments as its impact on source yields is uncertain. Adaptability to changing future conditions is increasingly viewed as a valuable design principle of strategic water planning. Because present decisions impact a system's ability to adapt to future needs, flexibility in activating, delaying, and replacing engineering projects should be considered in least‐cost water supply intervention scheduling. This is a principle of Real Options Analysis, which this paper applies to least‐cost capacity expansion scheduling via multistage stochastic mathematical programming. We apply the proposed model to a real‐world utility with many investment decision stages using a generalized scenario tree construction algorithm to efficiently approximate the probabilistic uncertainty. To evaluate the implementation of Real Options Analysis, the use of two metrics is proposed: the value of the stochastic solution and the expected value of perfect information that quantify the value of adopting adaptive and flexible plans, respectively. An application to London's water system demonstrates the generalized approach. The investment decisions results are a mixture of long‐term and contingency schemes that are optimally chosen considering different futures. The value of the stochastic solution shows that by considering uncertainty, adaptive investment decisions avoid £100 million net present value (NPV) cost, 15% of the total NPV. The expected value of perfect information demonstrates that optimal delay and early decisions have £50 million NPV, 6% of total NPV. Sensitivity of results to the characteristics of the scenario tree and uncertainty set is assessed.


Introduction and Background
Water utilities aim to maintain an efficient and reliable water supply service by optimally combining the scheduling of supply augmentation projects and demand reduction policies (Mortazavi-Naeini et al., 2014). Water planners investigate a range of feasible interventions including both the supply-side (e.g., wastewater reuse, desalination, and reservoirs) and demand-side interventions (e.g., demand reduction and leakage reduction). In its simplest form, the capacity expansion problem refers to finding the optimum timing and scale of predefined projects. Deterministic supply-demand optimization aims to meet service levels commitments under historically dire conditions and identifies a fixed least-cost schedule of system upgrades (Padula et al., 2013). However, fixed investment plans are brittle; that is, if future conditions turn out to be different than assumed, the plan is likely to fail (Chung et al., 2009). The antidote to brittleness is robustness (defined as a decision that performs acceptably well over a range of conditions) and flexibility (defined as the ability to switch a decision depending on outcomes that materialize; Maier et al., 2016). Methods that use an ensemble of plausible scenarios to seek robustness and flexibility are discussed below.
Robust decision making is an attempt to identify plans that perform well under a wide range of plausible future conditions (Lempert et al., 2006). That is, investment plans should aim to be insensitive to the most significant uncertainties (J. W. Hall, Lempert et al., 2012;Huskova et al., 2016;Lempert et al., 2006;P. A. Ray & Brown, 2015). Robust plans trade optimality with the ability to perform acceptably well in a wide range of future scenarios. Robust decision making has been applied in a range of water resource planning contexts, such as in England (Matrosov et al., 2013), in Australia (Mortazavi-Naeini et al., 2015), and in Southern California (Tingstad et al., 2013). Robust approaches accommodate for a wide range of possible future conditions (i.e., mild to dire). Depending on the statistics used to quantify the performance of the system over the

RESEARCH ARTICLE
Water Resources Research 10.1029/2017WR021803 set of possible scenarios, they may lead to excess capacity (over investment; Herman et al., 2015) if an excessively conservative set of actions is chosen (Shapiro, 2012). If optimization is used, different metrics to define robustness will lead to different results (McPhail et al., 2018;Mortazavi-Naeini et al., 2015). A robustness metric determines how a definition of robustness is operationalized (Kwakkel, Eker, & Pruyt, 2016). Misdefined robustness metrics generally lead to solutions that underestimate the system performance with respect to the one achievable with a better metric (Giuliani & Castelletti, 2016).
Adaptive approaches are based on considering the uncertain future and responding to future conditions by adjusting intervention schedules as the future manifests (Maestu & Gómez, 2012). Adaptability enables a system to change proactively to environments, markets, regulations, and technology (De Neufville & Scholtes, 2011). Dynamic Adaptive Policy Pathways (DAPP) and Real Options Analysis (ROA) are among the decision-making processes that differently identify adaptive strategies under uncertain future (P. A. Ray & Brown, 2015). While DAPP appears in the literature to be implemented in situations with absence of information on likelihood of the multiple plausible futures (Haasnoot et al., 2013;Kwakkel et al., 2015;Kwakkel, Haasnoot, & Walker, 2016), ROA typically makes use of probability information (Dixit & Pindyck, 1994;P. A. Ray & Brown, 2015) to treat future uncertainty.
DAPP is an amalgamation of two approaches, Adaptive Policy Making and Adaptation Pathways. The former is a structured approach for designing dynamic robust plans (Dessai & Sluijs, 2007;Kwakkel et al., 2010;Walker et al., 2001), and the latter approach uses adaptation tipping points to specify the conditions under which a given plan will fail as it no longer meets the specified objectives . DAPP includes transient scenarios representing multiple uncertainties used to analyze the vulnerabilities and opportunities of policy actions and how they develop gradually over time. Alternative types of actions are then identified to address these potential vulnerabilities and opportunities, specifying a dynamic adaptive plan (Hamarat et al., 2014;Herman et al., 2015;Kwakkel et al., 2015). In a water resource management context, adaptation tipping points could be a certain climate change trigger indicating that the current plan must change as new actions are needed to ensure water supply security. The challenge of this approach for water resource management applications is to identify good triggers for water management due to high natural variability as well as a monitoring framework for short time period of measurements (Diermanse et al., 2010).
ROA is a probabilistic decision process with the ability to value the flexibility and adaptability in future decision making when irreversibility and uncertainty are key characteristics of the decision problem (Dixit & Pindyck, 1994). While it can be used as part of the evaluation and design of DAPP (Buurman & Babovic, 2016), it is mainly used to enable planners to examine the implications of future uncertainties. Within ROA, flexibility is valued since it allows delaying commitment to large, costly, and irreversible decisions while either exercising different interventions or incrementally implementing interventions with high regret cost and long construction times until more information is available. Adaptation is enabled because ROA provides an optimal sequence of future investment decisions that respond to changes in uncertainty over time. Traditional ROA methods are based on financial theory, such as the Black-Scholes equation (Black & Scholes, 1973) or expected value decision tree analysis (Dixit & Pindyck, 1994).
ROA is implemented through different techniques. These include decision trees, lattices, and Monte Carlo analysis (Chow & Regan, 2011;De Neufville & Scholtes, 2011;Lander & Pinches, 1998;Trigeorgis, 1996) as well as multistage stochastic optimization programs (De Weck et al., 2004;Wang & De Neufville, 2004Zhao et al., 2004). Combinations of staged decision making (Beh et al., 2014;Cai et al., 2015;Hobbs, 1997;Kang & Lansey, 2012;Kracman et al., 2006;P. Ray et al., 2011;Vieira & Cunha, 2016) and ROA (Jeuland & Whittington, 2014;Steinschneider & Brown, 2012;Woodward et al., 2014) can be found in the water and flood management literature. The number of decision stages in these multistage problems defines the frequency that intervention strategies can be modified in the planning horizon. For example, P. Ray et al., 2011Ray et al., 's (2011 long-term water supply planning under climate change uncertainties extends 75 years into the future and the decision stages are made in years 2035, 2060, and 2085. In another work, Woodward et al. (2014)'s model stages flood risk interventions every 50-year time step over a 100-year time horizon. There has been significant effort by using different decomposition methods (Escudero, 2009;Mulvey & Ruszczyński, 1995;Rockafellar & Wets, 1991), and/or uncertainty reduction and clustering techniques (Dupačová et al., 2003;Gröwe-Kuska et al., 2003;Gülpınar et al., 2004;Heitsch & Römisch, 2005;Housh et al., 2013;Latorre et al., 2007;Šutienė et al., 2010) to represent long-term future uncertainty in stages using a scenario tree. Nevertheless, applying ROA in water resource planning is still challenging for three reasons. First, ROA is sensitive to the structure of the scenario tree so the parameterization of its design must be defensible. This includes deciding the number of nodes over the planning horizon and choosing the branching between states. Second, the probability assignment to scenario branches and nodes affects the optimized investment decisions. This can become intractable for a relatively complex problem. Lastly, as the number of scenarios used grows, the problem becomes more complex, often without increasing the quality of the solution (Lander & Pinches, 1998;Wang & De Neufville, 2004).
The decision-making process presented in this paper aims to explicitly seek adaptability and flexibility in least-cost supply-demand infrastructure investment planning. We estimate the value of adaptability and flexibility under conditions of probabilistic uncertainty where probabilities are assigned to future states of supply. This is different from decision making under deep uncertainty approaches (Lempert et al., 2006) where key criteria for evaluating alternative decisions such as robustness, adaptability, and trading-off conflicting objectives are addressed without requiring probabilities (Kasprzyk et al., 2012;Lempert et al., 2006).
To account for the above, this paper proposes a generalized uncertainty sampling and optimized scenario tree construction approach for multistage investment planning. We optimally build a scenario tree with multiple decision stages to allow for frequent and regular modifications to the investment strategies. The decision tree presented in this paper uses a range of supply scenarios to represent uncertainties of future climate change effects from mild to dire. The range of possible climate change futures was defined by the UKCP09 weather generator, which provides probabilistic projections of precipitation, temperature, and other variables for the United Kingdom using perturbed physics ensemble simulations (J. Murphy, Sexton, Jenkins, Boorman, et al., 2009). The analysis has used UKCP09 data assuming that the impacts are for a medium emissions scenario, as reported in Thames Water (2014). The scenario tree is incorporated into a multistage stochastic optimization formulation that applies ROA for enabling flexible and adaptive water resource investment decisions. Frequent corrective decisions allow the model to compensate for insufficient or excessive investment made in initial decision stages. The recommendations of the proposed method depend on the probabilities assigned to the supply scenarios; errors in those probabilities will lead to errors in the models recommendations. To measure the adaptability and flexibility enabled by the ROA implementation, two metrics are used and discussed. We apply the model to a water supply infrastructure planning problem in England over 50 years with a 5-year decision-making time step.
The proposed approach is described in section 2 and the results of its application to Thames Water's London supply zone are presented and discussed in sections 3 and 4. Two metrics to evaluate the implementation of ROA are proposed in section 4.2. Sensitivity of results to the use of different scenario trees and the characteristics of the uncertainty set used to create the trees are in section 4.3. Section 4.4 discusses the limitation of the proposed method, and section 5 concludes the paper.

Adaptive and Flexible Formulation for ROA Implementation
We take two steps in formulating a multistage stochastic program for ROA implementation. In the first step, a scenario tree (see definition in section 2.1) is generated to approximate the stochastic supply representing an ensemble of plausible futures. In the second step, a multistage mathematical programming formulation is solved on the scenario tree to obtain the future plan under plausible future scenarios. The section concludes with an illustration of a utility that practices real-options investment decision making provided by the proposed formulation (section 2.3).

Scenario Tree Approximation
We consider a discrete time horizon T in which decisions are made at each stage t ∈ T. To facilitate adaptive decision making to changing future condition, and to represent the multistage planning for flexible decision making, a set of paths is built to represent the evolution of an uncertain future. The paths, or trajectories, correspond to a particular state of the uncertain parameter in time. These paths are approximated using a tree structure which we refer to as a scenario tree. The scenario tree, schematized in Figure 1a, is built by creating the root node at time stage 1 associated with the first stage deterministic decision. The successor nodes to the root depict the possible outcomes of the next decision point at time index 2. This process is repeated until the end of the planning horizon resulting in a tree structure. A single scenario is then defined as a unique path from the root node to the terminal node defined by a leaf showing one realization of the future. The probability of scenario occurrence is defined by multiplying all state transition probabilities of the scenario path starting from root leading to the leaf. The scenario tree is an approximation of the stochastic process The parameters s i and p i are the scenarios and the transition probabilities for each outcome branch, respectively; for each pair of branches the sum of the probabilities adds to 1. A path is defined from root node to leaf node at the end of the planning horizon. (b) An illustration of a simple water resource problem solved with the proposed real-options formulation. The supply-demand gap and the activated intervention are provided above and below each tree node, respectively. and is suitable for multiperiod decision making as until a given point on the tree, the past is shared among a set of scenarios while a future event is yet to manifest. In Figure 1a, an example scenario tree structure is presented. We see that tree nodes F and G share a common point C and all decisions that come before it. Nonanticipativity enforces that investment decisions at time t only utilize any information that is available up to this stage. Hence, this dictates that all decisions made for scenarios 2 and 3 should be the same on nodes A and C. The path indicates that the possible outcomes from C in the next stage is transition to either F with probability p 5 or G with probability p 6 , subject to p 5 + p 6 = 1. The number of leaf nodes corresponds to the number of distinct scenarios and their probabilities are calculated as the multiplication of associated transition probabilities starting from root leading to the leaf node. For instance, the probability for supply scenario s 3 to occur, from root to the end of the planning horizon is p 2 × p 6 × p 11 . Manually generating the above scenario tree and deciding on the number of nodes, leafs, and probability information on each node for practical purposes requires complex calculation and sufficient judgment (Lander & Pinches, 1998). This is especially a major deterrent to ROA implementation in complex decision problems as scenario trees can quickly grow large. To account for this, we automatically construct the scenario tree by implementing the fast forward iterative greedy algorithm, which aims to minimize a so-called probability distance between the uncertainty sets . The algorithm optimally creates a most informative scenario tree based on the original stochastic process by successively bundling the tree nodes into separate sets to be later represented by a new node while maintaining the probability information of the constructed uncertain process as close as possible to the original stochastic process. By bundling similar scenarios and reducing the number of nodes this not only produces a valuable and smaller computationally accessible multistage decision model but also reduces the burden of manually representing the uncertainty through scenario tree generation for multistage stochastic ROA implementation. Appendix A gives details of the construction algorithm where the quality of the constructed tree is controlled by a metric that calculates the percentage of information lost known as relative probability distance (Heitsch & Römisch, 2011). The lower the metric value is, the less information is lost and hence the more accurate the constructed tree becomes. This is set to 5% in this study, as we assume that this is an acceptable loss of information. The tolerance indicates the relative probability distance between the constructed tree and the original stochastic process and consequently determines the number of scenarios preserved in the scenario tree.

Staged Mathematical Model
With a scenario tree constructed, we formulate a mathematical program to represent the staged decision process for obtaining an optimal decision for each node of the scenario tree. This provides adaptive optimal solutions which propose actions to be implemented at each decision-making time interval and for each estimate of the uncertain future. We introduce a binary decision variable dS representing the activation of Water Resources Research 10.1029/2017WR021803 an intervention at each node of the tree where the decision at each stage only depends on the information available up to that point. The following formulation defines the staged mathematical program for sequential capacity investment decision making over time. Let N be the set of nodes on a scenario tree and N t be the set of nodes belonging to stage t. For a node n ∈ N we denote with n − 1 and n + 1, respectively, the predecessor and successor nodes on the scenario and with p n the probability that node n is realized. For a node n ∈ N and scenario s ∈ Ω, Ω n is the set of nodes that belong to scenario s.
where n is a node, t denotes time (stages), i is an intervention, p n is the probability that node n is realized, r is the discount rate, eS n,i denotes levels of existing supply from intervention i, cC i is the undiscounted capital cost of intervention i, cF i is the undiscounted fixed operational cost of intervention i, cV i is the undiscounted variable operational cost of intervention i, D t is demand in time t, cS n,i is the maximum capacity of intervention i in node n, i is the construction time period for intervention i, dS n,i is the activation of intervention i for node n, S n,i is the supply from intervention i for node n, and aS n,t,i is the associate supply on the intervention i to supply on node n in time t.
The optimization model minimizes the expected cost of investments discounted back to the present. Constraint (2) makes sure that the supply balances the demand in each node of the tree. Constraints (3)- (5) allow an intervention to be utilized up to its capacity considering its construction period, i , before its activation; constraint (3) sets an earliest year for the yield, constraint (4) sets the available supply to associate with construction period, and constraint (5) prevents yield from being used during the construction period. Constraint (6) forces an intervention once activated to remain active at later nodes of the tree. Activation of two interventions that are mutually exclusive is avoided by introducing constraint (7) over the set of mutually exclusive interventions, I m . Constraint (8) ensures that modular interventions can be further expanded as long as the previous phase has been completed. I d denotes the set of dependent interventions and I p denotes the set of prerequisite interventions. The proposed problem structure follows a node-based formulation related to the multistage stochastic program. Intervention activation constraints, due to path dependency are nonanticipative. For instance, although scenarios s i and s j end up in different terminal nodes, they can be passing through the same node in time t. In that case, the intervention activation decision variables at time stage t in scenario s i equals that of the other scenario s j . This means that the multistage stochastic program will determine an optimal decision for each node of the scenario tree, given the information up to time stage t. Given that there are multiple succeeding nodes, the optimal decisions will not exploit hindsight, but they should anticipate future events. The mathematical model above allows nonanticipativity to be incorporated implicitly through its scenario tree formulation. Constraint (10) makes sure that an intervention can only be activated at most once in any scenario.  Figure 1b illustrates a simplified scenario tree for the purpose of demonstrating the ROA implementation. We consider a utility that wants to cost-effectively balance future supply-demand by investing in a new reservoir with three possible capacities (50, 100, or 150 Ml/d). The 50 Ml/d reservoir can be built with a fixed or modular capacity. As shown in Table 1, if the utility builds a 50 Ml/d fixed capacity reservoir with 1,000 £m cost, they cannot expand it later. Alternatively, if they pay a higher initial capex cost (1,100 £m) for a modular 50 Ml/d reservoir design, they are able to expand later to 100 Ml/d or further to 150 Ml/d by paying the relevant expansion cost (Table 1). The 100 £m premium is an upfront cost that the utility pays to reserve the right for expansion in later stages if required. This premium allows the utility to delay investment for the sake of acquiring information. The mathematical formulation in section 2.2 finds the minimum discounted expected investment cost of capacity expansion over a four-stage planning horizon. The supply-demand gap is shown in each node of the tree. In t 2 node B, a fixed reservoir of 50 Ml/d capacity is activated (50 Ml/d fx) since its capacity is sufficient to balance the supply-demand gap till the end of the planning horizon. In t 2 node C, however, a 50 Ml/d modular capacity is the most cost effective intervention that gives the ability to respond uncertain supply-demand level in the future. If s 2 happens, it avoids further investment till the end of planning horizon, while under s 3 , it requires the planner to expand capacity by an extra 50 Ml/d at t 4 to balance the larger supply-demand gap. In t 2 node D, the 50 Ml/d modular reservoir is again picked by the mathematical model, incrementally increasing capacity by an extra 50Ml/d and 100 Ml/d under s 4 and s 5 , respectively, till the end of planning horizon. This example shows how the ROA implementation is used to assess under different future scenarios the suitability of paying a premium to postpone capacity expansion.

Application to Infrastructure Investment Planning
England offers an interesting context to apply adaptive and flexible multistage investment planning, because every 5 years, the economic regulator requires the water utilities to produce a plan demonstrating that the supply-demand balance is satisfied throughout their operating area over a long-term planning period. A plan is an optimal combination of new supply and demand management interventions, scheduled to meet estimated water supply zone demand plus an uncertainty allowance at least cost and is periodically updated. That is, company asset planners must select short-term (5 years) interventions for the next planning decision period and be able to demonstrate how they fit within a strategic long-term plan (25 years or more). Current water capacity expansion scheduling approaches used by water companies in England are based on deterministic annual supply-demand balance (Padula et al., 2013). However, present investment decisions need to account for significant uncertainty.
Climate change projections for the United Kingdom in 2009 (UKCP09) is usually used to define the climate states in relevant studies of water asset planning in England (J. M. ). Borgomeo et al. (2014Borgomeo et al. ( , 2016 use daily time series of precipitation and temperature derived from the UKCP09 projections coupled with a transient stochastic weather generator produced by Glenis et al. (2015). They use a rainfall runoff model to generate daily flow time series to simulate the Thames water resource system. The output from each simulation is a record of the annual frequency of water shortages of different levels of severity (Borgomeo et al., 2016). The baseline supply uncertainty presented in this paper has several sources of uncertainty including vulnerable surface and groundwater licenses, the impact of climate change on source yields, the gradual pollution of sources causing a reduction in abstraction as well as accuracy of supply-side data, which depends on the nature of the intervention (pumping, aquifer, etc.; Thames Water, 2014). Supply uncertainty is calculated using the UKCP09 for the current annual supply-demand planning framework, termed 10.1029/2017WR021803 Economics of Balancing Supply and Demand (EBSD; Padula et al., 2013), where annual central estimates of supply are compared to central estimates of demand (see Thames Water, 2014 for details). Multimodel ensembles of general circulation models (GCMs) can be used by water planners to derive probability distributions of climate change impacts (Dessai & Hulme, 2007;Fowler et al., 2007). The resulting scenarios define the domain of plausible outcomes under climate change.
We use deployable output which is the volume of water that can be supplied from a water company's sources (surface water, groundwater, etc.) or bulk supply, constrained by environment, licensing, hydrological or hydrogeological factors, water quality, and works capacity. In England, deployable output is estimated using prescribed methodologies as outlined in Water Resources Planning Tools (United Kingdom Water Industry Research, 2012), commonly through system simulation of long historical or plausible future hydrological time series.
We apply the proposed multistage modeling to the London urban water supply area which is located in the Thames basin, southeast England. This basin has been classified as water stressed and is facing high population growth (Environment Agency, 2013) making it a suitable case study to investigate the use of the proposed flexible approach, as without investment security of supply cannot be achieved. Water supply is managed by Thames Water, a privately owned water utility, serving 15 million customers across London and the Thames Valley. Financial costs include the net present value (NPV) of capital expenditures incurred when selecting an intervention and operational expenditures, using a discount rate of 4.5% (Thames Water, 2014).
In this case study, a scenario tree is constructed to approximate the continuous distribution of the underlying London water supply (the annual yield or deployable output) provided by London's water utility (Thames Water). We used the supply's cumulative distribution function (CDF) and evenly partitioned the CDF into 100 regions. Each region's highest percentile value is picked up as the sample point. The probability of a scenario occurring is equal to the probability that supply falls within that region (supply range of each scenario interval is defined by the upper and lower percentile values). For instance, the scenario interval for scenario 2 is defined by (X 1 , X 2 ) and its probability P(S 2 ) is calculated by Given the evenly partitioned CDF using the percentile values, the probability of occurrence of each scenario is 1%. This is shown in Figure 2. This set is used to efficiently construct the scenario tree where the probability of each node and the threshold value for branching from one node to the other is calculated optimally. The constructed optimal scenario tree is used for multistage stochastic programming model for ROA implementation.
We do not consider uncertainty around demand growth rate and assume that the demand for water is expected to increase at a known rate. Figure 3 shows the supply uncertainty range for London as well as the deterministic demand values. The problem is structured so as to allow asset managers to review the plan in the distinct decision points (every 5 years) and respond through selecting additional interventions or expanding existing ones, by taking advantage of the observed changes to the main uncertainty drivers (e.g., water supply, demand, capital, and operational cost of intervention). We assume deployable outputs remain constant during the 5-year planning decision periods. Large water resource schemes can be built in phases. The flexibility to build resources in incremental stages allows for improved supply estimates before committing to larger schemes. Final plans are submitted in the year before the first planning decision period covered, and in practice, the proposed approach would allow planners to decide on their investment plans depending on the supply-demand gap a year ahead of the 5-year period end. Although the plans should demonstrate security of supply over the entire 50-year planning period, the main focus of asset managers is to decide which interventions should be implemented in the short term, that is, the optimal investment portfolios for planning decision period 2020-2024.
The scenario tree to approximate the stochastic London water supply-demand balance (due to supply uncertainty) is optimally produced as described in section 2.1 using the uncertainty over the deployable outputs. Each of the 100 unique paths denotes a plausible supply scenario (a set of deployable output values for each source). Each path starts from the unique root node at the first period and is linked to a supply scenario at each distinct time period (see Figure 4). The 50-year planning period was divided in 5-year time steps forming 10 discrete time periods t. Asset managers can rebalance their infrastructure portfolios at the beginning of each planning decision period. Submission of final Water Resource Management Plans occurs 1 year before the plan is due to come into action following a consultation period. At each time step, the scenario tree branches into nodes that belong to the next period.
As seen in the simplified scenario tree in Figure 1a, in t 2 node C has a decision which leads to nodes F and G in the next period representing different levels of supply-demand balance. The branching continues up to the nodes of the final period whose number corresponds to the number of supply scenarios. See Table 2 for the number of nodes used at each time step. We note that the scenario tree approximation method is independent from the staged mathematical model presented earlier and allows consideration of other sources of uncertainty through the use of joint probability distributions of random variables. This can be achieved if the uncertainty set is more than one dimensional, for instance, if it has both supply and demand distributions. The joint probability density function of supply-demand gap, which represents the stochastic component, is used to derive the scenario tree. Appendix A gives details of deriving the scenario tree when the uncertainty has more than one dimension.  We consider 47 alternative supply interventions in the appraisal process. Some interventions have been developed as long-term water resource interventions and are expected to be operated at high utilization given their capacity (e.g., intervention i28), while others are being considered by Thames Water as contingency interventions (e.g., intervention i21), expected to be operated at low utilization to avoid excessive operational costs. The type and capacity ranges of the interventions are given in Table 3 and are provided by Thames Water. Large interventions of 50 Ml/d or greater (such as effluent reuse schemes, desalination plants, and reservoirs) can also be built with a modular capacity that allows expanding later on. This ability for future expansion comes at a price. For each type of intervention, the premium for modular capacity is expressed as a percentage. The percentage value expresses how much larger the initial capital investment cost of the intervention with modular capacity is compared to the fixed (unexpandable) one. Figure 4 shows the nine supply scenarios in planning decision period 2020-2024, at t 2 magnified from the scenario tree. The solutions in 2020 are clustered into six sets of optimal interventions, by identifying the common sets of interventions across the nine nodes. Decision paths are formed using supply-demand gap threshold values. Each threshold value designates which set of interventions are optimal for the given forecasted deficit and leads to different amounts of water capacity increase for the planning decision period 2020-2024. The added water supply capacity is optimal for each scenario if it occurs. The scenario tree within the ROA incorporates uncertainty about how the evolution of different futures may trigger the selection of different  interventions and hence examines the implications of future uncertainty. In this long-term water resource planning problem, sequential decisions are made at multiple stages over time. Early stage decisions are based on long-term supply-demand forecasts whose accuracy decreases over time. The multistage optimization model formulation allows adjusting earlier stage decisions in later stages. This way the model compensates for the impact of earlier decisions made under supply-demand forecast inaccuracy.

Solving the Water Resource Planning Problem at Multiple Stages Over Time
In the London case study, the scenario tree is made based on the state of the world as known in 2015; from that vantage point the future is described via six supply scenarios for 2020. In our case study, if the planner in 2015 considers that the supply-demand balance in 2020 is most likely to be between 10.5 Ml/d and −32.5 Ml/d, then set 4 is the best intervention response. This short-term set of investment interventions is optimally obtained using a scenario tree that considers the longer-term future, and hence, the interventions associated with this set of interventions delineate the best response to uncertainty. The proposed approach is significant because least-cost scheduling of water supply infrastructure is required of English water utilities, and there is wise-spread support at the policy level for improving it to consider flexibility and adaptability. Table 4 shows that 40% of the 100 supply scenarios were directed to the top two paths (sets 5 and 6) where no extra capacity is needed in planning decision period 2020-2024. However, in set 5, an intervention is planned to be delivered in planning decision period 2025-2029 to meet the future demand for water beyond the 5-year period. The remaining 60% is directed into paths where London water capacity is increased by selecting alternative interventions. When supply deficit is greater than 10.5 Ml/d, intervention i28 is always selected with increasing utilization, the amount of water supplied from an intervention, as levels of existing supply decrease. Figure 5 shows the utilization of interventions i28 of 150 Ml/d capacity and intervention i4 of 5 Ml/d capacity indicating that small schemes are selected to postpone the activation of large schemes in case water supply in 2020 is no greater than 2,036.4 Ml/d.  In set 4, intervention i28 is replaced by i21 in planning decision period 2020-2024 as an alternative intervention with 150 Ml/d (see Table 4). The two interventions have equal capacity but contrasting intentional usage in terms of the amount of water produced. Intervention i21 has a relatively lower cost to build and a higher cost to operate and is considered to be a provisional contingency scheme. Contingency schemes are not expected to have a high capacity utilization, resulting in an excess capacity due to their higher operational cost compared to the average cost of taking the water from alternative water sources. Due to their higher operational cost, these schemes can be substituted if less expensive interventions are available in the future.
Conversely, intervention i28 is an irreversible long-term interventions (once built, it is used for the rest of the modeled time horizon) with an expected high utilization rate given its relatively higher construction costs but lower operational costs. This indicates that the selection of schemes is decided on the basis of the estimated required water utilization under different future uncertainty. In doing so, overspending on capital is avoided. When the lower operational costs outweigh the savings in the capital expenditure due to higher utilization then the long-term intervention i28 is selected. Decision on long-term intervention i28 is, however, delayed on the three paths that begin with sets 4, 5, and 6 of investment interventions in 2020. Instead, the modeling suggests to replace this with activation of the contingency intervention i21 in sets 4 and 5 and no interventions activation in set 6. Interventions i1, i2, and i3 are only selected in the path that begins with set 1 in 2020, as these contingency schemes are only required when the supply-demand balance is expected to be less than −179.7 Ml/d.
A key strength of ROA is the opportunity it provides for exploiting learning over time. For example, Figure 6 shows that if the estimated supply-demand gap is greater than 16.0 Ml/d, there is no need to make an investment in the current planning decision period. This flexibility is valuable because by not selecting an intervention now and deferring it to the next planning period, asset managers avoid the costs of building an intervention until it is needed later.
The results, shown as a colored bar chart in Figure 7, depict the frequency of activation of interventions in nodes at each time step on a scale from 0% (white) to 100% (black). A high percentage of activation denotes that the selection of this intervention is robust across a number of supply-demand scenarios. For instance, as shown in Table 4 in the S1 set of interventions, i1, i2, and i3 are all contingency interventions of small capacities, which get activated at t 2 in the most extreme scenarios that correspond to 2% of all scenarios in t 2 . As shown in Figure 4, these extreme scenarios, where S1 is selected at t2, pass through one node. Since interventions i1, i2, and i3 are selected only in S1, they have an activation frequency of 11% (one out of nine nodes) in Figure 7 in t 2 . By the end of the planning period, unlike interventions i2 and i3, i1 has an increased activation frequency. This implies that contingency interventions i2 and i3 are only selected in extreme scenarios, while activation of i1 is more robust across a number of supply-demand scenarios, that is, intervention i1 will also be activated in less extreme scenarios.

Metrics for Flexibility and Adaptability Assessment
We introduce two metrics used in stochastic programming problems (Birge & Louveaux, 1997;Escudero et al., 2007), namely, value of the stochastic solution (VSS) and expected value of perfect information (EVPI), to measure the adaptability and flexibility of the decisions suggested by the ROA formulation. VSS is calculated by replacing the uncertain variables with their expected values and measuring the performance of this expected value problem to future uncertainty.
EVPI is estimated by comparing the solution of the ROA-based approach with the optimal solution for the wait-and-see problem with perfect information. Appendix B gives mathematical detail on the calculations of VSS and EVPI.
In the context of this paper, VSS indicates the difference of implementing ROA via a multistage stochastic program that explicitly allows adaptation to different future conditions via a distribution of uncertain future supply instead of using the average supply values in each stage. VSS quantifies the cost of not recognizing the uncertainty and hence ignoring the adaptability advantage ROA provides. For the London case study, VSS is £113,206,815 discounted over the 50-year planning period. VSS estimates the value of adaptability by quantifying the cost of ignoring uncertainty by Thames Water that can be avoided by adaptive plans to changing future conditions. For the London case study, the VSS result is significant when it is compared to the total investment NPV cost of £737,648,067. That is, VSS corresponds to 15.4% of the total NPV cost. This relatively high VSS value is an indication that supply uncertainty is an important factor in London's supply-demand problem where adaptive solution to changing future can mitigate its consequences.
EVPI measures the value of information in planning under uncertainty. EVPI estimates how important, in the context of uncertainty, evolution of information over time is and therefore it indicates the value of a wait-and-see decision; how valuable it is to know the future before making a decision. In the context of ROA implementation, EVPI is a measure of valuing flexibility of delaying irreversible investment commitments and taking early provisional actions until new information is available. For the London case study, EVPI is £44,092,250 discounted over the 50-year planning period, which is 6% of the total NPV cost. EVPI estimates that the value of waiting to gain more information corresponds to 6% of total NPV. Even this small percentage reflects a significant value for the implementation of large irreversible long-term interventions given their large socioeconomic and environmental impacts.

Sensitivity to Scenario Tree
It is relevant to explore the sensitivity of results to the use of different scenario trees as well as the characteristics of the uncertainty set used to create the trees. We have performed two types of sensitivity analysis. First we investigated the consequences of generating and using alternative scenario trees in the analysis. The London case study was run using 30 different and randomly generated scenario trees from the stochastic London supply distribution making sure that each tree has the same uncertainty source data but has a different structure, that is, different number of nodes at each time step as well as different branching structure. Then, we performed a second type of sensitivity analysis, to investigate the consequences of using random subsets of the full set of scenarios. Each tree was generated using a different subset of supply scenarios randomly sampled from the full set of 100 scenarios.
The results of both types of sensitivity analysis, shown as a bar chart in Figures 8 and 9, respectively, depict the activation frequency of the interventions in planning decision period 2020-2024. It can be appreciated from both types of sensitivity analysis that most interventions suggested by the multistage optimization planning have a high frequency of selection (more than 75%) indicating the quality of the interventions' activation recommendation regardless of whether a different scenario tree (first type of sensitivity analysis) or different subsets of the full set of scenarios were used (second type of sensitivity analysis).
Other types of sensitivity analysis could include understanding the impact of using different relative tolerances by varying the relative distance between the constructed tree and the original stochastic process.

Limitations of the Approach
The proposed approach is an extension of least-cost supply-demand planning (Padula et al., 2013) aiming to optimize for flexibility and adaptability in addition to cost when investing in infrastructure under supply uncertainty. Planning resources via the yield or deployable output concept implies simplifying the problem by comparing a single value of annual regional supply with an annual demand. Although the use of regional annual supply and demand balancing is conceptually simple, these aggregate quantities are difficult to validate (J. Hall, Watts, et al., 2012). Unlike simulation-based optimization approaches that have become routine for analyzing water policies , the proposed optimization model does not rely on simulating Figure 9. Activation frequency of interventions in planning decision period 2020-2024 using 30 subsets of the full set of scenarios.
alternative observable outcomes, such as the frequency with which customers are predicted to experience water shortages.
The analysis uses supply uncertainty data from the UKCP09 weather generator that addresses GCM uncertainties. Although the GCM-based climate projections are obtained from the most credible climate change information available, concerns in the assignment and use of probabilities to these future climate change scenarios have been raised (Maier et al., 2016). These climate models use numerous assumptions about how the future will unfold (Taner et al., 2017) which impact results. For instance, climate projections are contingent on greenhouse gas (GHG) emissions scenarios and future reductions in atmospheric aerosols (Stouffer et al., 2017) which are unknown. Such assumptions impact the probability distributions in climate model outputs which in turn will impact the supply probabilities and the findings of the analysis in our proposed approach (Dessai & Sluijs, 2007).
Another limitation of least-cost supply-demand planning is that plans are optimized using a single least-cost objective, requiring all aspects of system performance to be monetized, leading to potentially imbalanced decisions (Matrosov et al., 2015). Using a single objective might prevent the finding of good solutions .

Conclusion
This paper described how a least-cost scheduling approach for water infrastructure investment planning, used currently at national scale in England, can be extended to explicitly enable flexibility and adaptability given future supply uncertainty. The RO concept using scenario trees over a predefined planning horizon with distinct decision points has been applied to allow rebalancing of the supply-demand system at intermediate stages.
A compact scenario tree is generated to approximate the stochastic supply representing an ensemble of plausible futures. At each time step of the planning horizon, an optimal set of interventions is identified in each node of the scenario tree according to plausible source yield scenarios. Supply-demand gap threshold values are used to determine which path to follow in order to minimize the NPV cost of investments. The staged decision process provides the planner with adaptive solutions whose implementation can be delayed and replaced as information on future supply-demand balance is gradually revealed.
The proposed flexible and adaptive approach is applied to London's water supply planning problem. In the appraisal process, 47 interventions of different capacities (ranging from 1.5 Ml/d to 150 Ml/d) and alternative types (e.g., wastewater reuse, desalination, and reservoirs) are considered. The 50-year planning period using 100 equally probable supply scenarios identified six optimal sets of investment interventions for the planning decision period 2020-2024. Depending on the forecasted short-term supply-demand balance, the planned capacity expansion ranges from 0 Ml/d (no intervention) to 330 Ml/d (as a result of activation of seven interventions). The results show that the large forecasted gap between supply and demand in London is being bridged through long-term (maintained after selection) interventions and through contingency schemes when the gap is smaller.
The results demonstrate the benefits of ROA to enable adaptive and flexible decision making in water resource planning. These are quantified using the VSS and EVPI metrics showing that, respectively, ignoring adaptive planning costs 15.4% of the total NPV and flexible decision making has a value of 6% of the total NPV of London's water supply system. Sensitivity of results to the use of different scenario trees as well as the characteristics of the uncertainty set used to create the trees are assessed. They point toward high-quality intervention activation selections by the proposed model.

Appendix A: Scenario Tree Construction Algorithm
The scenario tree construction uses the original supply scenarios to build a tree with probabilistic weights assigned to each nodes used in the optimization model. The tree construction is an optimization method based on Kantorovich transport functional (developed by Gröwe-Kuska et al., 2003) as follows, where ,̃is n-dimensional stochastic processes, i ,̃j is scenarios (sample path of ), p i , q j is scenario probabilities, probability distribution of the processes and̃, respectively, S is number of scenarios in the initial scenario set, J is index set of deleted scenarios, cJ is cardinality of the index set J; i.e., the number of deleted scenarios, s = S -cJ is number of preserved scenarios, is tolerance for the relative probability distance, and c t ( i , j ) is distance between scenario i , j .

10.1029/2017WR021803
Let P be the set of original scenarios. Scenario set Q based on the scenarios having minimal Kantorovich D K distance to P is computed in equation (A1), The probability q j of the preserved scenarios is given by the rule where J(j) ∶= {i ∈ J ∶ j = j(i)}, j(i) ∈ arg min j∉J c T ( i , j ), ∀i ∈ J.
That is, Kantorovich transport functional make sure that the scenario sample is the best possible approximation of the stochastic process. By bundling similar scenarios and reducing the number of nodes, this produces a smaller, computationally accessible multistage scenario tree that is the solution of the following optimal problem, where s = S -cJ is the number of preserved scenarios. The maximal reduction strategy is deduced to determine a reduced probability distribution Q of such that the maximum number of scenarios are deleted subject to

Appendix B: Computational Insight on the Metrics Used to Evaluate the Implementation of ROA
The calculations of the two metrics, namely, the EVPI and the VSS, in multistage problems are explained below. These metrics were developed for the case of two-stage problems (Birge & Louveaux, 1997) and have been extended to multistage problems (Escudero et al., 2007).
For the minimization model the following inequalities are satisfied: where WS denotes the expected value of the objective function obtained by replacing all random variables by their expected values; WS is known in the literature as the wait-and-see resolution value. AP denotes the optimal solution value to the adaptive multistage stochastic problem presented in this paper. EV denotes the expected result of expected value problem and measures how the optimal solution of the expected value problem performs allowing the other stages decisions to be chosen optimally as functions of different scenarios.
From equation (B1), EVPI and VSS are calculated as follows, To calculate the EVPI, nonanticipativity constraints are relaxed at each time step so that decisions are made with perfect information about the future. From equation (B2), the difference AP − WS displays the value of perfect information. From equation (B3), the difference EV − AP, known as the VSS, indicates the benefit of finding different solutions for each scenario by solving the stochastic program than to assume lack of uncertainty.
In the work of Escudero et al. (2007) those parameters are generalized to the multistage case explained below. Let the expected result in t of using the expected value solution, denoted by EV t for t = 2, … ,T, be the optimal value of the AP model, where the decision variables until stage t − 1, (x 1 , … ,x t−1 ), are fixed at the optimal values obtained in the solution of the average scenario model.
The value of the stochastic solution is defined in t, denoted by VSS t , as This sequence of nonnegative values represents the cost of ignoring uncertainty and not providing adaptive solution to future condition until stage t in the decision making of multistage models. VSS and EVPI in multistage problems are then calculated as and, Notation n Nodes t Time (stage) i Intervention p n Probability that node n is realized r Discount rate eS n,i Levels of existing supply from intervention i for node n cC i Undiscounted capital cost of intervention i fC i Undiscounted fixed operational cost of intervention i vC i Undiscounted variable operational cost of intervention i D t Demand in time t cS n,i Maximum capacity of intervention i in node n i Construction time period for intervention i dS n,i Activation of intervention i for node n S n,i Supply from intervention i for node n aS n,t,i Associate supply on the intervention i to supply on node n in time t