A Linear Dynamical Systems Approach to Streamflow Reconstruction Reveals History of Regime Shifts in Northern Thailand
Abstract
Catchment dynamics is not often modeled in streamflow reconstruction studies; yet, the streamflow generation process depends on both catchment state and climatic inputs. To explicitly account for this interaction, we contribute a linear dynamic model, in which streamflow is a function of both catchment state (i.e., wet/dry) and paleoclimatic proxies. The model is learned using a novel variant of the Expectation‐Maximization algorithm, and it is used with a paleo drought record—the Monsoon Asia Drought Atlas—to reconstruct 406 years of streamflow for the Ping River (northern Thailand). Results for the instrumental period show that the dynamic model has higher accuracy than conventional linear regression; all performance scores improve by 45–497%. Furthermore, the reconstructed trajectory of the state variable provides valuable insights about the catchment history—e.g., regime‐like behavior—thereby complementing the information contained in the reconstructed streamflow time series. The proposed technique can replace linear regression, since it only requires information on streamflow and climatic proxies (e.g., tree‐rings, drought indices); furthermore, it is capable of readily generating stochastic streamflow replicates. With a marginal increase in computational requirements, the dynamic model brings more desirable features and value to streamflow reconstructions.
Plain Language Summary
Streamflow reconstruction involves estimating streamflow volumes in the distant past using statistical techniques and long range climate proxies. Conventionally, this is done using linear regression techniques which model streamflow as a linear function of climatic inputs. These models do not consider catchment dynamics, although in reality, streamflow volume depends on both climatic inputs and the state of the catchment (whether it is wet or dry). Here, we contribute a novel reconstruction technique, based on linear dynamical systems, that models explicitly the catchment state and its interaction with climatic inputs. We apply it to a case study of the Ping River in northern Thailand. Our model improves reconstruction skills significantly over linear regression. Furthermore, the model's state variable reveals a history in which the catchment shifted between wet and dry regimes. The model can also be used readily to generate streamflow scenarios, which are useful for reservoir operation studies. With only a marginal increase in computation but many desirable features, our model can replace linear regression. Our work has strengthened the values and widened the applications of streamflow reconstruction.
1 Introduction
Since the seminal works of Stockton (1975) and Stockton and Jacoby (1976), streamflow reconstruction has brought forth insights that were unattainable with short instrumental records. Most notably, streamflow reconstructions have revealed extreme events (droughts and pluvials) in the distant past, and put recent extreme events into perspective (Meko & Woodhouse, 2011). In some cases, the paleo period was found to have more extreme droughts (e.g., Güner et al., 2017), and both more extreme droughts and pluvials (e.g., DeRose et al., 2015; Schook et al., 2016), than the instrumental period. In other cases, the opposite was observed (Littell et al., 2016; Woodhouse et al., 2006). Although varying in details, these studies—and many others (e.g., Bekker et al., 2014; Maxwell et al., 2011; Razavi et al., 2015)—came to the consensus that reconstructed streamflow data provide more understanding about streamflow variability than do instrumental data alone. Such added understanding is being transformed into water management practice in the U.S. (Meko & Woodhouse, 2011) and Canada (Sauchyn et al., 2015). Similar progress may be expected in other countries.
(1)The most natural way to incorporate catchment dynamics into streamflow reconstruction is to adopt a mechanistic modeling approach. This idea was explored by several studies. Saito et al. (2015) used the Thornwaite water balance model and reconstructed seasonal temperature and precipitation records to reconstruct streamflow. Gangopadhyay et al. (2015) introduced a hybrid paleo‐water balance approach consisting of two main steps: first, precipitation and temperature data are resampled to create their nonparametric reconstructions (Lall & Sharma, 1996); then, the reconstructions are fed into a water balance model to reconstruct streamflow. Tozer et al. (2018) reconstructed streamflow using a Budyko model with reconstructed data and simulated potential evaporation (PET); the lack of reconstructed PET here was compensated by using its annual average. Naturally, the main limitation of a mechanistic approach stands on its reliance on a large amount of hydrological data, either instrumental, simulated, or reconstructed. Such data may not always be available with the required spatial and temporal resolution.
Recently, Bracken et al. (2016) developed a statistical modeling approach based on a hidden Markov model for streamflow reconstruction. The hidden state is derived from climate proxies and interpreted as the “state of the climate”; streamflow is then reconstructed from the climate state via a log‐linear function. In this hierarchical model, streamflow generation depends on climate dynamics rather than catchment dynamics.
The main motivation for this work is to develop a streamflow reconstruction technique that accounts explicitly for the catchment dynamics without requiring a substantial amount of data. We address this challenge by appealing to linear dynamical systems—a class of models that has been used widely in hydrology, as it provides a good approximation for many natural phenomena, including the rainfall‐runoff process (e.g., Cooper & Wood, 1982a; Ramos et al., 1995). Specifically, we model the relationship between climatic proxies and streamflow using the state‐space representation of a discrete, linear dynamical system, which allows us to account for the dynamics of the catchment state as well as the effect of both climate proxies and catchment state on the streamflow generation process. Traditionally, linear dynamical systems are learned using the Expectation‐Maximization (EM) algorithm (see Cheng & Sabes, 2006, and references therein). However, EM cannot be used directly for streamflow reconstructions because the length of the climate proxies differs from that of the streamflow time series. To overcome this, we propose a novel variant for the EM algorithm.
The technique is tested in the Ping River Basin (northern Thailand), where we reconstruct 406 years of annual streamflow based on the time series of the Palmer's Drought Severity Index—retrieved from the Monsoon Asia Drought Atlas (Cook et al., 2010a). We show that the proposed model yields two important advantages. First, the reconstructed streamflow time series is complemented by a corresponding time series of a catchment state variable that provides information on the catchment's dynamics (e.g., regime‐like behavior), thereby assisting with the analysis of historical droughts and pluvials. Second, we show that the linear dynamic model has higher accuracy than a conventional principal component linear regression (on the instrumental period), especially during droughts and pluvials. We also show that the model can readily generate stochastic streamflow replicates. These advantages are obtained with a marginal increase in computational requirements compared to linear regression.
2 Study Site and Data
2.1 The Ping River Basin
The Ping River drains a catchment of 33,900 km2 (Komori et al., 2012) located in northern Thailand. Along with the Nan River, the Ping is one of the main tributaries of the Chao Phraya, whose basin covers 30% of the country's surface (Figure 1a). The water flowing in the Chao Phraya Basin serves multiple users—i.e., agricultural, industrial, and domestic supply, hydropower generation, navigation, and prevention of seawater intrusion in the Gulf of Thailand—supporting a population of approximately 25 million people, including 8 million in Bangkok (Divakar et al., 2011; Takeda et al., 2016). A key component of the water system is the Bhumibol Reservoir, located on the Ping River. The reservoir has a large active storage capacity—about 9,700 Mm3—that helps control floods and meet the demand of the different water users.

(a) Map of the Chao Phraya River Basin and main tributaries, including the Ping River. The stream gauge station P1 is indicated with a red dot. (b) Box plots showing the distribution of the monthly streamflow measured at P1 for the period April 1921 to March 2005.
In this study, we aim to reconstruct annual streamflow (on a water year basis, April–March) at the P1 stream gauge station, located in Chiang Mai, upstream of Bhumibol Reservoir (Figure 1a). In this area, monthly streamflow exhibits a strong seasonal pattern, with higher flow observed during the South‐West Monsoon season (early May to October–November) (Figure 1b). Peak flows and, therefore, floods are generally observed during the second part of the Monsoon season when heavy rainfall events occur over a catchment that is already wet. The flood generation mechanism can vary on an annual basis, as it depends on the intertwining interactions between Monsoon rainfall, global circulation, and tropical storms (Lim & Boochabun, 2012). For instance, the 2006 flood appears to be caused solely by Monsoon rainfall—whose intensity is amplified in La Niña years (Kripalani & Kulkarni, 1997)—while larger events, such as those that occurred in 1973 and 2005, were caused by the combination of Monsoon rainfall and tropical storm activity (Lim & Boochabun, 2012). Naturally, such complex streamflow generation process makes the reconstruction exercise difficult, especially when using data derived from moisture‐limited trees, because saturated overland flow is not reflected in the tree‐ring widths.
Monthly streamflow data at P1 station were retrieved from the Thai Royal Irrigation Department's database (http://www.hydro-1.net). To match the last year of our paleo data source (described in the next section), we used 85 water years—from April 1921 to March 2005. We found that aggregating streamflow on a water year basis and on a calendar year basis provided similar correlations with the paleo proxy (supporting information Text S1 and Table S2). Since water management in Thailand is carried out on a water year basis, we believe that a reconstruction based on a water year is more useful. We filled in the single missing data point (June 1933) by averaging the monthly streamflow of the previous and subsequent month. The summary statistics of streamflow at P1 station for the instrumental period (1921–2005) are provided in supporting information (Table S1).
2.2 The Monsoon Asia Drought Atlas
In Southeast Asia, streamflow reconstruction studies are rare because the necessary instrumental data for calibration are often limited and, most importantly, tree‐ring records are scarce—an issue partially attributable to the lack of suitable tree species (Sano et al., 2009). In fact, to the authors' knowledge, there has been only one streamflow reconstruction attempt in Southeast Asia (D'Arrigo et al., 2011). To address the problem of data scarcity, we propose to use the Palmer's Drought Severity Index (PDSI). While there are only a few tree‐ring sites in Southeast Asia, the PDSI is available in a gridded data set called the Monsoon Asia Drought Atlas (MADA) (Cook et al., 2010a)—a spatial‐temporal drought map over the Asian Monsoon region, with resolution 2.5° × 2.5°. The map comprises 534 grid points, each containing an annual time series of the PDSI, from 1300 to 2005, reconstructed from tree‐ring chronologies. The theoretical ground for using the PDSI as a climate proxy is that both PDSI and streamflow can be expressed as regression functions of tree‐rings; hence, one can build a regression function between them. Based on this idea, Ho et al. (2016) utilized the Living Blended Drought Atlas (LBDA) (Cook et al., 2010b)—a grid of PDSI time series reconstructed from tree‐rings over North America—to reconstruct streamflow in the Missouri River Basin, yielding good reconstruction skill (adjusted R2 ranged between 0.56 and 0.90).
Our preliminary analyses showed that annual streamflow at P1 station has higher correlation with the nearby MADA grid points than with nearby tree‐ring chronologies (Table 1 and Figure 2). The analyses also showed that 1,200 km is the optimal search radius to include the MADA grid points (supporting information Figure S1). There are three possible explanations for this result. First, this radius incorporates valuable information from the Bidoup‐Nui Ba tree‐ring site in Vietnam, about 1,200 km southeast of P1. The chronology from this site was a major “anchor” for PDSI reconstruction in this region (Cook et al., 2010a). Second, the chronologies at the southern‐most end of the Tibetan Plateau may have contributed to the reconstruction (Figure 2) (E. Cook, personal communication, 2017). Finally, going beyond 1,200 km means leaving the climate zone characterizing the region (Peel et al., 2007). Based on these analyses, we used 51 MADA grid points (within the optimal radius) for the period 1600–2005 as the paleoclimatic data for our streamflow reconstruction. The use of a shorter time series is justified by the fact that most tree‐ring chronologies in Southeast Asia started from the seventeenth century onward (Buckley et al., 2007; Sano et al., 2009)—so, data for the period before 1600 may be less reliable.
and Streamflow, Arranged by Increasing Distance From Station P1, Grouped by Similar Distances
| ID | Starting year | Ending year | Distance to P1 (km) | Correlation | p‐value |
|---|---|---|---|---|---|
| 252 | 1300 | 2005 | 27 | 0.52 | <1E‐4 |
| TH001 | 1558 | 2005 | 55 | 0.2 | 0.0632 |
| TH006 | 1648 | 2004 | 85 | −0.04 | 0.7464 |
| 251 | 1300 | 2005 | 282 | 0.53 | <1E‐4 |
| TH002 | 1786 | 1993 | 354 | 0.13 | 0.2757 |
| 275 | 1300 | 2005 | 368 | 0.48 | <1E‐4 |
| TH003 | 1616 | 1993 | 370 | −0.04 | 0.7589 |
| LS001 | 1743 | 2005 | 407 | −0.22 | 0.0407 |
| TH004 | 1693 | 2006 | 423 | 0.18 | 0.0919 |
| LS002 | 1785 | 2005 | 439 | −0.14 | 0.1863 |
| 301 | 1300 | 2005 | 500 | 0.42 | 0.0001 |
- Note. The numerical IDs (e.g., 252) are those of the MADA grid points. The other IDs belong to the tree‐ring chronologies.
- a Standardized chronology indices are obtained from the dendrobox project (dendrobox.org) (Zang, 2015).

Map showing MADA grid points (color‐scaled circles) within 1,200 km of stream gauge station P1 (red square) and nearby tree‐ring sites (green triangles). The MADA grid points show a radially decreasing correlation pattern. Beyond 1,200 km, correlation is insignificant. (Annual correlation between streamflow and PDSI was calculated for the instrumental period 1921–2005.) MADA: Monsoon Asia Drought Atlas (Cook et al., 2010a).
3 Linear Dynamical System Learning‐Reconstruction
In this section, we provide a brief overview of linear dynamical systems, and then describe in greater detail our proposed variant of the Expectation‐Maximization algorithm used for the reconstruction exercise. Finally, we show how the linear dynamic model can be used to generate stochastic streamflow replicates, and report the experimental setup of our study.
3.1 Linear Dynamical Systems
(2)
(3)

is the system state;
and
are the input and output;
and
are both white noise, independent of each other. Henceforth, we refer to the system governed by equations 2 and 3 as a linear dynamical system (LDS), and its parameters
, and
are collectively referred to as θ. Furthermore, we assume that, at time t = 1, the system starts from an initial state
. Note that the LDS model is a state‐space representation of the Auto‐Regressive‐Moving Average with eXogeneous input model (ARMAX) (Ramos et al., 1995; Shumway & Stoffer, 2011), but it has an advantage over ARMAX: the system state is modeled explicitly. In rainfall‐runoff modeling applications (e.g., Ramos et al., 1995), x, u, and y represent the catchment state (an indicator of its wetness/dryness), rainfall, and runoff (or streamflow), respectively. In the context of this study, x and y maintain the same meaning, whereas the input u is represented by the climatic proxy, namely the principal components of the MADA grid points.
The model can be used for both single‐site and multisite applications (Cooper & Wood, 1982a, 1982b). In the latter case,
and
represent the state and output at the jth site, and the matrices A, C, Q, and R capture the spatial dependence between the sites. Moreover, equation 2 implies that state transition is a first‐order Markov process. Higher order Markov processes can be transformed into first‐order by expanding the state space. For example, in this case study, a one‐dimensional system state xt represents the catchment wetness at time t. We may use a two‐dimensional system state to represent the catchment wetness at the current and previous time step. The dimensions of matrices
, and R must then be increased accordingly.
One observes that linear regression is a subclass of the LDS model: the state‐dependent term Cx in equation 3 is replaced by the constant intercept term α in equation 1, and the state‐transition equation 2 is unused in linear regression. As a result, linear regression may not fully capture phenomena related to the catchment dynamics, such as flood generation mechanisms or long‐range dependence (Koutsoyiannis, 2011). In this respect, LDS is better suited, since it uses the information regarding both catchment state and climate proxies to estimate streamflow. Another key advantage of LDS over linear regression concerns the autocorrelation structure of the output variables. When the linear model is learned using least square estimators, as is often the case, serial independence is implicitly assumed (DeGroot & Schervish, 2012, p. 701), which is often not valid for climatic and hydrological processes (see Pelletier & Turcotte, 1997). This, on the other hand, is not a problem for the LDS model, which is learned using a maximum likelihood method, as we shall see in section 3.2.
3.2 Learning the System States and Parameters With the Expectation Maximization Algorithm
. At the kth iteration, given the current parameter set
, the Expectation step (E‐step) estimates the hidden states

is the time index,
is the state trajectory,
is the output trajectory, and the hat notation denotes the estimator for the respective unknown quantity. In other words, this step solves the state estimation problem. Then, given the newly estimated state, the Maximization step (M‐step) finds a new parameter set
that maximizes the likelihood of the output; that means, the M‐step solves the system identification problem. Mathematically, the goal of the M‐step is to find

denotes the likelihood function and
denotes the input trajectory. The critical property of the EM formulation is that the likelihood is nondecreasing after each iteration step (Dempster et al., 1977), so equation 4 always holds
(4)EM iterates between the E‐step and the M‐step until convergence, i.e., when the left‐hand side of equation 4 is less than a predetermined threshold τ. In the case of Gaussian likelihood, convergence is always guaranteed (Wu, 1983). In the remainder of this section, we describe the mathematical details of the EM algorithm.


Thus,
is the estimated state at time t given observations up to time s, and
is the estimated variance of that state estimator. When s > t, the estimation task is called smoothing, when s < t, it is called prediction, and when s = t, it is called filtering. The overall goal of the Kalman smoother is to compute
(hence the name smoother, as
). This task is done in two passes: forward and backward.
(hence the name filter). First, we assume an initial state
, so
and
. For
, given the latest available estimate
based on observations up to time t − 1, we predict the current state using equation 2:
(5)
(6)
(7)
(8)
(9)
to obtain
. Equation 8 also shows that the updating term is proportional to the prediction error
(10)
(11)
as the prior distribution of xt, and the distribution of
as the posterior distribution, once new data yt is obtained. Furthermore, the Kalman filter can be shown to be the optimal estimator, in that it minimizes the mean squared error. The detailed proofs can be found in Shumway and Stoffer (2011, Chapter 6).
, one can improve the state estimation further using the Rauch‐Tung‐Striebel (RTS) recursion (Rauch et al., 1965) in the backward pass. This pass is initialized with
and
from the forward pass. For
, the following quantities are computed
(12)
(13)
(14)
(15)In the forward pass, one updates the state estimation based on
. In the backward pass, one does so based on
. The multiplier Jt acts as a gain, similarly to the Kalman gain in equation 9.
. The expression for the log‐likelihood is
(16)
, and
are quite cumbersome, some further shorthand notations are necessary. Let



The EM algorithm is summarized in Algorithm 1. It requires the system input and output trajectory (Y and U), and returns the parameter set
and the estimated state and output trajectory (
and
). Note that the solution returned by the EM algorithm is a local optimum since the global optimum found at each M‐step is optimal for that M‐step only, and may not correspond to the global optimum of the overall problem.
Require: Y, U
k = 0
Initialize

Initialize x1 and
.
repeat
for
do
▹ Kalman filter (equations 5-11)
end for
for
do
▹ RTS recursion (equations 12-15)
end for
▹ M‐step (equations 17-20)

until
▹ Convergence
Return:

3.3 Simultaneous Learning‐Reconstruction
Typically, a paleoreconstruction problem is solved in two phases: learning and reconstruction. Accordingly, the study horizon should be divided into two parts: the paleo period (with
) and the instrumental period (
), as illustrated in Figure 3a. Learning involves building a regression model for the instrumental period. Reconstruction is then carried out by feeding the paleo period's input into the regression model to obtain the paleo period's output. Although this conventional approach works well for linear regression, it is not suitable for LDS models because of two issues. First, the EM algorithm not only learns the system parameters, but it also derives an estimate for the initial state x1, which is necessary to commence the state transition. When the LDS is learned with only the instrumental period's data, the modeler faces a question in the reconstruction phase: which initial state to use at time
? As it turns out, this is not a major problem. Equation 2 implies that the state transition is Markovian. Thus, regardless of the initial state at time
, the effect of the initial conditions diminishes as the system evolves through time, and, eventually, the state trajectory converges. One therefore just needs to discard the initial transition period. The second, and most critical, problem arises when the paleo period meets the instrumental one. At this point in time, the system state may be different from the estimated x1 (see Figure 3b). While the estimated θ is optimal for the original x1 (derived in the learning phase), it may not be optimal for the new x1 (derived in the reconstruction phase). Worse still, if the system is propagated further into the instrumental period, the state trajectory may also be different from its original estimation in the learning phase, effectively invalidating the learned model. It is not possible to force the system to the desired x1 because once the system parameters are given, the system is only driven by the input.

(a) Conventional learning‐reconstruction delineation: the model is first learned using the instrumental period's data (
), and then used to reconstruct streamflow in the paleo period (
). (b) When this delineation is used for the linear dynamical systems model, two problems arise: (i) the initial transition period must be discarded and (ii) when the state trajectory is propagated from
, its value at t = 1 may mismatch with its estimated value during the learning phase. (c) Our novel technique enables simultaneous learning‐reconstruction and eliminates these two problems.
To solve this issue, we can drop the paleo/instrumental period delineation and provide the EM algorithm with the entire input time series (Figure 3c). Since the time spans of the input (climatic proxy) and output (instrumental data) no longer match each other, we propose a simple, but essential, modification to the EM algorithm: when yt is missing, its best available estimate is used instead. Specifically, in the forward pass (i.e., the Kalman filter step), we fill in the missing yt with
, calculated from equation 7, which is the best estimate available in the forward pass. Referring to equation 8, one sees that, with this gap filling, the measurement update is effectively skipped. Next, in the M‐step, which is done after the backward pass, the Kalman‐smoothed state estimation becomes available, hence the missing yt is filled with the value calculated from equation 15. Equation 16 suggests that this replacement does not affect the likelihood function (more details are discussed in Appendix Appendix A). With this modification, a new estimation for y in the entire study horizon is created at each iteration. As a result, when the EM algorithm converges, the system state and parameters are learned, and the reconstruction is completed at the same time. Simultaneous learning‐paleoreconstruction is achieved. The modified algorithm is summarized in Algorithm 2.
Require: Y, U
k = 0
Initialize

Initialize x1
repeat
for
do ▹ Kalman filter (equations 5-11)
if
NA then

else
▹ (equation 7)
end if
end for
for
do ▹ RTS recursion (equations 12-15)

end for
Replace missing yt with
▹ (equation 15)
▹ M‐step (equations 17-20)

until
▹ Convergence
Return:

This modification brings two additional benefits. First, it enables cross‐validation. Without this modification, cross‐validation could not be carried out because the original EM algorithm does not handle missing data. The only way to validate the model results, as seen in Shumway and Stoffer (2011) and Cheng and Sabes (2006), would be by way of bootstrapping and hypothesis testing—a validation procedure that yields an empirical distribution of each model parameter and determines the importance of the input variables, but that does not provide any information on the model's predictive skills. Second, the gap filling modification enables the learning algorithm to handle missing data in the instrumental record itself—these missing data points can be replaced by their best available estimates during the learning‐reconstruction process.
In principal, the gap filling modification can be implemented to models with higher dimensions (e.g., multisite models, higher order Markov processes, or higher dimensional state space with different interpretations). There are two potential issues when LDS and its extensions are implemented. First, modelers may face with numerical stability issues in higher dimensions with a lot of missing data, as the computation requires a small amount of matrix inversion (section 3.2). Second, equifinality may arise when different parameter sets yield the same goodness of fit. In this case, care must be taken when choosing an appropriate model that can be interpreted physically.
3.4 Stochastic Streamflow Generation
The LDS model formulated in section 3.1 is a stochastic process model. Once the model's parameters are known, it can be used readily as a stochastic streamflow generator. To do so, one first generates an initial state
. Then, sequentially for each time step
, the noises
and
are generated; and
and yt are computed according to equations 2 and 3. This yields one stochastic replicate of the streamflow process and catchment state. The procedure is repeated for the desired number of replicates.
Note that the stochastic replicates generated this way are only associated with one realization of u. As with other stochastic models with exogenous inputs (e.g., linear regression and ARMAX), a hierarchical procedure can be used: one first creates stochastic replicates of u, and, then, for each realization of u, generates replicates of y. When u is the PDSI, generating its stochastic replicates using a time series model can be difficult (Alley, 1984). To alleviate this, one may adopt a nonparametric resampling method, such as the stationary bootstrap (Politis & Romano, 1994).
3.5 Experimental Setup
As a basis to gauge the performance of our dynamic model, we created a benchmark reconstruction using principal component linear regression, a well‐known method in paleohydrology (cf. Hidalgo et al., 2000; Woodhouse et al., 2006). Specifically, we used a procedure very similar to Woodhouse et al. (2006). First, we performed principal component analysis on the 51 MADA grid points falling within 1,200 km from P1 station, and retained the highest principal components that cumulatively account for at least 95% of the input variance. We then carried out a backward stepwise linear regression using the retained principal components as predictors, and log‐transformed annual streamflow as predictand.
So as to have a fair comparison with the linear regression model, the same input and output variables were used for the LDS model, that is, the principal components selected for the benchmark and log‐transformed annual streamflow. For this seminal experiment, we started with a one‐dimensional system state for two main reasons: this parsimonious model works well without heavy computational load, and it simplifies the physical interpretation. To further facilitate the physical interpretation, the log‐transformed streamflow time series was centralized by subtracting the mean, so that the state x is centralized around zero. We adopted the MATLAB code published by Cheng and Sabes (2006)—available at http://keck.ucsf.edu/sabes/documents/lds-1.0.tgz.gz—and revised it to accommodate the variant described in section 3.3. Since EM is a local optimization algorithm, it may converge to a different maximum likelihood for different initial values of the parameter set
. Therefore, we implemented an exhaustive search for the initial values of
, and R—in the range from 0 to 1, with an increment of 0.1—and selected the initial values that yielded the highest likelihood. We fixed the value of the algorithm convergence threshold τ equal to 10−5 (Shumway & Stoffer, 2011, p. 342) and
. All experiments were carried out in MATLAB on a dual core Intel i7–6700 CPU @ 3.40 GHz with 32 GB RAM running Microsoft Windows 10. The average runtime is 1.4 s for one setup of
.
be the validation set, then
(21)
(22)
is the mean streamflow in the calibration set and
is the mean streamflow in the validation set. Thus, while the Nash‐Sutcliffe efficiency is a single metric that measures the model performance on the whole training set, RE and CE separate the model performance into two separate measures: fitness, in the case of RE, and predictive skill, in the case of CE.
Finally, we generated 100 stochastic replicates for the annual streamflow and catchment state following the procedure in section 3.4. Each replicate has the same length as the original reconstruction (406 years). Since our purpose here is only to demonstrate that the LDS model can be used directly as a stochastic streamflow generator, we did not consider the case requiring a stochastic model for the PDSI. Also, 100 replicates should be sufficient to capture the bulk of white noise variability for our demonstration purpose (more replicates may be needed for applications that are sensitive to extreme values). In addition, we generated 100 stochastic replicates for the linear regression model by simulating the noise
in equation 1 in order to compare the two stochastic models.
4 Results and Discussion
We first report the results obtained with the LDS model on the instrumental period (1921–2005), and compared them against those provided by a conventional principal component linear regression. Then, we illustrate the reconstructed catchment state and streamflow time series for the entire study period (1600–2005), and discuss their relation with El Niño Southern Oscillation, as well as other climate drivers. Finally, we analyze the stochastic replicates from the LDS model.
4.1 Model Performance
The LDS model performed remarkably better than linear regression on the instrumental period (1921–2005): R2 increased by 51%, RE by 189%, CE by 497%; and nRMSE decreased by 45% (Figure 4). Better streamflow estimation was observed mainly where linear regression overestimated or underestimated streamflow for several consecutive years; see for example the periods 1921–1930 and 1948–1954 (Figure 5a). We argue that this improvement must be attributed to the use of a system state variable—and state‐transition equation—in the LDS model. Mathematically, the system state x is a filtered and smoothed version of streamflow; we interpret it as a flow regime state. Thus, the flow regime state x is a quantity that characterizes the annual flow volume compared to the long term mean: x > 0 indicates a wet regime, and x < 0 a dry regime. The state trajectory revealed regime‐like behavior (cf. Turner & Galelli, 2016): the catchment stayed for years (sometimes decades) in one regime, and then shifted to another regime (Figure 5b). By matching the timing of the state trajectory in Figure 5b and the streamflow trajectory in Figure 5a, one observes that linear regression tended to overestimate streamflow when the catchment was in a dry regime (e.g., 1921–1930) and to underestimate it when the catchment was wet (e.g., 1948–1954), while the LDS model matched observation better. This shows that information about the catchment state may be beneficial.

Distribution of performance scores in cross‐validation runs obtained by linear regression and linear dynamical systems (LDS) models. Gray dots represent the value of each score obtained during the validation runs; red dots represent the average value of the scores across all runs. R2, RE, CE, and nRMSE denote the coefficient of determination, reduction of error, coefficient of efficiency, and normalized root mean squared error, respectively.

Results of the linear dynamical systems (LDS) model in the instrumental period: (a) reconstructed streamflow, plotted with 95% confidence interval, compared with the instrumental time series and the results from a benchmark linear regression model (section 3.5); (b) trajectory of the system state (flow regime) with 95% confidence interval. LDS generally provided higher streamflow estimates during periods of high flow regime (1935–1955, 1968–1978), and lower streamflow estimates during periods of low flow regime (1921–1935, 1980–1995).
The catchment state contributes to the streamflow prediction in the LDS model by means of equation 3, which states that the system output is the sum of two terms: the state term Cxt and the input term Dut—in other words, streamflow is the result of two components related to the catchment state and exogenous inputs. Given this relationship, the modified EM algorithm model derived the best combination of the state coefficient C and the input‐output coefficients D. As C was found positive (0.22), a quantity of
was added to (subtracted from) the input term Dut when
(
. But this increase (decrease) did not lead to overestimation (underestimation) because the algorithm derived the input coefficients D that have the same signs, but smaller magnitude, than the linear regression coefficients β (Table 2). Consequently, the LDS model was able to account for the situations in which the catchment is still wet (dry) following a previous wet (dry) year, although the PDSI for that particular year may not be high (low).
Residual analysis (Figure 6) showed that the assumption of independent Gaussian noise was not violated in either models. However, large deviations from Gaussian were observed in both positive and negative tails for the linear regression residuals. For positive residuals (overestimation), the two points of large deviation corresponded to the years 1931 and 1992, during both of which the catchment was in a very dry flow regime (Figure 5b). For negative residuals (underestimation), the two points of large deviation corresponded to the years 1973 and 2005, during both of which the catchment was in a very wet flow regime (Figure 5b). These large deviations were much less apparent in the LDS results where the flow regime was taken into account, although one may observe that when residuals are transformed from the log space back to the original streamflow space, the deviation is still large for year 2005. Thus, residual analysis further corroborates that catchment dynamics should not be neglected in streamflow reconstruction.

Residual analysis results for the (left column) linear regression and (right column) linear dynamical system models. The analysis was based on the log‐transformed streamflow. Plots (a) and (b) show the quantile‐quantile plots of the residuals compared to Gaussian distributions; both models' residuals are close to Gaussian, but larger deviations are observed in the tails for linear regression. Plots (c) and (d) show the autocorrelation function (ACF) of the residuals; no significant autocorrelations are observed.
4.2 A Reconstructed Hydrological History of the Ping River
Results revealed a history of droughts, floods, and regime shifts in the Ping River over the last four centuries (1600–2005). The LDS model and linear regression yielded similar results in normal years (i.e., when the flow regime is about zero), but the LDS model provided lower streamflow estimates in dry years and higher streamflow estimates in wet years (Figure 7a). Most importantly, the LDS model provided a drier picture than what was seen in linear regression results, especially during the low flow periods. This result may have important implications to water management—for instance, in the form of more conservative operating policies for the Bhumibol Reservoir.

Full reconstruction results: (a) reconstructed streamflow, compared with linear regression; (b) flow regime state trajectory. The gray bands are the 95% confidence intervals. The red lines indicate the mean values of the full reconstruction. In plot b, the orange bands are the megadroughts discussed in Cook et al. (2010a), namely the Ming Dynasty Drought (1638–1641), the Strange Parallels Drought (1756–1768), the East India Drought (1790, 1792–1796), and the Victorian Great Drought (1876–1878). The yellow bands are the dry epochs revealed by the flow regime state variable in the paleo period (a dry epoch is a period of consecutive negative flow regime).
The reconstructed flow regime shows different patterns of regime shift over time (Figure 7b). At first, the flow regime shifted infrequently in the seventeenth century; there were four main wet and dry epochs that lasted more than a decade (an epoch is a period where streamflow persists in the same regime). The flow regime then shifted more rapidly in the eighteenth and nineteenth centuries, where each wet or dry epoch lasted only a few years. The pattern of regime shift is most varied in the twentieth century: there were prolonged wet and dry epochs of decadal to bidecadal scales (similarly to the seventeenth century). However, the flow regime fluctuated more vigorously than the previous three centuries. As a result, the last century contains both the wettest period (including the wettest year) and the driest year on record. During the wettest period (1966–1979), two consecutive strong La Niña events occurred, and the driest year (1998) corresponded to a very strong El Niño event. We discuss this correspondence further in section 4.3.
The LDS model results are in agreement with the MADA, in that all four Asian megadroughts in the last millennium each impacted northern Thailand (Cook et al., 2010a). These droughts are the Ming Dynasty Drought (1638–1641), the Strange Parallels Drought (1756–1768), the East India Drought (1790, 1792–1796), and the Victorian Great Drought (1876–1878). But, more interestingly, while the MADA provided a geographical footprint of these droughts, our reconstruction provided more insights pertinent to the Ping River (Figure 7). The Ming Dynasty Drought seems to have triggered, or at least contributed, to a prolonged dry epoch in the Ping. By 1638, the Ping River was coming out of a short dry epoch. The occurrence of the Ming Dynasty Drought then coincided with 3 years of declining streamflow, which set the Ping back to a dry epoch that took two decades to vanish. Throughout this drought, streamflow stayed below the mean level. The Strange Parallels Drought was in the middle of several decades where streamflow was mostly at or below normal, and the flow regime was mostly around zero. Anchukaitis et al. (2011) suggested that the Strange Parallels is closely related to the Indian Ocean Dipole (IOD), and we found that the same relationship holds for the Ping River. Comparing the state trajectory in Figure 7b with page 40 in Anchukaitis et al. (2011), we observe that the first half of the drought, where flow regime was around zero, is consistent with a negative phase in the IOD, and the spike at the end is consistent with a brief positive phase in the IOD. The flow regime history suggested that the Strange Parallels was hydrologically mild, yet it coincided with a tumultuous part of Southeast Asia's history (Cook et al., 2010a; Lieberman, 2003), indicating that the socioeconomic damage of this drought may have been more serious than its hydrologic impact. The East India Drought coincided with a dry epoch in the Ping River, but this drought seemed to have a lesser impact in Thailand than other megadroughts. The Victorian Great Drought was similar to the Ming Dynasty Drought, in that it also set into motion a dry epoch. The MADA indicated a meteorological drought in the whole Southeast Asia between 1876 and 1878 while the flow regime indicated a hydrological drought in Thailand between 1878 and 1886. This suggests that the catchment may have seen a transition from a meteorological drought to a hydrological one. The flow regime also indicated a major drought between 1687 and 1696 that was not identified as a megadrought in Cook et al. (2010a), suggesting that this drought was more localized to Thailand. It should be noted that the PDSI is a meteorological drought index (Alley, 1984; Palmer, 1965) that does not always reflect hydrological droughts (Mishra & Singh, 2010). Hence, droughts identified by the PDSI and those identified by the flow regime may have similarities and differences. This implies that a regional drought footprint and a local streamflow reconstruction can complement each other to provide better understanding, as we demonstrated here.
The LDS results also identified multiple wet epochs in the three centuries preceding the instrumental records, with several pluvial years having flow comparable to the highest ones in the instrumental period (Figure 7). Notably, a prolonged wet epoch occurred between 1659 and 1672, consistent with a period of seven floods circa 1658 ± 7 years, identified in a paleo flood study using analyses of river sediments (R. Wasson, personal communication, 2017). The same study also identified a major flood in 1831, consistent with the wet epoch between 1830 and 1838 as shown in the state trajectory. This result somewhat reflects the flood generation mechanism of the Ping River, where floods are due to heavy rainfall events occurring over a wet catchment. It should also be noted that the sediment study identified peak discharge events, while we reconstructed annual streamflow. Maximum annual flow volume and peak discharge may not necessarily occur in the same year, but our results showed that the catchment stayed in the wet regime several years after a major flood.
4.3 Modes of Streamflow Variability
To characterize the most important temporal modes of variability contained in the reconstructed streamflow time series, we carried out a wavelet analysis using the Morlet wavelet (Roesch & Schmidbauer, 2014). We also applied the same technique to the reconstructed Sea Surface Temperature (SST) anomalies in the Eastern Pacific (Tierney et al., 2015). As shown in Figure 8, reconstructed streamflow shows a mode of variability that coincides with the frequency of the El Niño Southern Oscillation (ENSO)—about 2 to 7 years—during the seventeenth century, early eighteenth century and, intermittently, twentieth century. This result is consistent with previous studies for Thailand, Vietnam, and Southeast Asia (i.e., Buckley et al., 2007; Räsänen et al., 2016; Sano et al., 2009), which suggest that positive ENSO anomalies result in reduced PDSI, and, hence, reduced precipitation. Yet, our results do not indicate a perfect match between interannual variability in SST anomalies and reconstructed streamflow. This may be explained by the fact that SST anomalies in the Eastern Pacific do not always lead to ENSO events (Buckley et al., 2007; Dunbar et al., 1994). Furthermore, Singhrattna et al. (2005) reported that the effect of ENSO on the Thailand summer monsoon exhibits time dependence. In particular, the same authors showed that the relationship between ENSO and Thailand rainfall became stronger after the 1980s; this might explain the steady ENSO‐like temporal mode of variability we observe for the reconstructed streamflow during that period.

Wavelet analysis of (a) reconstructed streamflow and (b) reconstructed Eastern Pacific SST anomalies (Tierney et al., 2015). The color bars indicate the wavelet power. (Values in the fainted region are outside of the cone of influence and should not be interpreted.)
Frequency analysis results for the reconstructed streamflow also show features of interdecadal variance in the seventeenth, nineteenth, and twentieth centuries, which are consistent with the prolonged wet and dry epochs described in section 4.2. As noted in previous studies (Buckley et al., 2007; Sano et al., 2009), these results indicate that other climate drivers may cause decadal streamflow variability in region. For instance, Sano et al. (2009) found a significant positive correlation between tree‐ring reconstructions in Vietnam and SST in the northern Pacific Ocean, suggesting a possible link with the Pacific Decadal Oscillation (Mantua & Hare, 2002). The Indian Ocean Dipole (Saji et al., 1999) have also been found to be related to floods and droughts in Southeast Asia (Delgado et al., 2012; Räsänen & Kummu, 2013). This phenomenon may also influence streamflow variability in the Ping River, as indicated in our discussion on the Strange Parallels Drought (section 4.2).
4.4 Stochastic Replicates
There are stark differences between the stochastic replicates generated by LDS and those generated by linear regression (Figure 9). The replicates from linear regression have greater variability, with extremely high annual flow, more than twice the highest flow in the linear regression reconstruction (Figure 9a). On the other hand, the replicates from the LDS model follow its reconstruction more closely, for both the regime state and annual streamflow (Figure 9b). The differences are due to the characteristics of the two models. Linear regression only explains about 54% of the streamflow variance (Figure 4); the remaining variance is due to noise, which includes unmodeled phenomena. Thus, from the perspective of linear regression, the noise process can generate large deviations from the original reconstruction. On the other hand, from the perspective of LDS, the catchment is largely input‐driven (the exogenous input drives the state, and the state‐input interaction accounts for 82% of streamflow variation). As a result, stochastic replicates of LDS show the same input‐dependent pattern as the original reconstruction, unlike linear regression.

Stochastic replicates generated from (a) linear regression and (b, c) LDS models. The black lines are the original reconstructions (section 4.2), and the gray lines are the 100 stochastic replicates, which were generated by adding noise to the original reconstruction according to equations (1), (2), and (3).
The extreme high and low flows in the 40,600 simulated years of each model (100 replicates × 406 years) contrast the two models further. From the stochastic replicates of the LDS model, the pluvial in the 1970s was extreme: only 19 years exceeded the highest reconstructed streamflow in 1971, and none exceeded the highest regime state in 1973. Contrarily, the highest reconstructed streamflow by linear regression was exceeded 574 times in the stochastic replicates. Similar results were observed for droughts. The lowest flow in 1998, which corresponded to a very strong El Niño event, was exceeded 139 times in the stochastic replicates of the linear regression model. That number is 72 for LDS. Overall, the distribution of the stochastic replicates indicates that the LDS model may be better suited for stochastic streamflow generation and its application to downstream studies.
5 Conclusions
In this work, we contributed a technique for streamflow reconstruction based on the state‐space representation of a discrete, linear dynamical system, which was learned using a novel variant of the Expectation‐Maximization algorithm. The use of a state‐space representation yields two key advantages: it estimates the trajectory of the catchment state during the paleo and instrumental period, and it accounts for the effect of both catchment state and climate proxies on the streamflow generation process. The technique was tested to reconstruct 406 years of annual streamflow for the Ping River, northern Thailand, using the Monsoon Asia Drought Atlas gridded PDSI data set (Cook et al., 2010a) as the paleo‐climate proxy. Our reconstruction identified several prolonged pluvials and droughts in the paleo period. Somewhat differently from most previous works, we found that the instrumental record contains both the wettest period (1965–1979) and the driest year (1998). Our results are aligned with earlier works (e.g., Ho et al., 2015; Tozer et al., 2015) in that flood and drought analyses based on paleoreconstructed data may yield a different picture from similar analyses using instrumental data. Therefore, as seen with other regions (Cook et al., 2010b; Johnson et al., 2016; Kiem et al., 2016), there is a need of more reconstruction studies in Southeast Asia to better understand these natural hazards.
The model's reconstruction in the instrumental period is reliable, supporting the findings by Ho et al. (2016) that a paleo drought record can be used to reconstruct streamflow, and by Watson and Luckman (2005) that gridded PDSI data sets are a rich source of information to investigate hydroclimatic variability. The model scores are notably higher than the conventional principal component linear regression (45–497% improvement), suggesting that it is important to account for catchment dynamics, especially in systems characterized by complex streamflow generation processes. Additionally, our linear dynamical system model has several desirable features. (i) The reconstructed trajectory of the state variable provides more insights about the catchment's history than the reconstructed streamflow alone. For instance, we have shown that the model's state variable reveals regime‐like behavior of streamflow. (ii) The Expectation‐Maximization algorithm used to learn the model is computationally efficient, and does not require any assumption about serial dependence. (iii) The model can be readily used as a stochastic streamflow generator, and it is extendable to multisite applications.
A natural extension of our technique is the identification of a nonlinear dynamical system model, in which the state and output equations are nonlinear. In this case, as suggested by Roweis and Ghahramani (2001), the Kalman smoother in the E‐step needs to be replaced by an extended Kalman smoother, and the global optimizer in the M‐step can no longer be determined analytically. Such model is thus more computationally expensive, but it may yield better results—particularly in catchments that present strong nonlinearities associated to the streamflow generation process. The benefit and cost of such nonlinear models should be investigated. Another possible extension is to use a multidimensional state vector, as only one state variable was used here. It is perceivable that multiple state variables may contain more information or improve model performance; how to interpret them remains an open question. A third extension could be to explore models in which the parameters
, and R change over time. Such time‐varying models may be used to account for changes in catchment characteristics, e.g., due to changes in land use/land cover. When applying LDS and its extensions, modelers should be aware of potential issues such as numerical stability and model equifinality.
Perhaps the most relevant application of this work that should be the topic of immediate research is to transfer the added understanding of catchment dynamics to water management practices, such as reservoir operation models. Recently, Turner and Galelli (2016) and Ng et al. (2017) have shown that regime‐like behavior in streamflow time series contributes to the suboptimality of reservoir operating policies derived with conventional optimization methods; the flipside is that better operating policies can be obtained by incorporating a regime state variable into reservoir operations. In addition, robust operating policies require longer streamflow records, since more training data are likely to provide more robust operating policies. Reconstruction studies that model regime state, such as this work, address both needs.
The encouraging results and the desirable features of the LDS model suggest that it can replace linear regression in future streamflow reconstruction studies. Most importantly, the model's regime state, not available in conventional methods, may add value to downstream applications such as reservoir operations studies. Through the findings in this work, not only has the values of streamflow reconstruction been strengthened, but its potential applications have also been widened.
Acknowledgments
We would like to thank Joost Buurman for his initial suggestions on the Ping River case study; Robert Wasson for his insightful suggestions and for providing additional information and data on paleo floods; Brendan Buckley and Edward Cook for their valuable insights on the Monsoon Asia Drought Atlas (MADA). We also thank the three anonymous reviewers for the positive and constructive comments that improved the manuscript. Hung Nguyen is supported by the President's Graduate Fellowship from the Singapore University of Technology and Design. Monthly streamflow data were obtained from the Thai Royal Irrigation Department database (http://www.hydro-1.net). The MADA data set was obtained from the National Oceanic and Atmospheric Administration (NOAA) database (ftp://ftp.ncdc.noaa.gov/pub/data/paleo/treering/reconstructions/asia/cook2010pdsi/). Standardized tree‐ring indices used for preliminary analysis were obtained from the Dendrobox project (dendrobox.org). SST data were obtained from NOAA database (https://www.ncdc.noaa.gov/paleo-search/study/17955). To create Figure 1, catchment boundary was obtained from the HydroSHEDS project (http://hydrosheds.org/), river and administration boundary data from DIVA‐GIS (http://www.diva-gis.org/Data), and digital elevation data from SavGIS (http://www.savgis.org/thailand.htm).
Appendix A: Rationale for the Gap Filling Modification
(A1)
, the best available estimate for the system state at time t after the previous E‐step. If the missing yt is replaced by
, substituting (15) into (A1), we see that the summand for time step t is zero. Consequently, when the log‐likelihood is differentiated term by term, the term corresponding to yt is already zero. The missing data point is effectively skipped in the M‐step, similarly to what happens in the E‐step.
Notation
-
- x
-
- the hidden system state,

- the hidden system state,
-
- u
-
- exogenous input,

- exogenous input,
-
- y
-
- observed system output,

- observed system output,
-
- w
-
- state noise,

- state noise,
-
- v
-
- observation noise,

- observation noise,
-
- A
-
- state‐transition matrix,

- state‐transition matrix,
-
- B
-
- input‐state matrix,

- input‐state matrix,
-
- C
-
- observation matrix,

- observation matrix,
-
- D
-
- input‐observation matrix,

- input‐observation matrix,
-
- Q
-
- covariance matrix of the state noise,

- covariance matrix of the state noise,
-
- R
-
- covariance matrix of the observation noise,

- covariance matrix of the observation noise,
-
- θ
-
- model parameters,

- model parameters,
-
- μ1
-
- mean of initial state x1
-
- V1
-
- variance of initial state x1
-
- τ
-
- convergence threshold
References
Citing Literature
Number of times cited according to CrossRef: 5
- Thanh Duc Dang, Dung Trung Vu, A.F.M. Kamal Chowdhury, Stefano Galelli, A software package for the representation and optimization of water reservoir operations in the VIC hydrologic model, Environmental Modelling & Software, 10.1016/j.envsoft.2020.104673, 126, (104673), (2020).
- Francesco Avanzi, Zeshi Zheng, Adam Coogan, Robert Rice, Ram Akella, Martha H. Conklin, Gap-filling snow-depth time-series with Kalman Filtering-Smoothing and Expectation Maximization: Proof of concept using spatially dense wireless-sensor-network data, Cold Regions Science and Technology, 10.1016/j.coldregions.2020.103066, (103066), (2020).
- Chenxi Xu, Brendan M. Buckley, Parichart Promchote, S.‐Y. Simon Wang, Nathsuda Pumijumnong, Wenling An, Masaki Sano, Takeshi Nakatsuka, Zhengtang Guo, Increased Variability of Thailand's Chao Phraya River Peak Season Flow and Its Association With ENSO Variability: Evidence From Tree Ring δ18O, Geophysical Research Letters, 10.1029/2018GL081458, 46, 9, (4863-4872), (2019).
- Arun Ravindranath, Naresh Devineni, Upmanu Lall, Edward R. Cook, Greg Pederson, Justin Martin, Connie Woodhouse, Streamflow Reconstruction in the Upper Missouri River Basin Using a Novel Bayesian Network Model, Water Resources Research, 10.1029/2019WR024901, 55, 9, (7694-7716), (2019).
- C. P. Libisch‐Lehner, H. T. T. Nguyen, R. Taormina, H. P. Nachtnebel, S. Galelli, On the Value of ENSO State for Urban Water Supply System Operators: Opportunities, Trade‐Offs, and Challenges, Water Resources Research, 10.1029/2018WR023622, 55, 4, (2856-2875), (2019).



, and
are (Cheng & Sabes, 





