Volume 55, Issue 11 p. 8305-8329
Research Article
Free Access

Using Information Flow for Whole System Understanding From Component Dynamics

Peishi Jiang

Peishi Jiang

Ven Te Chow Hydrosystem Laboratory, Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA

Search for more papers by this author
Praveen Kumar

Corresponding Author

Praveen Kumar

Ven Te Chow Hydrosystem Laboratory, Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA

Correspondence to: P. Kumar,

[email protected]

Search for more papers by this author
First published: 09 September 2019
Citations: 10

Abstract

Complex systems that exhibit emergent behaviors arise as a result of nonlinear interdependencies among multiple components. Characterizing how such whole system dynamics are sustained through multivariate interaction remains an open question. In this study, we propose an information flow-based framework to investigate how the present state of any component arises as a result of the past interactions among interdependent variables, which is termed as causal history. Using a partitioning time lag, we divide this into immediate and distant causal history components and then characterize the information flow-based interactions within these as self- and cross-feedbacks. Such a partition allows us to characterize the information flow from the two feedbacks in both histories by using partial information decomposition as unique, synergistic, or redundant interactions. We employ this casual history analysis approach to investigate the information flows in a short-memory coupled logistic model and a long-memory observed stream chemistry dynamics. While the dynamics of the short-memory system are mainly maintained by its recent historical states, the current state of each stream solute is sustained by self-feedback-dominated recent dynamics and cross-dependency-dominated earlier dynamics. The analysis suggests that the observed 1/f signature of each solute is a result of the interactions with other variables in the stream. Based on high-density data streams, the approach developed here for investigating multivariate evolutionary dynamics provides an effective way to understand how components of dynamical system interact to create emergent whole system behavioral patterns such as long-memory dependency.

Key Points

  • We develop a framework to characterize how the evolutionary dynamics in a multivariate system influence the present state of each variable
  • In codependent long-memory processes, self-dependencies dominate recent dynamics, while cross-dependencies dominate earlier dynamics
  • Partitioning information from the historical states in stream chemistry into different components shows signatures of stream solute origins

Plain Language Summary

The observed dynamics of a variable in the natural environment is shaped by its interactions with several other variables. Our study provides a framework to determine characteristics of these interactions. It shows whether the present state of a variable is strongly influenced by its immediate past or interactions that happened in distant memory. Application of our approach to observed stream chemistry data shows that fractal signatures observed in the data are shaped significantly by interactions in distant memory. Further, the interaction structure reflects source origins of solutes where solutes with oceanic origins through atmospheric pathways and deposition have different behaviors than those originating within the watershed. This research opens up new ways to understand how component interactions give rise to whole system behavior.

1 Introduction

The self-organized behavior and associated patterns of form and function of watersheds, such as solute dynamics in a stream or ecohydrologic dynamics driving subsurface to atmosphere continuum, are a collective behavior resulting from the interactions among a multitude of components. Present-day advances in sensor and communication technologies and declining costs are allowing us to observe the dynamics of our environment at ever-increasing temporal frequency and spatial density. Simultaneous multivariate observations at high frequencies are opening up an unprecedented opportunity to understand and characterize deeply embedded interdependencies that govern process dynamics in our environment. How can we best use such high-dimensional data, arising from a number of simultaneously measured variables, to ask questions that take us beyond component level relationships to expose whole system behavior and enable us to identify system level attributes from component dynamics? On the flip side, can we also understand how system level constraints govern component level dynamics? In this paper we present a framework to address such questions by quantifying information flow among variables to characterize causal dependencies in complex systems (Balasis et al., 2013; Bollt et al., 2018). This draws upon the well-known idea that the whole is greater than the union of the parts. Coherent understanding of whole system dynamics from time series observations of several components in such systems can provide a unique perspective on system evolution and its dynamical characteristics.

Information is encoded in patterns of fluctuations in a signal, such as those recorded as time series by in situ instruments in a watershed. These fluctuations may be externally instigated, such as through variations in rainfall, or internally generated through nonlinearities in the system. As fluctuations propagate through different components in a watershed system, that is, one variable responds through its own fluctuation to that of another variable, we quantify it as flow of information from the latter to the former since the pattern of variability in one variable shapes that of another. Thus, information flow serves as the currency of exchange between interacting variables that fluctuate within the constraints of the conservation of mass, momentum, and energy. In a watershed, or dynamical systems in general, information flow captures the attenuation or amplification of fluctuations among variables, thereby revealing the dynamic connectivity between them (Goodwell et al., 2018). Analysis of such information flow can provide a unique vantage point for understanding watershed functions: that of quantifying multivariate causal interactions (Balasis et al., 2013; Bollt et al., 2018) that link across space and time scales of the watershed dynamics (Jiang & Kumar, 2019).

Consider a multivariate complex system with N variables, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0001, varying in time t. The current state of any variable urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0002 is an outcome of interactions in the entirety of all the earlier dynamics in the system. We call this prior dynamics urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0003 causal history (Jiang & Kumar, 2019). The interactions from the causal history can be parsed in a number of ways. The entire causal history can be divided into immediate and the complementary distant causal histories, partitioned by a time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0004 (Jiang & Kumar, 2019). The quantification of the influences from immediate and distant causal histories, as a function of the partition time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0005, provides insights on the interplay between the influence of recent and prior dynamics on Zt, since the dynamics of Zt is sustained by the dependencies on its own past as well as the interactions with other variables. Here, we propose significant new advances that describe how such interactions can be computed from observed multivariate time series data. In essence we characterize how the outcome or state of a variable at any time can be quantified in terms of the interaction between its own history and those of other variables, both in the immediate and distant causal histories. These interactions are captured as unique, synergistic, and redundant components through the framework of partial information decomposition (PID; Goodwell & Kumar, 2017; Williams & Beer, 2010). This novel framework allows us to characterize the joint influence of self- and cross-dependencies in determining the current state of each variable. This approach enables us to explore how component interactions shape the whole system behavior and how whole system dynamics determines component dynamics.

The rest of the paper is organized as follows. In section 2, we briefly review and synthesize recent developments pertaining to information flow. These developments serve as a foundation and background of the proposed approach. Specifically, we summarize the approach of using directed acyclic graph (DAG) representation for time series (Lauritzen, 1996; Runge et al., 2012a), and its use for quantifying the influence of immediate and distant causal histories on the outcome of a variable (Jiang & Kumar, 2019). Based on these prior developments, in section 3 we present new results to capture the interaction between self- and cross-feedbacks between the target and other variables in the system. We show that this results in a challenge that is often referred to as the curse of dimensionality (Bellman, 1957), requiring us to estimate high-dimensional probability distribution functions. To address this problem, we develop an approximation to reduce the dimensionality based on weighted transitive reduction (WTR; Bosnacki et al., 2010) using momentary information transfer (MIT; Runge et al., 2012b) as weights. In section 4 we illustrate the approach through two applications: (1) an observed long-memory stream solute dynamics using published stream chemistry data (Kirchner & Neal, 2013; Neal et al., 2013) and (2) a short-memory coupled logistic model. Using the two applications, we examine how the differences between short- and long-memory processes are dependent on the self- and cross-feedback in the immediate and distant causal histories. Section 5 provides a discussion and conclusion.

2 Review of Information Flow in Observed Dynamical Systems

In this section, we review the framework for understanding causal dependencies in multivariate time series using information flow. That is, how different variables interact to determine the present state of a target variable in a number of ways. We first summarize its underpinning on information measures. Then, to represent the temporal dependencies of a system, a DAG representation for time series (Runge et al., 2012a; Spirtes et al., 2000) is described. Last, based on this representation of the dynamics, we summarize the information flow to the present state of a target variable through different pathways such as a directed edge, a causal path, a pair of separable causal paths, and causal history. This synthesis is then used for further developments presented in subsequent sections along with illustrative examples.

2.1 Background on Information Measures

To characterize the nonlinear dependencies among multiple variables, we employ information measures based on Shannon's entropy (Shannon & Weaver, 1949). Shannon's entropy quantifies the uncertainty of a variable Xt and is given by
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0006(1)
where p(xt) is the probability of Xt. In a bivariate case of Xt and Yt, the uncertainty of Xt that remains given the knowledge of Yt can be quantified as a corresponding conditional entropy:
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0007(2)
where p(xt,yt) is the joint probability of Xt and Yt. Moreover, the shared dependency between Xt and Yt can be measured by using mutual information of the two variables, which is given by
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0008(3)

The last two equalities of equation (3) illustrate that I(Xt;Yt) symmetrically measures the shared dependency between Xt and Yt or the reduced uncertainty of one variable given the knowledge of the other.

When a third variable Zt is considered as a target influenced by two sources Xt and Yt, the total uncertainty reduction of Zt due to both Xt and Yt is measured as the mutual information between Zt and the union of Xt and Yt, that is, I(Zt;Xt,Yt). To further characterize different information contents in I(Zt;Xt,Yt), PID (Williams & Beer, 2010) has been developed to decompose the total information into (1) synergistic information—information jointly provided by both Xt and Yt and not available from each variable alone, denoted as S; (2) redundant information—the overlapping information from the two sources, denoted as R; and (3) unique information—information provided by each source individually, denoted as UX and UY, respectively. PID is then given by
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0009(4)

An approach for computing these quantities that is suitable for environmental time series data is discussed in Goodwell and Kumar (2017).

2.2 Information Flow in DAG Representation for Time Series

To analyze how the historical dynamics shape the current state of a variable in a multivariate complex system, we represent the temporal dependencies of the system as a DAG for time series, as shown in Figure 1. The DAG represented as urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0010 is defined as follows. Each node refers to the state at time t of a variable in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0011. A directed edge in E linking two nodes Ytτ and Zt, written as YtτZt, stands for a direct causal influence from Ytτ to Zt, where τ is a positive time lag. All the nodes directly influencing the target node Zt form the parentset of Zt, denoted as urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0012. Besides, a source node Ytτ can be linked to a target node Zt indirectly through a causal path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0013, which is a set of nodes connected by a sequence of edges linking from Ytτ to Zt, that is, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0014. Based on the DAG representation for time series, we consider the influence of different ways in which historical dynamics can affect a target node. Specifically, by assuming that only past influences the future, information flow to a target can be classified in the following categories:

Details are in the caption following the image
Illustration of information flow to a target node Zt in a quadvariate complex system from (a) Yt−1 through a directed edge; (b) Yt−3 through the corresponding causal path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0015; (c) Xt−4 and Yt−2 through the union of the corresponding two causal paths urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0016; and (d) the entire causal history, which can be partitioned into immediate and distant causal histories based on a partition time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0017.
Information flow through a directed edge: Consider a direct causal influence, in a Granger sense (Granger, 1969), between two nodes Ytτ and Zt through an edge, as shown in Figure 1a. That is, Zt and Ytτ are connected with a directed edge if and only if a disturbance in Ytτ will result in a corresponding disturbance in Zt when conditioned on the remaining past states of the variables in the system. Mathematically, such influence can be measured as a conditional mutual information (CMI) and is given by (Runge et al., 2012a)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0018(5)
where \ is the exclusion symbol. The CMI in the above inequality quantifies the information flow from Ytτ to Zt conditioned on the knowledge of the rest of the dynamics of all interacting variables in the causal history urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0019 excluding Ytτ.
The computation of CMI in equation (5) is infeasible due to the potentially infinite number of nodes arising from the past history in the condition set urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0020. To avoid this “curse of dimensionality,” the Markov property for DAG is assumed (Lauritzen et al., 1990). Loosely speaking, this property states that a node Zt is statistically independent of its nondescendants if its parents urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0021 are given, where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0022. This property implies that Zt is independent of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0023 given the knowledge of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0024. Correspondingly, the CMI in equation (5) can be revised as
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0025(6)
which is called the MIT (Runge et al., 2012a). Equation (6) holds because under Markov property of the DAG the dynamics between Zt and Ytτ are independent of the rest of the historical states if conditioned on the parents of the two nodes. An example of the condition set is illustrated as the gray nodes in Figure 1a for the influence from Yt−1 to Zt. MIT quantifies the direct interaction between two nodes acting as source and target, by excluding any information from other nodes that may be flowing through the source or directly to the target.
Information flow through a causal path: In addition to a direct influence through an edge, a lagged source node Ytτ can also indirectly affect a target node Zt through the corresponding causal path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0026. An example is illustrated as the influence from Yt−3 to Zt in the quadvariate system in Figure 1b. The quantification of the information flow through urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0027 is the same as equation (5). However, the corresponding simplification of equation (5) based on Markov property is different from equation (6) and is given by
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0028(7)
where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0029 is the parent set of the causal path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0030. Equation (7) is called momentary information transfer along causal path (MITP; Runge, 2015). Note that the condition set in equation (7) is now defined by separating the union of Zt and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0031 from the rest of the prior dynamics as illustrated by gray nodes in Figure 1b for the influence from Yt−3 to Zt. Therefore, equation (7) gives the information flow from a single lagged source Ytτ to a target Zt traversing only through the causal path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0032.
Information flow through two separable causal paths: A natural extension of equation (7) is quantifying the information flow from two lagged sources urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0033 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0034 to a target Zt through the corresponding causal paths urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0035 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0036, respectively. For instance, see Figure 1c, which illustrates the influence from Xt−3 and Yt−2 to the target Zt through their corresponding causal paths. The total information given by the two sources can be quantified as the CMI between Zt and the union of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0037 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0038 given the knowledge of the prior dynamics and is given by (Jiang & Kumar, 2018):
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0039(8)
where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0040 is a subgraph consisting of the union of the two causal paths and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0041 indicates the associated parent set. Again, the condition set, simplified based on Markov property, separates the dynamics going through the two causal paths from the remaining historical states, such as the gray nodes shown in Figure 1c separating the causal subgraph urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0042 from the remaining nodes.
Further, the interaction between the two causal paths can be characterized through PID as (Jiang & Kumar, 2018):
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0043(9)
where the subscript c in the synergy Sc, redundancy Rc, and unique contributions UX,c and UY,c represents that the decomposition is associated with causal paths. Whereas equation (4) characterizes the information contents given by two sources, equation (9) characterizes momentary partial information decomposition (MPID; Jiang & Kumar, 2018) and focuses on characterizing the information only going through the pathways linking the sources and the target, with the influence from earlier dynamics excluded through conditioning. An example that illustrates the difference between equations (4) and (9) is provided in section 4.1 using observed stream chemistry data.

Information flow from immediate and distant causal histories: A further look at the current state of a target variable, Zt, in Figure 1d reveals that Zt is in fact a result of the prior states of all interdependent variables in the system, that is, the causal history urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0044, with information flowing through a multitude of different pathways in the DAG. The causal history can be partitioned into two different complementary components: (1) a recent dynamics arising from all the previous states up to the time step urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0045, termed immediate causal history, represented by a subgraph urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0046 consisting of all the causal paths from the contemporaneous nodes urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0047 to Zt, and (2) the remaining earlier dynamics, termed distant causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0048. An example of immediate and distant causal histories is illustrated in Figure 1d. With the growth of partitioning time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0049, the information from immediate causal history and the complementary distant causal history can be expected to increase and decrease, respectively. Especially, the decay or asymptotic convergence of the information from the distant history illustrates the memory dependency of the system (Jiang & Kumar, 2019). For example, long-memory systems have persistent nonzero information from distant causal history even with large partitioning time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0050.

The quantification of the total information flow ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0051) from the causal history along with its partition into immediate ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0052) and distant ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0053) causal histories is given by Jiang and Kumar (2019) as
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0054(10)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0055(11a)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0056(11b)
Note that different from urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0057, which is theoretically unchanged with urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0058, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0059 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0060 are functions of the partition time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0061. Further, the Markov property of the DAG enables the feasibility of computing equations 1011b by simplifying urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0062 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0063 as (Jiang & Kumar, 2019):
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0064(12a)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0065(12b)
where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0066 referring to the parent set of Zt that belongs to the immediate causal history (shown as the blue nodes in Figure 1) and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0067 referring to the parent set of both Zt and the immediate causal history, which belongs to the distant causal history (shown as the orange nodes in Figure 1d). The simplifications of immediate and distant causal histories into urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0068 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0069, respectively, based on the Markov property, illustrate the aggregation of information in the DAG for time series. That is, the information from the dynamics in the system is aggregated at the nodes directly affecting the node(s) of interest (Jiang & Kumar, 2019). Therefore, equation (12b) indicates that urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0070 has all the information from the distant history such that the mutual information with Zt characterizes the dependency with the target node Zt. Similarly, equation (12a) indicates that when conditioned on urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0071, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0072 includes all the information from the immediate causal history flowing to Zt.

As summarized above, the DAG representation for time series provides effective formulations for capturing a range of different types of information flow to the present state of a target variable, which originates from one or two lagged sources or all of the historical dynamics. Especially, the causal history analysis approach provides a new way to investigate the influence of the entire evolutionary dynamics from the perspective of multivariate causal interactions. While previous study partitions the causal history based on a time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0073, a more detailed examination is desired to explore the interplay between different variables in the causal history. Such exploration is crucial in that it can potentially provide more insights through a finer multivariate analysis on the whole system's dynamics. This detailed characterization is anchored on a physically reasonable segmentation of the causal history as well as the corresponding PID, which is addressed in the next section and serves as the primary contribution of this work.

3 Quantifying Multivariate Interactions

In addition to the temporal separation of the causal history through a variable partitioning time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0074, the analysis of multivariate time series using DAG representation also allows for the possibility of partitioning of the immediate and distant causal histories into self- and cross-dependencies, as shown in Figure 2a. A study of how self-feedback and historical states of other related variables, from both recent and distant dynamics, jointly affecting the current state of a target variable would potentially help reveal how multivariate causal interactions lead to evolutionary behavior of a system. Our goal now is to make this notion precise and demonstrate the application of the developed framework.

Details are in the caption following the image
Illustration of the partitioning of information in the causal history urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0075 of a target Zt. (a) The partition of the causal history urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0076 into four contributions: the self- and cross-dependencies in the immediate causal history, represented by urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0077 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0078, respectively, as well as the self- and cross-dependencies in the distant causal history, represented by urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0079 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0080, respectively. (b) The illustration of the partition in (a) through the directed acyclic graph representation for time series for representing complex system dynamics (blue nodes: the parents of the target node Zt in the immediate causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0081; orange nodes: the parents of the immediate causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0082). (c) The partial information decomposition due to the interplay between the self- and cross-dependencies in the immediate and distant causal histories giving rise to that of the entire causal history in equation (15).

3.1 Interaction Through Information Flow

Specifically, for any target node urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0083 of a system, we partition the distant causal history into two components: self-dependenceand cross-dependence. The first considers how a variable's own history influences its present state, while the latter captures the influence of all other variables arising through interactions as depicted in a DAG. Practically, instead of the original distant causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0084, we partition urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0085 containing the information from earlier dynamics that directly affects the immediate causal history and Zt, into (1) self-dependence, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0086 (the orange box in Figure 2b), and (2) cross-dependence, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0087 (the dashed orange box in Figure 2b). The total information from the distant causal history, represented now by urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0088, is quantified as urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0089 in equation (12b). The partitioning of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0090 into urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0091 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0092 further allows the decomposition of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0093 into synergistic, redundant, and unique contributions by using the PID framework, which is given by
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0094(13a)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0095(13b)
where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0096 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0097 are the synergistic and redundant information from distant causal history, respectively, and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0098 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0099 are the unique information from the self- and cross-dependencies, respectively. Equation (13) represents how interactions in the distant causal history between the target variable and other variables can be quantified as synergistic, redundant, and unique contributions to the target node as a function of the partitioning time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0100.
Likewise, for immediate causal history, we partition urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0101, which contains information from the immediate past and directly influences Zt, into the self- and cross-dependencies, represented by urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0102 (the blue box in Figure 2b) and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0103 (the dashed blue box in Figure 2b), respectively. The corresponding PID from the two parts of the immediate causal history is given by
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0104(14a)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0105(14b)
where the subscript urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0106 in equation (14b) refers to the information partitioning corresponding to the immediate causal history. Note that different from the PID for urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0107 in equation (13b), the partitioning of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0108 requires the conditioning on urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0109 to prevent the influence of distant history in the computation of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0110, which follows a procedure similar to that of computing MPID for characterizing information flow to a target from two causal paths in equation (9). In a complementary manner to equation (13), equation (14) represents how interactions in the immediate causal history between the target variable and other variables can be quantified as synergistic, redundant, and unique contributions to the target node.
A closer look at equations (12)–(14) reveals that the sums of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0111 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0112 as well as their PID elements in fact give rise to the PID of the information from the entire causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0113, in terms of self- and cross-dependencies. The PID for urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0114 is therefore given by
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0115(15a)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0116(15b)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0117(15c)
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0118(15d)
where the subscript urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0119 in equation (15d) refers to the PID in the context of the entire causal history. Equation (15) illustrates that each information content from the entire causal history (i.e., synergy, redundancy, and unique information) due to the interplay between self- and cross-dependencies is additively contributed by the corresponding information content from both its distant and immediate histories. We note that this PID is not a function of the partitioning lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0120.

In this study, we employ the rescaled approach of Goodwell and Kumar (2017) for computing the PID of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0121, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0122, and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0123 in equations (13)–(15). The rescaled approach estimates the redundant information by considering the mutual dependency between two sources and ensures a nonnegative information partitioning. Further, the empirical estimation of all the information-theoretic measures is obtained based on the k-nearest-neighbor (kNN) method (Frenzel & Pompe, 2007; Kraskov et al., 2004). The number of nearest neighbors, k, is set to five to facilitate a low bias of (conditional) mutual information (Kraskov et al., 2004; Frenzel & Pompe, 2007). In the analysis presented later in section 4, we first compute urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0124 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0125, along with their PIDs, in equations (13) and (14), respectively. Then, the PID of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0126 is obtained based on the sum of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0127 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0128 according to equation (15).

3.2 Dimensionality Reduction Using Momentary Information WTR

In addition to the availability of time series data, the validity of empirically estimating the information metrics in equations (12)–(15) also depends on the number of nodes involved in the DAG. The dimensionality required in the computation can still be high even after the reduction achieved by assuming the Markov property for DAG (Runge et al., 2012a). For example, consider the node Zt in Figure 2. The nodes of immediate and distant causal histories required in computing urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0129 (equation (12a)) and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0130 (equation (12b)) are now reduced into those associated with urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0131 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0132 (blue and orange nodes, respectively). However, the dimensions of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0133 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0134 as well as those associated with computing the PID can still be high for reliable estimations of these information-theoretic metrics in equations (12)–(15). Particularly, the condition set, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0135, involved in computing both urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0136 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0137, contains the parents of the entire immediate causal history and accounts for most of the dimensionality in the computation of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0138 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0139, as shown by the orange nodes in Figure 2b. This dimensionality can grow large quickly as the number of variables increase and/or number of lags that influence a target increases. To address this problem, we introduce a new method called the momentary information weighted transitive reduction (MIWTR) to further reduce the dimensionality involved in computing the information measures. It builds on the WTR (Bosnacki et al., 2010) for reducing the complexity of the DAG. Briefly, WTR enables the removal of a directed edge between two nodes if there exists a stronger pathway linking the nodes indirectly, with the strength of a path or edge assessed by its associated weight. Here, we use the WTR to simplify the time series graph, which results in reduced dimensionality for computing the information theoretic measures. The general idea is to first remove edges linking urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0140 with the immediate causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0141 and then exclude any node urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0142 not directly linked to urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0143, thus obtaining a reduced urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0144. We use the MIT (Runge et al., 2012b), which reflects the degree of direct dependency between two nodes, as the weight for the transitive reduction. The details of this MIWTR approach is described in the appendix.

4 Characterizing Multivariate Interaction in Causal History

The framework presented in sections 2 and 3 provides a number of ways by which one may ask how different variables interact to determine the outcome of a specific variable at a specific time. As indicated earlier, a variable may affect another variable's outcome directly as computed through MIT (Figure 1a and equation (6)) or indirectly through a causal path (Figure 1b and equation (7)). Two variables may interact through their corresponding causal paths (Figure 1c), and their interaction can be partitioned using PID into synergistic, unique, or redundant contributions (equation (9)). At the next level of complexity, we can consider the interaction of all variables together influencing the outcome of any variable at time t through the framework of causal history (Figure 1d and equations (10)–(12)). Causal history can be decomposed into complementary components of immediate and distant causal histories as a function of a varying partition time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0145. We have shown that both immediate and distant causal histories can be further partitioned into self- and cross-dependencies (Figure 2). Each of these interaction can then be explored through PID (equations (13)–(15)). We can, therefore, ask the following:
  • How does information flow, jointly provided through the entire causal history, sustain the multivariate interaction associated with the whole system dynamics?
  • How does the characterization of such information at the system level reveal the unique contribution of each individual component in the system?

To address the above two questions, we implement the proposed causal history analysis approach to study the information flow in two different systems by using time series directly from observational data or synthetic data: a hydrobiogeochemical system with observed stream chemistry system known to exhibit long-term memory and a short-memory logistic model. Then, we summarize the insights obtained from these applications.

4.1 Stream Chemistry Dynamics

We first analyze a set of published stream solute data (Kirchner & Neal, 2013), recorded in the Upper Hafren catchment in the United Kingdom. The Hafren river is a tributary of the Severn River, located in mid-Wales and around 20 km from the western coast. The Upper Hafren catchment is covered by grassland and operates upon acidic soils (Neal et al., 1997). The coastal location and the vegetation in the catchment results in two origins of the stream solutes: the marine origin through atmospheric deposition and the terrestrial origin through both transport and biogeochemical processes in the subsurface. Stream solutes were collected and sampled every 7 hr from March 2007 to January 2009 (Neal et al., 2013). It is found that the stream solute data have 1/f fractal signature after correcting the flow rate influence in the observations (Kirchner & Neal, 2013). The fractal signature of the stream chemistry provides evidence of long-memory dependency (Kirchner & Neal, 2013), which is sustained by multivariate interactions within the catchment system (Jiang & Kumar, 2019), and provides an ideal test bed for analyzing component dynamics and emergent system behavior by using the causal history analysis approach developed in section 3.

We utilize the logarithm of flow rate, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0146, and six stream chemistry variables, Na+, Cl, Al3+, Ca2+, SO urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0147, and pH for analysis. The time series for both the raw data and the flow rate-corrected data are plotted in Figures 3a and 3b, respectively. The causal history analysis is conducted on both data sets, to investigate how the evolutionary dynamics influences the current state of each variable as well as how the flow rate affects these states. The DAGs for time series of the two data sets are estimated by using the Tigramite algorithm (Runge, 2015; Runge et al., 2012a, 2015, 2017). Tigramite is a modified PC algorithm (Spirtes et al., 2000) for constructing the DAG representation of time series by using MIT-based conditional independence test to remove spurious relationship between two nodes. In this analysis, each conditional independence test is performed based on 100 samples with a significance level α=0.05, and kNN approach is utilized to compute the corresponding CMI with k = 100 (high k facilitates a low variance of the estimated CMI; Frenzel & Pompe, 2007). The maximum time lag for establishing the directed edges between lagged variables in the algorithm is set to 5 in constructing each graph. The resulting time series graphs are sketched in Figures 3c and 3d. The thickness of each edge are based on the coupling strength between two connected nodes, computed by MIT in equation (6). It can be observed that while there are more cross-dependencies in the graph constructed from the raw data, self-feedback interactions are more dominant in the flow rate-corrected graph with less cross edges. The comparison between the two DAGs illustrates that flow rate is an important factor for creating cross-dependencies among the stream solutes, as expected. However, it is also remarkable that cross-dependencies exist between the variables in the absence of flow dependencies, which suggests that they may play a role in the sustenance of long-memory property.

Details are in the caption following the image
Time series data and estimated directed acyclic graph representation for time series using the stream chemistry data discussed in section 4.1. (a and b) The time series plot based on the raw data and the flow rate-corrected data, respectively, obtained from (Kirchner & Neal, 2013). (c and d) The corresponding estimated directed acyclic graphs for time series using Tigramite algorithm (Runge, 2015; Runge et al., 2012a, 2015, 2017).

Based on the DAG estimated with raw data in Figure 3c, we first illustrate the information interaction through causal paths, by characterizing the joint influence from lagged Al3+ at time t−2 ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0148) and SO42− at time t−2 ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0149) to the current state of pH (pHt). Figure 4a shows the two causal paths (blue nodes) from the two sources to the target pH (black node). We first compute the PID in equation (4) by using the rescaled approach (Goodwell & Kumar, 2017) for estimating different information components. The resulting information characterization is shown in Figure 4a. We observe a very dominant unique information from aluminum, UAl,c. This is because of the direct influence of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0150 on pHt while urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0151 indirectly affects pHt, which contributes less unique information USO4,c. The remaining synergistic and redundant information illustrates the joint effect of the sources as well as the overlapping effect of the two, which is explained by the dominance of the stream dynamics (e.g., flow rate) in governing the trivariate interactions (Jiang & Kumar, 2019). Nevertheless, when we block the information flow from the remaining graph by conditioning on the parents of two causal paths (orange nodes in Figure 4b) and compute the corresponding MPID in equation (9), the resulting information decomposition differs. First, the total information drops significantly from 0.902 [nats] to 0.088 [nats] due to the prevention of information from earlier dynamics through conditioning. Second, the synergy now dominates, while the redundancy effect diminishes. The strong synergistic information implies the strong overall effect of aluminum and sulfate on stream pH. The decrease of redundancy is due to the conditioning stopping the dominant influence from other variables, such as flow rate. The above examples serve to illustrate the formulations presented in the review section and provide the backdrop for the framework developed in this paper.

Details are in the caption following the image
Illustration of the directed acyclic graph for time series and information partitioning from Ca2+ at time t−3 and Al3+ at time t−2 to the present state of pH by using (a) momentary partial information decomposition (MPID) in equation (9) (Jiang & Kumar, 2018) and (b) partial information decomposition (PID) in equation (4). Note that the rescaled approach (Goodwell & Kumar, 2017) is adopted to compute the different information contents in MPID and PID.

We now further compute the information flow from the causal history over the partitioning time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0152 ranging from 5 to 400 using equation (12) for both DAGs in Figures 3c and 3d. We compute the corresponding PIDs in immediate and distant causal histories using equations (13)–(15). Note that urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0153 is obtained by the sum of the estimated urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0154 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0155. The computations of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0156 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0157 are based on the implementation of MIWTR. For instance, to compute the information transferred to Na+ from immediate and distant causal histories separated with time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0158 by using the raw data, we first identify urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0159 in equation (12) based on the DAG generated in Figure 3c, shown as the orange nodes at the top of Figure 5. Then, MIWTR is implemented to remove the nodes in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0160 whose edges connected to the corresponding immediate causal history are excluded by using WTR. The resulting simplified urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0161 is illustrated in the bottom of Figure 5, showing the reduction of the cardinality of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0162 from 31 to 17. MIWTR is implemented for each variable at each partitioning time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0163 in both DAGs in Figures 3c and 3d. The corresponding cardinalities with and without MIWTR are shown in Figure A6a. It can be observed that for each variable a significant dimension reduction is achieved between 5 to 15 due to the simplification of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0164. The resulting information flows from causal history and their corresponding information partitioning for the systems with and without the influence of flow rate over time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0165 are plotted in Figures 6 and 7, respectively.

Details are in the caption following the image
Illustration of the implementation of MIWTR to reduce the dimensionality of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0166 for the present state of Na+ with partition time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0167 = 6 for separating distant and immediate causal histories. (a) The original identified urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0168 (orange nodes) and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0169 (blue nodes) based on equation (12). (b) The reduced urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0170 by using MIWTR (orange nodes). Note that the number of edges in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0171 are much fewer than for the original directed acyclic graph in (a) due to the reduced edges (in magenta) linking urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0172 and immediate causal history by using MIWTR.
Details are in the caption following the image
Plots of the information flow from distant, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0173 (left), and the entire, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0174 (right), causal histories over the partitioning time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0175 for the raw data (blue region) and the flow rate-corrected data (red region) based on the directed acyclic graph for time series in Figures 3c and 3d, estimated by using both k-nearest-neighbor estimator and the momentary information weighted transitive reduction.
Details are in the caption following the image
Plots of the partial information decomposition (PID) based on the two sets of stream solute time series data and the estimated directed acyclic graph representation shown in Figure 3. (a and b) The PID for the information from the causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0176 in equation (15), based on raw data and the flow rate-corrected data, respectively. (c and d) The PIDs of the immediate and distant causal histories, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0177 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0178 in equations (14) and (13), based on raw data and the flow rate-corrected data, respectively. For each variable, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0179 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0180 lie above and below the dotted black line, respectively. The colors used in the characterization of this figure is consistent with that used in Figure 2.

Figure 6 shows that for each variable in both DAGs, the information from the entire causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0181, is almost invariant with the time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0182. It is due to the fact that the influence from the entire evolutionary dynamics of the system is independent of the time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0183 used for the partition of distant and immediate histories. Furthermore, the nonzero convergence of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0184 for almost each variable implies the presence of long-memory dependence of the stream chemistry dynamics. In addition, the comparison of the results between raw data (blue region) and flow rate-corrected data (red region) shows both urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0185 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0186 drop significantly when the influence of the flow rate is corrected. It is due to the dominance of the flow rate in providing significant amount of information for the dynamics of each selected variable. All the above conclusions are consistent with the results estimated without MIWTR in Jiang and Kumar (2019), illustrating the reliability of the usage of MIWTR in reducing the dimensionality of computing information flow.

Moreover, the characterizations of the information from the entire causal history ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0187), its immediate ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0188) and distant ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0189) partitioning in Figure 7 provide a richer picture about the influence from the system's whole evolutionary dynamics on the present state of each target. The partitioning of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0190 in Figures 7a and 7b shows that self- and cross-dependencies contribute to different information contents in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0191 of both graphs. While the unique information, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0192, contributes most of the information from cross-dependencies for all variables, the main contributor for the influence from self-feedback interactions differs. When the influence of flow rate is included, the redundant information, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0193 in Figure 7a, is stronger in the self-feedback influence. On the other hand, when the flow rate influence is excluded, the unique information of self-dependency, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0194 in Figure 7b, dominates.

In addition to urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0195, the information partitioning of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0196 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0197 in Figures 7c and 7d delineates the contribution of different information contents from a recent and the remaining earlier dynamics of the system. It can be observed that the dominance of self-dependency in immediate history is independent of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0198 through either unique urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0199 or redundant urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0200 information. However, the influence from distant history is attributed by both self- and cross-dependencies for small urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0201 but is dominated only by cross-dependency through its unique information urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0202 as urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0203 increases. This is because the influence from self-dependency is limited in recent dynamics such that when the separation time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0204 is too small, some of the self-dependency influence will be reflected in the distant causal history and squeezed back to the immediate causal history as urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0205 grows. It also implies that the influence from self-feedback interaction dominates the recent dynamics of each target variable, while the interaction with other variables dominates the dynamics of the target in the long term. This is especially insightful in understanding the 1/f fractal dynamics of these stream solutes. The important role of self-feedback interaction in determining a self-similar process is well accepted; however, the role of interaction with other related variables of a system is usually not considered. The PID of causal history now allows us to explicitly quantify this role through information flow. It shows that the influence due to the cross interactions is significantly crucial in sustaining the long-memory behavior of the stream chemistry. This analysis leads us to postulate that in order to sustain the complex long-memory behavior with several interacting components, a complex system requires (1) the influence from the self-feedback interactions in recent dynamics for guiding the short-term trend of the system and (2) the cross-dependency in earlier dynamics for supporting the long-term trend.

Also, the PID results of information from the immediate and distant causal histories reveal different dynamics for the different solutes studied. In the flow rate-dominated system, most solutes show similar PID patterns as plotted in Figure 7c. That is, for each solute, the information from distant causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0206, mainly consists of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0207 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0208, and the information from immediate causal history, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0209, mainly consists of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0210 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0211. Again, the larger redundant information in both distant and immediate causal histories— urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0212 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0213—is due to the influence of flow rate. Meanwhile, different from the other solutes, SO urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0214 shows a stronger unique information due to its self-feedback interactions in recent dynamics, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0215. It implies that SO urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0216 is less subjective to the influence of flow rate than other solutes, which is evident from the negligible changes between SO urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0217's PID when comparing results with and without flow rate in Figure 7d. However, when the influence of flow rate is removed, the PID patterns differ for each variable as shown in Figure 7d. For instance, Na+ and Cl show stronger unique information due to their self-feedback dynamics in both distant and immediate causal histories, represented by urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0218 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0219, respectively. This is consistent with the fact that the majority of the sodium and chloride in the studied catchment, which is close to the coast, originates from the ocean and are brought inland through atmospheric deposition (Neal et al., 1997). Therefore, compared with other solutes, the states of Na+ and Cl are more influenced by their own dynamics and less by other variables. Meanwhile, for solutes with evenly mixed origins from both ocean and catchment, such as Ca2+ and SO urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0220 (Neal et al., 1997), there is higher redundant information in their distant causal histories, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0221. This illustrates the shared influences from the catchment processes and oceanic origins and atmospheric pathways for both Ca2+ and SO urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0222, represented by urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0223. Lastly, Al3+ shows dominant redundant information from both distant and immediate histories, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0224 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0225, respectively. This coincides with its solely terrestrial origin (see Table S3 in Kirchner & Neal, 2013), such that the catchment processes are a strong determinant of the state of Al3+.

The PID approach in conjunction with causal history framework for the solute dynamics has enabled us to characterize the whole system behavior through the DAG representation and associated information flow (Figures 7a and 7b) as well the effect of interactions on the dynamics of each solute. We find that maintaining the whole system dynamics mainly results from a self-dependency-dominated immediate causal history and a cross-dependency-dominated distant causal history. Also, the characterization of information from immediate and distant causal histories reveal differences due to the origins of each solute.

4.2 A Short-Memory Dynamics: A Trivariate Chaotic Model

The previous example serves to illustrate how the PID of causal history exposes embedded dependencies arising through interactions in long-memory systems. To investigate the dynamics in other system, we implement the causal history analysis in a short-memory synthetic model. Consider a trivariate coupled logistic system, given by
urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0226(16)
where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0227 is a uniform noise term and ϵ is its coupling strength. The coupled logistic model shows different degrees of synchronizations as a function of ϵ (Atay et al., 2004; Paredes et al., 2013; Rosenblum et al., 1997). Its symmetric structure with lag one interaction proves to be a short-memory process, as evident from a recent study (Jiang & Kumar, 2019). That is, the current state of each variable is only controlled by a finite set of historical states. Here, we analyze the influence from causal history on a target variable, X3,t, based on a mild noise effect (i.e., ϵ = 0.3). After partitioning the immediate and distant causal histories of X3,t based on an earlier time step urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0228 shown in Figure 8a, we identify urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0229 as blue nodes, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0230 as orange nodes, and the self-and cross-dependencies of the two histories in solid and dashed boxes, respectively. urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0231, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0232, and their corresponding information characterization are calculated for urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0233 ranging from 1 to 50 based on equations (13) and (14), with 10,000 synthetic data points generated to conduct the empirical estimations for each urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0234.
Details are in the caption following the image
Illustration of the trivariate logistic model in equation (4.2). (a) The directed acyclic graph representation for time series of the system for X3,t as the target, with immediate and distant causal histories partitioned based on time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0235 as well as the self-and cross-dependencies in the two histories in solid and dashed boxes for computing information-theoretic measures, respectively. (b) The corresponding plots of the partial information decomposition for the two histories in equations (13) and (14) with urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0236 ranging from 1 to 50.

The characterization of information flow from distant and immediate histories are plotted in Figure 8b (similar to that in Figure 7). Different from the long-memory stream chemistry dynamics, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0237 of the logistic model (the area above the black dotted line) converges to zero with increasing urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0238, indicating the short-term dependence of the process. Furthermore, we observe an overall very strong redundant information contributed by both an increasing urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0239 from immediate causal history and a decreasing urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0240 from distant causal history as urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0241 grows. The opposing changes of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0242 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0243 illustrate the exchange of redundant information from distant causal history to immediate causal history when more states are entrained into immediate causal history. The strong overall redundancy is due to the symmetrical structure of the model in equation (4.2), such that the dynamics of the three variables are similar to each other and, therefore, provide significant overlapping information to the others. In addition, the influence from cross-dependence is now dominated by immediate causal history through urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0244 rather than distant causal history as observed in stream chemistry dynamics. This, again, is because of the short-term dependence of the logistic system leading to the contributions of both self- and cross-dependence interactions originating from recent dynamics.

4.3 Insights From the Applications

Characterizing the information flow from the causal history in the two systems reveals the whole system behavior as well as the dynamics of each of its component. Therefore, it helps address the two questions raised at the beginning of this section.

First of all, the dynamics that sustain the whole system behavior vary between the systems. For a short-memory system, such as the trivariate logistic model, the present state of each variable is maintained by the recent dynamics including both self-feedback interactions and the influence of the other variables. Meanwhile, for a long-memory system, the influences for sustaining the whole system behavior are mainly contributed by a self-dependency-dominated immediate causal history and a cross-dependency-dominated distant causal history. It implies that while the self-feedback interaction from recent dynamics is critical for the short-term dynamics of each variable, the cross-dependency interaction from distant causal history is responsible for its long-term memory.

Second, the dynamics of each component in a complex system can be indicated by characterizing the information from the interactions between self- and cross-dependencies in immediate and distant causal histories. This is complementary to the previous findings associated with the dynamics sustaining the whole system behavior. The previous conclusion depicts the dynamics maintaining the complexity at a system level. On the other hand, this conclusion details the unique dynamics of each variable through the information characterization on the system's dynamics. In the analysis of stream solute dynamics, while they have been widely and consistently found to have fractal behavior (Kirchner & Neal, 2013), the origin of each solute and how they interact with each other differs. The different origins of solutes are reflected by the proportions of mixed redundant and unique information from self-dependency when the dependency of the flow rate is excluded.

5 Discussion and Conclusion

This paper presents an information flow framework to understand the whole system behavior arising from the multivariate interactions occurring between component dynamics. A fundamental need driving this framework is to develop approaches for understanding how interactions between the parts creates emergent whole system behavior. Our study shows that the complexity or emergent dynamics, such as long-memory behavior, results from the multivariate interactions in the entire evolutionary dynamics of the system, or causal history.

Our approach blends the PID technique with the causal history analysis for characterizing the information flow to a target variable, from its self-feedback interactions and the cross-dependencies in both immediate and distant causal histories (see the top of Figure 2). While there are many ways to partition the causal history, we find that the proposed partitioning in terms of the self- and cross-dependencies in a recent and prior earlier dynamics is a reasonable way to reveal the key aspects of interactive dependencies. First, the difference between the influences from immediate and distant causal histories illustrates the memory dependency of the system (Jiang & Kumar, 2019). Second, the strong self-feedback interaction observed in many systems suggests that its interplay with the dynamics of other variables might be one of the keys for determining the current state of each target variable.

Based on the analyses of the observed stream chemistry dynamics and the synthetic model, we find that information characterization differs from system to system, thus illustrating their different behavior. While the future trajectory of a short-memory system is dictated by its recent dynamics as shown for the logistic model, the dynamics of a long-memory system is mainly sustained by the influences from the self-dependency-dominated immediate causal history as well as the cross-dependency-dominated distant causal history. In other words, in a long-memory system, the self-feedback interaction in recent dynamics determines the recent trend of a target variable, and the influence from the distant causal history, on the other hand, guides the long-term evolution. In the analyses of stream chemistry system, the consistent influence on long-term dynamics is evident from the strong unique information of the cross-dependency in distant causal history.

Both the structure of interaction between the variables (i.e., DAG) and the expression of dependencies between them through immediate and distant causal history interactions can be influenced by the presence of deterministic or regular patterns in the data. Examples include diurnal or seasonal cycles. The analyses with and without flow rate corrections allow us to examine the relative importance of such regular patterns. For the stream chemistry study, we see that seasonal flow variability impacts the measures as is evident through comparison of results with streamflow influence removed, although the analysis with the flow influence removed captures the long-memory persistence. Our approach, therefore, demonstrates that some care is needed in the interpretation of results when periodic or regular patterns are present.

Two key issues associated with estimating reliable information measures are the choice of k in kNN estimator and the usage of MIWTR for dimensionality reduction. In this study, we set k=5 throughout the paper to reduce the estimation bias (Frenzel & Pompe, 2007). Studies have been done for the sensitivity analysis of choosing different k values by using synthetic models (see Figures 4.2–4.6 in Runge, 2014). They show that setting k within 5 to 10 is able to quantify reliable causal strengths for CMI with high dimensions around 10 and time series length longer than 1,000. This analysis serves as a basis for using kNN with k=5 in this paper, where CMI with dimensions around 10 to 17 are estimated by using around 2,000 observational data points. In addition, MIWTR is developed to achieve an efficient estimation of different information measures (in equations (12)–(14)) by reducing the cardinalities of the condition set used in the estimation. WTR is better suited for simplifying a DAG than the traditional transitive reduction in that the weights or the strengths of edges are taken into account (Bosnacki et al., 2010). That is, the higher strength an edge has, the less likely that it will be excluded. An example of computing the information flow in a quadvariate logistic model in the appendix with and without this approach establishes the feasibility of its usage. Estimation of information theoretic measures using limited data size is a challenge. Through Figure A3, we show that the length of available data is adequate for such estimation and produces estimates consistent with data lengths an order of magnitude larger. However, much more research is required in this field, as the dimensionality grows quickly with increasing number of variables. We thus anticipate that the effectiveness of this approach will improve for more complex DAG, such as that arising when more variables are observed and included in the graph construction to explore interdependencies.

The approach presented here is fundamentally different from most existing information-based approaches, which either only focus on pairwise interaction or interactions in a specific part of the system. This uniqueness sheds light on how the complex system dynamics are sustained over time, thus improving our understanding of the whole system dynamics and the role of individual components. This is especially helpful in the current age of big data. With the increasing availability of observations, these data-driven tools will provide more insights in different scientific domains. Such data-driven approaches will open up new avenues for investigating complex system dynamics.

Acknowledgments

Funding support from the following NSF grants is acknowledged: EAR 1331906, ACI 1261582, and EAR 1417444. We also thank Allison Goodwell for her comments that helped improve the manuscript. The directed acyclic graph for time series of the stream chemistry example is estimated by using the Tigramite package (Runge et al., 2012a; Runge, 2015; Runge et al., 2015; Runge et al., 2017). The codes for conducting momentary information weighted transitive reduction and calculating the information flows in the stream chemistry and logistic examples are available at GitHub (https://github.com/HydroComplexity/CausalHistory).

    Appendix A: Dimensionality Reduction Using MIWTR

    In the appendix, MIWTR is developed to reduce the number of nodes in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0245 for computation of equations (12)–(14). MIWTR builds on WTR by using MIT defined in equation (6) as the edge weight. Since MIT reflects the strength of direct coupling between a source and target, it serves as an excellent choice. We first provide the procedures of implementing MIWTR and then verify its feasibility through a quadvariate logistic model.

    A1. Method

    WTR builds on the transitive reduction (TR; Aho et al., 1972). For a DAG, TR is aimed at removing “redundant” edges while keeping the connectivity structure of the graph. It is anchored on the idea that a transitive reduced graph can be obtained by removing any directed edge from urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0246 in the original graph if there exists an indirect path connecting the two nodes. However, in a weighted graph, TR potentially removes some “important” edges that have large weights. To avoid that, WTR takes a step further by considering the weights of the edges in the reduction. That is, an edge linking nodes urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0247 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0248 is removed from the original graph if and only if there exists a stronger indirect path from urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0249 to urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0250. Otherwise, the edge is kept in the graph.

    Technically, WTR is defined as follows. Consider a DAG urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0251 with a set of nodes urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0252 and a set of edges E1. We define a path from node urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0253 to urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0254 as urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0255, where all the nodes and edges in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0256 are in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0257 and E1, respectively. Note that the corresponding causal path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0258 is the union of all the paths, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0259, from urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0260 to urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0261. The representative weight of the causal path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0262 is defined as the maximal transitive influence (Bosnacki et al., 2010) and is given by
    urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0263(A1)
    where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0264 denotes all the edges in the path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0265 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0266 is the weight of the edge urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0267. urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0268 is the maximum weight of all the minimum weights in each path urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0269. In the context of information interactions, it can be considered as the maximum allowable information flow in all the pathways linking two lagged variables in a complex system. In WTR, if and only if urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0270, then the directed edge urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0271 is removed. Consider an example graph in Figure A1. It consists of three nodes (i.e., A, B, and C) and three corresponding weighted edges (i.e., AC, AB, and BC). TR removes the edge AC due to the existence of the path ABC indirectly connecting A and C. Meanwhile, WTR keeps AC because the corresponding maximal transitive influence hAC=wAC=2. However, if the weight wAC is changed to 0.9, AC will be removed in WTR since now hAC=1>wAC=0.9.
    Details are in the caption following the image
    Illustration of transitive reduction and weighted transitive reduction (WTR) in a graph consisting of three nodes. (a) Transitive reduction reduces the edge AC due to the existing path ABC. (b) WTR keeps the edge AC because the corresponding maximal transitive influence hAC in equation (A1) equals to the weight wAC = 2. (c) When wAC is reduced to 0.9, WTR excludes the edge AC because now hAC=1>wAC=0.9.
    The key idea of MIWTR is to first remove the edges linking urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0272 and the immediate causal history urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0273 by using WTR and then reduce the number of nodes in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0274, which now are not connected to urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0275. Consider Zt in the DAG for time series urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0276 as the target and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0277 as the time lag for separating urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0278 into an immediate and a distant causal history. We now define a subgraph of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0279, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0280. The node set urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0281 includes the union of the immediate causal history and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0282, that is, urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0283. The edge set Es contains all the edges in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0284. The procedures for reducing the dimensionality of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0285 by using MIWTR is as follows.
    • Implement WTR to exclude edges in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0286, generating a new graph urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0287, where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0288 includes the edges remaining after the implementation of WTR on urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0289.
    • For each node urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0290, check whether there is an edge linking urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0291 to any node in the immediate causal history urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0292 based on the new graph urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0293. If there is no edge urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0294, remove urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0295 from urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0296.
    • Repeat removal of nodes in the previous step for every node in urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0297.
    • Return the reduced set urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0298.

    Consider the DAG for time series in Figure 2 as an example. urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0299 in the orange nodes can be further reduced by excluding urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0300 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0301 if the edges urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0302 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0303 are removed by using MIWTR. A validity test for verifying the MIWTR-based reduction of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0304 in computing equations (12)–(15) is illustrated through a quadvariate logistic model in the appendix. We note that MIWTR algorithm needs to be implemented for each distant/immediate causal history segmentation associated with every urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0305 for each target variable.

    A2. Verification of MIWTR: A Test on a Quadvariate Logistic Model

    This subsection aims to verify the feasibility of MIWTR in reducing the cardinality of the condition set urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0306 in equation (12). We computed and compared the information flow from the immediate and distant causal histories urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0307 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0308 as well as the corresponding PID in equations (12a)–(14), with and without MIWTR in a quadvariate logistic model. The model is given by
    urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0309(A2)
    where urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0310 is a uniform noise term and its coupling strength ϵ is set as 0.2.

    The procedures of computing the information flow are as follows. We first use the Tigramite package to construct the directed acyclic time series graph based on the synthetic data generated from equation (A2). Given the graph describing the causal history, MIWTR is employed to simplify the condition set urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0311 according to the procedures in section1. urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0312 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0313 and their PIDs are then computed, with the partition time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0314 ranging from 5 to 50, using k-nearest-neighbor method with k=5. The parameter setting of the Tigramite is the same as the stream chemistry analysis in section 4.1. Further, to analyze how the data length affects the performance of MIWTR, we compute the information flow with time series lengths 200, 400, 600, 1,000, 5,000, and 10,000.

    The cardinalities of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0315 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0316 with and without MIWTR are plotted in Figure A2a. It can be observed that the dimensions decrease with increasing length of synthetic data. This is because more training data allow a more reliable estimation of the directed acyclic time series graph. The estimated graph becomes stable when data length is larger than 1,000, indicated by the convergence of the decreasing cardinalities. Furthermore, for a given data length, we also observe significant drops of dimensions for both urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0317 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0318 due to the reduction of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0319 by using MIWTR. The reduced dimension of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0320 is around 10 for data length greater than 1000.

    Details are in the caption following the image
    Plots of the cardinalities and the estimated information flow from distant ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0336) and immediate ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0337) causal histories of each variable in equation (A2) over different time lags urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0338, based on data length of 200, 400, 600, 1,000, 5,000, and 10,000. (a) The cardinalities of the urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0339 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0340 in equations (12a) and (12b), respectively, with (the solid lines) or without (the dashed lines) momentary information weighted transitive reduction (MIWTR) of each variable with the time series graphs constructed by using Tigramite. (b) The corresponding values of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0341 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0342 with and without MIWTR.

    The plots of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0321 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0322 with and without MIWTR are shown in Figure A2b. We can observe that both the results using MIWTR (solid lines) and without MIWTR (dashed lines) converge and are pretty close to each other, especially for data length greater than 1,000. Another visualization of using MIWTR with different data lengths is plotted in Figure A3. It can be observed that for each urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0323, both urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0324 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0325 estimated with 1,000 data points are consistent with estimates obtained by using longer data series, with only slight improvements when the data length increases from 1,000 to 10,000. Further, the comparison between each information measure with and without MIWTR shows consistency of estimation when MIWTR is used. It implies that in this quadvariate logistic model, the implementation of MIWTR in reducing the dimensions can ensure a reliable estimation of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0326 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0327 given enough time series data (>1,000).

    Details are in the caption following the image
    Plots of the estimated information flow from distant ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0343) and immediate ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0344) causal histories of X4 in equation (A2) over different time lags urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0345, based on data length of 200, 400, 600, 1,000, 5,000, and 10,000 and different partitioning time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0346. (a and b) The plots of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0347 with and without momentary information weighted transitive reduction (MIWTR), respectively. (c and d) The plots of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0348 with and without MIWTR, respectively. (Note that the black lines delineate the estimated information measures by using 1,000 data points.)

    The PIDs for urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0328 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0329 with MIWTR are plotted in Figures A4a and A5a, respectively. Both urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0330 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0331 contains dominant redundant information, which are urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0332 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0333. It illustrates the symmetric structure of the model in equation (A2). Also, the comparison between the information partitioning with and without MIWTR, in Figures A4b and A5b, shows that the differences are close to zero when more than 1,000 data points are used. This is consistent with the conclusion that the cardinality reduction based on MIWTR does not affect the estimation of information-theoretic measures significantly when the time series data are sufficient.

    Details are in the caption following the image
    Plots of the partial information decomposition (PID) of the information flow from distant causal history ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0349) of each variable in equation (A2) over different time lags urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0350, based on data length of 200, 400, 600, 1,000, 5,000, and 10,000. (a) The PID of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0351 based on momentary information weighted transitive reduction (MIWTR). (b) The difference between urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0352 without and with MIWTR.
    Details are in the caption following the image
    Plots of the partial information decomposition (PID) of the information flow from immediate causal history ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0353) of each variable in equation (A2) over different time lags urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0354, based on data length of 200, 400, 600, 1,000, 5,000, and 10,000. (a) The PID of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0355 based on momentary information weighted transitive reduction (MIWTR). (b) The difference between urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0356 without and with MIWTR.

    In the analysis of stream chemistry data and weather station data in section 4, the cardinalities of urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0334 and urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0335 of all the variables are reduced to be less than or around 20 by using MIWTR. Based on the quadvariate logistic model example, the associated estimations of information flows in Figure 7 are reasonable, because the corresponding time series lengths of the data (around 1,000–4,000) are sufficient to achieve reliable estimation.

    Details are in the caption following the image
    Plots of the cardinalities computing information flow from distant ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0357) and immediate ( urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0358) causal histories of each variable in equation A2 over the partition time lag urn:x-wiley:wrcr:media:wrcr24184:wrcr24184-math-0359, with (the solid lines) and without (the dashed lines) momentary information weighted transitive reduction (MIWTR), in the stream chemistry systems based on the time series graphs constructed in Figure 3.