Volume 55, Issue 7 p. 5830-5851
Research Article
Free Access

Predicting CO2 Plume Migration in Heterogeneous Formations Using Conditional Deep Convolutional Generative Adversarial Network

Zhi Zhong

Corresponding Author

Zhi Zhong

Bureau of Economic Geology, Jackson School of Geosciences, University of Texas at Austin, Austin, TX, USA

Correspondence to: Z. Zhong and A. Y. Sun,

[email protected];

[email protected]

Search for more papers by this author
Alexander Y. Sun

Alexander Y. Sun

Bureau of Economic Geology, Jackson School of Geosciences, University of Texas at Austin, Austin, TX, USA

Search for more papers by this author
Hoonyoung Jeong

Hoonyoung Jeong

Department of Energy Resources Engineering, College of Engineering, Seoul National University, Seoul, South Korea

Search for more papers by this author
First published: 11 June 2019
Citations: 109

Abstract

Numerical simulation of flow and transport in heterogeneous formations has long been studied, especially for uncertainty quantification and risk assessment. The high computational cost associated with running large-scale numerical simulations in a Monte Carlo sense has motivated the development of surrogate models, which aim to capture the important input-output relations of physics-based models but require only a fraction of the cost of full model runs. In this work, we formulate a conditional deep convolutional generative adversarial network (cDC-GAN) surrogate model to learn the dynamic functional mappings in multiphase models. The cDC-GAN belongs to a class of semisupervised learning methods that can be used to learn the data generation processes. Like the original GAN, a main strength of the cDC-GAN is that it includes a self-training scheme for improving the quality of generative modeling in a game theoretic framework, without requiring extensive statistical knowledge and assumptions on input data distributions. In particular, our cDC-GAN model is designed to learn cross-domain mappings between high-dimensional input (e.g., permeability) and output (e.g., phase saturations) pairs, with the ability to incorporate conditioning information (e.g., prediction time). As a use case, we demonstrate the performance of cDC-GAN for predicting the migration of carbon dioxide (CO2) plume in heterogeneous carbon storage reservoirs, which has both numerical and practical significance because of the safe storage requirements now mandated in many countries. Results show that cDC-GAN achieves high accuracy in predicting the spatial and temporal evolution patterns of the injected CO2 plume, as compared to the original results obtained using a compositional reservoir simulator. The performance of cDC-GAN models, trained using the same number of training samples, stays relatively robust when the level of spatial heterogeneity is increased. Our cDC-GAN is pattern based and is not limited by the underlying physics. Thus, it provides a general framework for developing surrogate models, and for conducting uncertainty analyses for a wide range of physics-based models used in both groundwater and subsurface energy exploration applications.

Key Points

  • A machine learning-based surrogate model is proposed for predicting CO2 plume migration
  • The method is based on conditional deep convolutional generative adversarial network (cDC-GAN)
  • cDC-GAN can facilitate the high-dimensional cross-domain learning and predict the CO2 saturation with high accuracy at any time instance

1 Introduction

Subsurface resource exploration and management represent one of the main application areas in which understanding of the formation heterogeneity is critically important for resource production optimization and risk management (De Silva et al., 2016; Luo et al., 2013). Historically, flow and transport in heterogeneous formations have been extensively investigated in the subsurface modeling community, under topics such as stochastic hydrogeology (Dagan & Neuman, 2005; Gelhar, 1993; Rubin, 2003), data assimilation and inversion (Oliver & Chen, 2011; Schöniger et al., 2012; Sun & Sun, 2015; Zhou et al., 2011), uncertainty quantification (UQ; Tartakovsky, 2013; Zhang, 2001), and multiobjective optimization (Costa & Nannicini, 2018; Müller et al., 2013; Queipo et al., 2005). A main driving force behind many of these existing efforts is the increasing need to develop distributed high-resolution simulation models, on the one hand, while properly accounting for uncertainties in model structures and parameters, on the other (Wood et al., 2011). Risk assessment and UQ conducted in the Monte Carlo sense typically require sampling a high-dimensional input space and running a large number of forward simulations, which is computationally expensive, especially for large-scale dynamic simulation models. As a result, a large number of surrogate modeling techniques have been developed to reduce the computational burden.

The basic idea behind surrogate modeling is to find an alternative and yet computationally efficient and accurate approximation of the input-output relations simulated in a large-scale dynamic model (Agarwal et al., 2014; Sun & Sun, 2015). In a slightly different definition, Lucia et al. (2004) defined the purpose of surrogate modeling as to “provide quantitatively accurate descriptions of the dynamics of systems at a computational cost much lower than the original numerical model and to provide a means by which system dynamics can be readily interpreted.” So far, a large number of surrogate modeling techniques have been developed in the literature. Examples include (a) kriging (Gaussian process regression) and its variants that use a covariance-based method to interpolate a model's response surface (Kleijnen, 2009; Marrel et al., 2008); (b) the classical supervised machine learning (ML) methods (e.g., artificial neural networks and support vector regression) that often use a combination of nonlinear basis functions to approximate the relations between the input and output; (c) the proper orthogonal decomposition (POD) methods that represent a system's dynamics using a set of orthogonal basis functions obtained through the eigenanalysis of system snapshots (Lucia et al., 2004); and (d) stochastic polynomial chaos expansion (PCE) methods that approximate input-output relations using an expansion of orthonormal polynomials (Xiu & Karniadakis, 2002). Methods such as the kriging and classic machine learning methods generally do not scale well for large-scale dynamic models, unless model inputs and outputs are reparameterized through dimension reduction (Jeong et al., 2018; Sun & Durlofsky, 2017; Swischuk et al., 2018). By design, POD methods find a reduced-order representation of a deterministic model through subspace projection and, thus, can be applied to large-scale dynamic systems. Application of POD to uncertain inputs, however, is nontrivial. In contrast, the PCE methods are designed to approximate stochastic partial differential equations (PDE) and treat the uncertain inputs as random variables; these methods exploit regularity in the dependence of model outputs on the uncertain model inputs by solving a forward problem at a finite number of realizations of the random inputs (Huan & Marzouk, 2013; Ma & Zabaras, 2009; Sun et al., 2018). The number of polynomial terms required for accurate PCE approximation, however, increases rapidly with the number of stochastic dimensions and the order of expansion. Thus, reparameterization of model inputs using dimension reduction techniques is still necessary to constrain the number of stochastic dimensions (Li & Zhang, 2007; Ma & Zabaras, 2009; Sun et al., 2013; Zhang et al., 2015; Zeng et al., 2016). For example, Karhunen-Loève expansion (KLE) or principal component analysis (PCA) is commonly used as a parameterization technique to reduce the dimensionality of random fields (Zhang & Lu, 2004).

In the last several years, the generative adversarial network (GAN) models have attracted wide attention in the artificial intelligence community (Arjovsky et al., 2017; Goodfellow et al., 2014; Isola et al., 2017; Mirza & Osindero, 2014). The original GAN (often called the vanilla GAN) introduced by Goodfellow et al. (2014) is a type of generative model that can generate samples following the distribution of input data (i.e., training samples). Specifically, the design of the vanilla GAN is set in a game theoretic framework that involves two competing players, a generator and a discriminator. The job of the generator is to transform samples from a low-dimensional latent space to samples of the variable of interest, which may potentially exist in a high-dimensional space. The discriminator is a classifier that tries to tell whether a sample is from the generator (fake sample) or from the training data (real sample). By training the generator and discriminator adversarially, GAN builds a generator that can create high-quality fake samples that cannot be distinguished from the real data by the discriminator. In essence, the GAN models include a self-training generative modeling mechanism, providing an attractive alternative for learning the input data distribution without requiring extensive statistical knowledge (e.g., needed for specifying parametric distributions) and code modifications from the end users. With the rapid advance in deep learning training techniques and computing hardware in recent years, the capacity of deep learning networks has also increased significantly, showing superior performance in learning hierarchical feature representations in various image analysis problems. Recognizing the strong linkage between image analyses and the input-output fields generated by distributed models, significant interests now exist in replicating the success of GAN achieved in learning the cross-domain image patterns to learning complex input-output dynamic mappings embedded in large-scale numerical models.

In subsurface modeling, GAN has recently been used to generate stochastic realizations of facies or permeability field from multimodal, high-dimensional distributions (Chan & Elsheikh, 2017; Dupont et al., 2018; Laloy et al., 2017), which is challenging for conventional generative modeling approaches used in geosciences. For example, the Markov chain Monte Carlo (MCMC) method has long been used as a generative model to approximate probability distributions of random variables, but the standard MCMC algorithms are mainly suitable for relatively low-dimensional parameters (Cotter et al., 2013). In Laloy et al. (2017), a spatial GAN is trained to generate 2-D and 3-D unconditional realizations of random parameter fields and then MCMC is applied on a low-dimensional latent space for Bayesian inversion. Alternatively, using the end-to-end (or image-to-image) generative models, GANs can be trained directly to learn dynamic mappings between a pair of high-dimensional model input and output domains. In Sun (2018), a state-parameter identification GAN (SPID-GAN) is trained to learn the forward and reverse mappings between the high-dimensional model parameters and model states and is demonstrated for a single-phase, groundwater flow problem; the forward mapping learned by the GAN is essentially a surrogate model of the forward simulator. In Zhu and Zabaras (2018), a fully convolutional encoder-decoder network is designed to capture the complex forward mapping between the high-dimensional input (permeability) and output fields (pressure) through end-to-end learning. On the basis of Zhu and Zabaras (2018), a deep convolutional encoder-decoder neural network is used in Mo et al. (2018) to develop surrogate models of dynamic multiphase flow models. In general, these recent studies suggest that the new image-to-image deep learning methods are promising, yielding impressive results in terms of predictive performance and uncertainty modeling, even with limited training data. The existing works either focused on learning the bidirectional input-output mappings at a single simulation time step (Sun, 2018) or did not leverage the strengths of GAN for cross-domain learning (Laloy et al., 2017; Mo et al., 2018; Zhu & Zabaras, 2018). In this work, we formulate a conditional deep convolutional GAN (cDC-GAN) to learn the dynamic mappings between high-dimensional model inputs and outputs in a multiphase model and then apply the cDC-GAN to predicting CO2 plume migration in heterogeneous carbon storage reservoirs.

Carbon capture and storage (CCS) is being investigated globally as a geoengineering technology for helping transition from the current fossil fuel dominant economy to a low-carbon economy. Leakage from geological carbon storage reservoirs is a priori nonzero because of the existence of natural faults and/or abandoned wells (Lewicki et al., 2007; Sun et al., 2013, 2018). Thus, high-fidelity simulation models, in conjunction with comprehensive site characterization and monitoring, are required to predict the long-term fate of the injected CO2 and to demonstrate the secure containment of the CO2 plume, with reasonable consideration of site uncertainty. Simulating CO2 flow and transport behavior in porous media is difficult because of the interplay among phase change, composition, and reservoir heterogeneity (Doughty & Pruess, 2004; Jiang, 2011; Zhong & Carr, 2019). The computational costs associated with simulating these aspects of CCS can be prohibitive, necessitating the use of surrogate models. Although a large number of surrogate modeling studies have been conducted for CCS in the context of risk assessment, sensitivity analysis, UQ, and monitoring network design (Dai et al., 2018; Jeong & Srinivasan, 2016, 2017; Keating et al., 2016; Oladyshkin et al., 2011; Pawar et al., 2015; Sun et al., 2013, 2018), development of high-fidelity surrogate models remains a challenging subject in the high-dimensional decision space.

In the following, we first present the design of cDC-GAN and then demonstrate its performance in training a surrogate model for predicting the CO2 plume migration in heterogeneous formations. Although our application area in this study focuses on CCS, the deep-learning-based approach proposed here has practical implications for many other surface and subsurface modeling problems that call for the use of high-fidelity surrogate models. This paper is organized as follows. In section 2, the general GAN framework is briefly reviewed and the formulation of cDC-GAN for dynamic surrogate modeling is described. In section 3, CO2 injection into a hypothetical brine aquifer is considered and the cDC-GAN is used to predict the shape of CO2 plumes at different times. Finally, conclusions are provided in the last section.

2 Material and Methods

2.1 GAN

The vanilla GAN consists of a generator and a discriminator that are in competition with each other. Let urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0001 denote a generator for a random variable (vector) x, with z and θg representing its input and model parameters, respectively. Let urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0002 denote the distribution of the outputs produced by the generator. Similarly, let urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0003 denote a discriminator that takes x as input and is specified by a set of parameters θd. The goal of the generator urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0004 is to learn the distribution of the training data pdata(x) and to generate samples that are as genuine as possible, namely, making urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0005 as close as possible to pdata(x). The goal of the discriminator urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0006 is to determine whether a sample x is generated from pdata(x) or from urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0007, and to assign probabilities accordingly, namely,
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0008(1)

The training of GANs typically involves solving a minimax optimization problem in the game theoretic framework (see subsections below). At convergence, the discriminator is maximally confused, and cannot distinguish fake samples produced by the generator from the real data from pdata(x), meaning urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0009 predicts with a probability of 0.5 for all inputs (Goodfellow et al., 2014).

Sampling directly from urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0010 can be difficult when the dimensionality of input space is high. In the vanilla GAN proposed by Goodfellow et al. (2014), the generator learns a mapping from a latent space z to the sample space of x in order to generate samples from urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0011:
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0012(2)
where I is an indicator function that assumes a value of 1 if urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0013 and is 0 otherwise. The approach exploits the idea that the data-generating process concentrates near a low-dimensional region (or manifold), and the representation of which can be efficiently learned by the generator. Note that the notion of transformation mapping from a latent space to a high-dimensional sample space is not new and is actually the basis of many random field generators and dimension reduction methods, such as the PCA and kernel PCA (Ma & Zabaras, 2011; Sarma et al., 2007).

Existing GANs approach the low-dimensional representation problem in two ways: (a) training a GAN to learn a latent space to data space mapping (Goodfellow et al., 2014; Laloy et al., 2017) or (b) training a GAN to learn a high-dimensional image-to-image (or end-to-end) mapping, but implicitly assuming that a low-dimensional representation exists and can be efficiently learned by the GAN (Isola et al., 2017; Sun, 2018; Zhu et al., 2017). We follow the latter approach in this work (see also section 2.2).

A critical issue in the design of GAN algorithms is how to measure the distance between the distribution urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0014 and pdata(x) accurately and reasonably. In terms of distance between two probability distributions, an inappropriate measure may lead to mode collapse (Arjovsky et al., 2017). In the subsections below, we briefly introduce the Kullback-Leibler (KL) divergence and Jensen-Shannon (JS) divergence measures, which are used in training the vanilla GAN, followed by a description of the optimization problem used to train urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0015 and urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0016. For clarity, we shall omit the dependence on GAN parameters θg and θd in the following discussion where no confusion should occur.

2.1.1 KL and JS Divergence

The KL divergence DKL(pq) is a measure of how a probability distribution p(x) differs from a reference distribution q(x). One of the most important properties of KL divergence is that DKL(pq) ≥ 0, where the equal sign holds if and only if the two distributions in question are identical. If xX is a discrete random variable, DKL(pq) is defined by
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0017(3)
When x is a continuous random variable, DKL(pq) is defined by the following integral:
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0018(4)
By definition, the KL divergence is asymmetric. An alternative to the KL divergence is the JS divergence, which may be considered a symmetric and smoothed version of the KL divergence and is defined by
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0019(5)
where urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0020. JS divergence varies in [0,1], with a value of 0 indicating that the two distributions in question are identical.
Assume we have a set of M training samples x1,x2,…,xM that follow the data distribution pdata(x), and let urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0021 denote the probability of a sample xi actually coming from the generator output distribution urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0022. The likelihood function of all training samples is given by
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0023(6)
which gives the probability of all M samples being from the generator distribution urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0024. The greater the value of L is, the higher the chance that urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0025 approaches pdata(x). To maximize the likelihood, the following optimization problem is solved:
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0026(7)
It can be shown that the maximum likelihood estimation problem in equation 7 is equivalent to the following minimization problem on the KL divergence (Goodfellow et al., 2014):
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0027(8)
which would recover pdata(x) exactly if it lies within the family of distributions covered by the generator distribution, urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0028 (Goodfellow et al., 2014).

2.1.2 Loss Function of GAN

The vanilla GAN solves a minimax problem shown below (Goodfellow et al., 2014)
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0029(9)
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0030(10)
where urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0031 is the loss function and urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0032 denotes the expectation operator. Solution of equation 9 involves maximization of discriminator parameters θd in the inner loop and minimization of generator parameters θg in the outer loop. In a game theoretic framework, it can be shown that learning in the sense of equation 9 resembles minimizing the JS divergence between the data and the model distribution, and the minimum of the loss function in equation 9 is achieved if and only if urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0033, at which point the theoretical value of the optimum is urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0034 (Goodfellow et al., 2014). In actual implementation, a two-step process is often taken to train the generator and discriminator networks iteratively, in which the parameters of the generator are fixed when parameters of the discriminator are being optimized, and vice versa (Goodfellow et al., 2016).

2.2 The cDC-GAN

The vanilla GAN has demonstrated exceptional performance in solving certain unsupervised and semisupervised learning problems (see a recent review by Goodfellow, 2016). However, training of the vanilla GAN is challenging due to the lack of constraints, mode collapse (i.e., the generator produces very similar samples for different inputs), and the discriminator converging too quickly to zero (Goodfellow, 2016; Isola et al., 2017; Mirza & Osindero, 2014). Those drawbacks motivate the development of a number of variants of the vanilla GAN, which are proposed to overcome the shortcomings of the vanilla GAN for different application domains, such as image-to-image translation (Isola et al., 2017), image segmentation (Long et al., 2015), video generation (Baddar et al., 2017; Vondrick et al., 2016), and cross-domain learning (Zhu et al., 2017).

Instead of just using a vector z (which may denote either a latent space vector or a high-dimensional image in the input domain), the conditional generative adversarial network (cGAN) proposed by Mirza and Osindero (2014) also facilitates the inclusion of additional conditions as inputs to train the generator, which is shown by the authors to improve the convergence of GAN significantly. Most of the state-of-the-art GAN models use convolutional layers as building blocks, which represent the inputs as a hierarchy of feature maps. For example, in Radford et al. (2015) a deep convolutional GAN (DC-GAN) is proposed to generate images from latent vectors by applying autoencoding and autodecoding techniques. The encoder-decoder design, together with other deep learning constructs (e.g., batch normalization layers), helps the generator and discriminator to learn downsampling and upsampling operations and improves the training stability of GAN models (Radford et al., 2015).

In this work, we adopt a conditional deep convolutional GAN (cDC-GAN) architecture that combines the strengths of cGAN and DC-GAN for image-to-image mapping. The loss function of the vanilla GAN in equation 10 may be converted to a conditional form to accommodate any additional information that is helpful for training a GAN (Mirza & Osindero, 2014),
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0035(11)
where the conditioning data y may represent auxiliary information in either scalar or vector forms.

In the cDC-GAN design, the generator urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0036 and discriminator urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0037 share a similar structure as used in the pix2pix (i.e., image-to-image mapping) work (Isola et al., 2017), which includes a series of convolutional and deconvolutional layers to help discover high-level features at multiple scales (see Figure 1). For our demonstration case study, the sizes of input and output images are both set to 128×128. We assume that the permeability field represents the main source of uncertain model input. The input data (z) to cDC-GAN is thus a permeability map (illustrated as the filled contour map under Input in the upper-left corner of Figure 1), and the target data (x) includes simulated CO2 saturation maps at different times (note the terms plume and saturation map are used interchangeably in the following discussion). To enable the GAN-based surrogate model to predict model outputs at different steps, we use the time step as conditioning data (y) during training, which is represented as a constant valued image, as shown by the solid color map under Input in Figure 1. In this case, the time step provides additional information for cDC-GAN to learn the input-output dynamic mappings.

Details are in the caption following the image
Design of cDC-GAN. The number of filters of each convolutional layer is shown in the green boxes, while the number of filters of the deconvolutional layer is twice the corresponding convolutional layer connected by the dashed line. The size of feature map corresponding to each level is shown on the left side of this figure. Input data is permeability map, output is simulated CO2 saturation map, and the time associated with each output map is provided as conditioning data. The generator performs image-to-image translation, while the discriminator is a classifier that gives the probability that the generator output is from the data distribution.
The design of the generator follows the U-net design originally proposed by Ronneberger et al. (2015). The kernels for both convolutional layers (used in the downsampling path) and deconvolutional layers (used in the upsampling path) are 4×4 with a stride size of 2. The number of filters for different layers increases from 256 to 2,048 in the downsampling path, allowing the encoder to learn low-level, fine-grained feature maps. The flow is then reversed in the upsampling path, where the coarse-grained feature maps from the decoder are combined with fine-grained feature maps from the encoder through skip connections (indicated by the dashed line in Figure 1). The leaky rectified linear unit function (leakyReLU) with a leaky value of 0.02 is used as the activation function for all hidden layers, namely,
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0038(12)

The output layer uses the hyperbolic tangent function (tanh) as the activation function to recover continuous values, and generates an output image having the same size as the input image.

During training, the generative model and discriminative model are optimized iteratively in alternating steps. Batch normalization is used on both generative and discriminative models to stabilize training (Salimans & Kingma, 2016). In this work, we apply alternating optimization of the discriminator network urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0039 and generator network urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0040 by five steps and one step, respectively. In other words, in each epoch, five iterations are used for optimization of the generator parameters, while a single iteration is used for optimization of the discriminator parameters. The stochastic gradient descent (SGD) solver is adopted with the reduced learning rate option to avoid numerical oscillation problems during training. We implemented our models using the open-source deep learning package, PyTorch (https://pytorch.org/).

2.3 Performance Metrics

Two metrics are used to quantify the performance of the cDC-GAN. The structural similarity index (SSIM) commonly used in image analysis (Wang et al., 2004) provides a measure of “perceptual difference” between two images. For two sliding windows u and v, the SSIM is defined by
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0041(13)
where μu and μv are the mean values of windows u and v, σu and σv are the standard deviation of windows u and v, and σuv is the covariance of windows u and v. C1=0.01 and C2=0.03 are constants. In this study, the sizes of sliding windows u and v are both set to 11×11 pixels (grid cells).
Root-square-mean error (RMSE) is another commonly used metric and is defined by
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0042(14)
where N is the number of samples and xi and urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0043 are the true and GAN-generated images, respectively.

2.4 Multiphase Flow Governing Equations

In this work, cDC-GAN is used to develop surrogate models for CO2 flow and transport in heterogeneous formations. For the geological carbon sequestration setting considered in this study, we assume that CO2 is injected into a brine aquifer. We also assume, without loss of generality, that the fluid flow system consists of two components (water and CO2 gas) and two phases (liquid w and gas g). Our starting point is Darcy's law for multiphase flow, which prescribes the relationship among fluid flux, reservoir properties, fluid properties, and phase pressure in a multiphase system:
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0044(15)
where the subscript α denotes phase, qα is the phase flux, k is absolute permeability, kr,α is relative permeability, ρα and μα are the fluid density and viscosity, Pα is fluid pressure in phase α, g is the gravitational constant, and z is the reservoir depth. The corresponding flow equations for the two-phase system are given by the following PDEs:
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0045(16)
where ϕ is porosity, Sα is saturation, Sg+Sw=1, qf,α denotes the source/sink terms in each phase, and the fluid pressures for the two-phase system are related through the capillary pressure Pc,w
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0046(17)
We assume that CO2 from the gas phase can dissolve in the water phase, but dissolution of water in the gas phase is neglected (i.e., the gas phase contains only one component). Mass transport for component κ is governed by the following advection-dispersion equation:
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0047(18)
where urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0048 is the mass fraction of component κ in phase α, Dα is the diffusion coefficient, τα is tortuosity, and fκ denotes the sink/source term for component κ. We used the commercial compositional reservoir simulator CMG-GEM (https://www.cmgl.ca/gem) to solve the miscible flow problem described herein.

3 Results and Discussion

3.1 Experiment Setup

We consider a 2-D hypothetical carbon storage aquifer with spatially heterogeneous reservoir properties. The model dimensions are 1,280 m × 1,280 m, with a uniform lateral grid block size of 10 m × 10 m and a layer thickness of 20 m. The aquifer is confined by overlying and underlying seals (i.e., no-flow boundary in the direction perpendicular to the aquifer). Infinite acting boundary conditions (Dirichlet) are imposed on all lateral sides of the aquifer. The initial reservoir pressure is 11 MPa, and the reservoir is at a constant temperature of 45 °C. A CO2 injection well is located in the center of the aquifer at grid block location (64, 64), with a constant injection rate of 5 × 105 m3/day (at standard surface condition) and is constrained by the maximum bottom-hole pressure of 3 × 104 kPa. The total simulation time is 380 days. To generate CO2 saturation maps for surrogate model training and testing, outputs from 22 time steps are saved, first from the 15- to 180-day period in 15-day intervals and then from the 180- to 380-day period in 20-day intervals. Detailed geological parameters are also listed in Table 1.

Table 1. Parameters Used in the 2-D Reservoir Model
Parameter Value Parameter Value
X×Y×Z 1,280 m × 1,280 m × 20 m Reference pressure 11 MPa
δx×δy×δz 10 m × 10 m × 20 m Reservoir temperature 45 °C
Nx×Ny×Nz 128×128×1 CO2 injection rate 5×105 m3/day
For the base case, the permeability field is assumed to follow a log-normal distribution, with a mean value of urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0049 (i.e., a geometric mean of 100 mD) and standard deviation of 1.0. The porosity map is related to permeability k via the following relationship
urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0050(19)

A total of 1,000 realizations of the permeability field is generated using the sequential Gaussian simulator (sgsim) from the open-source package SGeMS (Remy et al., 2009). For the base case, a Gaussian variogram model is used, where the azimuth angle is zero (i.e., the major axis of anisotropy is parallel to the positive y direction) and the correlation lengths are 50 and 25 grid blocks in the major and minor variogram directions, respectively.

To test the learning capacity of cDC-GAN under different amount of training data, we train separate cDC-GAN models using an ensemble of 200, 400, 600, and 800 permeability realizations, respectively, as input data. For each permeability realization, simulated CO2 saturation maps from 22 time steps are used as training targets, and the corresponding time step information is used as conditioning data (see also discussions under section 2.2). The log-transformed, input permeability field ( urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0051) is normalized to the interval [0,1] before training. Figure 2 shows a single realization of the permeability field and the corresponding urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0052 histogram. Note the histogram is not exactly Gaussian in this case because of the large correlation lengths (relative to the domain sizes) used. All trained models are tested by using a separate set of 200 permeability realizations not included in training to evaluate their performance.

Details are in the caption following the image
(a) An example permeability field realization (before log-transformation) used for training the conditional deep convolutional generative adversarial network (cDC-GAN) and (b) the corresponding histogram of the log-transformed permeability field.

3.2 Performance Evaluation

3.2.1 Training and Testing Performance

Figure 3 shows the value of cDC-GAN training loss as a function of epochs and for training ensemble sizes of 200, 400, 600, and 800, realizations. All of the cDC-GAN models are trained on a cluster node equipped with NVIDIA GeForce GTX 1080 Ti GPU for a total of 4,000 epochs. Here an epoch is defined as a single pass of the entire training set to the solver. As Figure 3a shows, the RMSE values of the generative models and discriminative models start to stabilize after 400 epochs in all cases. The value of the generative model loss converges to 0.9, and the value of the discriminative model loss converges to 0.7. Note also the convergence of the discriminative model is much faster than that of the generative model because of its simpler model structure. The inset of Figure 3a shows an amplified view of the loss function from epochs 200 to 600.

Details are in the caption following the image
(a) Training loss (RMSE) decreases with the number of epoch but generally becomes stable after 400 epochs. (b) Comparison of training time taken to train cDC-GAN with a training ensemble size of 200, 400, 600, and 800 realizations, respectively. (c) Violin plot of SSIM for the testing performance of cDC-GANs trained using different training sample sizes; the short bars (from top to bottom) represent the maximum, mean, median, and minimum SSIM, respectively. RMSE = root-mean-square error; cDC-GAN = conditional deep convolutional generative adversarial network; SSIM = structural similarity index.

Figure 3b shows training time for different cDC-GAN models. In general, a nonlinear relationship exists between the size of training data and training time (computing cost). The training time increases from 1.7 to 4.5 hr, as the size of training set increases from 200 to 800 realizations. The cDC-GAN model trained on 200 urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0054 realizations achieved a mean SSIM value of 0.957 on the testing set (Figure 3c). For the same testing set, the mean SSIM value increases slightly when the training sample size is increased from 200 to 800 realizations. The highest mean SSIM value of 0.988 is achieved by the cDC-GAN model trained on 800 realizations (Figure 3c). With the increase of the training samples size, the prediction accuracy also increases, but at the cost of an almost exponential increase in training time. For the rest of this section, we choose the cDC-GAN model trained on 600 realizations as the base model, which strikes a reasonable balance between the computational time and prediction accuracy for our study. Unless otherwise specified, results reported below pertain to this base model.

We now use one realization from the testing set to exemplify how cDC-GAN works. Figure 4 illustrates the temporal evolution of CO2 saturation maps simulated by CMG-GEM at all 22 output times for the reference realization being used as the example. In Figure 4, the second to last subplot shows the normalized reference log-permeability field, and the last subplot on the last row shows the histogram of the normalized urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0056 field. Results show that the CO2 plume grows with time, but the plume shape is not symmetric because of the heterogeneity of the reservoir. The highest CO2 saturation is found at the center of the field because of the constant-rate CO2 injection there. The CO2 plume tends to migrate along the higher permeability direction, leading to elongated plume shapes along the south-north direction in this case.

Details are in the caption following the image
CO2 saturation maps at 22 different time steps as simulated by CMG-GEM for a reference permeability field (normalized urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0053), which is shown in the second to last subplot in the last row. The last subplot shows the corresponding histogram of the reference permeability field.

Figure 5 shows the CO2 plumes generated by the trained cDC-GAN model (i.e., the surrogate model) for the same reference permeability field as shown in Figure 2. In general, the surrogate model captures the main features of the simulated CO2 plume over time, also showing an elongated pattern in the south-north direction. The last subplot in the last row of Figure 5 shows the SSIM value calculated between the simulated and cDC-GAN-generated CO2 saturation maps, which decreases from a value close to 1.0 to 0.96 with time. At larger transport times, the CO2 plume “experiences” more heterogeneity, and its shape becomes more difficult to predict because of the larger degree of freedom. Another reason is related to the design of SSIM metric, which is related to the mean and variance of the sliding windows (equation 13). As Figures 4 and 5 show, most of the pixel values located outside the plume are zero at early times (e.g., at 15 days), the resulting SSIM value is high because of the “dilution” effect of zero-valued cells. At later times, the number of zero-valued pixels decreases as the plume size increases. Overall, the cDC-GAN does a relatively good job in approximating the shape and value of CO2 plumes. The example also suggests that the use of time step as the conditioning information is instrumental for helping the cDC-GAN to learn the cross mappings between the static permeability field and dynamic model outputs.

Details are in the caption following the image
Snapshots of CO2 saturation maps at different time steps obtained by using the conditional deep convolutional generative adversarial network (cDC-GAN) surrogate model trained using 600 training realizations. The second to the last subplot on the last row shows the permeability field (normalized urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0055) used to train the cDC-GAN, and the last subplot on the last row shows the structural similarity index (SSIM) between the CO2 saturation map generated by cDC-GAN and that simulated by CMG-GEM.

To further investigate the difference between saturation maps simulated by CMG-GEM and that generated by the cDC-GAN, Figure 6 plots the residual maps between the two sets of results at selected time steps. As Figure 6 suggests, the cDC-GAN model achieves a high accuracy for predicting the CO2 plume, with most of the error residuals close to 0.

Details are in the caption following the image
Residual error maps between CMG-GEM and cDC-GAN CO2 saturation maps at selected time steps. Sg denotes the CMG-GEM results, urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0057 denotes the cDC-GAN results, and urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0058 corresponds to the difference between each pair of maps. Note the same color bar is applied to all panels.

Figure 7 plots the SSIM value for the entire testing ensemble consisting of 200 realizations. In general, we observe similar temporal patterns as we have seen in the example case—as time progresses, the SSIM values tend to decrease. The ensemble of SSIM curves tends to have a larger spread with time, as shown by the different SSIM histograms in Figures 7b1–7b6 corresponding to time steps 30, 75, 135, 200, 280, and 380 days.

Details are in the caption following the image
Plots of structural similarity index (SSIM) values obtained on the testing ensemble of 200 realizations. Insets show histograms of SSIM at 30, 75, 135, 200, 280, and 380 days.

In addition to SSIM, we also performed the statistical moment analysis on the testing ensemble. Statistical moment analysis is typically used to measure the quality of surrogate modeling against Monte Carlo simulations conducted using the actual models. In Figure 8, the ensemble mean and standard variation of CO2 saturation maps generated by cDC-GAN are compared to those simulated using CMG-GEM for the testing ensemble. The first row of Figure 8 shows the mean (μ) and standard variation (σ) of CO2 saturation simulated by CMG-GEM at time steps of 105 and 340 days, respectively. The second row of Figure 8 shows the results of statistical moment analysis obtained for a horizontal cross section a–a at time steps of 105 and 340 days, and the third row shows the results obtained for a south-north cross section b–b for the same output times. The ensemble mean CO2 saturation obtained by the surrogate model (dashed line) overlays almost exactly on top of that obtained by the CMG-GEM simulations (solid line with asterisks). The ensemble standard deviation shows a close match at 105 days, although slight deviations near the σ peaks can be observed in the near field of the injector at 340 days. An interesting observation from Figure 8 is that the matching quality in the far field stays relatively unaffected. This indicates that the surrogate model achieves good performance in tracking the plume front, which is especially encouraging for CCS applications because of the need to demonstrate closure (i.e., safe containment) of the injected CO2 in CCS projects. To highlight the dynamic change of μ and σ of CO2 saturation along injection time, μ and σ at grid block of (60, 40) and (100, 50) are plotted in Figure S1 in the supporting information. A close match between CMG-GEM simulation result and cDC-GAN prediction result at grid blocks (60, 40) and (100, 50) indicates that the cDC-GAN can predict the CO2 plume at any injection time. From a deep learning design perspective, the use of GAN in this case plays an important role in helping to reduce blurriness in the generated images (i.e., the saturation maps in this case; Isola et al., 2017). Our results also suggest that cDC-GAN may be used as a reasonable alternative to full numerical model Monte Carlo simulations for UQ tasks.

Details are in the caption following the image
Statistical moment analysis of CO2 saturation maps (Sg) obtained by surrogate model and numerical model on the testing ensemble of 200 realizations. Top row: (a) mean CO2 saturation simulated by CMG-GEM at 105 days, (b) standard deviation of CO2 saturation at 105 days, (c) mean CO2 saturation at 340 days, and (d) standard deviation of CO2 at 340 days. Middle row: from left to right, ensemble mean and standard deviation obtained at 105 and 340 days using surrogate model (dashed line) and CMG-GEM simulations (solid line with asterisks) at the cross section a–a in a, b, c, and d. Bottom row: from left to right, ensemble mean and standard deviation obtained at 105 and 340 days at the cross section b–b in a, b, c, and d.

3.2.2 Interpolation Capability

In the previous example, the cDC-GAN model was trained by using model outputs from all 22 time steps for each input permeability field in the training set (i.e., 600×22 target training samples). In practice, it is important for a surrogate model to be able to predict model outputs at arbitrary intermediate simulation time steps. To assess such interpolation ability of the cDC-GAN, here we retrain the cDC-GAN model by using only a subset of all 22 time steps. Specifically, we leave time steps (45, 105, 150, 200, 260, 320, and 360 days) out during training. After the cDC-GAN is trained, we test the trained model on predicting the CO2 plume at 45, 105, 150, 200, 260, 320, and 360 days (the simulation results are already archived) to evaluate its interpolation ability. Figure 9 shows the comparison between the predicted CO2 saturation field (Sg) and simulated CO2 saturation field ( urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0059) for time steps used for this interpolation test. Results show that the cDC-GAN does a satisfactory job in approximating CO2 saturation fields at different “new” time steps not used during training, suggesting its strong interpolation skill and the consistency of the cDC-GAN performance. Thus, the cDC-GAN model can be used as an efficient surrogate model for dynamic systems.

Details are in the caption following the image
Test of conditional deep convolutional generative adversarial network (cDC-GAN) interpolation ability at time steps not included in training (45, 105, 150, 200, 260, 320, and 360 days). (Top row) CO2 saturation field simulated by CMG-GEM (Sg), (middle row) CO2 saturation field urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0060 generated by cDC-GAN, and (bottom row) differences between the top and middle rows. The cDC-GAN model is trained using 600 permeability realizations.

3.2.3 Model Comparison

Here the performance of our cDC-GAN surrogate model (for the base case) is evaluated against a naïve surrogate model, the mean-plume predictor, that is obtained by simply taking the ensemble average of CMG-GEM CO2 saturation maps obtained for the training set at all output times. We then evaluate the performance of this mean-plume predictor on all test realizations by calculating the difference (residual) between CO2 saturation maps of the mean-plume predictor and that obtained by the CMG-GEM runs (note that no extra model runs are needed for this comparison because all forward simulations have already been completed for the training and testing sets).

The top panel in Figure S1 shows the CO2 saturation maps of the mean-plume predictor for time steps 40, 105, 150, 200, 260, and 340 days. The bottom panel in Figure S1 shows the residual maps (ΔSg) obtained by applying the mean-plume predictor on all test realizations. As the color bar in Figure S1 (bottom panel) suggests, the residual is the greatest (∼0.2) near the injector and then gradually decreases toward the edges. For comparison, Figure S1 shows the same predictions obtained by using the trained cDC-GAN on the testing set. The bottom panel of Figure S1 suggests that the residual ΔSg obtained by cDC-GAN is almost zero near the injector and only becomes visible around the plume edges but is generally less than 0.03. This comparison highlights the need for developing an accurate surrogate model, especially when model parameters are spatially heterogeneous.

3.2.4 Computational Cost

In Figure S1, we compare the total computational time of using CMG-GEM to that of using the cDC-GAN surrogate model for various number of realizations. In this case, each numerical simulation takes about 10 min to finish by running CMG-GEM. As Figure S1 shows, CMG-GEM and cDC-GAN take the same amount of time for the first 600 simulations, because the training data of cDC-GAN model come from running the CMG-GEM simulations. Then, training of the cDC-GAN takes about 2.5 hr, which is equivalent to another 15 CMG-GEM runs (assuming 10 min for each run). After training, the cDC-GAN proxy model can get CO2 saturation at any time instance and for any permeability distribution with a few seconds, but CMG-GEM still takes 10 min to run a single simulation. Thus, if the number of required Monte Carlo simulations is more than 615, which is often the case for subsurface UQ (Li, 2014), using cDC-GAN as a surrogate model will be more efficient computationally.

3.3 Sensitivity Studies

3.3.1 Effect of urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0061 Correlation Range

In the base case, the correlation ranges used for generating the urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0062 realizations are relatively large as compared to the domain dimensions. As a result, the generated fields exhibit smooth large-scale spatial patterns which, in turn, may have helped the pattern extraction and learning by cDC-GAN. A remaining question is whether the performance of cDC-GAN will be affected when the spatial heterogeneity is increased. To address this concern, we ran additional sensitivity studies using smaller correlation ranges.

In the first test, the correlation ranges were changed to 25 (in major direction) and 15 (in minor direction) grid blocks, while all other configurations were kept the same (Table 2). Figure S1 shows an example of the generated permeability field and its histogram, which shows smaller-scale spatial heterogeneity and thus higher spatial variations. We trained a cDC-GAN model for this case using 600 realizations. Figure S1 shows the comparison between CO2 plumes simulated by CMG-GEM and those generated by cDC-GAN at selected simulation time steps, while the results of statistical moment analyses are presented in Figure 10 for the same two cross sections as we have considered previously in the base case. Results show that the performance of cDC-GAN is largely unaffected when the correlation range is reduced.

Table 2. Experiment Design of Correlation Range, Standard Deviation, and Injection Period for Sensitivity Analyses
Case no. Major Minor Mean (μ) Std (σ) Injection period (days)
Base case 50 25 urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0063 1 380
Shorter correlation range (1) 25 15 urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0064 1 380
Larger standard deviation 50 25 urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0065 2 380
Shorter correlation range (2) 10 10 urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0066 1 380
Longer injection period 50 25 urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0067 1 540
Details are in the caption following the image
Comparison of statistical moments of CO2 saturation fields obtained by conditional deep convolutional generative adversarial network (cDC-GAN) and CMG-GEM for the testing ensemble of 200 realizations at 105 and 340 days, for the correlation range sensitivity analysis. (top row) (a) Mean CO2 saturation simulated by CMG-GEM at 105 days, (b) standard deviation of CO2 saturation at 105 days, (c) mean CO2 saturation at 340 days, and (d) standard deviation of CO2 saturation at 340 days. (middle row) From left to right, ensemble mean and standard deviation obtained at 105 and 340 days using surrogate model (dashed line) and CMG-GEM simulations (solid line with asterisk) at the cross section a–a in a, b, c, and d. (bottom row) From left to right, ensemble mean and standard deviation obtained at 105 and 340 days at the cross section b–b in a, b, c, and d.

In the second test, we reduced the correlation ranges of urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0068 further to 10 grid blocks in both major and minor directions. Figure S1 shows an example of the generated permeability field and its histogram, which exhibits even more spatial heterogeneity than the previous test. Again, a cDC-GAN was trained using 600 urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0069 realizations. Figure S1 shows the comparison between CO2 plumes simulated by CMG-GEM and those generated by cDC-GAN at selected simulation time steps. Figure S1 displays the results of statistical moment analysis. The mean value between cDC-GAN and CMG-GEM at both 105 and 340 days is matched very well in both the base case (Figure 8) and the smaller correlation range studies (Figures 10 and S1). However, the match in standard deviation between cDC-GAN and CMG at 105 days is better than that at 340 days in both the base case (Figure 8) and sensitivity cases (Figures 10 and S1). By comparing the standard deviation between cDC-GAN and CMG-GEM at 340 days for all three cases (Figures 8, 10, and S1), it can be observed that the match in saturation map standard deviation only decreases slightly when the correlation range is decreased.

Previously, it was shown, through eigenvalue analysis, that the number of eigenvalues required to represent a log-normal permeability field at the same energy level (i.e., in KLE sense) increases with the decrease in correlation length (Li, 2014; Zhang & Lu, 2004). Thus, smaller correlation lengths tend to be more challenging for stochastic surrogate modeling techniques that are based on KLE. In the case of cDC-GAN, small correlation lengths may imply that the dimensionality of the underlying latent space (or intrinsic dimensionality) is higher. The series of sensitivity studies presented herein suggest that the performance of cDC-GAN is little affected when the correlation length is reduced, indicating that the end-to-end learning behind cDC-GAN can be trained to accommodate the increased latent space dimension while maintaining the same training effort (i.e., 600 realizations in our study).

3.3.2 Effect of Spatial Variability

This sensitivity analysis investigates the effect of higher spatial variability on the performance of cDC-GAN. Specifically, we increase the standard deviation of urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0072 from 1.0 to 2.0 while keeping all other parameters the same. Figure S1a illustrates an example of permeability realization generated based on Gaussian semi-variogram with the larger standard deviation (Table 2), and Figure S1b displays the corresponding histogram of permeability map. A total of 800 simulations is simulated by CMG-GEM, of which 600 simulations are used for training the cDC-GAN, and the remaining 200 simulations are used for testing. Figure S1 shows the resulting CO2 saturation distributions that are simulated by CMG-GEM and by cDC-GAN at different times for the higher standard deviation case. To quantify the performance of cDC-GAN, we calculate the mean and standard deviation on the 200 testing realizations at the time instance of 105 and 340 days (Figure 11). The result indicates that the magnitude of standard deviation has little impact on the performance of cDC-GAN. The magnitude of mean CO2 saturation at 105 and 340 days is the same in both the base case (Figure 8) and the larger standard deviation case (Figure 11); however, because of the increased spatial heterogeneity, the magnitude of standard deviation of CO2 saturation at 105 and 340 days increases compared with the base case. Again, the robust cDC-GAN performance seen here may be attributed to the strong capacity of cDC-GAN to learn the underlying functional mapping through pattern extraction, not by approximation through polynomial expansions that invariably lose information after truncation (Xiu & Karniadakis, 2002).

Details are in the caption following the image
Comparison of statistical moments of CO2 saturation fields obtained by cDC-GAN and CMG-GEM for the testing ensemble with larger urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0070 standard deviation ( urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0071) at time steps 105 and 340 days. (top row) From left to right, mean and standard deviation of CO2 saturation simulated by CMG-GEM at 105 and 340 days, respectively. (middle row) Results for a horizontal cross section a–a. (bottom row) Results for south-north cross section b–b.

3.3.3 Effect of Injection Duration

For the current problem, the duration of injection time affects the size of CO2 plumes and thus the SSIM metric (see also the discussion under section 3.2.1). To investigate the impact of different injection time lengths on the performance of cDC-GAN, we increase the total injection period from 380 to 540 days while keeping all other parameters the same as in the base case. The same sensitivity analysis is repeated. Figure S1 shows the SSIM values, which are obtained on 200 testing realizations. The max SSIM value is around 0.99, which is obtained on the first time instance. The minimum SSIM value is 0.825, which is obtained for the end of time instance. It can be seen that the SSIM value does not decrease too much as compared to the base case (Figure 7). The result suggests that although the deviation (as measured by SSIM) between the CO2 saturation simulated by CMG-GEM and CO2 saturation predicted by cDC-GAN increases, it is still within an acceptable range.

3.4 What Has the cDC-GAN Learned?

As mentioned before, the cDC-GAN is a pattern-driven deep learning method. To help elucidate this point, we compare feature maps obtained after the first layer of convolutional operation (i.e., the 64×64 feature maps in Figure 1). Figure 12a shows an example of input permeability map, and Figure 12b shows the corresponding feature maps extracted from the input after the first convolutional layer and for the time step of 15 days. Recall that the time step is used as conditioning data. Comparing Figures 12a and 12b, it can be seen that some subplots in Figure 12b exhibit the same spatial patterns as shown in the original permeability map (Figure 12a). Figure 12c shows the feature maps calculated after the first convolutional layer at 380 days. In Figure 12d, the difference between Figure 12b and Figure 12c is shown. Each subplot in Figure 12d turns out to show a constant value image, which means that the patterns extracted by the first layer do not change between Figures 12b (15 days) and 12c (380 days), and the only change between those two figures are magnitudes, as reflected in the color intensity of the subplots in Figure 12d. These results imply that the conditioning data (time step) mainly affects the color intensity of the features extracted from the first layer, while the image patterns are determined by the spatial patterns in the permeability map. The results shown here may be compared to the KLE, which also uses a combination of eigenmaps to represent a stochastic process (Zhang & Lu, 2004). The difference is that the end user does not need to control the order of expansion, nor worry about the distribution of the input. In the supporting information (Figures S13–S1), feature maps extracted after the second convolutional operation are also plotted, which show that more detailed information is extracted from the permeability (Figure 12a) for different time steps.

Details are in the caption following the image
Illustration of conditional deep convolutional generative adversarial network feature extraction. (a) A sample permeability realization used as input, (b) feature maps outputted from the first layer of convolution calculation with t=15 days as conditioning data, (c) feature maps after the first layer convolution calculation with t=380 days as conditioning data, and (d) residual map between feature maps calculated in (b) and (c). The last three subplots suggest that permeability determines the pattern while time controls the magnitude (or intensity) of the feature maps.

4 Conclusions

The GANs, a type of deep learning models, have shown promising performance in learning cross-domain mappings. In this work, we adopted a GAN framework to develop surrogate models of high-dimensional dynamical numerical models, which has been extensively studied but remains a challenging task under the conventional surrogate modeling frameworks. In particular, we have developed a cDC-GAN for stochastic surrogate modeling. The developed cDC-GAN model is demonstrated for a carbon capture and storage (CCS) use case, for which we seek to predict the spatial and temporal evolution of injected CO2 plume in heterogeneous carbon storage reservoirs. The underlying multiphase flow and transport problem is highly nonlinear and normally solved via compositional reservoir simulation, which is computationally expensive even on high-performance clusters. The cDC-GAN is trained to learn the functional mappings between two domains: the model input domain is permeability field and the output domain is the CO2 plume. The conditioning data is the model output time. Our results indicate that cDC-GAN has strong skills in learning the cross mappings between the permeability fields and simulated CO2 plumes. By feeding it with different stochastic realizations of the permeability, the cDC-GAN is trained to learn a generalized mapping that can handle different cases from the same data distribution class. The performance of the cDC-GAN stays relatively robust when the heterogeneity structure changes (e.g., smaller urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0073 correlation ranges or larger urn:x-wiley:wrcr:media:wrcr24044:wrcr24044-math-0074 standard deviation), without requiring the increase of training samples at the same time. This suggests the strong capacity of cDC-GAN to adjust to the change in latent space dimensionality. It also interpolates well for time steps not used in training, which is an important attribute to have for high-quality surrogate models. As part of the study, we also investigate the feature maps extracted by cDC-GAN and show that the feature maps can be interpreted meaningfully. It is worth pointing out that unlike many surrogate modeling techniques that either assume a linear system or require parametric probability distribution functions, cDC-GAN imposes few assumptions on the input data and is entirely data (or pattern) driven. Thus, it can be potentially applied to a large class of physical simulation problems for developing surrogate models for risk assessment and UQ purposes.

Acknowledgments

We are grateful to the AE and three anonymous reviewers for their constructive comments. Z. Zhong and A. Y. Sun were supported by the U.S. Department of Energy, National Energy Technology Laboratory (NETL) under Grant DE-FE0026515. H. Jeong was partially supported by the National Research Foundation of Korea (NRF) under Grant 2018R1C1B5045260. Computing resources are provided by the Texas Advanced Computing Center at UT Austin. We are grateful to the Computer Modeling Group (Calgary, Canada) for free access to their CMG-GEM software. Python codes of the proposed cDC-GAN are available at the GitHub (https://github.com/danilecug/CO2-GAN/).