Spatio-Temporal Machine Learning for Regional to Continental Scale Terrestrial Hydrology
Abstract
Integrated hydrologic models can simulate coupled surface and subsurface processes but are computationally expensive to run at high resolutions over large domains. Here we develop a novel deep learning model to emulate subsurface flows simulated by the integrated ParFlow-CLM model across the contiguous US. We compare convolutional neural networks like ResNet and UNet run autoregressively against our novel architecture called the Forced SpatioTemporal RNN (FSTR). The FSTR model incorporates separate encoding of initial conditions, static parameters, and meteorological forcings, which are fused in a recurrent loop to produce spatiotemporal predictions of groundwater. We evaluate the model architectures on their ability to reproduce 4D pressure heads, water table depths, and surface soil moisture over the contiguous US at 1 km resolution and daily time steps over the course of a full water year. The FSTR model shows superior performance to the baseline models, producing stable simulations that capture both seasonal and event-scale dynamics across a wide array of hydroclimatic regimes. The emulators provide over 1,000× speedup compared to the original physical model, which will enable new capabilities like uncertainty quantification and data assimilation for integrated hydrologic modeling that were not previously possible. Our results demonstrate the promise of using specialized deep learning architectures like FSTR for emulating complex process-based models without sacrificing fidelity.
Key Points
-
Deep learning models can emulate a complex integrated hydrology model that simulates groundwater over the United States
-
We developed a new model architecture that is more robust over longer simulation periods than off-the-shelf neural networks
-
Deep-learning based emulators of complex models enable new applications such as real-time forecasting and estimating uncertainties
Plain Language Summary
Computational models are important for understanding and predicting terrestrial hydrology, but our most physically detailed models can be time-consuming and expensive to run over large regions. In this study, we trained deep learning models to emulate a complex hydrology model that simulates groundwater flow over the contiguous US. We developed a new model architecture called FSTR that captures spatiotemporal patterns better than standard deep learning models. FSTR is over 1,000 times faster than ParFlow, the original hydrologic model. This enables new possibilities like forecasting groundwater changes and estimating uncertainties. Our results show that specialized deep learning architectures can accurately emulate complex hydrologic models while drastically reducing computation time.
1 Introduction
Computational models have been hugely successful in predicting and understanding Earth and environmental systems. Since their early use in the mid twentieth century these models have increased in complexity to account for higher spatiotemporal resolution and a wider range of physical processes. To achieve the highest possible degree of physical realism scientists and engineers often must run these models on high-performance computing clusters and supercomputers, which require large institutional investment to build and maintain. In addition, the computational complexity of using these models on such platforms requires considerable expertise to use effectively. As a result, these large-scale and highly complex models are typically only used by a small subset of researchers.
Emulation (also referred to as reduced order models or surrogate models) has long been a popular method to reduce the computational complexity of running simulations in a number of domains (Astrid et al., 2008; Razavi et al., 2012; C. Wang et al., 2014). Reducing the computational complexity of process based models makes it possible to build more complex workflows such as building model chains, performing parameter calibration or sensitivity experiments (Cheng et al., 2023), and running more scenarios to better understand uncertainties (Kasim et al., 2021). One avenue that is becoming increasingly popular is the use of deep learning (DL) based methods for building emulators of process based models (Doury et al., 2023; Leonarduzzi et al., 2022; Reichstein et al., 2019; Tran et al., 2021).
Deep learning has recently and quickly become a standard piece of the computational modeling toolkit and offers almost universal applicability to modeling tasks (Jordan & Mitchell, 2015). It has also proven very powerful in allowing researchers to take models from disparate applications and apply them to new problems (Khan et al., 2022). This is true in the Earth system sciences and related fields, where deep learning models have been used to simulate atmospheric phenomena (Brenowitz et al., 2020), predict flow in rivers (Kratzert et al., 2018), and monitor land use (Xu et al., 2017) among many other applications. Specifically in hydrology the majority of applications are based around streamflow modeling or other forms of “point-scale” applications where the models are trained on individual sites (Bennett & Nijssen, 2021; De la Fuente et al., 2023; Gauch et al., 2021). However, we know that in subsurface hydrology lateral flow occurs, and can have large impacts on both human and natural systems (Condon & Maxwell, 2019; Fan, 2015).
Hydrologic models that treat both the coupled surface and groundwater systems as well as account for lateral flow in the subsurface are commonly referred to as integrated hydrologic models. These models are among the most complex and comprehensive representations of the terrestrial hydrologic cycle that have been developed. However, they are often difficult to run because they are data-hungry and computationally heavy. This is particularly true in the subsurface, where observations are sparse and parameters are hard to measure (Blöschl et al., 2019). Even when data is available, it is often difficult to calibrate these models because of the computational complexity and large-dimensional search space over parameter configurations (O'Neill et al., 2021). It should also be noted that the lack of spatiotemporally complete (i.e., gridded data sets) observations/reanalysis data for groundwater is a large reason that purely data-driven approaches have not emerged as they have in the weather forecasting domain (Chen et al., 2023; Keisler, 2022; Lam et al., 2022). Because of these challenges the use of emulator or surrogate models is an appealing approach to improving the usability of integrated hydrologic models.
In this study we demonstrate the use of modern deep learning based emulators of a continental scale integrated hydrologic model, without sacrificing spatiotemporal or process fidelity. Specifically, we emulate subsurface flow of the ParFlow-CLM model, developed over a large portion of the contiguous US at a high spatiotemporal resolution (Maxwell & Condon, 2016; Maxwell et al., 2015; O'Neill et al., 2021). Previous work has shown that deep learning is an effective approach to this problem, showing good agreement in predictions as well as computational efficiency on synthetic benchmarks (Maxwell et al., 2021) and on smaller domains (Leonarduzzi et al., 2022; Tran et al., 2021).
In this work, we compare the ability of three different deep learning model architectures to emulate 3d subsurface pressure fields simulated by ParFlow. We compare ResNet, UNet, and Forced-SpatioTemporal-RNN (FSTR) architectures; the first two are off-the-shelf convolutional neural networks that we apply in an autoregressive fashion to build up spatiotemporal predictions. That is, we feed the previous output of the model as an input to the next step of the prediction process, in addition to other features.
The FSTR model is a novel adaptation of the PredRNN model with action-conditioning, which is a video-prediction model that has proven capable for atmospheric & hydrologic modeling as well as robotics (Tran et al., 2021; Y. Wang et al., 2017). We make use of the robotics terminology of “action-conditioning” to explicitly account for the meteorological forcings acting on the hydrology in a way that is separate from the subsurface parameters/geology and the initial state of the domain that we are simulating.
We compare the performance of the three model architectures for emulating the evolution of both pressure heads at multiple depths as well as the derived soil moisture states and water table depths. We find that all three model architectures show good overall capabilities at matching spatial and temporal patterns. Our FSTR architecture consistently shows the best performance results, and is capable of simulating a year of the entire domain in less than an hour on a single 40 GB Nvidia A100 GPU, showing a >1,000 times speedup as compared to the original simulations run on >3,000 CPU cores. Based on these findings we believe that our FSTR architecture could form the basis for a new set of modeling capabilities to fine tune model parameters and enable real-time ensemble-based forecasting.
2 Methods
2.1 Modeling Domain and Data
Our modeling domain is based on the ParFlow CONUS1.0 model (O'Neill et al., 2021). The domain covers the majority of the contiguous United States, plus some small portions of Canada and Mexico (Figure 1). We are primarily interested in representing the subsurface hydrology in this domain at a 1 km gridded spatial resolution, with a daily timestep. The gridded domain amounts to a regular grid of 3,342 by 1,888 km. The depth layers of our simulations increase in the downward direction, starting with shallow surface layers and a large groundwater layer. The depths of each layer are 0.1, 0.3, 0.6, 1, and 100 m from the surface to the bottom for a total of 5 layers.
The simulations that we use for training/testing have previously been validated against many observational data sets across multiple variables (e.g., streamflow, evapotranspiration, snow) with favorable results given the default parameter sets (O'Neill et al., 2021). These simulations cover water years (spanning from October 1 to September 30) 2003–2006 at an hourly timescale. We aggregate all the data to a daily timescale by taking the daily means, totals, minimums, and maximums where appropriate.
The ParFlow simulations that we use to train/evaluate our emulators is forced by a modified version of the NLDAS-2 product. The modifications include a blending of higher-resolution data sets such as Stage IV Radar and GOES Surface and Insolation Products as well as remapping from the native 12 km grid to our 1 km grid using bilinear interpolation. See O'Neill et al. (2021) for more information about the development of the forcing data. The spatially-distributed parameter fields were developed from a variety of data sets including the Soil Survey Geographic Database (SSURGO) and subsurface conductivity maps from Gleeson et al. (2011). More details on the development of the parameter fields can be found in Maxwell et al. (2015).
2.2 Model Architectures
We explore three deep learning model architectures to emulate the ParFlow simulations described previously. Here we will describe the overall structure of each of these models, but will leave the detailed input/output setup of them for the following section. Throughout the rest of the paper we will interchangeably use the terms model and emulator. All references to the original ParFlow simulations will be directly specified as ParFlow. We employ three model architectures in this study: a Residual Network (ResNet; He et al., 2015), a UNet (Ronneberger et al., 2015), and a newly developed architecture that we refer to as the ForcedSpatioTemporalRNN (FSTR) model. The first two of these are relatively “standard” models in the deep learning literature at this point and have been used for a large array of tasks (Alzubaidi et al., 2021; Azad et al., 2022). The architectures of the ResNet and UNet are shown in schematic form in Figure 2, while the FSTR architecture is shown in Figure 3.
The development of the ResNet architecture is considered a milestone in the development of machine learning to the task of image classification and paved the way for the modern deep learning revolution (He et al., 2015). We consider the ResNet a strong baseline architecture to compare against as has been adopted by other studies with similar approaches (Haber & Ruthotto, 2018; Kochkov et al., 2021; Ott et al., 2020). This model consists of stacks of “residual blocks,” usually convolutional layers followed by a nonlinear activation function. Following these stacks is a residual connection, which adds the input back to the output of the residual block. Architectures with residual connections have been found to train more effectively, especially in very deep networks (Zhang et al., 2023). We stack two residual blocks, each consisting of 1 depthwise-separable convolutional layer with a layer norm and activation function, and a final convolutional layer. We use two residual blocks with each having two convolutional layers with hidden dimension of 256 channels for this study.
The second architecture that we consider is the UNet, which is named as such because of its use of successive downsampling and upsampling layers (along with skip connections) which allow the model to capture spatial relationships at varying resolutions. This type of architecture is considered state of the art in image segmentation which has applications in both remote sensing (Yuan et al., 2021) and medical imaging (Ronneberger et al., 2015). Like the ResNet, the UNet is mainly composed of stacks of convolutional layers. However, the UNet makes use of downsampling and upsampling to get a “multi-resolution” view of the data. In our model, we use stacks of these downsampling and upsampling layers which are composed of convolutional layers that are either preceded by MaxPool layers or followed by bilinear interpolation layers for down and upsampling, respectively. Each downsampling and upsampling layer consists of two convolutions followed by an activation. At parallel levels in the downsampling/upsampling process skip connections are used to transfer information at multiple resolutions to the upsampling, which helps to preserve spatial structure in the data. Here we set the base dimension for the convolutional layer to be eight at the input and output and double/halve the hidden dimension for each down/up sampling layer respectively.
We also developed a novel architecture based on the PredRNN model, which is itself based on the Convolutional LSTM model (ConvLSTM; Shi et al., 2015). It attempts to take best practices from image modeling via CNNs and sequence modeling from recurrent neural networks, particularly with the design of the Long Short Term Memory network (LSTM; Hochreiter & Schmidhuber, 1997). The key insights that the developers of PredRNN and associated models had over the ConvLSTM architectures was that an additional memory/hidden state would better reflect the spatiotemporal state of the system, and could be shared amongst model layers. This, along with several other procedural training techniques led the PredRNN model to be one of the most performant video prediction models available (Y. Wang et al., 2017). In the original PredRNN paper, the authors also introduced an “Action-Conditioned” variant, which allows the input sequence of a robotic arm to be used as a part of the prediction algorithm (Y. Wang et al., 2017). We make use of this modification because the input to the robotic system “acts” on the video stream in a similar way that meteorological forcings “act” on the evolution of hydrologic states.
The main modification that we make to the PredRNN structure is the use of encoders to initialize the memory and cell states for the model. These states are updated throughout the recurrent loop, and by default, are initialized as zeros in the PredRNN structure. However, our insight is that the initial conditions and subsurface parameters can be considered as an approximate byproduct of the true memory of the natural environment, and thus can be used to initialize these hidden states. We use the initial conditions (i.e., the 3-dimensional pressure heads for the domain being simulated) to initialize the memory state and the parameter values (e.g., porosity and permeability for each gridcell) to initialize the cell states. Both encoders consist of convolutional layers that project the inputs into a higher dimensional space that matches the hidden states of the Action-Conditioned ST-LSTM which forms the backbone of the FSTR model. The initial conditions are used as input to the memory encoder as well as used as the starting input pressure field to the model. We use two layers of the AC-ST-LSTM layer in our model, each having a hidden dimension of 16.
Our model structure considers the meteorological forcings to be “action” tensors rather than model inputs directly. This delineation of information partitioning between the initial conditions, parameter values, and meteorological forcings in FSTR is very similar to how ParFlow and many other hydrologic models operate.
All models take in daily minimum temperature, daily maximum temperature, total precipitation, snowmelt, and bulk evapotranspiration as the boundary condition forcing. Static parameters that the models take as input are porosity, permeability, van Genuchten n, van Genuchten alpha, topographic index, elevation, and the distance to the nearest stream based on the digital elevation map. The static parameters were chosen to include the major parameters required to solve Richard's equation as well as represent major topographic features at both point and aggregated scales. The models output 4-dimensional spatiotemporal predictions of the subsurface pressure heads for all grid cells in the input domain. All models are run in an auto-regressive fashion, meaning they start with the initial conditions as a starting point and then evolve their output in response to the forcings and parameter value inputs, at which point they use their own prediction as the initial state for the next time step. All of the models have roughly the same number of parameters: the ResNet has 307 thousand, the UNet has 272 thousand, and FSTR has 194 thousand. We examined both larger and smaller model sizes and found that it was not the dominant control on the model performance. That said, we believe with a larger training data set were compiled a larger model may still provide better performance.
2.3 Experimental Setup
In this study we explore the ability of the three neural network architectures to reproduce the 3d ParFlow pressure heads across the CONUS domain. To quantify this, we train each model on water years 2003–2005. The testing data set that we evaluate against is from the water year 2006, and all results shown here are from this set. As mentioned previously, we aggregate variables to daily timesteps to reduce the overall complexity of the data. While we are interested in building emulators that maintain high spatiotemporal resolution we are primarily focused on seasonal to annual modeling, reflecting timescales that groundwater interactions tend to take place at.
We train all models in three phases. For each phase we sample from the training data, using a different “patch” sizes and rollout time horizons. A patch is simply an N by N spatial subsample from the overall domain, and a rollout horizon is the number of timesteps that are taken. For the first phase we train for three epochs using a patch size of 256 and a rollout horizon of 3 days. In the second phase we train for another epoch using 96 pixel chips and a rollout horizon of 35 days. The final phase uses a patch size of 64 and rollout horizon of 120 days. This helps stabilize the autoregressive loop of the model, as well as learn seasonal trends. We found that training across all three phases improved the model prediction in terms of both stability of the predictions and the overall accuracy for all model types. We include a representative loss curve for the three models in the supplementary materials, which shows the effect of this multi-stage training process. All models were trained on a single Nvidia A100 40 GB GPU. Training times varied by model, but generally took roughly 24 hr each, with the ResNet being cheapest and FSTR being the most expensive, though all were quite similar. It is difficult to benchmark the computational performance of the model training because IO and data preprocessing steps often introduced overhead that was affected by use of shared resources on the training server. Nevertheless, training a single instance of an emulator was much less costly than running the ParFlow simulations. We will briefly touch on this in the discussion (Section 4) as well.
3 Results
To analyze the performance of each of the emulator architectures we calculated the overall root mean square error (RMSE) and Pearson correlation (hereafter, just correlation) for each grid cell in the domain with respect to the original simulations. The resulting spatial maps for these measures are shown in Figure 4. The UNet performs worst in terms of the RMSE, followed by the ResNet, with FSTR showing the best predictions. Overall, all three models have higher error in the western portion of the domain, particularly in the southwestern region showing negative correlations. This is the most arid portion of the domain, and also has very large spatial gradients in the water table depth. We found that these areas of very high spatial variability in water table depth were difficult for the models to perfectly track over the full year rollout.
Similarly, in Figure 5 we examine the models' ability to emulate the surface soil moisture levels and find all three models perform similarly. The areas of highest error for surface soil moisture tend to be the same for all the models, unlike the results for the water table depth. Of note are three small regions where RMSE is higher: one in north-eastern Arizona (southwestern portion of the domain), and two small clusters in the central portion of the domain. These are quite different regions from a hydrologic perspective, with the section in the southwest being an arid desert and the central regions being grasslands. The fact that they are quite different suggests deficiencies in the input data or model training procedure are the underlying cause, perhaps due to unique configurations in the static parameter fields, though we were not able to diagnose the exact reasons. We also include plots of the distributions of RMSE and correlation for both water table depth and soil moisture in the supplementary materials (Figures S2 and S3 in Supporting Information S1), which make it easy to compare general performance characteristics of each of the model architectures.
To better understand the temporal nature of error growth of the emulators, we also show the forecast error in water table depth at increasing prediction times in Figure 6. We find that all three models show increasing error growth, with the UNet showing the highest median and interquartile range at the end of the test period. FSTR maintains the lowest median and interquartile ranges, though all three models have roughly equivalent 90th percentile error ranges. All models show the majority of errors are below 1 m RMSE, which is promising for use in subseasonal-to-seasonal modeling contexts. The continual error growth, particularly the 90th percentile, does highlight that using these models for longer term (annual to decadal) applications may require further research.
So far, we have looked at the overall error characteristics in the forecasts across the full domain, but it is worth zooming in on some particular regions to see local performance spatially and temporally. First, in Figure 7 we show a 256 by 256 km sub-domain in the midwestern US. Comparing the time series for a particular location in this region we once again see that the FSTR and ResNet models are able to much more accurately capture the dynamics of the system, as compared to the UNet. Most notably, the FSTR and ResNet models are able to capture the seasonal dynamics in a much more robust way than the UNet, which shows good overall correlation, but drifts in magnitude from the ParFlow results. Similarly, the spatial plots show that FSTR is much more able to represent the spatial heterogeneity across seasons, particularly in areas with shallow water tables. This connects back to the timeseries picture, where we see the UNet consistently predicts a deeper water table and the ResNet consistently predicts a shallower water table depth. None of the models can perfectly capture the structure of the river network, although some general features that correlate well with the river network from ParFlow are seen in all results.
Figure 8 shows another example in the southwestern portion of the domain. This region is much more arid and has deeper water table depths on average than the region shown in Figure 7. Here we see that all models can capture the long-term dynamics of the region and maintain spatial coherence over the simulation period as well. However, we do see that the FSTR model is better able to capture the fine-scale structure, particularly in the sub-basins that form shallow water table depths and rivers. The UNet cannot reproduce very shallow water tables and lacks the river network structure while the ResNet produces a river network that is too large. We also include two more sub-regions in the supporting information to show details in the central and southeastern portions of the domain, which have different hydrologic conditions but showcase similar conclusions for differences between the model architectures (Figures S4 and S5 in Supporting Information S1, respectively).
In Figure 7 we saw that one of the main deficiencies in the UNet output is drift in the longer-term dynamics, which we found in other regions as well (Figure S5 in Supporting Information S1). To better understand what drives the accumulation of errors over time we looked at how the models respond to precipitation events. We accomplished this by selecting grid cells at varying levels of precipitation and then comparing the response of the surface level pressure heads before and after the storm events (Figure 9). Overall, we found that the model architectures show similar sensitivities to precipitation events, even in the extreme events, though none matched perfectly. Overall, the ResNet was more sensitive to extreme precipitation than the original ParFlow simulations, while the UNet and FSTR showed consistent under-response. The over-sensitivity of the ResNet may be the reason for the over-development of river networks in Figures 7 and 8. The ability of the emulators to respond accurately across a wide range of events is a promising result, although there is clearly room for further improvements with more training data and/or model architecture tweaks.
4 Discussion and Future Work
In this study we chose to focus entirely on the subsurface and treat the snow and evapotranspiration as inputs to the model. We chose this because ParFlow is substantially more computationally expensive to run than the land surface model CLM, which provides the simulation of the snow, evaporation, and vegetation processes. We plan to incorporate sub-modules that simulate these processes and can be connected to the subsurface component in another study. These processes could also be added to the FSTR model, but this would require the ability to estimate their gridded initial conditions of snowpack and ET to be used in this framework. There are also many additional experiments that could be performed using the framework that we have developed here to further improve the capabilities of the emulators.
One opportunity for further exploration is to modify the training routines that we use to produce more suitable emulators for specific temporal or spatial scales. One of the main drivers in performance of our emulators was the ability to flexibly train them across a variety of spatio-temporal scales. The ResNet was especially sensitive to the choice of training phases as discussed in Section 2.3. Generally, we found that the initial time horizon for the first training phase was the largest factor for controlling model stability, with small horizons being favored over long ones. However, this is at odds with representing the slower timescale processes like water table depth. Training at a single timescale was always met with subpar performance; exclusively using small temporal rollouts led to bad long term predictions and exclusively using large temporal rollouts led to bad predictions of the temporal dynamics. We believe that finding optimal training strategies for models for different spatio-temporal scales will be an important question to answer.
Additionally, our aim of enabling subseasonal to seasonal groundwater prediction was the reason for our initial choice of using a daily timestep, but we showed they can be rolled out to annual scales with little degradation in performance with our FSTR model. It may even be possible to combine training methodologies into a multi-stage model (where sub-models are used for different timescale predictions), similar to the approach taken by the FuXi weather forecasting architecture (Chen et al., 2023). Another possible concern is whether our trained emulators depend highly on the forcing data used. For example, if we were to use our trained emulator in a true forecast context it is unclear how we should integrate bias-correction/post-processing techniques to ensure that the emulator responses are robust. We are currently working on follow-up studies that make use of perturbed parameter and forcing variables to explore this.
Another aspect that we have not yet explored is in the translation from the pressure heads to water table depth and soil moisture during training time. The equations we used are implemented in the same way as they are in ParFlow, but are written as PyTorch layers which, in theory, could be appended to our model structures and trained on directly. It is possible that this could yield better predictions for these quantities but may sacrifice the fidelity of the more fundamental pressure heads. In future work we may explore this along with multi-objective training to capture a wider range of conditions. There are some technical challenges that remain before such experiments can be performed, though we believe this will be a necessary step in order to successfully emulate the overland flow component of ParFlow.
One other potential limitation of our current emulation method is that the emulators are dependent on the soil and vegetation classifications that were used in the original simulations. Work is ongoing to develop strategies for building ParFlow ensembles with varied parameters which would provide the basis for model training to be varied in more robust ways. We hope that this will provide an even stronger emulator which can be used to optimize the subsurface parameter values via simulation-based inference, in turn giving better ParFlow simulation results as compared to observations.
Going beyond improving the emulation of ParFlow simulations, this work enables new applications for large-scale simulations. For example, having a fast and stable emulator at the seasonal timescale allows for ensemble predictions, uncertainty analysis, and coupling to land surface and atmospheric models. If future work shows that such emulation approaches can be stable over the annual-to-decadal periods this will also enable more robust studies on the effects of climate change on groundwater. Additionally, our FSTR architecture should be suitable for modeling in other domains where external forcings act on the state of the system, particularly other areas of hydrologic and land surface modeling. Additionally, we believe that the action-conditioning portion of the architecture provides a natural coupling point to other model types.
5 Conclusions
In this study we developed a deep-learning model architecture that is a viable approach to full-system emulation of a complex spatiotemporal groundwater model at high resolution. Our results show that off-the-shelf neural network architectures like the ResNet and UNet perform reasonably well in emulating ParFlow, but our FSTR architecture is more accurate at reconstructing groundwater dynamics over long rollout times. Our emulators exhibit much lower computational cost compared to the original Parflow simulations. This opens up many new opportunities to use emulated results for ensemble forecasting and model calibration through simulation-based inference.
The FSTR model architecture developed in this study not only shows good overall performance at simulating pressure heads, water table depth, and soil moisture but is also fast and scalable to run. As a side-benefit the model architecture is easy to conceptually understand because the model inputs directly correspond to the input data types used in most distributed hydrologic models. Initial conditions are used to set the model up for simulation, static physical parameters are used to modulate the time evolution of the system, and meteorologic forcings act on the internal state of the system. Currently each of these subsystems are processed via convolutional layers before being fed into the core AC-ST-LSTM model, but fundamentally could be any neural network component.
The use of deep learning for geophysical modeling is still a rapidly developing field where most applications for land and subsurface modeling are either point-scale timeseries models or spatially distributed models which are not dynamic. In this work we have demonstrated a modeling approach that is not only fully spatiotemporal, but also can be used to represent multiple processes simultaneously. We consider this work to be a step in moving toward more advanced machine learning representations of the subsurface, which is currently lacking compared to capabilities for weather and atmospheric predictions.
Acknowledgments
This research was funded by the US National Science Foundation Convergence Accelerator Program, Grant CA-2040542. The authors would also like to thank Bill Hasling, Amy Defnet, Amy Johnson, and Will Lytle for their assistance in developing software to deploy our models to https://hydrogen.princeton.edu/.