Machine Learning Developments and Applications in Solid-Earth Geosciences: Fad or Future?
Abstract
After decades of low but continuing activity, applications of machine learning (ML) in solid Earth geoscience have exploded in popularity. This special collection provides a snapshot of those applications, which range from data processing to inversion and interpretation, for which ML appears particularly well suited. Inevitably, there are variations in the degree to which these methods have been developed. We hope that the progress seen in some areas will inspire efforts in others. Challenges remain, including the formidable task of how geoscience can keep pace with developments in ML while ensuring the scientific rigor that our field depends on, but with improvements in sensor technology and accelerating rates of data accumulation, the methods of ML seem poised to play an important role for the foreseeable future.
Key Points
-
Applications of machine learning (ML) in solid Earth geoscience have exploded in the past few years
-
This special collection provides a snapshot of ML applications from data processing to inversion and interpretation
-
Better integration of ML algorithms and scientific rigor is expected to further improve our understanding of the Earth
Plain Language Summary
Machine learning has been the topic that attracts massive academic attention in Solid-Earth Geosciences in the past decade. Applications of machine learning (ML), including the more conventional signal processing-based methods and the trending deep neural network-based methods, have dominated many scientific conversations. We introduce the special collection of ML applications in Solid-Earth geosciences that range from earthquake signal processing, automatic image interpretation, to joint understanding of multiple geoscience datasets. With the extraordinary efforts in ML studies, we now have a better outline of the areas where ML has contributed most significantly through efficiency and automation, and where ML has the potential to revolutionize the workflow and advance the integrated scientific understanding of the Solid-Earth processes.
1 Introduction
Machine learning (ML) has a long history in statistical signal processing and roots some of its earliest applications in time series filtering (Kolmogorov, 1939) and geostatistics (Krige, 1951). Despite significant discoveries of important methods such as Bayesian inference in the 1960s, some re-discoveries of the methods of backpropagation in the 1980s, and the invention of data-driven techniques such as support vector machines and recurrent neural networks in the 1990s, ML research and application have experienced multiple cycles of optimism and pessimism. The most recent wave of optimism started in the early 2000s when both the amount of openly available data and computing power grew exponentially. Driven by the “big-data” movement, ML methods such as support vector clustering and random forests have become widely accepted. Since the 2010, deep learning, based on large artificial neural networks, specifically convolutional neural networks (CNNs), has become the most advanced, yet practical ML system that enabled spectacular success in supervised tasks such as image classification and speech recognition. In 2016 and 2017, Google's AlphaGo and AlphaGo Zero snatched consistent wins from professional human players in the notoriously difficult game of Go, marking one of the highest achievements of reinforcement learning.
Following each cycle of ML research, the number of ML applications in geosciences has gone through similar peaks and troughs, albeit with a time lag of a few years (Dramsch, 2020). While the new era of ML equips geoscientists with high-performance computing and open-source software libraries, we face unique challenges specific to the domain of solid Earth geoscience. The first challenge arises from the lack of sufficient data and corresponding labels. Unlike many commercial applications, geoscience data can rarely be crowd-sourced and labeled. The uncertain nature of geoscience exacerbates this challenge with a lower opportunity to properly benchmark the machines with ground truth. The second challenge is rooted in the general “black-box” characteristic of the machines. This decoupling between prediction and understanding undermines the confidence of the geoscientists in the learned models, calling for integration between physics-based methods and data-driven methods.
Co-led by four journals: Journal of Geophysical Research: Solid Earth, Geochemistry, Geophysics, Geosystems, Earth and Space Sciences, and Tectonics, the special collection on “Machine Learning for Solid Earth Observation, Modeling, and Understanding” gathers papers that demonstrate the new developments, unprecedented capabilities, and novel applications of ML in solid Earth geosciences. In Section 2, we categorize the papers according to their geoscience applications and summarize the highlights of these papers in addressing the general ML challenges as well as the particular geoscience challenges. In Section 3, we outline the unsolved challenges of ML in geosciences and speculate about the future directions the solid Earth community should venture upon based on the cornerstones laid down by this special collection.
2 Highlights
2.1 Earthquake Data Applications
The earthquake community is one of the earliest movers in geoscience to capitalize on the recent developments in ML, partially due to the exponential increase in digital data volumes, partially due to the increasing number of automatic earthquake identification algorithms (Yano & Kano, 2022), and more importantly due to the better availability of data labels accumulated through the community earthquake catalogs. Studies presented in the special collection bring more details of the seismic waveforms to the attention of the neural network and form the basis for the next-generation earthquake detection algorithms that are able to fully mimic an experienced earthquake data analyst (Beroza et al., 2021).
Earthquake phase detection has witnessed the most rapid advance as a successful application of ML. Cianetti et al. (2021) and Münchmeyer et al. (2022) compare existing deep learning algorithms under different earthquake scenarios, with special attention on their generalizability to data beyond the training set and giving end-users practical suggestions when applying these models. The sheer improvement in detection efficiency and consistency has resulted in much more detailed maps of seismicity and a corresponding improved understanding of earthquakes. In this special collection, multiple studies further improve the robustness and generalizability of ML-based earthquake detection. The improvements are achieved by utilizing a vision transformer architecture (Saad et al., 2022), by designing cascaded neural networks (Majstorovic et al., 2021), by data augmentation (T. Wang et al., 2021) and transfer learning (Lapins et al., 2021), by transforming seismic data into the time-frequency domain before detection (Saad et al., 2021), and by incorporating higher abstraction features and latent space information over the seismic array (Mosher & Audet, 2020; Z. Xiao et al., 2021; Feng et al., 2022). Baseline neural networks are trained using massive labeled datasets with several tens of thousands of data entries, while transfer learning reduces this requirement to a few thousand.
Seismicity classification is a more challenging step in the earthquake monitoring workflow as fewer labeled data are available, and more uncertainty exists in current manually labeled datasets. Therefore, earthquake seismologists appeal to semi-supervised methods (Linville, 2022) and unsupervised ML methods, which do not require manual labels and are often better generalized. Zhu, McBrearty, et al. (2022) proposes a new earthquake phase association algorithm based on a Bayesian Gaussian Mixture model to aggregate picked seismic phases into individual seismic events. This process is essentially an unsupervised clustering process, based on the maximum likelihood criterion to determine the earthquake source parameters. In a subsequent paper (Zhu, Tai, et al., 2022), the authors further develop an end-to-end architecture for joint phase picking and association. Jenkins II et al. (2021) adapts a random forest classifier to separate the background seismicity and the aftershocks in existing earthquake catalogs. Steinmann et al. (2022) proposes a hierarchical clustering strategy to classify noise and seismicity based on the deep scattering spectrum of the seismic data. Aden-Antoniow et al. (2022) compresses the seismic information into a low-dimensional latent space using an autoencoder before these latent vectors are clustered to identify seismicity from different sources.
While natural earthquake data carry a tremendous amount of information about the physical Earth, inverting for such information is extremely challenging due to the multitude of complexities in the natural environment. In the controlled laboratory setting, on the other hand, accurate labels of the physical system are generated at the same time as the laboratory earthquakes. Fieseler et al. (2022) apply an unsupervised sparse regression model to classify acoustic emission signals related to different cracking mechanisms and suggest using the differences in the reconstruction accuracy as an indicator for classification. By focusing the attention of the neural networks on specific features, fracture loading mode (Z. Song et al., 2022) and fracture saturation (Nolte & Pyrak-Nolte, 2022) are successfully inferred from the laboratory earthquakes. Similarly, although the deterministic prediction of time-to-failure in natural environments remains elusive, a couple of studies (Jasperson et al., 2021; Shreedharan et al., 2021) show that ML has the ability to predict time-to-failure and the stress state from the laboratory earthquake data. J.-T. Lin et al. (2021) demonstrates that deep neural networks trained by simulated ground deformation data have the potential to provide accurate early warnings for large earthquakes, particularly when the regional tectonic setting is well understood and data are abundant. Nonetheless, the application of such methods to tectonic faults in natural environments requires rigorous tests of the generalizability of the proposed ML approaches.
2.2 Geophysical Data Processing and Image Interpretation
Geophysical data contain various types of images where ML algorithms have demonstrated remarkable success. H. Xiao et al. (2021) perform a straightforward application of image classification to identify weather phenomena from natural images and establish a database for future weather studies. Zhou et al. (2022) follow the conventional feature engineering and prediction workflow to extract outlying motions from satellite images and achieve accurate identification of landslide location almost 1 year in advance. Granat et al. (2021) utilize clustering methods on global navigation satellite system data to identify major faults in California. Graw et al. (2021) utilize the random forest regressor to interpolate for a global marine sediment density map from measurements at a few sparsely distributed locations. You et al. (2021) train a generative adversarial network to compress complex two-dimensional digital rock images into one-dimensional latent space vectors and exploit the linearity of these vectors to interpolate between the two-dimensional images for a complete three-dimensional rock structure. These self-supervised ML methods bring unprecedented high-resolution images that are impossible or expensive to obtain to the geophysical disciplines.
ML has also drastically improved the efficiency of geophysical signal processing and interpretation. Alyaev and Elsheikh (2022) use a mixture density deep neural network (NN) to perform fast geophysical log interpretation for real-time geosteering. Automatic 2D image fault interpretations are performed on optical and topographic images with different resolutions (Mattéo et al., 2021), and bathymetry images (Vega-Ramírez et al., 2021). Gan et al. (2022) propose a generative NN to interpolate earthquake waveforms recorded on irregularly spaced stations. B. Li and Li (2021) train a neural network to perform end-to-end interpretation from time-lapse seismic images to the presence of CO2 after its geological sequestration. To train the ML algorithms, sufficient manual interpretations are used. Compared to the size of the natural image training data sets, the size of the geophysical image training data sets is often orders of magnitude smaller. Consequently, the structure and complexity of the machines, particularly of the neural networks, optimize at a moderate level to avoid overfitting.
When manually labeled ground truth for geophysical images is not available, synthetic images and corresponding labeled solutions are used for large-scale three-dimensional seismic image interpretation (Bi et al., 2021; X. Wu et al., 2020), for microseismicity locationing (Q. Zhang et al., 2022), for interferometric synthetic aperture radar image processing and denoising (Sun et al., 2020), and for dispersion curve picking (W. Song et al., 2021, 2022). These studies demonstrate the encouraging generalizability of supervised neural networks from synthetic data to field data, especially when the synthetic data are crafted based on the preliminary knowledge of the field. Constructing and augmenting training data using the known physics is also one of the most straightforward methods to incorporate physics information into the “black box” of the neural network, making the machines not entirely dependent on insufficient, noisy data.
2.3 Geophysical Inversion
This special collection has received the most submissions in the category of geophysical inversion, conventionally a highly challenging, ill-posed, and computationally expensive task. Most studies in this category design deep neural networks that are capable to capture the complex transformation from the measured data space to the desired model parameter space, train these machines using paired models and their corresponding synthetic data, and apply the trained machines to field datasets. Applications of such a framework range across the whole spectrum of geophysical inverse problems, including surface wave dispersion inversion and tomography (Cai et al., 2022; X. Zhang & Curtis, 2021), seismic-to-petrophysics inversion (Xiong et al., 2021; C. Zou et al., 2021), crustal thickness and Vp/Vs estimation from receiver functions (F. Wang et al., 2022), earthquake and microseismicity moment tensor inversion (Chen et al., 2022; Steinberg et al., 2021), magnetic, gravity, and ground-penetrating radar (GPR) data inversion (R. Huang et al., 2021; Leong & Zhu, 2021; Nurindrawati & Sun, 2020), and thermal evolution estimation for Mars (Agarwal et al., 2021). Y. Wu et al. (2022) design a dual loop framework for full waveform inversion using reflection data, where the inner loop trains a CNN to update the velocity model from the image, and the outer loop updates the training data set based on the results from the inner loop.
It is easy to see the appeal of ML in this challenge. First, it takes advantage of the massive expressivity of deep neural networks to represent complex domain transforms. Second, it incorporates the best-known physics through the training data sets constructed by accurate simulations and approximated noise. Moreover, once the neural networks are properly trained, applications on field data are almost instantaneous. Last but not least, the efficiency and the crafted stochastic characteristics of the ML framework help quantify the uncertainties in the inverted models. Nevertheless, such end-to-end ML solutions to geophysical inversion problems still suffer from strong skepticism due to the lack of interpretability of the inversion process, the severely limited generalizability of the trained machines to different field datasets, and the high computational cost in constructing the training datasets.
A few studies take more cautious steps compared to these end-to-end solutions by integrating ML as part of the inversion workflow. Kaur et al. (2021), X. Huang and Alkhalifah (2021), and Rasht-Behesht et al. (2022) improve the efficiency of seismic inversion by accelerating the wave-equation simulation process. Chen and Saygin (2021) propose to utilize latent-space representation of the seismic data compressed by the convolutional autoencoder. Lopez-Alvis et al. (2022) assemble the geological priors using a variational autoencoder to facilitate GPR traveltime inversion. While these studies show promising results, significant technological and scientific improvements are still in need before ML solutions make groundbreaking contributions to the extremely challenging geophysical inverse problems.
2.4 Multiphysical and Multi-Disciplinary Information Integration
One of the most exciting aspects of ML methods stems from their capability to capture implicit, complex relations among data from different physical, chemical, and geological measurements and information from different disciplines. Due to the complexity and diversity of geochemistry data, ML-based classification methods are proven to be much more efficient and accurate than conventional methods (Qin et al., 2022; S. Zou et al., 2022), particularly for large-scale geological processes. However, joint inversion, interpretation, and/or assimilation of data and knowledge from multiple disciplines rank among the most challenging tasks in geoscience. These are the problems that do not have existing accepted solutions, rely heavily on judgments of highly experienced experts, yet could lead to the most profound scientific insights if investigated properly. A few studies explore the promise of ML in multi-disciplinary data integration for predicting drought behavior in the Colorado River Basin based on various Earth System Models (Talsma et al., 2022), for predicting sea surface variabilities in the South China Sea (Shao et al., 2021), for geothermal heat flow prediction from multiple geophysical and geological datasets (Lösing & Ebbing, 2021), for identifying volcano's transition from non-eruptive to eruptive states (Manley et al., 2021), for understanding the geodynamic history using geochemical data (Jorgenson et al., 2022; X. Lin et al., 2022; X. Li & Zhang, 2022; Saha et al., 2021; Thomson et al., 2021; Y. Wang et al., 2021), and for characterizing geodetic signals by their sources (Hu et al., 2021). Albert (2022) uses an unsupervised deep NN structure to predict future atmospheric structures from past measurements to enable infrasound propagation modeling. These studies, although limited by the availability of data, provide encouraging results that demonstrate the feasibility of formalizing the interdisciplinary knowledge integration process through ML.
3 Remaining Challenges and Roads Ahead
This special collection showcases a wide variety of ML applications in solid Earth geosciences that have dramatically improved the efficiency of data processing and interpretation, and in some cases have resulted in a better understanding of the underlying physics and geoscience processes. The fundamental challenges of ML in geosciences: lack of labeled data, low interpretability of the deep learning networks, as well as limited integration between physical understanding and data correlations, remain to be overcome by the concerted efforts from generations of geoscientists.
Some of these challenges, particularly the lack of labeled data, may be addressed by recent trends in ML toward foundation models (Bommasani et al., 2021), which are trained on broad data and can be adapted to solve many problems using small problem-specific datasets. While foundation models have had great success in applications such as natural language processing, how or if they could be applied in geosciences remains to be seen. Such gaps between computer science, data science, and geoscience call for greater and deeper interdisciplinary collaborations so that rapid innovations in ML can be capitalized on by the geoscience community to continue pushing the boundaries of scientific understanding of our natural world. We anticipate opportunities for new and interesting applications of ML in geosciences to continue progressing. With increasing exploration, comparison, and competition, we as a community will identify the most suitable algorithms for different geoscience goals, and achieve better clarity of areas where ML has truly impacted solid Earth geoscience.
Acknowledgments
We are grateful to the editors and editorial staff of American Geophysics Union, who carried out a significant amount of editorial tasks. We also thank all the reviewers and authors who provided their best expertise and work to build this collection.
Open Research
Data Availability Statement
Data were not used, nor created for this research.