Volume 50, Issue 8 e2022GL102649
Research Letter
Open Access

A Hybrid Atmospheric Model Incorporating Machine Learning Can Capture Dynamical Processes Not Captured by Its Physics-Based Component

Troy Arcomano

Corresponding Author

Troy Arcomano

Department of Atmospheric Sciences, Texas A&M University, College Station, TX, USA

Now at Argonne National Laboratory, Lemont, IL, USA

Correspondence to:

T. Arcomano,

[email protected]

Search for more papers by this author
Istvan Szunyogh

Istvan Szunyogh

Department of Atmospheric Sciences, Texas A&M University, College Station, TX, USA

Search for more papers by this author
Alexander Wikner

Alexander Wikner

Department of Physics, University of Maryland, College Park, MD, USA

Search for more papers by this author
Brian R. Hunt

Brian R. Hunt

Institute for Physical Science and Technology, University of Maryland, College Park, MD, USA

Search for more papers by this author
Edward Ott

Edward Ott

Department of Physics, University of Maryland, College Park, MD, USA

Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, USA

Search for more papers by this author
First published: 26 April 2023
Citations: 1


It is shown that a recently developed hybrid modeling approach that combines machine learning (ML) with an atmospheric global circulation model (AGCM) can serve as a basis for capturing atmospheric processes not captured by the AGCM. This power of the approach is illustrated by three examples from a decades-long climate simulation experiment. The first example demonstrates that the hybrid model can produce sudden stratospheric warming, a dynamical process of nature not resolved by the low resolution AGCM component of the hybrid model. The second and third example show that introducing 6-hr cumulative precipitation and sea surface temperature (SST) as ML-based prognostic variables improves the precipitation climatology and leads to a realistic ENSO signal in the SST and atmospheric surface pressure.

Key Points

  • A hybrid system combining an atmospheric global circulation model (AGCM) with a machine-learning component can capture processes not captured by the AGCM

  • Machine learning provides a flexible framework to introduce additional prognostic variables into the hybrid model

  • The prototype hybrid model tested in the paper is stable and has a realistic climate in decades-long simulation experiments

Plain Language Summary

This paper introduces and tests schemes for efficiently enabling significant expansion of the utility and scope of a recently introduced hybrid modeling technique that combines machine learning with an atmospheric global circulation model (AGCM). Simulation experiments are carried out with an implementation of the approach on a low resolution simplified AGCM. An examination of the simulated atmospheric circulation suggests that the hybrid model can capture dynamical process not captured by the AGCM. Moreover, the addition of precipitation and sea surface temperature (SST) as machine learning predicted physical quantities to the model improves the precipitation climatology and leads to a realistic El Niño-La Niña signal in the SST and atmospheric surface pressure.

1 Introduction

Arcomano et al. (2022) (AEA22 hereafter) described a hybrid atmospheric modeling approach that combines machine learning (ML) with an atmospheric general circulation model (AGCM). They showed that, when the hybrid model was used for weather prediction, it provided more accurate short and medium range (1–7 days) forecasts than either the AGCM or the ML-only component of the model (Arcomano et al., 2020) acting alone. They also showed that when the model was used for climate simulations, it greatly reduced the systematic errors (biases) of the model climate compared to that of the AGCM. In the present study, we further explore the potential of the approach of AEA22 for climate modeling, and describe methods that significantly extends its utility and scope. The results we report are in accord with the idea that the inaccuracies of an AGCM could potentially be mitigated by utilization of information in time series of past observational data via the ML component of the hybrid.

The approach of AEA22 is an implementation of the combined hybrid/parallel prediction (CHyPP) scheme of Wikner et al. (2020) on an AGCM. CHyPP itself is an adaptation of the hybrid modeling approach of Pathak, Wikner, et al. (2018) to large dynamical systems, using the parallel reservoir computing (RC) algorithm of Pathak, Hunt, et al. (2018) for ML. Other hybrid approaches recently proposed for earth system modeling (Brenowitz & Bretherton, 20182019; Chattopadhyay et al., 2020; Clark et al., 2022; Farchi et al., 2021; Gentine et al., 2018; Rasp et al., 2018; Watt-Meyer et al., 2021) use either random forests or deep learning for ML.

Section 2 summarizes the approach of AEA22 and explains how additional prognostic variables can be introduced into the hybrid model without changing the AGCM. Section 3 demonstrates the potential of the approach by three examples from a climate simulation experiment. The first example, the presence of sudden stratospheric warming (SSW) events, illustrates that the hybrid model can capture some dynamical processes of nature not resolved by the AGCM. The second and third example, realistic precipitation climatology and SST variability, demonstrate that some other dynamical processes can be reproduced by modifying the AEA22 hybrid via addition of new ML-based prognostic variables (precipitation and sea surface temperature). As in AEA22, the AGCM of the simulation experiments is the Simplified Parameterization, primitive-Equation Dynamics (SPEEDY) model (Kucharski et al., 2006; Molteni, 2003).

2 The Hybrid Modeling Approach

2.1 The Hybrid Modeling Approach of AEA22

The hybrid model uses the same computational grid as the AGCM. The elements of the hybrid global state vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0001 and physics-based global state vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0002 are the grid-point values of the prognostic model variables. The input of the “one-time-step” hybrid global model solution urn:x-wiley:00948276:media:grl65785:grl65785-math-0003 is urn:x-wiley:00948276:media:grl65785:grl65785-math-0004, where the “time step” Δt is significantly longer than the time step of the AGCM. No changes are made to the AGCM, which is started from urn:x-wiley:00948276:media:grl65785:grl65785-math-0005 to provide the physics-based contribution urn:x-wiley:00948276:media:grl65785:grl65785-math-0006 to urn:x-wiley:00948276:media:grl65785:grl65785-math-0007. The hybridization is done by subdividing the global atmosphere horizontally into L local regions and obtaining a hybrid local model solution for each region. The computations for the different local regions ( = 1, 2, …, L) are carried out in parallel and urn:x-wiley:00948276:media:grl65785:grl65785-math-0008 is obtained by assembling the hybrid local solutions. The next paragraph outlines the calculations that provide the hybrid local model solution for local region .

The elements of the physics based local state vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0009 are the standardized elements of urn:x-wiley:00948276:media:grl65785:grl65785-math-0010 that fall into local region (Hereafter, the symbol urn:x-wiley:00948276:media:grl65785:grl65785-math-0011 indicates a standardized vector obtained by subtracting a related mean value and dividing by a related standard deviation for each element of x.). The “one-time-step” hybrid local model solution is
where W is a weight matrix whose entries are to be determined by ML training, which will be discussed in Section 2.2. The Dr-dimensional vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0013 is a quadratic function of the reservoir state vector r(t + Δt), where the reservoir is a dynamical system with evolution equation (Jaeger, 2001; Lukoševičius, 2012; Lukoševičius & Jaeger, 2009)
Each entry of the Dr × Dr weighted adjacency matrix A is randomly chosen with a probability κ/Dr of being nonzero and assigned a random value chosen uniformly in (0,1]. The nonzero entries are scaled such that the magnitude of the largest eigenvalue of A has a prescribed value ρ (0 < ρ < 1), called the spectral radius. The Du-dimensional vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0015 is the input vector of the reservoir, whose elements are standardized elements of the global hybrid state vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0016 from an extended local region that has overlaps with its four neighbors. B is a matrix with entries chosen randomly on the interval (−α, α), where α is an adjustable parameter. The hybrid local model solution is obtained by transforming the elements of the standardized vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0017 to non-standardized values.

The initial value of urn:x-wiley:00948276:media:grl65785:grl65785-math-0018 at the beginning of a forecast or simulation is a conventional observational analysis urn:x-wiley:00948276:media:grl65785:grl65785-math-0019 for the AGCM. Starting the hybrid model also requires an initial value r(0) for each of the L reservoir state vectors. These initial values are obtained using Equation 2 to synchronize the evolution of the reservoirs with the atmospheric states for a short period prior (t < 0) to the start time of the forecast or simulation. This synchronization is achieved by feeding the reservoirs input vectors based on observational analyses for the synchronization period of 1 month.

2.2 Training the Hybrid Model

The machine-learning component of the model learns to predict urn:x-wiley:00948276:media:grl65785:grl65785-math-0020 from urn:x-wiley:00948276:media:grl65785:grl65785-math-0021 for each local region by training. A flowchart of the hybrid model and a schematic of training can be found in AEA22. The training data are based on global observational analyses urn:x-wiley:00948276:media:grl65785:grl65785-math-0022. These analyses provide the initial conditions for the Δt-long AGCM forecasts and are standardized and restricted to the extended local region to form the input u(t) for each of the L reservoirs. To promote stability, a small-magnitude random noise ɛ(kΔt) is introduced into the analyses before forming the input vectors by the formula urn:x-wiley:00948276:media:grl65785:grl65785-math-0023.

The training data after standardization also provide the elements of the desired outcome urn:x-wiley:00948276:media:grl65785:grl65785-math-0024 to which urn:x-wiley:00948276:media:grl65785:grl65785-math-0025 can be compared during training. Formally, the training seeks the weight matrix W for which the “one-time-step” predictions urn:x-wiley:00948276:media:grl65785:grl65785-math-0026 (k = −K + 1, −K + 2, …, 0) best fit urn:x-wiley:00948276:media:grl65785:grl65785-math-0027 in a least-square sense. That is, W is the minimizer of the quadratic cost-function
The adjustable parameters βP and βR are chosen regularization parameters (Tikhonov & Arsenin 1977). It can be shown that the direct solution of the minimization problem is the matrix
In this equation, column k of the matrix urn:x-wiley:00948276:media:grl65785:grl65785-math-0030 is the local state vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0031 that corresponds to the physics-based model solution urn:x-wiley:00948276:media:grl65785:grl65785-math-0032 started from urn:x-wiley:00948276:media:grl65785:grl65785-math-0033, column k of the matrix urn:x-wiley:00948276:media:grl65785:grl65785-math-0034 is urn:x-wiley:00948276:media:grl65785:grl65785-math-0035, and column k of the matrix urn:x-wiley:00948276:media:grl65785:grl65785-math-0036 is urn:x-wiley:00948276:media:grl65785:grl65785-math-0037.

2.3 Introducing New ML-Based Prognostic Variables

In atmospheric modeling, the term “prognostic variable” refers to a state variable whose temporal evolution is predicted directly by a model equation. The hybrid approach provides a framework for introducing new prognostic variables without making any changes to the AGCM provided that training data are available for the new variables. Two specific methods that take advantage of this flexibility are described here: Method I is designed for atmospheric variables that are not required to evolve the ACGM; while Method II is designed for external variables, variables represented by prescribed boundary fields in a standalone ACGM, which might vary on a different time scale than the atmospheric prognostic variables.

2.3.1 Method I

The purpose of Method I is to introduce a prognostic variable that is either not predicted by the AGCM, or predicted only indirectly as a “byproduct” of the parameterization schemes. This approach will be demonstrated by introducing precipitation as a prognostic variable (Section 3).

be the global state vector of hybrid model, where urn:x-wiley:00948276:media:grl65785:grl65785-math-0039 represents the global field of a new prognostic variable. The corresponding local state vector for local region is
The equation of the reservoir dynamics is modified as
where urn:x-wiley:00948276:media:grl65785:grl65785-math-0042 is an extended local state vector, which also includes the grid-point values of the new prognostic variable from the extended local region. In addition, Equation 1 is modified as

2.3.2 Method II

In an AGCM, the effects of the other earth system components, such as the ocean, cryosphere, land, and biosphere on the atmosphere are taken into account by parameterization schemes that include fields of some state variables of the other components at the earth's surface as input. In a standalone AGCM these fields must be prescribed. For instance, the thermal effects of the ocean on the atmosphere are taken into account by schemes that include prescribed SST fields, which are based on past SST observational analyses in the case of a climate simulation, or the latest SST analysis in the case of a weather forecast. A limitation of this approach is that it does not take into account feedbacks from the state variables of the AGCM to the prescribed state variables. Method II addresses this issue by replacing a prescribed field with an ML-based prognostic variable. It also takes into account the fact that the climate-relevant effects of these feedbacks on the atmosphere typically occur on time scales that are different than the time scale of the changes of the atmosphere on which the AGCM evolves. Method II will be demonstrated by introducing SST as a prognostic variable (Section 3).

In contrast with Method I, the reservoirs for the new prognostic variable are separate from the original reservoirs of the hybrid model. Let urn:x-wiley:00948276:media:grl65785:grl65785-math-0045 be the state vector that represents the global state of the new variable in the hybrid model, and urn:x-wiley:00948276:media:grl65785:grl65785-math-0046 ( = 1, 2, …, L) the related local state vectors. The ML-based “prognostic equation” for local vectors is
The input vector urn:x-wiley:00948276:media:grl65785:grl65785-math-0049 includes standardized grid-point values of both the new variable and the original variables, while the “time step” Δt* is not necessarily equal to Δt (another difference with Method I). For instance, when the newly added prognostic variable evolves on a slower time scale than the atmospheric prognostic variable (e.g., the SST), Δt* > Δt, and the interactions between the new variable and the atmospheric variables are treated as follows: (a) the time t + nΔt (n = 1, 2, …, Δt*/Δt − 1) input from the new prognostic variable to the AGCM and the atmospheric reservoirs are the values at time t; and (b) the time t input from the atmospheric prognostic variables to the reservoirs of the new prognostic variable are the average values for the “time steps” t + nΔt (n = 0, 1, …, Δt*/Δt).
The weight matrix urn:x-wiley:00948276:media:grl65785:grl65785-math-0050 is computed separately from urn:x-wiley:00948276:media:grl65785:grl65785-math-0051 by
where βR* is a regularization parameter, column k of the matrix urn:x-wiley:00948276:media:grl65785:grl65785-math-0053 is urn:x-wiley:00948276:media:grl65785:grl65785-math-0054, and column k of the matrix urn:x-wiley:00948276:media:grl65785:grl65785-math-0055 is urn:x-wiley:00948276:media:grl65785:grl65785-math-0056 (the local vector of training data for time kΔt).

3 Climate Simulation Experiment

3.1 Experiment Design

The hybrid model is the same as in AEA22, except that precipitation and SST are added as prognostic variables to the two horizontal coordinates of the wind vector, temperature, specific humidity, and the logarithm of the surface pressure. Following Pathak et al. (2022) and Rasp and Thuerey (2021), the precipitation variable is defined by ln(P/0.001 + 1), where P is the cumulative precipitation in meters for the prior 6 hr. The fields of the SST, precipitation, and logarithm of the surface pressure are two-dimensional, while the fields of the other variables are three-dimensional.

The hybrid model is implemented using the standard configuration of Version 41 of SPEEDY with a spectral horizontal resolution of T30 and a nominal horizontal spatial resolution of 3.75° × 3.75° (Molteni, 2003). The three-dimensional state variables of the model are the two coordinates of the horizontal wind vector, temperature, and specific humidity defined at eight vertical σ-levels (0.025, 0.095, 0.20, 0.34, 0.51, 0.685, 0.835, and 0.95), where σ is the ratio of pressure to the surface pressure. The single two-dimensional state variable is the natural logarithm of surface pressure. In the standard configuration of SPEEDY the boundary conditions are prescribed ERA Interim climatological fields for 1979–2008.

The global computational grid and the state variables of the hybrid model are the same as those of SPEEDY. The L = 1, 152 local regions for the atmospheric state variables have a 7.5° × 7.5° horizontal footprint and contain all vertical levels. The extended local regions have a horizontal footprint of 15.0° × 15.0° with an overlap of 3.75° (1 grid point) on each side. The climatological mean and standard deviation for the standardization of the components of the local state vectors and input vectors of the reservoirs are computed for the specific variable at the specific vertical level for the extended local region.

The input vectors of the reservoirs for the atmospheric prognostic variables include the standardized values of the atmospheric prognostic variables from the extended local region, plus the incoming solar radiation at the top of the atmosphere. The “time step” for the atmospheric state variables is Δt = 6 hr. The other hyper-parameters of the reservoirs for the atmospheric prognostic variables are the same as AEA22: Dr = 6,000, α = 0.5, βR = 10−4, βP = 1, κ = 6, ɛ = 0.2, ρ increases from 0.3 at the equator to 0.7 at latitude 45° and beyond. These values were found by hand tuning based on numerical experiments with the goal of producing stable predictions with the best medium range (3–5 days) weather forecast skill. A local state vector and reservoir are created for the SST only if the local region includes at least one oceanic grid point. The coordinates of the local state vectors are the standardized SST values at the oceanic grid points. (A similar approach is employed in the standalone parallel RC-based global SST model of Walleshauser and Bollt (2022)). The “time step” for the SST is Δt* = 168 hr (7 days), which is 28 times longer than Δt. The elements of the input vectors of the reservoirs are the averages of the atmospheric state variables at the lowest model level for the period [t, t + Δt*] and the SST at time t from the extended local region. At grid points over land, the SST elements of the input vectors are set to a predefined constant (land mask) value. A non-standardized SST value ≤−1°C is assumed to indicate ice. In a local region where the ocean is permanently covered by ice in the training data, the ocean is assumed to remain covered by ice. In a local region where both water and ice are present in the training data, the phase of sea water is allowed to change, but non-standardized values of the SST that are <−1°C at the end of a time step are reset to −1°C to promote stability. The other hyperparameters for the SST are urn:x-wiley:00948276:media:grl65785:grl65785-math-0057, α* = 0.6, βR* = 10−4, κ* = 6, ρ* = 0.9, ɛ* = 0.1. These values of the hyper-parameters were also found by hand tuning, using numerical experiments to determine the values that produced the best ocean climatology. We found tuning to not be too difficult, only taking about 10 experiments. The feedback from the SST to the atmosphere is introduced by replacing the prescribed SST field of SPEEDY with the last predicted values of the SST, which stays constant for 7 days.

The training and verification data are ERA5 reanalyzes (Hersbach et al., 2020). The training period is from 0000 UTC 1 January 1981 to 0000 UTC 1 December 2006. The ERA5 reanalyzes from December 2006 are used to keep the reservoirs synchronized with the atmosphere, and a 70-year simulation experiment (free run without observational input) is started from the ERA5 reanalysis for 0000 UTC 1 January 2007. The hybrid model remains stable and produces a realistic climate for the entirety of the experiment. The hybrid model climatology for the first 40 years of this experiment is compared to the ERA5 climatology for 1981–2020, and the 40-year climatology for a free run with SPEEDY, which is started on 0000 UTC 1 January 1981.

3.2 Sudden Stratospheric Warming

The dominant features of stratospheric variability are wintertime events of sudden stratospheric warming (SSW) in the NH. The term SSW refers to a dynamical process in which the normally strong westerly zonal mean flow at the edge of the NH stratospheric polar vortex suddenly turns easterly, which leads to a sudden rise of the polar stratospheric temperature. This rapid change is caused by an unusually strong coupling between the dynamics of the stratospheric and tropospheric flow (Andrews et al., 1987). While SPEEDY would need additional vertical levels above 25 hPa (its current top level) to produce realistic stratospheric dynamics, the hybrid model can produce realistic SSW events (Figure 1). The blue curves and gray shades show, respectively, the calendar-day mean and year-to-year variability of the strength of the zonal flow at the edge of the stratospheric polar vortex (top three panels) and the polar temperature (bottom three panels). From July to December, the stratospheric flow (left panels) first turns from easterly (negative values) to westerly (positive values), and then it gradually strengthens until midwinter, when it starts to weaken and eventually turns easterly again in April. The mean polar temperature gradually decreases from midsummer to midwinter, when it starts to increase to complete the cycle. The variability of the strength of the zonal flow and the polar temperature is low from May to September and high from October to April, with a maximum in midwinter.

Details are in the caption following the image

The performance of the hybrid model in capturing SSW. Results are shown for the (left) ERA5 reanalyzes, (center) hybrid model, and (right) SPEEDY. Results are shown at the 25 hPa pressure level for (top panels) the mean of the zonal wind component in the 55°N–65°N latitude band, and (bottom panels) the mean temperature north of 60°N. Blue curves show the climatological daily mean, while the gray shading characterizes the annual variability by displaying the range between plus and minus two standard deviations. Positive values of the wind indicate westerly flow, while negative values indicate easterly flow. The red curves show the same diagnostics as the blue curves, except for a particular SSW event rather then the 40-year mean. (No SSW event is detected for SPEEDY.) The event from ERA5 took place in 2013.

While both the hybrid model (middle two panels) and SPEEDY (right two panels) can capture the mean trends, the hybrid model somewhat overestimates, while SPEEDY substantially underestimates the variability of the flow. The relationship between the variability of the flow and SSW can be further investigated by using the criteria of Charlton and Polvani (2007) to detect SSW: an event occurs when the stratospheric zonal mean of the zonal wind at 60°N turns easterly and then it turns back to westerly for at least 10 consecutive days. Here, the criteria is applied to the zonal wind at vertical pressure level 25 hPa. For ERA5, the hybrid model, and SPEEDY, there are 0.6, 0.87, and zero SSW events per year, respectively. The examples for an event shown by the red curves in Figure 1 illustrate that both the speed of the onset and the duration of the SSW are captured realistically by the hybrid model.

3.3 Precipitation Climatology

The prognostic precipitation variable of the hybrid model provides cumulative precipitation values with 6 hourly resolution, while the diagnostic precipitation variable of SPEEDY provides this variable with a daily resolution. The precipitation model climatologies of Figure 2 are based on these variables. This figure shows that the hybrid model produces lower magnitude precipitation biases than SPEEDY at most locations (top two rows of panels): the 1.20 mm per day global root-mean-square of the bias for SPEEDY is reduced to 0.63 mm per day for the hybrid model, and the absolute value of the largest-magnitude local bias is reduced from 9.87 mm per day to 5.17 mm per day. SPEEDY has a dry bias in the extension regions of the Kuroshio Current and Gulf Stream (two right panels), which is greatly reduced by the hybrid approach (top two middle panels). The bias is also lower for the hybrid model than SPEEDY in mountainous regions (e.g., Rockies, Himalayas) and equatorial South America and Africa. One region where the hybrid model has a larger bias than SPEEDY is the Tropical Pacific, where it has a wet bias. Interestingly, the hybrid model produces a “double ITCZ,” which has also been a persistent problem for physics-based models (Zhang et al., 2019).

Details are in the caption following the image

The performance of the hybrid model in capturing the precipitation climatology. Shown is the climatological daily mean precipitation rate for (top left) ERA5, (top center) the hybrid model, and (top right) SPEEDY. Also shown are (middle left) the difference between the biases of the daily precipitation rates for the hybrid model and SPEEDY, and (middle center) the biases of the daily precipitation rates for the hybrid model and (middle right) SPEEDY. Also shown (bottom center) are the rates of occurrence of different precipitation intensities in percentile for (blue) ERA5, (orange) the hybrid model, and (green) SPEEDY.

In addition to providing improved mean precipitation, the hybrid model captures the daily variability of the precipitation more accurately than SPEEDY.

3.4 SST Climatology and ENSO

The SST prognostic variable (Figure 3, left panels) has low biases: the global root-mean-square value of the SST bias is 0.43°C, while the largest local values of the bias are in the 1°–2°C range. While the model correctly captures the main regions of largest temporal variability of the SST (Figure 3, right panels), it tends to somewhat underestimate the variability associated with the western boundary currents and their extension regions, and overestimate the variability associated with ENSO in the Equatorial Pacific near the coast of South America.

Details are in the caption following the image

The SST climatology of the hybrid model. Shown are the climatological mean SST for (top left) ERA5, (middle left) the hybrid model, and (bottom left) the difference between the two fields; and the standard deviation of the monthly mean SST for (top right) ERA5 and (middle right) the hybrid model, and (bottom right) the difference between the two fields.

The skills and the limitations of the model in capturing climate variability related to ENSO are further illustrated by Figure 4. Two of the most common metrics used for diagnosing ENSO phases are the Oceanic Niño Index (ONI) and the Southern Oscillation Index (SOI) for the Niño 3.4 region (5°S–5°N, 120°W–170°W). The model correctly captures the inverse relationship between the smoothed time series of the two indexes (top panel). In addition, the autocorrelation function of the Niño 3.4 SST anomalies for the model is in good agreement with that for the ERA5 reanalyzes for the first 6 months of lag (bottom left panel). The model, however, does not capture the timing of the crossover into negative autocorrelation at about 10 months: the model transitions from one phase of ENSO to another with a delay. In addition, the power spectrum of ENSO is similar to ERA5 (bottom right panel). While some climate models produce too much variability associated with ENSO in the western Tropical Pacific (Menary et al., 2018), the hybrid model does not exhibit such behavior (Figure 3, bottom right panel).

Details are in the caption following the image

Illustration of the performance of the hybrid model in capturing ENSO. Shown are (top) time series of (solid black) the 3-month running mean of the ONI and (green dashes) the 5-month running mean of the SOI. Red and blue shadings indicate El Niño and La Niña, respectively. Also shown are (bottom left) the autocorrelation functions and (bottom right) power spectra of the Niño 3.4 SST anomalies for (orange) the ERA5 reanalyzes and (blue) the coupled model (blue). The error bars are computed by splitting the timeseries of Niño 3.4 SST anomalies into overlapping segments and calculating the standard deviation across each segment.

4 Conclusions

The goal of this paper was to demonstrate that hybridizing an AGCM by incorporating ML can help the model to capture dynamical processes of nature that are missing from climate simulations with the AGCM. For some dynamical processes, this potential can be realized without introducing new prognostic variables in the ML component of the model. This point was illustrated with the process of SSW. Some other processes can be introduced into the model dynamics by adding new ML-based prognostic variables. This point was illustrated by two examples. First, the 6-hr cumulative precipitation was introduced as a prognostic variable, and it was shown that the model produced a highly realistic climatology for the newly added prognostic variable. Second, SST, which is a prescribed boundary parameter of the AGCM, was turned into a prognostic variable. The SST prognostic variable had highly realistic climatology, and it also had a realistic ENSO signal. Moreover, the hybrid model also correctly captured the related atmospheric surface pressure signal, the indication of a realistic two-way coupling between an oceanic state variable and the model atmosphere.

State-of-the-art Earth System models require 1,000s of processors and can typically simulate one decade per day (e.g., Golaz et al., 2019). In contrast, our hybrid model can be run on a small cluster and the 70-year simulation presented in this paper took 18 hr with 32 processors. Because the ML component of the hybrid model is based on RC, training the model is also computationally highly efficient. The training described in this paper required 40 min wall-clock time using 1,152 processors on a medium sized cluster.

One important caveat concerning our conclusions is that they are based on the application of the hybrid approach to an AGCM that has much lower resolution and simpler parameterization schemes than a state-of-the-art AGCM. A state-of-the-art model may leave less room for the improvement of the model representation of some dynamical processes. It is also yet to be seen whether the hybrid model state satisfies an interpretable physics-based water and energy budget equation (e.g., Bosilovich et al., 2011). This would provide further support to the proposed approach to turn boundary parameters into ML-based interactive prognostic variables. Finally, while we show that the hybrid model is able to replicate the current climate, the dynamics of the climate is inherently nonstationary. Patel et al. (2021) and Patel and Ott (2023) outline a method to incorporate nonstationarity into a hybrid model similar to the one presented in this study. Using this method, their hybrid model was able to anticipate tipping points and simulate post-tipping point climates in toy models. Our plan is to apply their method to investigate the nonstationarity of the climate using the hybrid model of the present study.


This work was supported by DARPA contract HR00112290035, and it was conducted with the advanced computing resources provided by Texas A&M High Performance Research Computing. Dhruvit Patel provided helpful comments on the manuscript.

    Data Availability Statement

    The code to run and analyze the results of the experiments in this study are available online (https://doi.org/10.5281/zenodo.7508156). The trained weights for the hybrid model used in this study are available online (https://doi.org/10.5281/zenodo.7222830).