It is shown that a recently developed hybrid modeling approach that combines machine learning (ML) with an atmospheric global circulation model (AGCM) can serve as a basis for capturing atmospheric processes not captured by the AGCM. This power of the approach is illustrated by three examples from a decades-long climate simulation experiment. The first example demonstrates that the hybrid model can produce sudden stratospheric warming, a dynamical process of nature not resolved by the low resolution AGCM component of the hybrid model. The second and third example show that introducing 6-hr cumulative precipitation and sea surface temperature (SST) as ML-based prognostic variables improves the precipitation climatology and leads to a realistic ENSO signal in the SST and atmospheric surface pressure.
A hybrid system combining an atmospheric global circulation model (AGCM) with a machine-learning component can capture processes not captured by the AGCM
Machine learning provides a flexible framework to introduce additional prognostic variables into the hybrid model
The prototype hybrid model tested in the paper is stable and has a realistic climate in decades-long simulation experiments
Plain Language Summary
This paper introduces and tests schemes for efficiently enabling significant expansion of the utility and scope of a recently introduced hybrid modeling technique that combines machine learning with an atmospheric global circulation model (AGCM). Simulation experiments are carried out with an implementation of the approach on a low resolution simplified AGCM. An examination of the simulated atmospheric circulation suggests that the hybrid model can capture dynamical process not captured by the AGCM. Moreover, the addition of precipitation and sea surface temperature (SST) as machine learning predicted physical quantities to the model improves the precipitation climatology and leads to a realistic El Niño-La Niña signal in the SST and atmospheric surface pressure.
Arcomano et al. (2022) (AEA22 hereafter) described a hybrid atmospheric modeling approach that combines machine learning (ML) with an atmospheric general circulation model (AGCM). They showed that, when the hybrid model was used for weather prediction, it provided more accurate short and medium range (1–7 days) forecasts than either the AGCM or the ML-only component of the model (Arcomano et al., 2020) acting alone. They also showed that when the model was used for climate simulations, it greatly reduced the systematic errors (biases) of the model climate compared to that of the AGCM. In the present study, we further explore the potential of the approach of AEA22 for climate modeling, and describe methods that significantly extends its utility and scope. The results we report are in accord with the idea that the inaccuracies of an AGCM could potentially be mitigated by utilization of information in time series of past observational data via the ML component of the hybrid.
The approach of AEA22 is an implementation of the combined hybrid/parallel prediction (CHyPP) scheme of Wikner et al. (2020) on an AGCM. CHyPP itself is an adaptation of the hybrid modeling approach of Pathak, Wikner, et al. (2018) to large dynamical systems, using the parallel reservoir computing (RC) algorithm of Pathak, Hunt, et al. (2018) for ML. Other hybrid approaches recently proposed for earth system modeling (Brenowitz & Bretherton, 2018, 2019; Chattopadhyay et al., 2020; Clark et al., 2022; Farchi et al., 2021; Gentine et al., 2018; Rasp et al., 2018; Watt-Meyer et al., 2021) use either random forests or deep learning for ML.
Section 2 summarizes the approach of AEA22 and explains how additional prognostic variables can be introduced into the hybrid model without changing the AGCM. Section 3 demonstrates the potential of the approach by three examples from a climate simulation experiment. The first example, the presence of sudden stratospheric warming (SSW) events, illustrates that the hybrid model can capture some dynamical processes of nature not resolved by the AGCM. The second and third example, realistic precipitation climatology and SST variability, demonstrate that some other dynamical processes can be reproduced by modifying the AEA22 hybrid via addition of new ML-based prognostic variables (precipitation and sea surface temperature). As in AEA22, the AGCM of the simulation experiments is the Simplified Parameterization, primitive-Equation Dynamics (SPEEDY) model (Kucharski et al., 2006; Molteni, 2003).
2 The Hybrid Modeling Approach
2.1 The Hybrid Modeling Approach of AEA22
The hybrid model uses the same computational grid as the AGCM. The elements of the hybrid global state vector and physics-based global state vector are the grid-point values of the prognostic model variables. The input of the “one-time-step” hybrid global model solution is , where the “time step” Δt is significantly longer than the time step of the AGCM. No changes are made to the AGCM, which is started from to provide the physics-based contribution to . The hybridization is done by subdividing the global atmosphere horizontally into L local regions and obtaining a hybrid local model solution for each region. The computations for the different local regions (ℓ = 1, 2, …, L) are carried out in parallel and is obtained by assembling the hybrid local solutions. The next paragraph outlines the calculations that provide the hybrid local model solution for local region ℓ.
The initial value of at the beginning of a forecast or simulation is a conventional observational analysis for the AGCM. Starting the hybrid model also requires an initial value rℓ(0) for each of the L reservoir state vectors. These initial values are obtained using Equation 2 to synchronize the evolution of the reservoirs with the atmospheric states for a short period prior (t < 0) to the start time of the forecast or simulation. This synchronization is achieved by feeding the reservoirs input vectors based on observational analyses for the synchronization period of 1 month.
2.2 Training the Hybrid Model
The machine-learning component of the model learns to predict from for each local region by training. A flowchart of the hybrid model and a schematic of training can be found in AEA22. The training data are based on global observational analyses . These analyses provide the initial conditions for the Δt-long AGCM forecasts and are standardized and restricted to the extended local region to form the input uℓ(t) for each of the L reservoirs. To promote stability, a small-magnitude random noise ɛ(kΔt) is introduced into the analyses before forming the input vectors by the formula .
2.3 Introducing New ML-Based Prognostic Variables
In atmospheric modeling, the term “prognostic variable” refers to a state variable whose temporal evolution is predicted directly by a model equation. The hybrid approach provides a framework for introducing new prognostic variables without making any changes to the AGCM provided that training data are available for the new variables. Two specific methods that take advantage of this flexibility are described here: Method I is designed for atmospheric variables that are not required to evolve the ACGM; while Method II is designed for external variables, variables represented by prescribed boundary fields in a standalone ACGM, which might vary on a different time scale than the atmospheric prognostic variables.
2.3.1 Method I
The purpose of Method I is to introduce a prognostic variable that is either not predicted by the AGCM, or predicted only indirectly as a “byproduct” of the parameterization schemes. This approach will be demonstrated by introducing precipitation as a prognostic variable (Section 3).
2.3.2 Method II
In an AGCM, the effects of the other earth system components, such as the ocean, cryosphere, land, and biosphere on the atmosphere are taken into account by parameterization schemes that include fields of some state variables of the other components at the earth's surface as input. In a standalone AGCM these fields must be prescribed. For instance, the thermal effects of the ocean on the atmosphere are taken into account by schemes that include prescribed SST fields, which are based on past SST observational analyses in the case of a climate simulation, or the latest SST analysis in the case of a weather forecast. A limitation of this approach is that it does not take into account feedbacks from the state variables of the AGCM to the prescribed state variables. Method II addresses this issue by replacing a prescribed field with an ML-based prognostic variable. It also takes into account the fact that the climate-relevant effects of these feedbacks on the atmosphere typically occur on time scales that are different than the time scale of the changes of the atmosphere on which the AGCM evolves. Method II will be demonstrated by introducing SST as a prognostic variable (Section 3).
3 Climate Simulation Experiment
3.1 Experiment Design
The hybrid model is the same as in AEA22, except that precipitation and SST are added as prognostic variables to the two horizontal coordinates of the wind vector, temperature, specific humidity, and the logarithm of the surface pressure. Following Pathak et al. (2022) and Rasp and Thuerey (2021), the precipitation variable is defined by ln(P/0.001 + 1), where P is the cumulative precipitation in meters for the prior 6 hr. The fields of the SST, precipitation, and logarithm of the surface pressure are two-dimensional, while the fields of the other variables are three-dimensional.
The hybrid model is implemented using the standard configuration of Version 41 of SPEEDY with a spectral horizontal resolution of T30 and a nominal horizontal spatial resolution of 3.75° × 3.75° (Molteni, 2003). The three-dimensional state variables of the model are the two coordinates of the horizontal wind vector, temperature, and specific humidity defined at eight vertical σ-levels (0.025, 0.095, 0.20, 0.34, 0.51, 0.685, 0.835, and 0.95), where σ is the ratio of pressure to the surface pressure. The single two-dimensional state variable is the natural logarithm of surface pressure. In the standard configuration of SPEEDY the boundary conditions are prescribed ERA Interim climatological fields for 1979–2008.
The global computational grid and the state variables of the hybrid model are the same as those of SPEEDY. The L = 1, 152 local regions for the atmospheric state variables have a 7.5° × 7.5° horizontal footprint and contain all vertical levels. The extended local regions have a horizontal footprint of 15.0° × 15.0° with an overlap of 3.75° (1 grid point) on each side. The climatological mean and standard deviation for the standardization of the components of the local state vectors and input vectors of the reservoirs are computed for the specific variable at the specific vertical level for the extended local region.
The input vectors of the reservoirs for the atmospheric prognostic variables include the standardized values of the atmospheric prognostic variables from the extended local region, plus the incoming solar radiation at the top of the atmosphere. The “time step” for the atmospheric state variables is Δt = 6 hr. The other hyper-parameters of the reservoirs for the atmospheric prognostic variables are the same as AEA22: Dr = 6,000, α = 0.5, βR = 10−4, βP = 1, κ = 6, ɛ = 0.2, ρ increases from 0.3 at the equator to 0.7 at latitude 45° and beyond. These values were found by hand tuning based on numerical experiments with the goal of producing stable predictions with the best medium range (3–5 days) weather forecast skill. A local state vector and reservoir are created for the SST only if the local region includes at least one oceanic grid point. The coordinates of the local state vectors are the standardized SST values at the oceanic grid points. (A similar approach is employed in the standalone parallel RC-based global SST model of Walleshauser and Bollt (2022)). The “time step” for the SST is Δt* = 168 hr (7 days), which is 28 times longer than Δt. The elements of the input vectors of the reservoirs are the averages of the atmospheric state variables at the lowest model level for the period [t, t + Δt*] and the SST at time t from the extended local region. At grid points over land, the SST elements of the input vectors are set to a predefined constant (land mask) value. A non-standardized SST value ≤−1°C is assumed to indicate ice. In a local region where the ocean is permanently covered by ice in the training data, the ocean is assumed to remain covered by ice. In a local region where both water and ice are present in the training data, the phase of sea water is allowed to change, but non-standardized values of the SST that are <−1°C at the end of a time step are reset to −1°C to promote stability. The other hyperparameters for the SST are , α* = 0.6, βR* = 10−4, κ* = 6, ρ* = 0.9, ɛ* = 0.1. These values of the hyper-parameters were also found by hand tuning, using numerical experiments to determine the values that produced the best ocean climatology. We found tuning to not be too difficult, only taking about 10 experiments. The feedback from the SST to the atmosphere is introduced by replacing the prescribed SST field of SPEEDY with the last predicted values of the SST, which stays constant for 7 days.
The training and verification data are ERA5 reanalyzes (Hersbach et al., 2020). The training period is from 0000 UTC 1 January 1981 to 0000 UTC 1 December 2006. The ERA5 reanalyzes from December 2006 are used to keep the reservoirs synchronized with the atmosphere, and a 70-year simulation experiment (free run without observational input) is started from the ERA5 reanalysis for 0000 UTC 1 January 2007. The hybrid model remains stable and produces a realistic climate for the entirety of the experiment. The hybrid model climatology for the first 40 years of this experiment is compared to the ERA5 climatology for 1981–2020, and the 40-year climatology for a free run with SPEEDY, which is started on 0000 UTC 1 January 1981.
3.2 Sudden Stratospheric Warming
The dominant features of stratospheric variability are wintertime events of sudden stratospheric warming (SSW) in the NH. The term SSW refers to a dynamical process in which the normally strong westerly zonal mean flow at the edge of the NH stratospheric polar vortex suddenly turns easterly, which leads to a sudden rise of the polar stratospheric temperature. This rapid change is caused by an unusually strong coupling between the dynamics of the stratospheric and tropospheric flow (Andrews et al., 1987). While SPEEDY would need additional vertical levels above 25 hPa (its current top level) to produce realistic stratospheric dynamics, the hybrid model can produce realistic SSW events (Figure 1). The blue curves and gray shades show, respectively, the calendar-day mean and year-to-year variability of the strength of the zonal flow at the edge of the stratospheric polar vortex (top three panels) and the polar temperature (bottom three panels). From July to December, the stratospheric flow (left panels) first turns from easterly (negative values) to westerly (positive values), and then it gradually strengthens until midwinter, when it starts to weaken and eventually turns easterly again in April. The mean polar temperature gradually decreases from midsummer to midwinter, when it starts to increase to complete the cycle. The variability of the strength of the zonal flow and the polar temperature is low from May to September and high from October to April, with a maximum in midwinter.
While both the hybrid model (middle two panels) and SPEEDY (right two panels) can capture the mean trends, the hybrid model somewhat overestimates, while SPEEDY substantially underestimates the variability of the flow. The relationship between the variability of the flow and SSW can be further investigated by using the criteria of Charlton and Polvani (2007) to detect SSW: an event occurs when the stratospheric zonal mean of the zonal wind at 60°N turns easterly and then it turns back to westerly for at least 10 consecutive days. Here, the criteria is applied to the zonal wind at vertical pressure level 25 hPa. For ERA5, the hybrid model, and SPEEDY, there are 0.6, 0.87, and zero SSW events per year, respectively. The examples for an event shown by the red curves in Figure 1 illustrate that both the speed of the onset and the duration of the SSW are captured realistically by the hybrid model.
3.3 Precipitation Climatology
The prognostic precipitation variable of the hybrid model provides cumulative precipitation values with 6 hourly resolution, while the diagnostic precipitation variable of SPEEDY provides this variable with a daily resolution. The precipitation model climatologies of Figure 2 are based on these variables. This figure shows that the hybrid model produces lower magnitude precipitation biases than SPEEDY at most locations (top two rows of panels): the 1.20 mm per day global root-mean-square of the bias for SPEEDY is reduced to 0.63 mm per day for the hybrid model, and the absolute value of the largest-magnitude local bias is reduced from 9.87 mm per day to 5.17 mm per day. SPEEDY has a dry bias in the extension regions of the Kuroshio Current and Gulf Stream (two right panels), which is greatly reduced by the hybrid approach (top two middle panels). The bias is also lower for the hybrid model than SPEEDY in mountainous regions (e.g., Rockies, Himalayas) and equatorial South America and Africa. One region where the hybrid model has a larger bias than SPEEDY is the Tropical Pacific, where it has a wet bias. Interestingly, the hybrid model produces a “double ITCZ,” which has also been a persistent problem for physics-based models (Zhang et al., 2019).
In addition to providing improved mean precipitation, the hybrid model captures the daily variability of the precipitation more accurately than SPEEDY.
3.4 SST Climatology and ENSO
The SST prognostic variable (Figure 3, left panels) has low biases: the global root-mean-square value of the SST bias is 0.43°C, while the largest local values of the bias are in the 1°–2°C range. While the model correctly captures the main regions of largest temporal variability of the SST (Figure 3, right panels), it tends to somewhat underestimate the variability associated with the western boundary currents and their extension regions, and overestimate the variability associated with ENSO in the Equatorial Pacific near the coast of South America.
The skills and the limitations of the model in capturing climate variability related to ENSO are further illustrated by Figure 4. Two of the most common metrics used for diagnosing ENSO phases are the Oceanic Niño Index (ONI) and the Southern Oscillation Index (SOI) for the Niño 3.4 region (5°S–5°N, 120°W–170°W). The model correctly captures the inverse relationship between the smoothed time series of the two indexes (top panel). In addition, the autocorrelation function of the Niño 3.4 SST anomalies for the model is in good agreement with that for the ERA5 reanalyzes for the first 6 months of lag (bottom left panel). The model, however, does not capture the timing of the crossover into negative autocorrelation at about 10 months: the model transitions from one phase of ENSO to another with a delay. In addition, the power spectrum of ENSO is similar to ERA5 (bottom right panel). While some climate models produce too much variability associated with ENSO in the western Tropical Pacific (Menary et al., 2018), the hybrid model does not exhibit such behavior (Figure 3, bottom right panel).
The goal of this paper was to demonstrate that hybridizing an AGCM by incorporating ML can help the model to capture dynamical processes of nature that are missing from climate simulations with the AGCM. For some dynamical processes, this potential can be realized without introducing new prognostic variables in the ML component of the model. This point was illustrated with the process of SSW. Some other processes can be introduced into the model dynamics by adding new ML-based prognostic variables. This point was illustrated by two examples. First, the 6-hr cumulative precipitation was introduced as a prognostic variable, and it was shown that the model produced a highly realistic climatology for the newly added prognostic variable. Second, SST, which is a prescribed boundary parameter of the AGCM, was turned into a prognostic variable. The SST prognostic variable had highly realistic climatology, and it also had a realistic ENSO signal. Moreover, the hybrid model also correctly captured the related atmospheric surface pressure signal, the indication of a realistic two-way coupling between an oceanic state variable and the model atmosphere.
State-of-the-art Earth System models require 1,000s of processors and can typically simulate one decade per day (e.g., Golaz et al., 2019). In contrast, our hybrid model can be run on a small cluster and the 70-year simulation presented in this paper took 18 hr with 32 processors. Because the ML component of the hybrid model is based on RC, training the model is also computationally highly efficient. The training described in this paper required 40 min wall-clock time using 1,152 processors on a medium sized cluster.
One important caveat concerning our conclusions is that they are based on the application of the hybrid approach to an AGCM that has much lower resolution and simpler parameterization schemes than a state-of-the-art AGCM. A state-of-the-art model may leave less room for the improvement of the model representation of some dynamical processes. It is also yet to be seen whether the hybrid model state satisfies an interpretable physics-based water and energy budget equation (e.g., Bosilovich et al., 2011). This would provide further support to the proposed approach to turn boundary parameters into ML-based interactive prognostic variables. Finally, while we show that the hybrid model is able to replicate the current climate, the dynamics of the climate is inherently nonstationary. Patel et al. (2021) and Patel and Ott (2023) outline a method to incorporate nonstationarity into a hybrid model similar to the one presented in this study. Using this method, their hybrid model was able to anticipate tipping points and simulate post-tipping point climates in toy models. Our plan is to apply their method to investigate the nonstationarity of the climate using the hybrid model of the present study.
This work was supported by DARPA contract HR00112290035, and it was conducted with the advanced computing resources provided by Texas A&M High Performance Research Computing. Dhruvit Patel provided helpful comments on the manuscript.
- 1987). In Middle atmosphere dynamics. (Vol. 40). Academic Press.
- 2020). A machine learning-based global atmospheric forecast model. Geophysical Research Letters, 47(9), e2020GL087776. https://doi.org/10.1029/2020gl087776
- 2022). A hybrid approach to atmospheric modeling that combines machine learning with a physics-based numerical model. Journal of Advances in Modeling Earth Systems, 14(3), e2021MS002712. https://doi.org/10.1029/2021MS002712
- 2011). Global energy and water budgets in MERRA. Journal of Climate, 24(22), 5721–5739. https://doi.org/10.1175/2011JCLI4175.1
- 2018). Prognostic validation of a neural network unified physics parameterization. Geophysical Research Letters, 45(12), 6289–6298. https://doi.org/10.1029/2018gl078510
- 2019). Spatially extended tests of a neural network parametrization trained by coarse-graining. Journal of Advances in Modeling Earth Systems, 11(8), 2728–2744. https://doi.org/10.1029/2019ms001711
- 2007). A new look at stratospheric sudden warmings. Part I: Climatology and modeling benchmarks. Journal of Climate, 20(3), 449–469. https://doi.org/10.1175/jcli3996.1
- 2020). Data-driven super-parameterization using deep learning: Experimentation with multiscale Lorenz 96 systems and transfer learning. Journal of Advances in Modeling Earth Systems, 12(11), e2020MS002084. https://doi.org/10.1029/2020MS002084
- 2022). Correcting a 200 km resolution climate model in multiple climates by machine learning from 25 km resolution simulations. Journal of Advances in Modeling Earth Systems, 14(9), e2022MS003219. https://doi.org/10.1029/2022MS003219
- 2021). Using machine learning to correct model error in data assimilation and forecast applications. Quarterly Journal of the Royal Meteorological Society, 147(739), 3067–3084. https://doi.org/10.1002/qj.4116
- 2018). Could machine learning break the convection parameterization deadlock? Geophysical Research Letters, 45(11), 5742–5751. https://doi.org/10.1029/2018gl078202
- 2019). The DOE E3SM coupled model version 1: Overview and evaluation at standard resolution. Journal of Advances in Modeling Earth Systems, 11(7), 2089–2129. https://doi.org/10.1029/2018MS001603
- 2020). The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146(730), 1999–2049. https://doi.org/10.1002/qj.3803
- 2001). The “echo state” approach to analyzing and training recurrent neural networks-with an erratum note (p. 148). German National Research Center for Information Technology GMD Technical Report.
- 2006). Decadal interactions between the western tropical Pacific and the north Atlantic oscillation. Climate Dynamics, 26(1), 79–91. https://doi.org/10.1007/s00382-005-0085-5
- 2012). A practical guide to applying echo state networks. In G. Montavon, G. B. Orr, & K.-R. Müller (Eds.), Neural networks: Tricks of the trade ( 2nd ed., pp. 659–686). Springer Berlin Heidelberg.
- 2009). Reservoir computing approaches to recurrent neural network training. Computer Science Review, 3(3), 127–149. https://doi.org/10.1016/j.cosrev.2009.03.005
- 2018). Preindustrial control simulations with HadGEM3-GC3.1 for CMIP6. Journal of Advances in Modeling Earth Systems, 10(12), 3049–3075. https://doi.org/10.1029/2018MS001495
- 2003). Atmospheric simulations using a GCM with simplified physical parametrizations. I: Model climatology and variability in multi-decadal experiments. Climate Dynamics, 20(2), 175–191. https://doi.org/10.1007/s00382-002-0268-2
- 2021). Using machine learning to predict statistical properties of non-stationary dynamical processes: System climate, regime transitions, and the effect of stochasticity. Chaos: An Interdisciplinary Journal of Nonlinear Science, 31(3), 033149. https://doi.org/10.1063/5.0042598
- 2023). Using machine learning to anticipate tipping points and extrapolate to post-tipping dynamics of non-stationary dynamical systems. Chaos: An Interdisciplinary Journal of Nonlinear Science, 33(2), 023143. https://doi.org/10.1063/5.0131787
- 2018). Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach. Physical Review Letters, 120(2), 024102. https://doi.org/10.1103/physrevlett.120.024102
- 2022). Fourcastnet: A global data-driven high-resolution weather model using adaptive Fourier neural operators. arXiv. https://doi.org/10.48550/ARXIV.2202.11214
- 2018). Hybrid forecasting of chaotic processes: Using machine learning in conjunction with a knowledge-based model. Chaos, 28(4), 041101. https://doi.org/10.1063/1.5028373
- 2018). Deep learning to represent subgrid processes in climate models. Proceedings of the National Academy of Sciences of the United States of America, 115(39), 9684–9689. https://doi.org/10.1073/pnas.1810286115
- 2021). Data-driven medium-range weather prediction with a resnet pretrained on climate simulations: A new model for weatherbench. Journal of Advances in Modeling Earth Systems, 13(2), e2020MS002405. https://doi.org/10.1029/2020ms002405
- 1977). Solutions of ill-posed problems.
- 2022). Predicting sea surface temperatures with coupled reservoir computers. Nonlinear Processes in Geophysics, 29(3), 255–264. https://doi.org/10.5194/npg-29-255-2022
- 2021). Correcting weather and climate models by machine learning nudged historical simulations. Geophysical Research Letters, 48(15), e2021GL092555. https://doi.org/10.1029/2021GL092555
- 2020). Combining machine learning with knowledge-based modeling for scalable forecasting and subgrid-scale closure of large, complex, spatiotemporal systems. Chaos, 30(5), 053111. https://doi.org/10.1063/5.0005541
- 2019). The double ITCZ syndrome in GCMs: A coupled feedback problem among convection, clouds, atmospheric and ocean circulations. Atmospheric Research, 229, 255–268. https://doi.org/10.1016/j.atmosres.2019.06.023