Uncertainty Quantification of Ocean Parameterizations: Application to the K-Profile-Parameterization for Penetrative Convection
Abstract
Parameterizations of unresolved turbulent processes often compromise the fidelity of large-scale ocean models. In this work, we argue for a Bayesian approach to the refinement and evaluation of turbulence parameterizations. Using an ensemble of large eddy simulations of turbulent penetrative convection in the surface boundary layer, we demonstrate the method by estimating the uncertainty of parameters in the convective limit of the popular “K-Profile Parameterization.” We uncover structural deficiencies and propose an alternative scaling that overcomes them.
Key Points
- A Bayesian methodology can be used to probe turbulence parameterizations and better understand their biases and uncertainties
- Parameterization parameter distributions, learned using high-resolution simulations, can be used as prior distributions for climate studies
Plain Language Summary
Climate projections are often compromised by significant uncertainties which stem from the representation of physical processes that cannot be resolved—such as clouds in the atmosphere and turbulent swirls in the ocean—but which have to be parameterized. We propose a methodology for improving parameterizations in which they are tested against, and tuned to, high-resolution numerical simulations of subdomains that represent them more completely. A Bayesian methodology is used to calibrate the parameterizations against the highly resolved model, to assess their fidelity and identify shortcomings. Most importantly, the approach provides estimates of parameter uncertainty. While the method is illustrated for a particular parameterization of boundary layer mixing, it can be applied to any parameterization.
1 Introduction
Earth System Models (ESMs) require parameterizations for processes that are too small to resolve. Uncertainties arise both due to deficiencies in the scaling laws encoded in the parameterizations and the nonlinear interactions with resolved model components, sometimes leading to unanticipated and unphysical results. The first challenge can be addressed by improving the representation of the unresolved physics (e.g., Schneider et al., 2017), while the second requires “tuning” of the parameterizations when implemented in the full ESM (e.g., Hourdin et al., 2017). In this paper, we illustrate how to leverage recent advances in computation and uncertainty quantification to make progress toward the first challenge. Our focus will be on oceanic processes, but the approach can be applied to any ESM parameterization, provided that a high-resolution submodel can be constructed.
The traditional approach to the formulation of parameterizations of subgrid-scale processes is to derive scaling laws that relate the net effect of such processes to variables resolved by the ESMs. These scaling laws are then tested with either field observations (e.g., Large et al., 1994; Price et al., 1986), laboratory experiments (e.g., Cenedese et al., 2004; Deardorff et al., 1980), or results from a high resolution simulations (e.g., Harcourt, 2015; Li & Fox-Kemper, 2017; Reichl et al., 2016; Wang et al., 1996). Rarely are parameterizations tested over a wide range of possible scenarios due to the logistical difficulty and high cost of running many field experiments, the time necessary to change laboratory setups, and computational demand. The computational limitations have become much less severe over the last few years through a combination of new computer architectures such as Graphic Processing Units (GPUs; Besard, Churavy, et al., 2019), new languages that take advantage of these architectures (e.g., Julia Bezanson et al., 2017), and improved large eddy simulation (LES) algorithms (Sullivan & Patton, 2011; Verstappen, 2018). Modern computational resources have opened up the possibility of running libraries of LES simulations to explore a vast range of possible scenarios. This paper discusses how such computational advances can be applied to assess parameterizations in ocean models.
LES simulations alone are not sufficient to formulate parameterizations. Statistical methods are needed to extract from the LES solutions the functional relationships between small-scale processes and coarse variables available in ESMs. A common approach is to rely on well-established scaling laws and use the LES solutions to constrain the nondimensional parameters that cannot be determined from first principles. In this approach, only a few LES simulations are necessary to find the optimal parameter values. However, it is rare that scaling laws and associated parameterizations perfectly capture the functional dependencies of large-scale variables—if they did, they would be referred to as solutions rather than parameterizations. In general, it is necessary to run a large ensemble of LES simulations to estimate optimal parameter values and test whether those values hold for different scenarios, thereby supporting the functional dependencies.
State estimation, which has a long tradition in geophysics (Wunsch, 2006), has been used to constrain parameter values. A loss function is chosen to quantify the mismatch between the prediction of the parameterization and observations. Uncertain parameters are then adjusted to minimize the loss function. One can also estimate the standard deviation around the optimal values by computing the Hessian of the loss function (Sraj et al., 2014; Thacker, 1989).
An alternative approach, based on the seminal work of Bayes (1763) and its modern incarnation (Jaynes, 2003), is arguably better suited to constrain the transfer properties of turbulent processes. The Bayesian method allows one to estimate the entire joint probability distribution of all parameters. The method is a crucial extension over state estimation, because the statistics of turbulent processes are generally far from Gaussian (Frisch, 1995) and thus are not fully characterized by the first and second moments alone. In the Bayesian approach, one defines a prior parameter distribution, based on physical considerations, and a “likelihood function,” which measures the mismatch between the parameterized prediction and the LES simulation. Based on this information, Bayes' formula shows how to compute the posterior distribution of the parameters consistent with the LES simulations and the parameterization. If the posterior distribution is narrow and peaked, then one can conclude that a unique set of parameters can be identified which can reproduce all LES results. In this limit, the Bayesian approach does not provide more information than state estimation. However, the power of Bayes' formula is that it can reveal distinct parameter regimes, the existence of multiple maxima, relationships between parameters, and the likelihood of parameter values relative to optimal ones.
The Bayesian approach can also be used to test the functional dependence of the parameterization on large-scale variables. One estimates the posterior distribution on subsets of the LES simulations run for different scenarios. If the posterior probabilities for the different scenarios do not overlap, the functional form of the parameterization must be rejected. We will illustrate how this strategy can be used to improve the formulation of a parameterization.
Bayesian methods are particularly suited to constrain ESM parameterizations of subgrid-scale ocean processes. Most of these processes, such as boundary layer or geostrophic turbulence, are governed by well understood fluid dynamics and thermodynamics. Thus, LES simulations provide credible solutions for the physics. The atmospheric problem is quite different where leading order subgrid-scale processes such as cloud microphysics are governed by poorly understood physics that may not be captured by LES simulations.
In this paper, we will apply Bayesian methods to constrain and improve a parameterization for the surface boundary layer turbulence that develops when air-sea fluxes cool the ocean. LES simulations that resolve all the relevant physics will be used as ground truth to train the parameterization. Our paper is organized as follows: In section 2 we describe the physical setup and the LES model. In section 3 we introduce Bayesian parameter estimation for the parameters in the K-Profile Parameterization (KPP). We then perform the parameter estimation in the regime described by section 2 and show how the Bayesian approach provides insight on how to improve the KPP parameterization. Finally, we end with a discussion in section 4.
2 Large Eddy Simulations and K-Profile-Parameterization of Penetrative Convection
During winter, high latitude cooling induces near-surface mixing by convection which generates a “mixed layer” of almost uniform temperature and salinity which can reach depths of hundreds of meters—see Marshall and Schott (1999) for a review. At the base of the mixed layer, convective plumes can penetrate further into the stratified layer below—called the “entrainment layer”—where plume-driven turbulent mixing between the mixed layer and stratification below cools the boundary layer. This process, in which the layer is cooled both at the surface and by turbulent mixing from the entrainment layer below, is called penetrative convection. Here we evaluate the ability of the K-Profile Parameterization (Large et al., 1994) to capture penetrative convection by comparing predictions based on it against large eddy simulations (LES) of idealized penetrative convection into a resting stratified fluid. It provides the context in which we outline the Bayesian approach to parameter estimation which we advocate.
2.1 Penetrative Convection Into a Resting Stratified Fluid















The visualization reveals the two-part boundary layer produced by penetrative convection: close to the surface, cold and dense convective plumes organized by surface cooling sink and mix ambient fluid, producing a well-mixed layer that deepens in time. Below the mixed layer, the momentum carried by sinking convective plumes leads them to overshoot their level of neutral buoyancy (nominally, the depth of the mixed layer), “penetrating” the stably stratified region below the surface mixed layer and generating the strongly stratified entrainment layer. The total depth of the boundary layer is h and includes the mixed layer and the entrainment layer of thickness Δh. Turbulent fluxes are assumed negligible below
.















The first term on the left of Equation 7 describes boundary layer deepening due to buoyancy loss at the surface, while the second term corresponds to the further cooling caused by turbulent mixing in the entrainment layer. Other authors have also arrived at a similar expression for the boundary layer depth upon taking into account turbulent entrainment. See, for example, Appendix F in Van Roekel et al. (2018).


2.2 The K-Profile Parameterization of Penetrative Convection






















Equation 15 is the implicit nonlinear constraint in Equation 10 that determines the boundary layer depth, h. In Appendix B we discuss the physical content of Equation 15 for the case of penetrative convection.
The boundary layer depth criteria in Equation 15 is often referred to as the bulk Richardson number criteria, because in mechanically forced turbulence the denominator is replaced by an estimate of the mean shear squared and CH becomes a critical bulk Richardson number (Large et al., 1994). In penetrative convection there is no mean shear, and CH is not a Richardson number. See Appendix C for more details.

These parameters are not the original set of independent parameters proposed by Large et al. (1994), but rather algebraic combinations thereof. Nevertheless, we emphasize that our formulation is mathematically identical to that proposed by Large et al. (1994). The mapping between the current set of parameters and the original are one-to-one; hence, no information is lost in transforming from the current set of parameters to the original ones; see Appendix C for details. With regard to the numerical implementation, we do not use enhanced diffusivity as explained in the appendices of Large et al. (1994). Our objective is to calibrate the free parameters
by comparing KPP temperature profiles T(z, t; C) with the LES output
.
3 Model Calibration Against LES Solutions


We choose the square error in space to reduce the sensitivity to vertical fluctuations in the temperature profile. We take the maximum value of the squared error in time for t ∈ [t1, t2] to guarantee that the temperature profile never deviates too far from the LES simulation at each instant in time. The parameterization is taken to be the KPP model given by Equations 9 through 15, and the data are the horizontally averaged LES output. The initial time t1 is chosen after the initial transition to turbulence of the LES simulations.
A natural way to extend the concept of loss functions to account for parameter uncertainty is to introduce a likelihood function for the parameters. Similar to how the form of the loss function is critical to the estimation of optimal parameters, the form of the likelihood function is critical for estimating the parameter uncertainties. The likelihood function quantifies what we mean by “good“ or “bad” parameter choices. The Bayesian method uses this information to estimate parameter uncertainties. These estimates are only as good as the choice of likelihood function, much like optimal parameters are only as good as the choice of the loss function. See, for example, Morrison et al. (2020), Nadiga et al. (2019), Schneider et al. (2017), Sraj et al. (2016), Urrego-Blanco et al. (2016), Zedler et al. (2012), and van Lier-Walqui et al. (2012) for definitions of likelihoods in various geophysical/fluid dynamical contexts. In Appendix D we discuss in detail the rationale for the choices made in this paper.










In our context Bayes' formula updates prior guesses about KPP parameter values and yields a posterior distrbution based on the LES data.







For example, if the choice C1 increases the minimum of the loss function by a factor of two, that is,
, then it is 1/e less likely. The probability distribution ρ(C) is then sampled with a Random Walk Markov Chain Monte Carlo (RW-MCMC) algorithm (Metropolis et al., 1953), described further in Appendix E.
To illustrate our choices, as well as the RW-MCMC algorithm, we show a typical output from an RW-MCMC algorithm for a 2-D probability distribution in Figure 3. We use the probability density function for the KPP parameterization presented in the next section, but keep two of the four parameters fixed (CD and CH) to reduce the problem from four to two parameters (CN and CS). The prior distributions for CN and CS are uniform over the ranges reported at the end of this section. The parameters CD and CH are set to the values that minimize the loss function. We show results for two arbitrary values of
for illustrative purposes. Starting from a poor initial guess, the RW-MCMC search proceeds towards regions of higher probability (lower loss function) by randomly choosing which direction to go. Once a region of high probability is found, in this case parameter values in the “blue” region, the parameters hover around the minimum of the loss function as suggested by the high values of the likelihood function. The orange hexagons represent the process of randomly walking towards the minimum of the loss function and correspond to the “burn-in” period. The burn-in period is often thrown away when calculating statistics since it corresponds to an initial transient before the RW-MCMC settles around the minimum of the likelihood function. We see that the choice of
does not change the overall structure of the probability distribution but does affect how far from optimal parameters the random walk is allowed to drift.


Parameterizations such as KPP exhibit a dependence on resolution in addition to nondimensional parameters. Here we perform all calculations for a vertical resolution
m and timestep
min representative of those used in state of the art ESMs. We do not use enhanced diffusivity as in Large et al. (1994) for this resolution. The parameterization is relatively insensitive to halving Δz and Δt, for a fixed set of parameters, but the results are sensitive to doubling either one. Thus, the optimal parameter values and their uncertainties are only appropriate for the resolution used for the calibration and would need to be updated especially if the parameterization was run at a coarser resolution. This dependence on resolution could be handled within the Bayesian method by introducing Δz and Δt as additional parameters in the probability distribution, but we do not pursue this approach.


The surface layer fraction CS, being a fraction, must stay between zero and one. The other parameter limits are chosen to span the whole range of physically plausible values around the reference values given in Equation 16. The choice of uniform distributions is made to avoid favoring any particular value at the outset.
3.1 Calibration of KPP Parameters From One LES Simulation
In this section we apply the Bayesian calibration method to the LES simulation of penetrative convection described in section 2.1 and quantify uncertainties in KPP parameters in section 2.2. The horizontal averages from the LES simulations are compared with predictions from solutions of the KPP boundary layer scheme, Equations 9 and 10. The boundary and initial conditions for KPP are taken to be the same as those for the LES simulation, that is, 100 W/m2 cooling at the top,
C m−1 at the bottom, and an initial profile
.
To estimate the full probability distribution function, we use the RW-MCMC algorithm with 106 iterations to sample the probability distributions of the four KPP parameters (CS, CN, CD, CH). The large number of forward runs is possible because the forward model consists of a one-dimensional equation, namely, KPP in single column mode. The Markov chain leads to roughly 104 statistically independent samples as estimated using an autocorrelation length, see Sokal (1997). The RW-MCMC algorithm generates the entire four-dimensional PDF, Equation 18.
The parameter probability distribution can be used to choose an optimal set of KPP parameters. Of the many choices, we pick the most probable value of the four-dimensional probability distribution, the mode, because it minimizes the loss function, see Appendix D for the detailed calculation. In Figure 4a we show the horizontally averaged temperature profile from the LES simulation (continuous line) and the temperature profiles obtained running the KPP parameterization with reference and optimal parameters (squares and dots) at
days. The optimized temperature profiles are more similar to the LES simulation than the reference profiles especially in the entrainment region. Figure 4b confirms that the square root of the instantaneous loss function, the error, grows much faster with the reference parameters. The oscillations in the error are a consequence of the coarseness of the KPP model: only one grid point is being entrained at any given moment.



The improvement in boundary layer depth through optimization of the parameters is about 10%, or 10 m over 8 days. As discussed in section 2.1, the rate of deepening can be predicted analytically within 20% by simply integrating the buoyancy budget over time and depth and assuming that the boundary layer is well mixed everywhere, that is, ignoring the development of enhanced stratification within an entrainment layer at the base of the mixed layer. KPP improves on this prediction by including a parameterization for the entrainment layer. The reference KPP parameters contribute a 10% improvement on the no entrainment layer prediction, and the optimized parameters contribute another 10%. While these may seem like modest improvements, they can prevent large biases for the boundary layer depth when integrated over a few months of cooling in winter rather than just 8 days. We will return to this point in the next section when we discuss structural deficiencies in the KPP formulation.


The marginal distribution can intuitively be thought of as the total of a parameter (or pair of parameters) while taking into account the total uncertainty of other parameters. Furthermore, the marginal distribution takes into account potential compensating effects that different parameters may have on one another. The marginal distribution does not capture the effect of individually varying a parameter while keeping all the other parameters fixed at a particular value (That is, unless the other parameters have essentially delta function 1-D marginal distributions.). That is an effect represented by a conditional distribution.
Constructing the marginal distributions only requires constructing histograms of the trajectories generated by the RW-MCMC algorithm. The 2-D marginal distributions are visualized with heatmaps in Figure 5, and the 1-D marginal distributions of the corresponding parameters are shown along the outermost edges. For the 2-D marginal distributions, the dark blue regions correspond to regions of high probability, and the light blue regions are regions of low probability. The white space corresponds to regions that the RW-MCMC algorithm never visited. The 2-D marginal distributions show that parameters must be changed in tandem with one another in order to correspond to a similar model output. Furthermore, their structure is distinctly non-Gaussian.

The 1-D marginal distribution of the mixing depth parameter CH (the bottom left rectangular panel) is much more compact than that of the other three parameters suggesting that it is the most sensitive parameter. The mixing depth parameter's importance stems from its control over both the buoyancy jump across the entrainment layer and the rate-of-deepening of the boundary layer. (Again it may be useful to remember that CH is often referred to as the bulk Richardson number in the KPP literature, even though it takes a different meaning in convective simulations, see Appendix C.) The parameters CD and CN set the magnitude of the local and nonlocal fluxes. Results are not sensitive to their specific values, as long as they are large enough to maintain a well-mixed layer. The value of the surface layer fraction CS is peaked at lower values but is less sensitive to variations than CD or CH.
The uncertainties of the parameters can be used to infer the uncertainties of the temperature profile at each depth and time, predicted by KPP. To do this, we subsample the 106 parameter values down to 104 and evolve KPP forward in time for each set of parameter choices. We construct histograms for the temperature field at the final time for each location in space individually. We then stack these histograms to create a visual representation of the model uncertainty. This uncertainty quantifies the sensitivity of the parameterization with respect to parameter perturbations as defined by the parameter distributions.

- 0–10 m depth: There is some uncertainty associated with the vertical profile of temperature close to the surface.
- 20–60 m depth: The mean profile of temperature in the mixed layer is very well predicted by KPP.
- 60–70 m depth: The entrainment region contains the largest uncertainties.
- 70–100 m depth: There is virtually no uncertainty. The unstratified region below the boundary layer does not change from its initial value.

Now that we have applied the Bayesian methodology to one LES simulation and explored its implications, we are ready to apply the method to multiple LES simulations covering different regimes in the following section.
3.2 Calibration of KPP Parameters From Multiple LES Simulations
We now use our Bayesian framework to explore possible sources of bias in the KPP model. To this end we investigate what happens when we change the initial stratification in penetrative convection simulations. This is motivated by recent work on boundary layer depth biases in the Southern Ocean (DuVivier et al., 2018; Large et al., 2019). In those studies, KPP failed to simulate deep boundary layers in winter when the subsurface summer stratification was strong.
We perform 32 large eddy simulations and calculate parameter distributions for each case. In the previous section we saw that CH is the most sensitive parameter. Thus, our focus now will be on the optimization and uncertainty quantification of CH. In the background, however, we are estimating all parameters. We keep the surface cooling constant at 100 W/m2 for all regimes and only vary the initial stratification. The integration time was stopped when the boundary layer depth filled about 70% of the domain in each simulation. We used 1283 grid points in the LES, ≈0.8 m resolution in each direction (Although the parameter estimates will vary upon using less LES resolution, the qualitative trends are expected to be robust.). We use a lower resolution for the LES in these trend studies as compared to those in the previous section, but results were not sensitive to this change. In the Bayesian inference, each one of the probability distributions were calculated 105 iterations of RW-MCMC, leading to an effective sample size on the order of 103. The stratifications ranged from N2 ≈ 1 × 10−6 to N2 ≈ 3.3 × 10−5 s−2.
We find, as visualized in Figure 7, that CH is not constant but depends on the background stratification, N2. The blue dots are the median values of the probability distributions, and the stars are the modes (minimum of the loss function). The error bars correspond to 90% probability intervals, meaning that 90% of parameter values fall between the error bars. The large discrepancy between the median and the mode is due to the mode being the optimal value of the entire four-dimensional distribution whereas the median only corresponds to the marginal distribution. The reference KPP value is plotted as a dashed line.

The median values and optimal values increase monotonically with the initial stratification revealing a systematic bias. Furthermore, it exposes where the systematic bias comes from: No single value of CH, Equation 15, can correctly reproduce the deepening of the boundary layer for all initial stratifications. This suggests that the scaling law for the boundary layer depth criteria is incommensurate with the LES data.




The LES simulation described in section 2.1, and many previous studies of penetrative convection, for example, Deardorff et al. (1980) and Van Roekel et al. (2018), show that the boundary layer depth grows as
. To be consistent, Nh would have to scale as h2/3, but this is not observed in the LES simulations nor supported by theory. This suggests that we must modify the formulation of boundary layer depth, as we now describe.
3.3 Modification of the KPP Parameterization to Reduce Biases




Equation 30 is an implicit equation for h which guarantees that Equation 28 holds.
We now repeat the model calibration in section 3.2 but using this new boundary layer depth criterion to test whether there is an optimal value of C⋆ that is independent of initial stratification. We estimate all KPP parameters and show the new mixing depth parameter for simulations with different initial stratifications in Figure 8. Encouragingly there is no obvious trend in the optimal values of C⋆ and the error bars overlap for all cases. This supports the new criterion in the sense that parameters estimated in different regimes are now consistent with one another. The uncertainties in C⋆ translate into an uncertainty in boundary layer depth prediction. In particular, values between 0.05 ≤ C⋆ ≤ 0.2 imply a boundary layer depth growth in the range
.

Additionally, one can check if the constants estimated following the methodology of section 3 are consistent with an independent measure directly from the diagnosed LES simulation. In particular the LES simulations suggest that C⋆ ≃ 1/6 as per Equation 31. From Figure 8 we see that the optimal C⋆ is smaller than
(the dashed black line), and the value 1/6 is not within the confidence intervals for many of the simulations. There are several potential reasons for the discrepancy, for example, the neglect of curvature in the buoyancy budget (since we assumed a piece-wise linear buoyancy profile) or the finite resolution of the parameterization. Perhaps the most likely explanation is the difference in how the boundary layer depth was diagnosed in the LES, which need not have the same meaning as the one in KPP. A different definition in the LES simulation, such as the depth of maximum stratification, would yield a different scaling law, but still proportional to
. Whatever the choice, the Bayesian parameter estimation bypasses these ambiguities/inconsistencies by direct comparison with the entire horizontally average temperature profile from the LES.
We do not explore other modifications to the boundary layer depth criterion as this would greatly expand the scope of this article. Furthermore, biases in KPP are not limited to the cases explored here, see Van Roekel et al. (2018) for discussions and remedies. The criterion described in this section assumes a constant initial stratification and a constant surface heat loss, which leads to the
growth of the boundary layer depth. It would be interesting to extend the criterion to arbitrary initial stratification, variable surface heat fluxes, not to mention the interaction with wind-driven mixing. The goal here is not to derive a new parameterization, but rather to illustrate and argue for a Bayesian methodology in the refinement and assessment of parameterizations.
4 Discussion
We presented a Bayesian approach to assess the skill of the K-Profile Parameterization (KPP) for turbulent convection triggered by surface cooling in an initially stably stratified ocean. The KPP model for this physical setting consists of a one-dimensional model with an algebraic constraint for the mixing-layer depth together with four non-dimensional parameters. These parameters consisted of an algebraic reorganization of the original KPP parameters so that terms in the equations could be associated with choices of parameters. Parameters were estimated by reducing the mismatch between the vertical buoyancy profile predicted by KPP and the area-averaged buoyancy profile simulated with a three-dimensional LES code for the same initial conditions and surface forcing. Using Bayes' formula we further estimated the full joint probability distribution of the four parameters. Furthermore, the probability distribution was used to quantify interdependencies among parameters and their uncertainty around the optimal values.
Repeating the Bayesian parameter optimization and uncertainty quantification for different initial stratifications, we found that no unique set of parameters could capture the deepening of convection in all cases. This implied that the KPP formulation does not capture the dependence of convection on the initial stratification in the simple test case considered here: constant surface cooling, constant initial stratification, no wind, and no background flow. The parameter that required re-tuning for each case was the one associated with the boundary layer depth criterion, thereby suggesting that this criterion has the wrong functional dependence on stratification. We thus reformulated the boundary layer depth criterion to capture the semi-analytical result, supported by the LES simulations, that the boundary layer depth deepens as the square root of time when the initial stratification is constant. The validity of the new formulation was vindicated because the Bayesian approach was able to find a set of parameters which captured the evolution of the boundary layer, as compared to the LES simulations, for all initial strtatifications. In this way, the Bayesian methodology allowed us identify and remove a bias in KPP formulation.
The methodology outlined here could be as easily applied to other parameterizations of boundary layer turbulence, such as those reviewed in CVMix (Griffies et al., 2015), Pacanowski and Philander (1981), Mellor and Yamada (1982), Price et al. (1986), and Kantha and Clayson (1994). It is expected that the inclusion of additional physics, such as wind-driven mixing and its interaction with convection, would also be amenable to the techniques described in this manuscript. Our experience is that progress is faster if one starts with simple idealized setups, like the ones considered here, and then move to progressively more realistic ones which accounted for variable stratification and surface heat fluxes, wind-stress forcing, background shear, surface waves, and so forth. The Bayesian method would then provide a rigorous evaluation of parameter uncertainty, parameter dependencies, and biases in the formulation of the parameterization.
Ultimately, our hope is that parameter probability distributions estimated in local regimes will serve as useful prior information for calibration/tuning of Earth System Models (ESMs). Local simulations of turbulence must be carefully designed and incorporate suites of subgrid-scale processes that have leading order impact in global ocean dynamics: surface and bottom boundary layer turbulence, surface wave effects, deep convection, mesoscale and submesoscale turbulence, and so forth. Bayesian calibration of parameterization for each subgrid-scale process will then result in probability distributions for all the nondimensional parameters associated with the parameterizations. These distributions can then be used as prior information for what is a reasonable range of values that each parameter can take, when the parameterizations are implemented in an ESMs.
With regard to calibration of ESMs, the parameterizations of different subgridscale processes may nonlinearly interact with each other and with the resolved physics. Additional calibration is then required for the full ESM. Presently, this is achieved by perturbing the parameters within plausible ranges (Mauritsen et al., 2012; Schmidt et al., 2017). The Bayesian approach provides an objective approach to determine a plausible range. The same algorithm cannot be used to calibrate the ESM, because the methodologies described here are not computationally feasible when applied to larger systems. Promising approaches to address this challenge through the use of surrogate models are described in Sraj et al. (2016) and Urrego-Blanco et al. (2016). Such models bring internal sources of uncertainty, and it is not clear to what extent one can trust a surrogate of a full ESM. One potential way to address this additional challenge is the Calibrate, Emulate, and Sample (CES) approach outlined in Cleary et al. (2020). There the surrogate model's uncertainty is estimated through the use of Gaussian processes and included as part of a consistent Bayesian calibration procedure.
Should the global problem still exhibit significant biases, even when all available prior information about parameterizations and about global data are leveraged utilizing emulators or traditional methods of tuning, then one would have to conclude that there is a fundamental deficiency in our understanding of how the different components of the climate system interact with one another, or that perhaps the models do not include some key process. For example, Rye et al. (2020) argue that glacial melt might be one such missing process which is not currently represented in ESMs. The advantage of the systematic calibration approach outlined here is that it allows us to quantify uncertainty in ESM projections and identify the sources of such uncertainty.
Acknowledgments
The authors would like to thank Carl Wunsch, Tapio Schneider, Andrew Stuart, and William Large for numerous illuminating discussions. We would also like to thank the reviewers for their helpful suggestions with the manuscript. Our work is supported by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program, and by the National Science Foundation under grant AGS-6939393.
Appendix A: Oceananigans.jl








1 Subfilter Stress and Temperature Flux


2 Numerical Methods
To solve Equations A1-A3 with the subfilter model in Equation A5, we use the software package “Oceananigans.jl” written in the high-level Julia programming language to run on Graphics Processing Units, also called “GPUs” (Besard, Churavy, et al., 2019; Besard, Foket, et al., 2019; Bezanson et al., 2017). Oceananigans.jl uses a staggered C-grid finite volume spatial discretization (Arakawa & Lamb, 1977) with centered second-order differences to compute the advection and diffusion terms in Equations A1 and A2, a pressure projection method to ensure the incompressibility of u, a fast, Fourier-transform-based eigenfunction expansion of the discrete second-order Poisson operator to solve the discrete pressure Poisson equation on a regular grid (Schumann & Sweet, 1988), and second-order explicit Adams-Bashforth time stepping. For more information about the staggered C-grid discretization and second-order Adams Bashforth time-stepping, see section 3 in Marshall et al. (1997) and references therein. The code and documentation are available for perusal at https://github.com/CliMA/Oceananigans.jl.
Appendix B: Parcel Theory Derivation for the KPP Boundary Layer Depth Criterion
















The rationale behind this extended criterion can be found in Large et al. (1994). For the purely convective limit
and the dependence on Ric drops out.
Appendix C: Relationship Between the Model in Section 2.2 and Large et al. (1994)'s Formulation of KPP
The formulation of KPP in Section 2.2 represents an algebraic reorganization of the formulation proposed by Large et al. (1994). The two formulations are mathematically equivalent. In this appendix, we discuss in detail how the four free parameters CH, CS, CD, and CN are algebraically related to the free parameters proposed by Large et al. (1994).
Large et al. (1994)'s formulation of KPP for the case of penetrative convection with no horizontal shear introduces six nondimensional parameters: the Von Karman constant
, the ratio of the entrainment flux to the surface flux
, a constant that sets the amplitude of the non-local flux
, a constant that ensures the continuity of the buoyancy flux profile
, the surface layer fraction
, and a parameter that controls the smoothing of the buoyancy profile at the base of the boundary layer depth Cv. Large et al. (1994) argue that Cv can take any value between 1 and 2. We set the reference value
, which corresponds to the strong stratification limit in the model proposed by Danabasoglu et al. (2006) and given by equation (8.184) in Griffies et al. (2015).

We are able to reduce the number of parameters from six (ϵ, cs, CV, βT, κ, C∗) to four (CH, CS, CD, CN), because in the case of penetrative convection the two combinations Cv(βT)1/2 and csκ4 always appear together.

We refer to these as the reference parameters.







Appendix D: A Primer on Uncertainty Quantification

- In the limit of no uncertainty, the probability distribution should collapse to a delta function centered at the optimal parameter values that minimize the loss function.
- The uncertainty of a parameter value C should increase proportionally to the value of
.



The hyperparameter
sets the shape of the likelihood function
and its associated uncertainty quantification. The limit
corresponds to no uncertainty, because the likelihood function and the probability distribution collapse to a delta function peaked at the optimal parameter values that minimize the loss function. The limit
instead corresponds to a likelihood function that adds no information to reduce the uncertainty and the posterior distribution ρ(C) is equal to the prior one ρ0(C). Thus,
must take finite values between zero and infinity, if the likelihood function is to add useful information.


Hence, the most probable value of the probability distribution is achieved at the minimum of the loss function independent of
for a uniform prior distribution.
As mentioned in section 3, it is convenient to set the hyperparameter
to be equal to the minimum of the loss function
. This choice satisfies two key requirements. First, the uncertainties of parameters should be independent of the units of the loss function. Second, the hyperparameter
should be larger the larger the loss function
, because the latter is a measure of the parameterization bias and the former should be larger if there is more uncertainty about acceptable parameter values.
In practice it is seldom possible to find the global minimum of
and instead we adopt a “best guess” of the optimal parameters
and set
. Since
, our choice is conservative because a larger
corresponds to more uncertainty.
Appendix E: Random Walk Markov Chain Monte Carlo
- Choose a set of initial parameter values C0. We pick our best guess at the set of values that minimize the log-likelihood function as estimated from standard minimization techniques.
- Choose a new set of candidate parameters by adding a Gaussian random variable with mean zero and covariance matrix Σ to the initial set,
. The algorithm is guaranteed to work independently of the choice of Σ as long as it is nonzero and does not vary throughout the random walk. However, suitable choices can speed up convergence and will be discussed below.
- Calculate
. This is a measure of how much more likely
is relative to C0.
- Draw a random variable from the interval [0, 1], for example, calculate
. If
accept the new parameter values and set
. Otherwise reject the new parameter values
. This is the “accept/reject” step. Note that if Δℓ > 0, that is, if the proposed parameter produces a smaller output in the negative log-likelihood function, the proposal is always accepted.
- Repeat steps 2–4, replacing C0 → Ci and C1 → Ci + 1, to generate a sequence for Ci of parameter values.
The sequence of parameter values generated by the algorithm can then be used to construct any statistics of the probability distribution 18, including empirical distributions, marginal distributions, and joint distributions. In the context of KPP it can generate the uncertainty of the temperature value at any depth and time as well as the uncertainty of the boundary layer depth at a given time.
To guide the choice of an appropriate value for Σ, one diagnoses the “number of independent samples” by using approximations of the correlation length as described by Sokal (1997). If Σ is too small, then the acceptance rate is too large since each candidate parameter is barely any different from the original one. Too large a Σ yields too low acceptance rates. To find an appropriate compromise we perform a preliminary random walk and estimate the covariance matrix of the resulting distribution. We then set Σ equal to this covariance matrix.
Last, in order to sample parameters within a finite domain, we artificially make the parameter space periodic and the random walk is therefore guaranteed to never leave the desired domain.
Open Research
Data Availability Statement
Code and data may be found via figshare at Souza (2020).