Volume 48, Issue 2 e2020GL091236
Research Letter
Open Access

Observational Constraints on Warm Cloud Microphysical Processes Using Machine Learning and Optimization Techniques

J. Christine Chiu

Corresponding Author

J. Christine Chiu

Department of Atmospheric Science, Colorado State University, Fort Collins, CO, USA

Correspondence to:

J. C. Chiu,

[email protected]

Search for more papers by this author
C. Kevin Yang

C. Kevin Yang

Department of Atmospheric Science, Colorado State University, Fort Collins, CO, USA

Search for more papers by this author
Peter Jan van Leeuwen

Peter Jan van Leeuwen

Department of Atmospheric Science, Colorado State University, Fort Collins, CO, USA

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Graham Feingold

Graham Feingold

NOAA Earth System Research Laboratory, Boulder, CO, USA

Search for more papers by this author
Robert Wood

Robert Wood

Department of Atmospheric Sciences, University of Washington, Seattle, WA, USA

Search for more papers by this author
Yann Blanchard

Yann Blanchard

Department of Earth and Atmospheric Sciences, ESCER Centre, University of Quebec at Montreal, Montreal, QC, Canada

Search for more papers by this author
Fan Mei

Fan Mei

Pacific Northwest National Laboratory, Richland, WA, USA

Search for more papers by this author
Jian Wang

Jian Wang

Center for Aerosol Science and Engineering, Department of Energy, Environmental and Chemical Engineering, Washington University in Saint Louis, Saint Louis, MO, USA

Search for more papers by this author
First published: 11 December 2020
Citations: 12

Abstract

We introduce new parameterizations for autoconversion and accretion rates that greatly improve representation of the growth processes of warm rain. The new parameterizations capitalize on machine-learning and optimization techniques and are constrained by in situ cloud probe measurements from the recent Atmospheric Radiation Measurement Program field campaign at Azores. The uncertainty in the new estimates of autoconversion and accretion rates is about 15% and 5%, respectively, outperforming existing parameterizations. Our results confirm that cloud and drizzle water content are the most important factors for determining accretion rates. However, for autoconversion, in addition to cloud water content and droplet number concentration, we discovered a key role of drizzle number concentration that is missing in current parameterizations. The robust relation between autoconversion rate and drizzle number concentration is surprising but real, and furthermore supported by theory. Thus, drizzle number concentration should be considered in parameterizations for improved representation of the autoconversion process.

Key Points

  • Machine-learning trained by in situ data constrains autoconversion and accretion rates with uncertainty of 15% and 5%, respectively

  • There is a surprising relation between autoconversion rate and drizzle number concentration that significantly improves parameterizations

  • The exponent of autoconversion rate dependence on cloud number concentration is 0.75, lower than that in existing parameterizations

Plain Language Summary

Drizzle has been a key element of research, because its formation modulates cloud properties and evolution, and affects the water cycle of the Earth. Since drizzle formation involves cloud droplets of all sizes, it requires extensive computational time. Hence, we often use simplified methods in weather and climate prediction models to obtain a bulk estimate of how fast and how many cloud droplets collide with each other or collide with bigger drops to form drizzle. However, many models continue to have inadequate representation of drizzle formation, calling for the need to improve these simplified methods. We introduce new methods to estimate the rate of those microphysical processes, capitalizing on aircraft measurements and recent advances in machine-learning techniques. Our techniques outperform the current methods significantly. Importantly, our analyses reveal that the rate of drizzle formation via collisions between cloud drops is related to drizzle drop number concentration itself, which is missing in the existing methods. This relation occurs because drizzle drop number concentration provides information on the stage of evolution of cloud size distribution during drizzle formation. Although this is not a causal relationship, it is important to incorporate this relation into models for better prediction of drizzle formation.

1 Introduction

Warm rain formation plays a crucial role in determining the properties and life cycle of marine boundary layer clouds, and has significant impacts on radiative and hydrological budgets. Yet, many global weather and climate models continue to produce rain too frequently over oceans (e.g., Stephens et al., 2010), too light (Ahlgrimm & Forbes, 2014; Jing et al., 2017), or too heavy (Abel & Boutle, 2012; Bodas-Salcedo et al., 2008). The intermodel spread in precipitation rate in the southeast Pacific, one of the major marine boundary layer cloud decks, can be an order of magnitude (Wyant et al., 2015). The model discrepancy and spread in precipitation are linked to diverse issues, such as rain drop size distributions, and the representation of boundary layer, autoconversion, and accretion processes. The effects of autoconversion on precipitation can change surface temperature prediction significantly (Golaz et al., 2013).

Many warm rain parameterizations for autoconversion and accretion processes have been developed in the past (e.g., Beheng, 1994; Berry, 1968; Kessler, 1969; Khairoutdinov & Kogan, 2000; Liu & Daum, 2004; Seifert & Beheng, 2001; and many others). These parameterizations have been reviewed critically (Lee & Baik, 2017; Liu & Daum, 2004; Wood, 2005), and confronted with in situ observations (Hsieh et al., 2009; Wood, 2005). By applying in situ size-resolved cloud measurements to the continuous collection equation, the two aforementioned observational studies showed that parameterized accretion rates generally agree with in situ data, but parameterized autoconversion rates can be significantly different from observational estimates. While these results are encouraging and informative, there has been little follow-up observational work. It remains unclear how to maximize the use of observations for improving understanding and model representations of these microphysical processes, and how to extend constraints from in situ to remote sensing platforms that can provide continuous observations in various cloud regimes on a global scale.

The objectives of this study are manifold. Instead of evaluating existing parameterizations, here we use in situ observations to build machine-learning (ML) models to “predict” autoconversion and accretion rates. Since translating ML results to physical formulations is not trivial and remains an active research area, we also perform nonlinear optimizations to fill the gap and to quantify the relationships of autoconversion and accretion with cloud/drizzle properties. These results are compared and contrasted with widely used parameterizations, and the implications are discussed.

2 In Situ Cloud Measurements

In situ cloud measurements were taken from the Aerosol and Cloud Experiments in the Eastern North Atlantic (ACE-ENA) field campaign, deployed by the Atmospheric Radiation Measurement (ARM) user facility. The aircraft flew near the ARM site on Graciosa Island during two intensive operational periods in June-July 2017 (IOP1) and January-February 2018 (IOP2). The cloud types sampled in ACE-ENA are mainly marine stratocumulus, with some scattered or precipitating cumulus. Measurements from three cloud probes were merged to form combined drop size distributions (DSD). Cloud droplets are defined as those with radii smaller than 25 µm, and drizzle drops are defined as those with radii larger than 25 µm and up to 400 µm. The choice of the cloud/drizzle separation threshold is appropriate for marine stratocumulus (Khairoutdinov & Kogan, 2000; Kogan, 2013). We also define a cloudy sample when cloud water content urn:x-wiley:00948276:media:grl61710:grl61710-math-00010.01 g m–3. Based on this definition, a total of ∼93,000 in situ cloudy DSDs comprise 11% drizzle-free DSDs and 89% drizzling DSDs after data screening (see Table S1). The smallest drizzle water content (qr) observed in the drizzling DSDs is on the order of 10−5 g m–3, and thus we set it as the threshold for drizzle delineation, denoted as qr,crit. Property distributions from individual days for cloudy samples are shown in Figures 1a–1d.

Details are in the caption following the image

Box plots of in situ (a) cloud water content (qc), (b) cloud droplet number concentration (Nc), (c) drizzle water content (qr), and (d) drizzle drop number concentration (Nr) observed in IOP1 (left panels) and IOP2 (right panels) from the ARM campaign at the Azores. The bottom and top of each box represent the 25% and 75% quartiles, and the line inside the box represents the median. The whiskers mark the 5th and 95th percentiles. Panels (e and f) are calculated autoconversion rate (Pau) and accretion rate (Pac) from observed drop size distributions using the stochastic collection equation formulated as a two-moment bin model (see Section 3). Based on 1-s measurements, the sample size for each day is listed in (b) for all conditions, and (f) for drizzling conditions. Flights on June 29 and July 6, 2017 were excluded due to data availability. ARM, atmospheric radiation measurement; IOP, intensive operational period.

3 Parameterization Derived From Machine Leaning Techniques

The in situ DSDs from ACE-ENA are used to develop 2 ML models. The first ML model, dubbed “initiation model,” uses two inputs (qc, Nc) for predicting autoconversion rate (Pau) in drizzle-absent conditions. The second, dubbed “standard model,” uses four inputs (qc, Nc, qr, Nr) for predicting Pau and accretion rate (Pac) in drizzling conditions. As discussed further below, we use the initiation model to generate nonzero qr and Nr values. Once qr and Nr exist, the standard model is superior and used to better predict Pau and Pac.

Both models use an Artificial Neural Network. It is a deep feed forward neural network (Schmidhuber, 2015) comprising eight hidden, fully connected layers with 1,024 nodes in each layer. All input and output variables are transformed to their logarithmic forms. Since these input variables have rather different magnitudes, we normalized them using their mean and standard deviation. We used LeakyReLU (Mass et al., 2013) as our activation function. Additionally, the training was performed by the Adam optimizer (Kingma & Ba, 2015), based on a loss function defined as the mean squared error between the true value and the prediction.

The training data sets for the two models are different but originate from the same pool of data points. The pool was generated by using the in situ DSDs as the initial conditions and propagating the DSDs forward in time with the stochastic collection equation (SCE). We used the two-moment bin model of Tzivion et al. (1987) to compute Pau and Pac directly from the explicit drop-drop interaction terms at 1-s time steps for 10 min. The bin model uses the Hall (1980) kernel. The 10-min time period is based on the typical in-cloud residence time (Feingold et al., 1996). Since our focus is on clouds, we exclude noncloudy data points from the pool.

For the initiation ML model, the training data set is based on data points generated from the initially drizzle-free DSDs in the pool. To ensure that DSDs used for the initiation model are absolutely drizzle free, we exclude DSDs that contain cloud droplets in the instrument size bin (17.5–22.5 µm radius) proximate to the cloud/drizzle boundary (i.e., 25 µm radius), based on the uncertainty of 1.5–5 µm in in situ size measurements (Glienke & Mei, 2019, 2020). We do retain all DSDs that do not have droplets in that bin initially but produce nonzero qr in 5 s. These are practical choices that are as inclusive as possible of DSDs and also facilitate the ML. As shown in Figures 2a and 2b, the initiation model has the 25th and 75th errors ranging between –60% and 80%.

Details are in the caption following the image

Plots of the predicted versus the true autoconversion rates from the testing data set, using (a) the initiation and (c) standard machine-learning model, (e) the KK parameterization, and (g) Equation 6. (a) is a scatter plot, while the number of data points in all others are indicated by color. The corresponding histograms of errors (%) from these individual methods are shown in (b, d, f, and h), respectively. The blue, black, and red dashed lines, respectively, represent the 25th, 50th, and 75th percentiles of the data. The corresponding errors for these lines are denoted in each subplot in their own color. For the KK parameterization, the 75th percentile is not plotted because it is out of the axis range. For the standard ML model, the mean error (%) and the mean absolute deviation (%, using the mean as the center point) are denoted in (d).

Note that these initially drizzle-free DSDs generate qr ranging between 10−18 and 10−9 g m–3 in 5 s. The lower bound, 10−18 g m–3, is equivalent to one single drizzle drop with a radius of 25 µm in a 10 km × 10 km × 500 m cloudy volume. The upper bound, 10−9 g m–3, is at least four orders of magnitude smaller than any in situ measured qr. Therefore, although the initiation model generates nonzero qr, these numbers are very small and should still be considered as nondrizzling in any practical sense. We emphasize that the initiation ML model is mainly used as a gateway to the standard ML model that requires nonzero qr as input.

For the standard model, we sample all the cloudy points every 5 s from the pool to form the training data set, as long as their qr urn:x-wiley:00948276:media:grl61710:grl61710-math-0002 10−18 g m–3. This threshold is based on the qr magnitude that can be initiated by the aforementioned initiation model, ensuring that the qr range between two models overlap as much as possible. This leads to a total of ∼10.7 M data points. From the rest of the pool that are not sampled for training, we randomly selected 2.5 M points for testing. The ratio between the training and the testing is about 4, similar to the standard practice in ML. Additionally, both the training and testing data sets contain ∼25% data that have a ratio of Pau to Pac urn:x-wiley:00948276:media:grl61710:grl61710-math-0003 1 (i.e., in the early stages of drizzle formation), and 75% data that have a ratio < 1.

Figures 2, 3, 2d, 3a and 3b show the performance of the standard ML model. For both Pau and Pac, the majority of the data points fall on the 1:1 line in the scatter plots, confirming the appropriateness of the neural network. The uncertainty is 15% for Pau, and 5% for Pac. The good performance on the testing data set indicates that the ML model does not suffer from overfitting. Predicting Pau and Pac for these 2.5 M points takes about 100 s using a single Intel Xeon E5-2697V4 processor. Note that training Pau and Pac separately using the same input yielded similar results (see Table S2).

Details are in the caption following the image

Same as Figure 2, but for accretion rate. Panels (e and f) are based on Equation 7, using 2.5 M data points. Note that estimates from KK and Equation 7 are so similar that the density scatter plots do not show any obvious differences.

Results from the standard ML model are further compared with those from the parameterization proposed in Khairoutdinov and Kogan (2000), dubbed the KK parameterization. KK describes Pau and Pac as:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0004(1)
urn:x-wiley:00948276:media:grl61710:grl61710-math-0005(2)
where ρ is the air density and all variables are in SI units. As shown in Figures 2, 3, 2f, 3c, and 3d, the KK parameterization predicts Pac reasonably well, with errors smaller than 40%, but significantly overestimates low Pau and slightly underestimates high Pau. The performance of KK is consistent with the findings in Wood (2005).

4 Parameterizations Based on a Simple Form

While the ML model can be used to predict Pau and Pac successfully, extracting the bulk dependencies that shed light on the underlying physics, is not straightforward. To characterize the physical relationships, we assume that process rate, P, representing either Pau or Pac, can be parameterized as:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0006(3)
This form is similar to the KK parameterization (Equations 1 and 2), and the parameterization in Liu and Daum (2004), dubbed the LD parameterization, in which
urn:x-wiley:00948276:media:grl61710:grl61710-math-0007(4)
Unlike the KK and LD parameterizations, we include all observables on the right-hand side of Equation 3 to cover any possible dependencies, but will examine if all are necessary.
To find the optimal values for parameter ad and k in Equation 3, we take the logarithm of both sides, leading to
urn:x-wiley:00948276:media:grl61710:grl61710-math-0008(5)
This linear equation allows us to set up a least squares problem, solved with a limited memory quasi-Newton method (L-BFGS; Byrd et al., 1995) with wide bounds on the parameter values. The bounds are wide enough to ensure the best values do not end up close to the bounds. Uncertainty estimates are obtained from the inverse of the Hessian. The mean value of k has been corrected for the fact that k is implicitly assumed to be lognormally distributed because of the log transformation between Equations 3 and 5.

We use the testing data set with 2.5 M points to derive the parameters in Equation 3. The in situ qc and qr have an uncertainty of 30%, while Nc and Nr have an uncertainty of 50% and 20%, respectively (Glienke & Mei, 2019, 2020; Mei et al., 2020). These uncertainties are accounted for in Equation 3, leading to an additive error of 0.8 in Equation 5. Since it is possible that not all the variables constrain the solution, we systematically reduce the number of variables and adjust the additive error accordingly in the minimization.

Table 1 summarizes the parameter estimates and error statistics for predicting Pau and Pac, based on the testing data set but with qr urn:x-wiley:00948276:media:grl61710:grl61710-math-0009 qr,crit. The reason for this restriction on qr is that the power laws are unable to fit a range of qr spanning 18 orders of magnitudes. As a result, the sample size was reduced from 2.5 to 2.3 M. If we must predict Pau from qc and Nc alone, as do existing parameterizations, the corresponding exponents are about 2.90 and –1.69, respectively. Our qc exponent is closer to LD's value (a = 3) than KK's (a = 2.47), and our Nc exponent is closer to KK's value (b = –1.78) than LD's (b = –1). As shown in Table 1, adding qr into parameterizations produces a better correlation in Pau predictions. However, in general, the parameterizations involving Nr tend to perform best and have smaller errors. Once Nr is considered in the physical relationship, the exponents for both qc and Nc, that is, the sensitivity of the Pau to these two variables, is reduced. Interestingly, we also find that the exponents of Nr and Nc are nearly reciprocal. The key role of Nr in autoconversion rate is counter-intuitive and will be discussed in the next section.

Table 1. Optimal Parameters for Representing Autoconversion and Accretion Rates in the Form of urn:x-wiley:00948276:media:grl61710:grl61710-math-0010, where P is in kg m–3 s–1, qc in kg m–3, Nc in m–3, qr in kg m–3, and Nr in m–3, Using 2.3 M Data Points
Corr. Error (%) k a b c d
25th/50th/75th
Autoconversion
Nonturbulent conditions
0.96 −39/−8/55 2.44 ± 0.05 2.0681 ± 0.0007 −0.7760 ± 0.0007 −0.1285 ± 0.0004 0.7844 ± 0.0005
0.96 −39/−9/55 5.9 ± 0.1 1.9839 ± 0.0008 −0.7496 ± 0.0007 −0.0642 ± 0.0005a 0.7043 ± 0.0006
0.92 −52/−9/90 (164 ± 2) E7 2.2742 ± 0.0007 −1.0930 ± 0.0006 0.3177 ± 0.0002
0.93 −49/−9/77 (71.1 ± 1.4) E7 2.5247 ± 0.0009 −1.0548 ± 0.0008 0.4185 ± 0.0004a
0.96 −39/−8/55 16.8 ± 0.3 2.0150 ± 0.0006 −0.7461 ± 0.0006 0.6403 ± 0.0003
0.88 −59/−8/129 (375 ± 4) E12 2.8957 ± 0.0005 −1.6945 ± 0.0004
Turbulent conditions with a dissipation rate of 400 cm2 s−3
0.96 −39/−8/53 11.1 ± 0.2 1.9777 ± 0.0007 −0.7366 ± 0.0006 0.6511 ± 0.0003
Nonturbulent conditions, but only using points that urn:x-wiley:00948276:media:grl61710:grl61710-math-0011
0.96 −31/1/49 (201 ± 7) E6 1.7699 ± 0.0018 −0.7975 ± 0.0014 0.8043 ± 0.0009
0.96 −31/1/48 1.22 ± 0.05 1.7656 ± 0.0017 −0.7929 ± 0.0013 0.8432 ± 0.0008
0.84 −56/3/129 (73 ± 2) E10 2.611 ± 0.0014 −1.512 ± 0.001
Nonturbulent conditions, but only using initially drizzle-free cloud size distributions
0.89 −54/−2/113 (4 ± 2) E17 4.08 ± 0.02 −2.25 ± 0.02
Accretion
Nonturbulent conditions
0.996 −22/−3/22 (94.8 ± 1.5) E5 1.4030 ± 0.0007 −0.3147 ± 0.0006 1.3069 ± 0.0004 −0.2389 ± 0.0004
0.997 −25/−3/29 (89 ± 1) E3 1.4159 ± 0.0006 −0.3018 ± 0.0005 1.1172 ± 0.0001
0.960 −64/27/244 (631 ± 9) E–4 1.9603 ± 0.0006 −0.6487 ± 0.0005 1.2153 ± 0.0001
0.996 −23/−6/21 69.5 ± 0.2 1.1476 ± 0.0002 1.1587 ± 0.0001
Turbulent conditions with a dissipation rate of 400 cm2 s−3
0.997 −20/−5/17 55.2 ± 0.1 1.1287 ± 0.0003 1.1462 ± 0.0001
  • The correlation coefficient (Corr.), the 25th, 50th, and 75th percentile errors (%) are also listed. The rows in bold indicate the set of parameters used for comparisons to the ML model in Figures 2 and 3.
  • a Instead of qr, we use urn:x-wiley:00948276:media:grl61710:grl61710-math-0012 in the formula, motivated by Seifert and Beheng (2001).

For accretion, the key roles of qc and qr are consistent with existing parameterizations. Taking the KK parameterization as an example, the exponent for qc and qr is 1.15, which is very close to our exponents, 1.148 for qc and 1.159 for qr.

Based on the performance in errors in Table 1 and the minimum number of variables required to reach a high correlation, we select the following power laws for predicting Pau and Pac:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0013(6)
urn:x-wiley:00948276:media:grl61710:grl61710-math-0014(7)
Figures 2, 3, 2h, 3e, and 3f show that the parameterizations from Equations 6 and 7 perform well, but not as well as the standard ML model. The ML model is therefore the desired choice, but Equations 6 and 7 remain a good option compared to KK.
We also performed the minimization technique on the drizzle initialization data set using only qc and Nc, resulting in
urn:x-wiley:00948276:media:grl61710:grl61710-math-0015(8)
This is in very close agreement with the analytical expression in Seifert and Beheng (2001) that was based on a gamma distribution for the cloud droplet mass and the Long kernel, in the limit of drizzle initiation.

5 The Dependence of Autoconversion on Drizzle Number Concentration Nr

Results in Sections 3 and 4 demonstrate that both Pau and Pac are influenced by cloud and drizzle simultaneously. The influence of cloud and drizzle properties on accretion makes sense and is consistent with collision-coalescence theory and existing parameterizations, but the influence of drizzle on autoconversion is less straightforward.

From the definition of autoconversion, one should not expect a causal relationship between Nr and Pau. Instead, the dependence of Pau on Nr represents the influence on Pau from the evolution stage of the cloud DSD, which is related to the appearance of raindrops. Such an influence was first pointed out by Cotton (1972), followed by Seifert and Beheng (2001) who incorporated this associated relationship using the ratio of qr to total water content (shown as one of the options in Table 1). Zeng and Li (2020) also demonstrated that qr is a good predictor of the width of the cloud droplet size distribution, and thus Pau. However, it remains unclear what is the best form to describe this associated relationship, and whether qr and Nr are equally effective predictors.

To understand whether Nr contains different information from qr and whether Nr is an effective predictor for all coalescence regimes, we conducted a number of ML tests (see Table S2). For the regime of urn:x-wiley:00948276:media:grl61710:grl61710-math-0016, we found the use of (qc, Nc, qr, Nr) remains the best, and the performances from (qc, Nc, Nr) and (qc, Nc, qr) are similar. For all regimes, the use of (qc, Nc, qr, Nr) is better than (qc, Nc, Nr), and the latter is better than (qc, Nc, qr). These suggest that qr and Nr contain different information. These also suggest that qr and Nr are equally effective predictors of the autoconversion-dominant regime, but Nr is a better choice for all regimes. This is understandable as qr depends also on the accretion rate, while Nr is not affected by accretion.

We further turn to the SCE used to generate the data set to seek a theoretical basis for the associated relationship. The rate of change in the total drop number concentration (N) can be derived from
urn:x-wiley:00948276:media:grl61710:grl61710-math-0017(9)
in which x and y represent the mass per drop; urn:x-wiley:00948276:media:grl61710:grl61710-math-0018is the drop number concentration as function of mass; and K(x,y) is the collection kernel.
Defining cloud droplets as those drops with mass smaller than x0, and rain drops as those with mass larger than x0 allows us to decompose Equation 9 as:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0019(10)
The first integral in Equation 10 denotes the changes in N due to interactions between cloud droplets, that is, via self-collection and autoconversion. The second and third integrals denote interactions between cloud and rain drops, that is, via accretion. The last integral contains interactions between rain drops, that is, via self-collection of rain drops. To proceed analytically, we use the kernels of Long (1974),
urn:x-wiley:00948276:media:grl61710:grl61710-math-0021(11)
as an example, leading to:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0024(12)
where urn:x-wiley:00948276:media:grl61710:grl61710-math-0025 is defined as:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0026(13)
proportional to the radar reflectivity in the Rayleigh regime. As explained above, the underlying processes for each term in Equation 12 are:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0027(14)
where the subscript “sc” denotes self-collection.
Similar to Equation 9, we can perform the integrals for qc, that is,
urn:x-wiley:00948276:media:grl61710:grl61710-math-0028
which leads to
urn:x-wiley:00948276:media:grl61710:grl61710-math-0029(15)
Let us define
urn:x-wiley:00948276:media:grl61710:grl61710-math-0030(16)
where urn:x-wiley:00948276:media:grl61710:grl61710-math-0031 is the average mass per cloud droplet. Then, we find
urn:x-wiley:00948276:media:grl61710:grl61710-math-0032(17)
Since
urn:x-wiley:00948276:media:grl61710:grl61710-math-0034(18)
we can combine Equations 17 and 18 to express the autoconversion process rate as:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0035(19)
Inserting Equations 14-16 into Equation 19, yields
urn:x-wiley:00948276:media:grl61710:grl61710-math-0036(20)
with kr = 5.78 m3 kg−1 s−1 and kc = 9.44 urn:x-wiley:00948276:media:grl61710:grl61710-math-0037 m3 kg−2 s−1 (Long, 1974).
There are a number of interesting features in Equation 20. First, the right-hand side is independent of qr. Second, the first term on the right-hand side has a remarkable resemblance to our power laws, showing the dependence of Pau on Nr and the reciprocal feature between Nc and Nr. Since the separation in Long's kernels was 50 µm radii, we extend the derivation of Equation 20 for cases in which the cutoff size is smaller than Long's separation value. As shown in supporting information Text S1, Equation 20 can be rewritten as:
urn:x-wiley:00948276:media:grl61710:grl61710-math-0038(21)
where Tc is the third moment of the mass distribution, and urn:x-wiley:00948276:media:grl61710:grl61710-math-0039 is the number concentration in the range between x0 and the drop mass at 50 µm radius.

To examine the role of the first term in Equation 21, we calculate all terms for a wide range of size distributions approximated by various combinations of lognormal and gamma distributions with realistic cloud and drizzle properties (Figure S1). Our results show that although the first term alone cannot completely replicate Pau, compared to the second, third, urn:x-wiley:00948276:media:grl61710:grl61710-math-0040-related, and the last term, the first term is closer to Pau than each of them in 95%, 100%, 71%, and 100% of all cases, respectively, with the relative contributions of terms dependent on the size distributions. This supports the finding in Table 1 that the first term is a good predictor of Pau, and provides evidence as to why our Pau estimates show a dependence on Nr, and why the inclusion of Nr in the autoconversion parameterization is beneficial.

6 The Effect of Turbulence

The ML model and power-law parameterizations introduced above are based on SCE calculations in nonturbulent conditions. Since small-scale turbulence can enhance the collection rate (e.g., Ayala et al., 2008; Chen et al., 2018; Grabowski & Wang, 2013; Wang & Grabowski, 2009), we evaluate the turbulence impact by incorporating the enhancement of the collision efficiency tabulated in Wang and Grabowski (2009). Using the enhancement factor under turbulent cloud conditions with a 400 cm2 s−2 dissipation rate, we found that the exponents in power-law relationships did not change significantly (see Table 1). Ignoring the turbulence effects in the ML model leads to (−15% ± 13%) errors in Pau and (−10% ± 7%) errors in Pac. The medians of the error histograms are −18% for Pau and −7% for Pac. Compared to the median errors introduced by the KK parameterization, which are respectively 45% and −20% for Pau and Pac (see Figures 2 and 3), the errors due to turbulence collision effects in our ML model are smaller and can be accounted for if the dissipation rate can be estimated from radar or lidar observations.

7 Summary

We have built machine-learning models to predict autoconversion and accretion rates from cloud and drizzle properties, using cloud probe measurements from the ACE-ENA campaign in the Azores and the stochastic collection equation formulated as a two-moment bin model. Overall, the estimated autoconversion and accretion rates from the machine-learning model agree with the observed rates to within 15% and 5%, respectively. The standard model requires concurrent, separated cloud and drizzle water contents and number concentrations, which can be obtained from in situ observations or retrievals from remote sensing measurements (e.g., Fielding et al., 2015; Mace et al., 2016; Rusli et al., 2017; Wu et al., 2020).

The joint analyses from the machine-learning model and optimization techniques led to a robust dependence of autoconversion on drizzle number concentration. The dependence on drizzle number concentration also shows a reciprocity with the dependence of cloud droplet number concentration. These findings are unexpected, because the autoconversion process represents the coalescence between cloud droplets and is causally only related to cloud properties. However, drizzle number concentration does contain information on the width and evolution of the DSD, and hence indirectly on the autoconversion rate. By using simple collection kernels, we replicate the dependence and reciprocity in theoretical derivations. This implies that these features are physical and can be incorporated to improve parameterizations of autoconversion rate. The power-law parameterizations also suggest that the autoconversion rate relates to cloud droplet number concentration with an exponent of 0.75, that is, smaller than often assumed, which will affect precipitation susceptibility, and therefore warrants further investigation.

Acknowledgments

This research was supported by the Office of Science (BER), DOE under Grants DE-SC0021167, DE-SC0013489, DE-SC0020259, and DE-89243020SSC000055. Van Leeuwen was supported by the European Research Council under the CUNDA project 694509.

    Data Availability Statement

    ARM data are available online through http://www.archive.arm.gov. The work on machine learning used resources of the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory, under DOE Contract No. DE-AC05-00OR22725. The training and testing data sets, and the machine-learning trained models are available freely in the ARM Archive and in Github (https://github.com/yang0920colostate/AuAc).