Multiple‐Instance Superparameterization: 1. Concept, and Predictability of Precipitation

We have investigated the predictability of precipitation using a new configuration of the superparameterized Community Atmosphere Model (SP‐CAM). The new configuration, called the multiple‐instance SP‐CAM, or MP‐CAM, uses the average heating and drying rates from 10 independent two‐dimensional cloud‐permitting models (CPMs) in each grid column of the global model, instead of a single CPM. The 10 CPMs start from slightly different initial conditions and simulate alternative realizations of the convective cloud systems. By analyzing the ensemble of possible realizations, we can study the predictability of the cloud systems and identify the weather regimes and physical mechanisms associated with chaotic convection. We explore alternative methods for quantifying the predictability of precipitation. Our results show that unpredictable precipitation occurs when the simulated atmospheric state is close to critical points as defined by Peters and Neelin (2006, https://doi.org/10.1038/nphys314). The predictability of precipitation is also influenced by the convective available potential energy and the degree of mesoscale organization. It is strongly controlled by the large‐scale circulation. A companion paper compares the global atmospheric circulations simulated by SP‐CAM and MP‐CAM.


Introduction
Since the 1960s, low-and medium-resolution atmospheric models, including global circulation models (GCMs), have used cumulus parameterizations to represent the effects of unresolved convective clouds (e.g., Arakawa & Schubert, 1974;Kuo, 1974;Manabe et al., 1965). The parameterizations determine "tendencies" due to convective transports and phase changes, for example, rates of heating and drying.
Most existing parameterizations are deterministic, in the sense that for a given state of the simulated largescale circulation, the tendencies produced by the parameterization are fully and unambiguously determined by the resolved-scale weather simulated by the model. In a few cases, the parameterizations include prognostic variables of their own (e.g., Pan & Randall, 1998), but the tendencies produced by such a parameterization are still deterministic. Deterministic parameterizations are intended to give the "expected values" of the convective heating and drying rates. These can be interpreted as ensemble averages over the many possible realizations that are consistent with a given large-scale weather state (e.g., Arakawa, 2004). Although spatial averaging plays an explicit and key role in the derivations of the equations used in a deterministic parameterization (e.g., Arakawa & Schubert, 1974), ensemble averages are never explicitly introduced. For this reason, the idea that today's deterministic parameterizations represent ensemble means appears to be based on a hopeful interpretation rather than a demonstrated fact.
Recently, there has been increasing interest in stochastic parameterizations, in which the tendencies include random contributions (e.g., Berner et al., 2012;Buizza et al., 1999;Keane et al., 2016;Plant & Craig, 2008;Shutts et al., 2008). Stochastic parameterizations are intended to generate individual realizations of convective activity, which are samples chosen from the ensemble of possible realizations, like individual cards drawn from a deck. Stochastic parameterizations have led to some improvements in the representation of precipitation variability (Groenemeijer & Craig, 2012;Keane et al., 2016;Wang et al., 2016).
The stochastic fluctuations of precipitation arise from sensitive dependence on initial conditions (e.g., Lorenz, 1969). To see how this works, consider an ensemble of forecasts performed with a high-resolution cloud-resolving model. Suppose that the various ensemble members start from slightly different initial conditions. Individual realizations diverge with time because they are subject to the chaotic, unpredictable growth and decay of convective instabilities. Ensembles of this type have been analyzed by Xu et al. (1992) and Jones and Randall (2011). These ideas are consistent with the studies of Weisman et al. (2008) and In addition to SP-CAM, superparameterization has been implemented in the GEOS5 model of National Aeronautics and Space Administration's Goddard Space Flight Center (Tao et al., 2009), in a version of National Centers for Environmental Prediction's Climate Forecast System used by the Indian Institute of Tropical Meteorology (Goswami et al., 2015), in the Integrated Forecast System of the European Centre for Medium Range Weather Forecasts (Subramanian & Palmer, 2017), and in the Global Forecast System of the U.S. National Centers for Environmental Prediction. Results reported in over 100 journal publications by many different authors have demonstrated that superparameterization leads to major improvements in simulations of the Madden-Julian Oscillation (e.g., Benedict & Randall, 2009), the diurnal cycle of precipitation over land (e.g., Pritchard & Somerville, 2009), the Asian summer monsoon (e.g., DeMott et al., 2011;Goswami et al., 2015), African easterly waves (e.g., McCrary et al., 2014), and the frequency and intensity of precipitation (e.g., DeMott et al., 2007;Kooperman et al., 2016).
The simulations reported in this paper are based on the "special release" version of SP-CAM 2.0, which is part of the Community Earth System Model 1.1.1 (Randall et al., 2013). It is based on Version 5.2 of the CAM (Neale et al., 2012). We use the CAM's finite-volume dynamical core, which solves the quasi-static equations of motion on a 1.9°×2.5°latitude-longitude grid. The GCM has 26 levels with a terrain-following hybrid σ-p coordinate. Radiation calculations are performed every GCM time step (15 min) using CAM radiation (Neale et al., 2010). Observed climatological sea surface temperatures are prescribed and temporally interpolated to each GCM time step.
A superparameterization is a stochastic parameterization, because a CPM is a nonlinear dynamical system that is sensitively dependent on its initial conditions. A superparameterization simulates a single realization of a cloud field. The stochasticity of a superparameterization arises naturally, without ad hoc assumptions or random number generators, in much the same way as the stochasticity of real cloud systems.
We have created a model that uses multiple CPMs in each GCM grid column in order to simulate an ensemble of possible cloud-system realizations for a given state of the large-scale weather as simulated by the GCM. We call the new model MP-CAM, where the "M" stands for "multiple." All of the CPMs see exactly the same large-scale weather, as simulated by the CAM, but the different CPMs produce different realizations of the convective activity because they start from slightly different initial conditions. Because the MP framework generates an ensemble of realizations for a given large-scale state, we can use the spread of the ensemble as a measure of predictability. We also interpret the ensemble-averaged heating and drying as an approximation to a deterministic parameterization, although we can only approximate the feedback that would occur with an infinite ensemble. The approximation should become more accurate as the number of CPMs increases. Naturally, the ensemble-mean feedback is temporally and spatially smoother than the feedback from any one of the MP-CAM CPMs, and it is also temporally and spatially smoother than the feedback from the single CPM in SP-CAM. Examples are presented in the next section. Figure 1 compares the convective parameterizations of CAM, SP-CAM, and MP-CAM.
The results discussed in this paper are based on the use of 10 CPMs in each GCM column. The CPMs are configured identically to those used in SP-CAM. Each CPM is initialized with a unique set of random initial thermal perturbations. These perturbations are added on the first time step only; no additional perturbations are applied as the simulation proceeds.
The use of the ensemble-mean feedback does not ensure that the individual CPMs have precisely the same mean state as the GCM column, although the analysis of Randall et al. (2016) implies that the ensemble average of the mean states of the individual CPMs still satisfies this constraint. We have confirmed that the mean states of the CPMs actually do stay close to the mean state of the GCM and to one another. In order to do this, we have analyzed the similarity of the CPM solutions within a given GCM column. Using data for every GCM time step (15 min), the root-mean-square difference of the ensemble member values from the MP CPM ensemble mean temperature and specific humidity at the 850-, 500-, and 300-hPa pressure levels were calculated and averaged over sample months of January ( Figure 2) and July (not shown). For both variables, CPM differences are found mainly in the presence of clouds. The maximum differences are on the order of 1 K for temperature at any level and up to about 1 g/kg for specific humidity at low levels. Average global values are about an order of magnitude smaller, with differences in cloudy regions typically less than third of the maximum. In a relative sense, average absolute specific humidity differences are approximately 1% at 300 hPa, 0.5% at 500 hPa, and 0.3% at 850 hPa, and average absolute temperature differences are less than 0.01% at all levels. Average deviations computed from daily mean data are approximately two thirds the magnitude of those computed with 15-min sampling, and maximum deviations are about half of those obtained with 15-min sampling (not shown).
Because each CPM contributes only one tenth of the ensemble-averaged feedback, the coupling of the individual CPMs to the GCM is "looser" than in SP-CAM. As an example, suppose that, in a particular GCM grid column, only 1 of the 10 CPMs is simulating active convection. One effect of convective feedback is the tendency to reduce the CAPE. Because the feedback from the sole "active" CPM is divided by 10, this effect is greatly weakened. As a result, the CAPE may become larger than it would have been with full feedback, and so the CPM may simulate stronger convection than it would have with full feedback. Later, we show some evidence of this effect.
MP-CAM was run for just over 23 simulated years on National Center for Atmospheric Research's Yellowstone supercomputer. Output was recorded as monthly and daily time averages for both models, but practical complications yielded a data set with different availability for different variables. Higherfrequency output was also created for January and July subsets of the full MP-CAM simulation. Three-hourly averaged data were created for January of Year 8 and July of Year 9 as well as for August of Year 20 through the end of the simulation. Records were saved every 15 min (every GCM time step, with averages only over the CPM subcycle, hereafter abbreviated as ETS) for individual days in a simulated January and July. ETS data were also created for SP-CAM for a single January day. A detailed documentation of the data sets is given by Jones (2017).

Global Characteristics
From a forecasting point of view, the most basic precipitation question is the following: Will it rain, yes or no? Figure 3 shows the fraction of occurrence, considering the full MP daily data set, where MP CPMs unanimously report zero precipitation, unanimously report any amount of precipitation, or are in some disagreement as to whether there will be any precipitation. The members frequently agree that there will be zero precipitation over deserts and oceanic stratocumulus regions. They frequently agree that there will be nonzero precipitation over much of the extratropics and polar regions, as well as locations frequently populated by the cirrostratus anvils of deep convection. The CPMs share a particular fondness for precipitation over the islands of the maritime continent. On the other hand, they tend to disagree about the occurrence of precipitation in regions frequently populated by fair-weather cumuli, where the annual mean precipitation rate is in the range 2-4 mm/day. These are the boundary regions between high and low precipitation, where the weather is monotonous. It is highly likely that in many cases, some CPMs in the ensemble are producing trace amounts of precipitation, while the others produce none at all.
In the MP framework, a wide range of precipitation rates and physical tendencies across the ensemble are indicative of low predictability. As an example, Figure 4 shows April precipitation rate statistics from the MP run, averaged over five Aprils. The average precipitation maps for the individual ensemble members are nearly indistinguishable, and the global mean precipitation rates span the narrow range from 2.67 to 2.70 mm/day. A close look at the top two rows of the figure reveals small differences at the edges of the more heavily precipitating regions. The standard deviation panel (Figure 4l) shows the time average of the standard deviation of the precipitation rate, across the ensembles. Maxima of the standard deviation tend Figure 2. The root-mean-square difference from the MP CPM ensemble mean temperature and specific humidity at the 850-, 500-, and 300-hPa pressure levels computed at every GCM time step (15 min) and averaged over one particular simulated January. <x> denote spatial averages: the upper left value, with superscript Tr, is an average from 20°S to 20°N, representing tropical values, and the upper right value is the global mean. The value at the bottom right is the maximum value obtained at that level for the month.

Journal of Advances in Modeling Earth Systems
to be correlated with the locations of mean precipitation maxima (e.g., along the Intertropical Convergence Zone), though there are exceptions (e.g., along the northern Pacific storm track). This pattern is also apparent in the zonal mean ( Figure 4n).
Our initial attempts at quantification of the CPM ensemble spread were based on the ensemble standard deviation divided by the ensemble mean. This ratio is called the "coefficient of variation" or COV. High values of the COV indicate strong disagreement and low predictability. Figure 4o shows the COV averaged over time and plotted only where the ensemble-mean precipitation rate exceeds 5 mm/day. The zonal mean of the COV is shown in Figure 4n. Large values are most common in the tropics, peaking near ±15°latitude. Figure S1 in the supporting information is as presented in Figure 4 but for a single day in September of Year 5. Here it is clear that the ensemble members are producing different results at some locations. Even at the daily level, elevated COV values are found in the same tropical locations, with much lower values (indicating CPM agreement) throughout the extratropics and near the center of broad, intensely precipitating systems. The edges of these systems, where the mean precipitation rate is weaker, also exhibit somewhat elevated COV. In both the long-term average and single-day cases, the vast majority of GCM grid cells (Figures 4m and S1m) shows strong agreement, and areas of strong disagreement are confined to small areas. The exception to this is found in the largest-magnitude bin of the histograms; this is discussed further below.

Point Characteristics
Results from a selection of representative points are summarized in Table 1. At the extratropical point, E, there is general agreement on the domain-mean precipitation. This is typical of the extratropics. Point C2 is similar to the equatorial Pacific point labeled C1; both exhibit a high COV. During the period shown do the CPM members of C1 and C2 never agree on the precipitation rate. Only rarely do more than half of the members agree that there will be precipitation. Those that do precipitate sometimes produce heavy rain. For most of the example period, only one or two members report precipitation. Exceptions occur near Days 20, 30, and 37, when the members agree that there is no precipitation. In contrast to C1, C2 shows far greater temporal variation in the ensemble-mean precipitation rate, ranging from 0 to nearly 40 mm/day (not shown). Figure 5 shows data from two tropical points, for the same time period. T1 lies just north of the equator near the coast of Guinea in Western Africa. For this month, the location is characterized by moderate rates of precipitation and lies on the eastern border of a local precipitation maximum that stretches westward over the Atlantic Ocean. The CPM patterns for T1 are not as neatly coherent as for Point E, there is agreement as to whether or not precipitation is occurring. Nevertheless, the rain rate varies considerably ( Figure S2). . Precipitation rate statistics from multiple-instance superparameterized Community Atmosphere Model for an average over 5 years of daily data from the month of April: (a-j) the mean precipitation rate for each of the 10 cloud-permitting model ensemble members; the time average of (k) the average MP ensemble precipitation and (l) the standard deviation of the MP ensemble precipitation; (m) the truncated histogram of grid points by coefficient of variation (COV); (n) zonal means of the ensemble mean (solid black), standard deviation (dot-dash), and COV (red); and (o) the time-averaged COV where the ensemble average precipitation rate was greater than 5 mm/day.

Journal of Advances in Modeling Earth Systems
T2 is in the central Indian Ocean, just south of the equator. For Days 15 through 30, the ensemble members disagree about the presence and magnitude of precipitation, which is light in the mean. Without strong forcing from the GCM, CPM disagreement at T2 tends to be high. Near the end of the time series, the members come into better agreement that there is strong precipitation. This rain event is associated with the passage of a low-pressure system, a quick rise in PW, and a two-Kelvin cooling (not shown), indicating the passage of a large-scale weather system.
In summary, the lowest COV, indicating greatest predictability, is associated with the extratropical point. Tropical points, T1 and T2, have larger COVs. The COV for C2 is considerably higher. For C2, the CPMs rarely agree on the magnitude of precipitation, except when they all report zero. Finally, the highest COV was reported for C1, where typically only one ensemble member reported precipitation.

What Happens Inside the CPMs
Figures 6 and 7 show Hovmöller plots of 3-hourly precipitation rate for a sample July across the domain of each CPM at two different GCM points, namely, extratropical (E) and tropical with high COV (C1). The movements of individual convective cells are apparent in each case. For Point E, it is clear that while the progression of CPM means agree rather closely (Figure 5a), the details of the precipitation patterns differ on the CPM grid scale. In strong contrast, at Point C1, only one member is heavily precipitating at any given time. Figures S3 and S4 provide additional detail about the CPM states at the time indicated by the black box in Figure 7. These figures show the relatively disturbed state of Ensemble Member 8 at this time and the movement of its convective cell with time as well as smaller disturbances in Members 9 and 10 that do not lead to precipitation.
In the latter part of the C1 sequence, the strong precipitation shifts from Members 8 to 2 and then back to Member 8. Strong activity decays in Member 8, and strong new perturbations appear in Members 2, 4, 5, and 7 ( Figure S5). Thirty hours later, only Member 2 has any remaining active precipitation ( Figure S6). During this transition, very little changes in the local large-scale circulation (not shown). The CAPE is relatively steady, increasing slightly as convection decays in Member 8, though it undergoes much more significant fluctuations during the periods of single-member domination. The large-scale state could support strong convection, but vigorous triggering would be required to overcome strong CIN. Active convection in one CPM was strong enough to counter the local large-scale destabilization.
In this example, Members 2 and 8 are extreme outliers-the only ones producing intense precipitation. The domain averages for Members 2 and 8 frequently exceed 50 mm/day, allowing the ensemble average to exceed 5 mm/day ( Figure S7). In fact, 5-10 mm/day is a fairly common precipitation rate in the high-COV region surrounding this GCM grid column. In SP, a single CPM in this region often produces domain-averaged precipitation rates of 5 mm/day. In MP, under very similar large-scale conditions, the ensemble average returns a similar distribution of precipitation rates, but the CPMs are in disagreement, and a single CPM produces 50 mm/day for a month or longer, while the nine remaining CPMs produce almost nothing.

Limitations of the COV as a Measure of Predictability
Is the equatorial Pacific point, C1, typical of high-COV points? Figure 8 shows CPM-reported daily mean precipitation in July of Year 7, including only points for which the ensemble mean precipitation exceeded Journal of Advances in Modeling Earth Systems 5 mm/day and the COV was greater than 3. These conditions serve to select the highest COV bins in the histograms of Figures 4m and S1m. Contributing events are ranked top to bottom from the highest to lowest ensemble-mean daily precipitation rate, and individual ensemble member daily mean precipitation rates within each contributing event are sorted left to right from the lowest to highest, in order to visually isolate the outlying member, which would otherwise be randomly placed. White indicates zero precipitation, and the thin dark bins indicate light precipitation. The figure shows that C1 is indeed typical of the high-COV points; only a few events feature some precipitation from all of the ensemble members. Incidents of single-member nonzero precipitation, which is notably heavy here due to the 5 mm/day precipitation threshold, account for less than 0.2% of all cases in this period.
The time-averaged COV exhibits coherent spatial patterns similar to those of the CAPE. This suggests that predictability is small where the CAPE is large. An example, taken from 10 years of daily data from the MP simulation, is shown in Figure 9 for points where the ensemble mean precipitation rate is greater than 5 mm/day. The boundaries of the high-COV regions are surprisingly sharp. The CAPE and COV both peak within the tropics, and their maxima and minima tend to be in the same regions. The pattern of COV is  Table 1, indicating greater predictability in the extratropics. Although the annual-mean spatial patterns of the CAPE and COV have a correlation of 0.842, local temporal correlations (i.e., correlations through time at a single point in space, with zero lag) are not consistently strong. In fact, the temporal correlations are negative where the COV is the largest.
We further examined daily data from July of Year 7 of the MP simulation. In this subset, the COV and CAPE are correlated at 0.711 for instances where the precipitation rate exceeds 5 mm/day. This is consistent with  Journal of Advances in Modeling Earth Systems r S/T , the correlation coefficient between CAPE and COV considering all points in space and time of Figure 9, and is shown in Figure 10a. Here we see that some of the data are clustered in horizontal stripes (also see Figure S1). The red points that cluster close to 3.2 are those for which only one CPM member reports precipitation (as in Figure 8). The other two stripes sit near COV values of 2.1 (khaki) and 1.6 (blue). If we ignore the stripes, the remaining black points show an intriguing linear relationship.
Isolating geographic locations where the long-term local temporal correlations shown in Figure 9c are less than −0.5, we note the predominance of the positive linear relationship found in Figure 10a where the COV is less than two (figure 5.17 of Jones, 2017). However, this relationship is dominated by a large proportion of events along COV values near 2.1 and 3.2, which occur for a wide range of CAPE values and make the overall correlation reverse sign.
The stripes reveal a problem with utility of the COV for a variable like precipitation rate, which can have many zero values. It is easy to show that for any CPM ensemble that reports 9 zero values and 1 nonzero precipitating value, the COV will be equal to ffiffiffiffiffi 10 p , which is about 3.16. This upper limit is an artifact, determined entirely by the sample size (i.e., 10). Similarly, the stripes in Figure 10a near 2.1 and 1.6 are, respectively, associated with states in which exactly two or three members are precipitating, while the rest of the members are nonprecipitating.
There is a second issue. Points similar to C1 (Figures 5 and S6) have high COV and sometimes large CAPE, with only one or two members reporting precipitation. The ensemble-mean feedback used in MP makes it possible for the CAPE to remain large despite heavy precipitation in a single ensemble member, because the feedback from that member is divided by 10. While convection acts to reduce the CAPE by converting it into convective kinetic energy, the CAPE-reducing effects of a single member will only weakly reduce the CAPE when its physical tendencies are averaged with the other nine inactive members. Because of this, there can be a tendency for large COV to be associated with high CAPE. More typically for near-maximal COV events, though, lower precipitation and CAPE values are present. The problem is reduced when using larger precipitation thresholds, but this also reduces the sample size.

Measuring Predictability With Proportional Variablility
For the reasons discussed above, a better or at least an additional measure of predictability is needed. Heath and Borowski (2013) developed a measure known as proportional variability (hereafter PV, an unfortunate initialism for atmospheric science applications). The PV is defined as the average ratio comparison of all possible combinations of numbers in a set. It is bounded between zero and one, where zero indicates no spread . For points where 10 years of the MP daily-mean precipitation rate exceeds 5 mm/day, (a) the time-averaged daily mean convective available potential energy, (b) COV of cloud-permitting model member daily precipitation rate, and (c) the local temporal correlation between the convective available potential energy and COV. Shading on the correlation plot indicates significance at the 95% confidence level. r S/T is the correlation coefficient considering all points in space and time, r S is the spatial correlation of (a) and (b), r GMT is the global mean of the local temporal correlation coefficients, and r GMT95 is the global mean of the significant local temporal correlation coefficients. COV = coefficient of variation.

Journal of Advances in Modeling Earth Systems
and one indicates a very large spread. As described by Heath and Borowski (2013), for a given data set of n nonnegative numbers, z i ≥ 0, the PV is defined by where the relative difference, D(z i ,z j ), is given by and C=n(n−1)/2 is the number of unique pairs, (z i ,z j ). Sample size and zero-valued data have less impact on the PV than the COV. In particular, the PV has the nice property that the case of a single active member is assigned low PV, properly reflecting the fact that the majority of the ensemble members are in agreement. Adam (2009) notes that the PV is useful with highly non-Gaussian data that include many zeroes and/or are strongly skewed. Figure 10a but uses the PV instead of the COV. It shows that the correlation of the PV of precipitation to the CAPE is 0.733, higher than the correlation of the COV with CAPE. The red points of Figure 10a that were associated with poor COV-based predictability have larger PV-based predictability.

Figure 10b is analogous to
The time-averaged relationship between the CAPE and PV is shown in Figure 11, which can be compared to Figure 10b. PV minima are often associated with CAPE minima, yielding a long-term mean spatial correlation of 0.907, considerably higher than with the COV. The global mean of the local temporal correlations is also increased, owing to near-global positive correlations. CAPE above 1,000 J/kg is almost always associated with wide CPM precipitation rate spreads. For lower CAPE values, the PV increases by approximately 0.08 for every 100 J/kg increase in CAPE. In short, PV provides a more consistent measure of predictability, with a smaller dependence upon precipitation magnitude. We will use the PV through the rest of this paper.

Bulk Measures
To understand how the large-scale state of the atmosphere influences the predictability of precipitation, correlation analyses like those in Figure 11 were performed for various daily-mean variables. We computed local temporal correlation coefficients of various fields with the PV of the daily-mean precipitation rate. The global means of these correlations, where significant at 95% confidence, are denoted by r GMT95 in the

Journal of Advances in Modeling Earth Systems
figures discussed below. We visually scanned joint probability density functions to identify clear, nonlinear associations between the PV-and GCM-scale variables.
Variables with strong correlations to the PV of the precipitation rate include the CAPE, already discussed above, and the low-cloud fraction. As shown in Figure 12,the local temporal correlations of low-cloud fraction with the PV are strongly and significantly negative. This suggests that dense cloud cover is associated with better predictability of precipitation, possibly because dense low-cloud cover is associated with active weather systems. The strong spatial correlation is insensitive to the imposed precipitation threshold.
A number of cloud-and moisture-related variables exhibit strong negative correlations with the PV because they are associated with weather systems. These include cloud amounts at all levels, cloud ice and liquid concentrations, precipitation frequency, the precipitation rate itself, relative humidity, and midlevel (but not low-or upper-level) specific humidity. The correlation with vertical pressure velocity is also indicative of weather systems.
Some variables show spatially varying relationships to PV. For instance, the sensible (not shown) and latent heat flux ( Figure 13) and the boundary layer depth ( Figure S8) tend to show positive correlations over land and in the southern extratropics and negative correlations over the tropical oceans. Global means of the local temporal correlations for these variables exhibit strong seasonality and are near zero in the annual mean. Strong positive correlations over land are most prominent in the summer hemisphere, matching the associations with surface temperature and the tendency for disorganized convection.
In addition to CAPE, strong positive correlations with the PV were found for net shortwave radiation at the surface, midlevel drying tendencies from the CPM, low-level positive temperature and specific humidity anomalies, and midlevel turbulence kinetic energy and cloud mass fluxes in the CPM. These positive correlations are associated with scattered convection unrelated to organized weather systems. Within the CPMs, the occurrence of smaller precipitating cells is very strongly associated with high PV (Figure 14). When precipitating cells cover the full horizontal CPM domain, the PV is small, indicating strong predictability. This is consistent with greater predictability in the presence of organized weather systems.
A number of associations with parameters related to convective organization support this argument. A negative correlation is present across the globe with vertical wind shear, which tends to organize convection into predictable mesoscale systems (e.g., Liu & Moncrieff, 2001). Similarly, an organization parameter defined by where each term represents the vertical integral of the variance of the specified wind component within the CPM. Small values indicate the presence of mesoscale organization due to large variance in the horizontal wind speeds. As such, significant global mean correlation coefficients with V ORG are positive at 0.32, indicating greater predictability in the presence of organization. Additionally, the Richardson number, essentially the ratio of CAPE to the vertical wind shear, tends to be large in the presence of pulse storms and small under strong, organized systems. It is positively correlated to PV in the storm track regions of the extratropics, averaging approximately 0.4. These results are consistent with the analysis of Zawadzki et al. (1994).
Daily means are not ideal for analyzing the predictability of precipitation, because many convective systems have life cycles shorter than a day. We now present an analysis of 3-hourly data produced for Years 21 through 23 of the MP simulation, which provides the same number of samples as the 24-year daily data set.
While there is little correlation between CIN and PV (not shown), the joint variations of CIN and CAPE may have some bearing on the predictability of precipitation because a trigger can be needed to set off convection when the CIN is large. Figure 15 shows the PV binned as a function of both CAPE and CIN. This shows that, for a given CAPE, the PV tends to increase as the CIN increases. There is a tendency for PV to maximize near CAPE values of 1,000 J/kg when CIN is less than 20 J/kg and near 500 J/kg when CIN is above 40 J/kg. Sampling issues for high CIN make the latter hard to discern with confidence. With positive CAPE and strong CIN, convection is possible, but its probability can be reduced because a suitable trigger may not be

Journal of Advances in Modeling Earth Systems
available. In such a case, the CPM ensemble members are likely to disagree. The figure also shows that the PV decreases for the highest CAPE values when CIN is low. In those situations, instabilities are realized more effectively and regularly, leading to agreement among the ensemble members.
The PV also depends on the PW amount ( Figure 16b). As is well known, the precipitation rate is a very strong function of the PW (Bretherton et al., 2004), with maximum precipitation rates occurring for PW greater than 60 kg/m 2 and a very quick transition to extreme precipitation rates for larger values. Almost regardless of CAPE, the PV is also maximized along that transition. Previous studies have shown that the SP-CAM simulates the observed relationship between tropical precipitation and PW more successfully than the standard CAM (Khairoutdinov et al., 2005;Thayer-Calder & Randall, 2009;Zhu et al., 2009).

Critical Phenomena
We now turn to the role of critical phenomena in precipitation predictability, which was mentioned in section 1. Neelin et al. (2008) argue that the high variability near the PW critical point is an intrinsic property of the system that occurs independent of scale and is indicative of the system's extreme sensitivity. They found large and variable CAPE values to be associated with this transition, in agreement with the results presented above. They found no discernible association with CIN. Neelin et al. (2009) found that precipitation variance is maximized for a critical value of PW, w c , which is an increasing function of the vertically averaged tropospheric temperature,T. Based on linear regression analysis of the data reported in their paper, we find that the critical value satisfies w c ¼ 2:3714T −579:3; whereT is in Kelvin and w c is in kilograms per square meter. Figures 16c and 16d show mean precipitation and PV for bins of CAPE and the ratio w/w c , for the tropical belt within ±20°latitude. Self-organized criticality predicts that precipitation variance should be maximized for w/w c =1. In agreement with the ideas of Neelin et al. (2008), we see that the PV has a maximum when the

Journal of Advances in Modeling Earth Systems
critical ratio is equal to one and just above one. It should be noted that COV does not capture this relationship very well at all; whereas it does increase moving from high w/w c to 1, it continues the increase to the misleading maximal values of COV at low PW critical ratios (and low precipitation rates) where these become the most prevalent.
The presence of a predictability minimum under weakly forced or transitional conditions is also suggested by a tendency for peak PV values to be isolated near weakly rising 500-mb pressure velocities of −0.25 Pa/s for a wide range of relative humidity values (not shown). Under near-zero velocities or subsidence, predictability increases with drying, and under stronger rising motion, predictability increases with moistening.
Severe weather indices that have been developed for use in operational forecasting can similarly exhibit convective threshold behavior. We test four of these. The total totals index (Miller, 1972) is used to predict the likelihood and nature of thunderstorms based on vertical temperature and dewpoint structure. The modified K-index (Charba, 1977) and the lifted index (Galway, 1956) are also used to indicate convective potential. An additional instability index, named simply the "instability index" by Raymond et al. (2015), is based on vertical differences in low-level and midlevel saturated moist entropy. Figure 17 shows the mean PV values for bins of these stability indices and CAPE. In operational meteorology, the total totals index, purports to indicate likely isolated thunderstorms for values above 44, becoming progressively more intense Operationally, the modified (to include lower-level mean, rather than specified level, contributions) K-index, which provides a measure of air mass thunderstorm likelihood, is expected to indicate high convective potential and organization for values above 30. Coincidentally, PV increases strongly above 30 K, peaking at and above 40 K. There is more of a dependence on CAPE with this index and a broadening of the high-PV modified K-index range with increasing CAPE. These results are consistent with those of Davies et al. (2013), who show the greatest precipitation variability for modified K-index values over 30 K in observational data collected near Darwin, Australia.
The lifted index, indicates extreme instability for values less than −8 K, which is where we see a sharp maximum in PV for all CAPE values. Higher values of the lifted index show more CAPE dependence, and the PV maximum at the lifted index of 20 K tends to occur at a number of points in the tropics, particularly on the northern and southern edges in the Pacific. These results support those of Zhang et al. (2003)   Collectively, each of these is coincident with the edge of the precipitation maximum in a manner similar to the ratio w/w c , and these indices show appropriate local temporal correlations that agree with the relationships shown. While several bulk quantities show some association with precipitation predictability as measured by the PV of CPM ensemble precipitation rates, the most direct associations are those relating to critical phenomena. Based on the information analyzed here, states nearing critical points are those where precipitation is the least predictable.

Summary and Concluding Discussion
The CPM used as a superparameterization in SP-CAM is a stochastic parameterization because the solutions that it produces are sensitively dependent on their initial conditions. We have created a variant of the SP-CAM, which we call MP-CAM, in which 10 CPMs with slightly different initial conditions all see the same large-scale weather as simulated by the GCM. The feedback to the GCM is the ensemble mean of the feedbacks produced by the individual CPMs. In this way, we have approximated a "deterministic" parameterization.
When we compare results from the SP-CAM and MP-CAM, we are comparing the results from deterministic (MP-CAM) and nondeterministic (SP-CAM) parameterizations, where the underlying formulation of the parameterization (i.e., the CPM) is the same both cases. How different are the climates simulated by SP-CAM and MP-CAM? This is discussed in a companion paper (Jones et al., 2019). In the present paper, we concentrate on an analysis of the predictability of the precipitation rate, as simulated by the individual CPMs.
There were a number of surprises along the way regarding the specific ways in which the CPMs sometimes handle the generation of precipitation and the methods one can use to reliably quantify predictability given an ensemble of realizations with limited sample size. It was determined that the PV, used to provide a relative, scale-aware measure of ensemble spread, is a reasonable choice for small samples of precipitation data that are often inundated with zero values. Even with its noted faults, PV is a conceptually superior measure compared to the COV.
By comparing the PV to a large variety of large-scale parameters, it was determined that the predictability of precipitation is modulated by a number of environmental factors. Multiple tested bulk parameters had some statistically significant correlation to the PV, either in the global mean or at specific locations or times of the year. Strong surface forcing tends to indicate poor predictability, particularly in summer months, and strong forcing from the GCM tends to indicate better predictability. This was evidenced by negative PV relationships with features indicating the widespread presence of cloudiness or strong moisture anomalies. Notable among the basic indicators of poor predictability was the largescale potential for convective activity, CAPE. When CAPE is large, there is at least the possibility of strong convection.
Indicators of potential mesoscale organization within the CPM domain, whether inferred from GCM parameters or by investigating the state of the CPMs on their own grid, were weakly correlated with better predictability. Since most organization of convection derives from the state of the environment, like vertical wind shear, and because organization has an element of self-sufficiency once it is initiated, the likelihood for CPM ensemble member agreement is increased on average. The relationship is probably not stronger due the barriers to initiating convection that is organized or convection in general; it is very likely that crossing the threshold from scattered to organized convection will occur by way of chance variations on the CPM scale in the absence of clear, sustained direction from the GCM.
The results of the critical phenomena relationship to precipitation predictability lend support to previous theoretical work. Crossing thresholds of certain parameters, including the ratio of critical column water vapor to the actual GCM column water vapor and certain values of a number of commonly used convective instability indices, was the most reliable indicator of poor predictability. This makes sense intuitively, as critical change phenomena are, by their nature, confined to a limited parameter space. Falling on one side of the value or the other can yield vastly different results within an ensemble of possibilities and therefore large values of PV.
With the aim of extending these results, one can proceed in varied directions. There is much room to make modifications to the MP framework. One could envision employing a larger number of ensemble members to 10.1029/2019MS001610

Journal of Advances in Modeling Earth Systems
develop even better predictability statistics or to arrive at more consistent expected values for the purposes of a more deterministic model. However, the ways in which such an ensemble delegates the duties of reducing convective instabilities remain uncertain, and the possibilities shown here were not always encouraging. For this reason, one may try a different formulation for the tendencies from the CPM ensemble. For instance, since simple averages can be biased for skewed data, we might be better off applying some formulation which approximates the median or otherwise weighted CPM result.
Simple ensemble averaging is not the only possible way to combine the CPM tendencies for feedback to the GCM. For example, it would be possible to weight the tendencies by some measure of convective activity, such that the more active CPMs feed back more strongly. We have not yet explored such alternatives.
With regard to determining predictability relationships, there are many more parameters to be tested, chief among these being the vast array of existing stability indices. Additionally, consideration of parameter combinations seems as though it may be a fruitful pursuit. For example, one may be interested in how moisture convergence or upper-level dynamical forcing pairs with the stability indices to indicate precipitation predictability. Also, we have noticed similarities in the geographical structures of PV with those of certain cloud regimes as presented by Rémillard and Tselioudis (2015) and Jin et al. (2017). Jones (2017) explores which regimes show greatest similarities.
Our results provide evidence for the existence of large-scale indicators of the predictability of precipitation and can provide some guidance for the further development of stochastic parameterizations. With some additional effort, it should also be possible to apply these results to obtain forecast improvements, either through model development or more directly, by running more ensemble members or models of higher resolution in areas identified by large-scale parameters to be of poor predictability. At the very least, we have identified a way to know when a forecast might be unreliable, which might prove to be equally valuable.