Assessing Uncertainties and Approximations in Solar Heating of the Climate System

In calculating solar radiation, climate models make many simplifications, in part to reduce computational cost and enable climate modeling, and in part from lack of understanding of critical atmospheric information. Whether known errors or unknown errors, the community's concern is how these could impact the modeled climate. The simplifications are well known and most have published studies evaluating them, but with individual studies it is difficult to compare. Here, we collect a wide range of such simplifications in either radiative transfer modeling or atmospheric conditions and assess potential errors within a consistent framework on climate‐relevant scales. We build benchmarking capability around a solar heating code (Solar‐J) that doubles as a photolysis code for chemistry and can be readily adapted to consider other errors and uncertainties. The broad classes here include: use of broad wavelength bands to integrate over spectral features; scattering approximations that alter phase function and optical depths for clouds and gases; uncertainty in ice‐cloud optics; treatment of fractional cloud cover including overlap; and variability of ocean surface albedo. We geographically map the errors in W m−2 using a full climate re‐creation for January 2015 from a weather forecasting model. For many approximations assessed here, mean errors are ∼2 W m−2 with greater latitudinal biases and are likely to affect a model's ability to match the current climate state. Combining this work with previous studies, we make priority recommendations for fixing these simplifications based on both the magnitude of error and the ease or computational cost of the fix.

Forecast System meteorological data that drives the CTM simulation of atmospheric chemistry (M. J. Prather et al., 2017;Szépszó et al., 2019). This CTM + Solar-J model quantified the errors related to spherical geometry of the atmosphere (M. J. Prather & Hsu, 2019; hence P2019). Here, we use that code and diagnostic framework (i.e., January average of hourly global calculations at horizontal resolution of 1.1° including water vapor and ice-/liquid-water clouds) for an extensive set of case studies. In addition, we embed the AER RRTMG-SW version 4.0 (Clough et al., 2005;Mlawer et al., 1997) directly in the CTM, providing a platform for parallel comparison of the two codes in a realistic climate. Combining solar heating and photolysis in one code is obvious as both solutions use the same atmospheric data and we move to ESMs that require both. Nevertheless, compromises between the two requirements, Watts versus photons s −1 , are identified here.
In this study, we examine additional classes of approximations or uncertainties beyond P2019: examples of the historical improvement in infrared heating codes and the ongoing work at major climate centers (Section 2); the use of bands to integrate over spectral features (Section 3); multiple-scattering approximations that alter the scattering phase function for clouds, aerosols, and gases (Section 4); uncertainty in ice-cloud optics (Section 5); treatment of fractional cloud cover including cloud overlap (Section 6); and approximation of ocean surface albedo (Section 7). Each one of these sections has been the focus of major research studies that we briefly review. Bringing these together with common climate metric allows for the broad comparison here. Section 8 reviews our findings from Sections 3 through 7 and makes priority recommendations for improving climate models' solar heating codes.
For most cases here, we find global mean absolute errors, or uncertainties, ranging from 0.5 to 5 W m −2 with larger systematic latitudinal or root-mean-square errors. Such error levels are likely to shift ESMs into different climate regimes for the current reference period as they are comparable to the changes in climate forcing by greenhouse gases from pre-industrial to present (Myhre et al., 2013). Such shifts are likely to force parametric retuning of other ESM processes to match observed surface temperatures. Either way, errors on this scale are important and must eventually be addressed.

Evolving Solar RT Codes
Many of the case studies here follow the ongoing efforts of the scientific community to improve the solar heating codes in current climate models. Most of this work is occurring within the major climate modeling centers. For example, the U.S. Department of Energy's Energy Exascale Earth System Model (E3SM) is implementing a new toolbox that seeks to balance accuracy, efficiency, and flexibility in the solar RT code (Pincus et al., 2019). The Canadian Centre for Climate Modeling and Analysis (CCCma) examined the importance of expanded scattering phase functions for clouds and aerosols possible with a 4-stream RT code (Li et al., 2015, hence LBYY) as well as the use of satellite-derived cloudy atmospheres to compare 3-D RT with the 1-D climate model codes (H. W. Barker et al., 2012;2015. The Korean Integrated Model has updated their ice cloud treatment (Baek & Bae, 2018), as have also Zhao et al. (2018) for the Community Atmosphere Model Version 5. The European Centre for Medium-Range Weather Forecasts (ECMWF) now includes innovations such as horizontal and vertical cloud structures (R. J. Hogan & Bozzo, 2018; J. K. P.  based on Hogan et al.'s (2016 solar RT code that combines Tripleclouds and 3D effects in a 1D solution. The Australian Community Climate and Earth System Simulator (ACCESS) has also adopted Tripleclouds (Franklin et al., 2013). The French ARPEGE-Climat model (Séférian et al., 2018) has adopted the Z. H. Jin et al. (2011) ocean surface albedo model. With variants of the Solar-J code we can compare many of these improvements including vertical cloud overlap, but the 3D effects of H2016 are outside of Solar-J's current capability.
To calibrate the climate metric used here to assess errors and uncertainties, we test a sequence of solar RT codes that represents successively improved spectroscopy and modeling of water vapor lines in the infrared. Our reference code, the standard Solar-J version (H2017, denoted SJ), uses Cloud-J (M. J. Prather, 2015) spectroscopic data for the ultraviolet and visible wavelengths (0.18-0.78 µm, bins #1-18) with a single value of solar flux (in both photons cm −2 s −1 and Watts m −2 ) and cross sections (cm 2 ). SJ adds 9 broad infrared (IR) bands (0.78-12.2 μm, bins #19-27) taken directly from the RRTMG-SW version 4.0 code (Clough et al., 2005;Mlawer et al., 1997). These 9 IR bands plus the overlap bin #18 (0.485-0.778 µm) contain a total of 78 + 5 g-points (sub-bins) for gas absorption. A 'high-accuracy' SJ version in terms of infrared gas absorption is derived from the benchmark code RRTM-SW with the same 9 IR broad bands, but 144 g-points, and is denoted SJ/RRX. We made Solar-J variants using water vapor absorption as parameterized by older codes. The oldest and least accurate LLNL code (Grant & Grossman, 1998) has three IR bands (0.69-3.85 μm) with 21 sub-bins (denoted SJ/LLNL). The CLIRAD code (Chou & Suarez, 1999, revised with HITRAN 2012 data), used until recently by the Goddard Space Flight Center climate model, has three IR bands (0.70-10.0 μm) including a total of 30 sub-bins (denoted SJ/CLIRAD).
We compare these codes by running their SJ variants with only water vapor as an absorber in the IR and clear skies (no clouds or aerosols, see Figure 1a). Compared to SJ, SJ/CLIRAD, and SJ/LLNL have about 5 and 7 W m −2 less atmospheric absorption, respectively, most of which is absorbed at the surface with small excess fraction (∼1 W m −2 ) being reflected (see Table 1/Rows 1 & 2, hence designated T1/R1&2). The more accurate SJ/RRX code has only ∼0.25 W m −2 less atmospheric absorption than SJ (T1/R3). Assuming that SJ/RRX is the most accurate code for water vapor absorption, we find that in terms of re-partitioning the solar heating between atmosphere and surface (5-7 W m −2 ) these historical improvements are greater than any of the next generation of errors assessed here. In terms of incident or reflected flux errors, however, they are smaller than the cloud uncertainties, spherical geometry, or even issues of wavelength resolution in the visible.   Table S1); CLIRAD and LLNL, to the courser resolutions (SJ/CLIRAD and SJ/LLNL). (b) All sky with averaged clouds and no infrared (IR) gas absorption, emphasizing the resolution of cloud absorption. Differences are relative to SJ-66b (high-resolution infrared bins for clouds). SJ/noIR has the standard 9 IR RRTMG bands, and SJ/CLIRAD/noIR has 3. (c) Averaged liquid-only clouds shown for a range of re-scalings of the Mie scattering phase function (HG,. These are all evaluated within the 8-stream SJ code. Also shown is the difference RRTMG minus SJ/δ1, where much of the difference, especially in atmospheric heating, is due to the 2-stream minus 8-stream difference. See Table S1

Solar-J Spectral Model and Band Resolution
Practical RT solutions for the complex absorption features of atmospheric gases require selecting a limited number of representative wavelengths or wavelength bands for the calculation. Ideally, one picks as broad a band as possible that has nearly the same attenuation of sunlight across all wavelengths and can be accurately represented with a single RT solution. In wavelength regions with many sharp absorption lines, such as the Schumann-Runge bands of O 2 (177-202 nm), the opacity distribution method (Fang et al., 1974) sorts the lines into bins of non-contiguous wavelength micro-bands with similar opacity. In parallel, the calculation of heating rates uses the similar correlated-k distribution (Goody et al., 1989;Lacis & Oinas, 1991) to sort the near infrared and infrared absorption lines of water vapor and other greenhouse gases into bins of similar opacity, designated g-points in RRTMG. Both methods select broad wavelength bands where the opacities of more continuum absorbers like clouds and aerosols can be treated as a constant. The art here lies in selecting the minimum number of total bins, each requiring a separate RT calculation, which can still reproduce, within a specified error, results from spectrally resolved models using thousands of wavelength intervals and calculations.
Solar-J's underlying spectral model and scattering code is Fast-J, which is used for chemical photolysis rates and thus requires a reference solar spectrum in photons cm −2 s −1 and photon-weighted average cross sections. With the added capabilities for cloud overlap and solar heating, Fast-J has become Cloud-J (M. J. Prather, 2015), which has become the core of Solar-J. Fast-J developed optimized wavelength binning for the stratosphere (175-298 nm, bins #1-12, Bian & Prather, 2002) and the troposphere (298-800 nm, bins #13-18, Wild et al., 2000), and builds upon 4 decades of RT development for atmospheric chemistry (Logan et al., 1978;Olson et al., 1997;PhotoComp, 2010). The 8-stream Feautrier based scattering code (Prather, 1974;Wild et al., 2000) calculates mean specific intensities for the photolysis rates as well as flux divergence across each model layer. The cross sections used for atmospheric attenuation, photolysis rates, and absorption in bins #1-18 are photon-weighted. Heating rates use a reference solar table (Watts m −2 ) for each large bin. Atmospheric attenuation and absorption in IR bands #19-27 is calculated with the RRTMG spectral model; Solar-J extends its reference solar table (Watts m −2 ) to match RRTMG; and photon fluxes are not used since photolysis only occurs in bins #1-18. For the tropospheric wavelengths (#13-18), the bins are contiguous wavelength bands, and only ozone absorption and Rayleigh scattering affect the fluxes, see Figure 2.

The Wavelength Region 290-778 nm
We expected a simple, or at least explicable comparison of SJ with the RRTMG-SW v4.0 standalone code (RRTMG) under clear skies because the codes are essentially identical for IR wavelengths >778 nm, and the UV and VIS region <778 nm consists of well quantified, continuous cross sections for ozone absorption and Rayleigh scattering. The primary differences in the codes are the wavelength bands and scattering: SJ uses Rayleigh-phase scattering with an 8-stream code while RRMTG uses isotropic scattering with a 2-stream code. The difference RRTMG minus SJ under clear sky gave a surprisingly large difference: (+1.3, −0.5, −0.8 W m −2 ) for the three primary globally averaged components (reflection, atmospheric absorption, and surface absorption), respectively (T1/R4). For the UV region <298 nm, the solar flux (13.6 out of 1360.8 W m −2 ) is absorbed in the stratosphere, and both SJ and RRTMG agree on this. So focusing on the 298-778 nm region that covers the visible, UVA, and part of UVB (for convenience designated as UV-VIS here) we try to understand these differences.
This UV-VIS region has the most solar flux (721 W m −2 ) and for the most part is optically thin. The primary attenuators (molecular Rayleigh, O 3 ), solar flux, and wavelength bands used in RRTMG and SJ are shown in Figure 2. For 298-320 nm (13 W m −2 ) the O 3 total optical depth is greater than about 0.5 (0.05 in the troposphere) and much is absorbed in the stratosphere. SJ uses 3 narrow bands to resolve the rapid changes in O 3 optical depth and attenuation of sunlight as well as a fourth narrow band to 345 nm where transmission is controlled by Rayleigh scattering. This level of resolution is necessary to calculate photolysis rates. The Rayleigh optical depth (mostly tropospheric) drops rapidly from 1.0 at 300 nm to 0.04 at 700 nm. For wavelengths >345 nm, the optical depths of Rayleigh and ozone are relatively small, and accurate photolysis rates need at most two bands if individual species' absorption cross sections are photon-weighted. O 3 absorption in the Chappuis bands (450-800 nm) is not important for the photochemistry, but it is an important source of atmospheric heating. Likewise, Rayleigh scattering in the visible plays a minor role in photolysis rates but is an important source of reflected sunlight. A first-order estimate based simply on weighting tropospheric extinction with solar flux has Rayleigh scatter (forward and back) of 123 W m −2 , O 3 absorption of 2.7 W m −2 , and surface reflection (ocean albedo = 0.05) of 36 W m −2 . This should be a relatively easy wavelength region to model accurately.
To cover the 345-778 nm region, RRTMG has three broad bands and 20 total g-points (sub-bins) within them. Each g-point has its own Rayleigh cross section derived from the correlated-k distribution and its own O 3 absorption cross section. Solar-J has only two broad bands with no sub-bins and clearly does not resolve the different patterns of Rayleigh and O 3 extinction, see Figure 2. Thus, we built a high-resolution-visible Solar-J version (SJ/hrv), putting 18 bands in the UV-VIS region (black dots in Figure 2) to achieve resolution comparable to RRTMG. The SJ/hrv minus SJ differences, (+0.3, −0.5, +0.1 W m −2 ) (T1/R5), can explain the RRTMG v. SJ differences in atmospheric heating, but still leaves a large discrepancy in the reflected flux.
Is this difference caused by Solar-J's use of photon-, not Watt-weighted averages for these UV-VIS bands? In  VIS bands in Solar-J using both photon and Watt weighting derived from the same high-resolution solar spectrum. When the bands are narrow (e.g., 320-345 nm) the difference between these two averaging methods in negligible, but for the two broad VIS bands, the Watt-averaged Rayleigh cross sections are 4%-7% larger, and would result in greater reflected flux as seen in RRTMG and SJ/hrv. Thus Solar-J's requirement for calculating both photolysis and heating rates from the same calculation points out a fundamental conflict in the case of broad bands. For the narrow tropospheric UV (298-345) the current SJ bands are fine, but for the VIS we must increase the number of bands beyond 2, and look at Watt-weighting of Rayleigh and O 3 cross sections but not the photolyzed species. We expect that accurate clear-sky solar heating can be computed with 5-10 broad bands and probably does not need the 20 g-points of RRTMG. One must remember that clouds and aerosols are also important for the 721 W m −2 in the UV-VIS. Clouds have little spectral dependence in the UV-VIS and can thus be calculated accurately with very broad bands. Aerosols extinction often scales inversely with wavelength (see Figure 2) and thus aerosol radiative forcing of climate needs to be resolved by the broad band structure and not with g-points.

Cloud Absorption in the IR
The accuracy of the calculated cloud absorption is determined by the number and spacing of the IR broad bands, not the number of sub-bins or g-points in each, because models assume constant cloud optics across each band. Edwards and Slingo (1996) recognized this problem and devised an ingenious spectral averaging of the single-scattering albedo (SSA) of ice-and liquid-water clouds over the broad infrared bands. Their thick-cloud averaging enabled the UK climate model to run successfully with four broad bands for solar RT, but that model and a 24-band model still had large errors compared to their 220-band reference model. How does this problem look with the RRTMG bands? Figures 3a and 3b shows the SSA of typical ice-and liquid-water clouds over the infrared spectrum. The refractive indices for ice and liquid water are similar but with distinct wavelength shifts: the second deep SSA minimum for liquid occurs about 1.9 µm, while that for ice occurs about 2.0 µm. The largest differences in the liquid versus ice curves here are caused by particle size, with smaller particles having larger SSA. The 9 infrared bands (#19-#27 in Figure 3) are denoted with the vertical dashed lines. The average cloud SSA in each bin is shown by the horizontal bar with a circle.
To resolve the variation in cloud absorption within the broad bands, we recalculated a set of narrow bands (0.05-0.10 μm wide), yielding 66 IR bands shown as black squares. The Solar-J cloud optics are calculated using a flux-weighted refractive index for liquid or ice water and Mie theory for a wide range of effective radii (R eff ) and a dispersion of spherical particles similar to stratus clouds. The Mie calculation returns SSA, extinction coefficient (Q), and seven Legendre coefficients of the scattering phase function (g 1:7 , see Table 2). For ice clouds, a non-Mie scattering phase function is used and examined in Sections 4 and 5. The Mie calculation of SSA and Q is surprisingly accurate. In the top panel of Figure 3 we compare our narrow-band Mie-ice SSA results with values taken from Ping Yang's library (Bi & Yang, 2017) of ice crystal optics: the four thin colored lines represent four crystal habits (droxtals, 8-columns, 10-plates, small bullet rosettes, all with roughness value of 003) that have a wide range of geometries (maximum to effective diameter ratios of 1.2, 3.3, 7.6 and 2.5 respectively). The SSA is primarily a function of R eff and refractive index, and the Mie approach fits within the range of crystal habits.
The standard Solar-J cloud optics, based on average refractive indices (H2017), should probably be re-calculated with the Mie calculation at fine wavelength intervals and then with flux-averaging of SSA, Q, and g 1:7 , as in Edwards and Slingo (1996). Looking forward, however, we should not have bands or sub-bins with such large variations in refractive index. The narrow bands have smaller changes in refractive index across each band and will be minorly affected by the method of averaging. SJ/66b is the Solar-J variant using these 66 narrow bands. Lu et al. (2011: LZL) noted that RRTM's wide infrared bands average over both reflecting and absorbing wavelengths in clouds. LZL use a correlated k-distribution that combines the line-by-line water vapor with the continuum liquid-water cloud optical properties to produce a set of sorted cloud + vapor bins within each wide band. LZL apply this new hybrid model to a sample column of liquid cloud layers and show that including variable cloud absorption within the broad RRTM bands results in more scattering, less atmospheric heating with both more reflected flux and more surface heating. They attribute these errors to the The RRTMG infrared (IR) bins, designated #18 through #27 are demarcated by dashed vertical lines with average SSA denoted (large circle with cross bar and light-colored fill). The solar energy (W m −2 ) in each bin is denoted, and the reflected flux (%, red, above flux) for overhead sun, zero surface albedo, and an optically thick ice-water cloud is also given. Average SSA is calculated from a flux-weighted refractive index for each bin assuming spherical (Mie) particles, including for ice-water. The liquid particles have an effective radius of 12 µm; and the ice particles, 48 µm. The black squares show values used in SJ/66b, where the resolution ranges from 0.05 µm (0.7-2.4 µm) to 0.10 µm (2.5 µm-4.0 µm). Four thin colored lines plotted on top of the ice cloud data are taken from Ping Yang's library (Bi & Yang, 2017) of ice crystal optics for droxtals, 8-columns, 10-plates, and small bullet rosettes, all with roughness value of 003). The R eff values of 48 µm correspond to maximum diameters of 112, 312, 738, and 238 µm, respectively. The Zhao et al. (2018) new derivation of ice-cloud single scattering albedo for the RRTMG bands is shown as a light cross (X) to compare with the pink/red circles. (c) Profile of in-cloud rates for stratus (optical depth OD = 12, liquid water) and cirrus (OD = 2, ice water) from SJ/noIR and SJ/66b (also no IR gas absorption by design) for overhead sun and surface albedo of 0.05.  spectral correlation of gas and cloud absorption and make the case for including ice clouds in a three-parameter correlated k-distribution as the path forward.
We extend the LZL work with SJ/66b by evaluating the RRTM-band cloud errors in a climate-relevant framework and including ice clouds. We do not have the capability to generate correlated-k sub-bins for the 66 IR bands and thus cannot include infrared (IR) gas absorption. In a single column model like LZL, we calculate cloud heating rate profiles for sample stratus and cirrus clouds at high sun ( Figure 3c) using SJ/66b and compare it with Solar-J without IR absorption (SJ/noIR). For stratus clouds, the excess heating using SJ/noIR ranges from +5% (bottom) to +25% (top), with similar values for cirrus ice clouds. At the top of a stratus cloud, SJ/66b heating rates are 1.8 K per day, while SJ/noIR calculates 2.3 K per day (both are without IR gas absorption). We can include IR gas absorption in SJ and find that heating rates in the upper layers of the stratus cloud are reduced by about a third because of the absorption by water vapor above the cloud deck; this difference is much less with high cirrus clouds. The relative differences between 9 bins and 66 infrared bands (i.e., 5%-25%) should remain. Cloud heating errors at this level are likely to affect the lifetime and stability of clouds (Wood, 2012). Overall these results are similar to LZL's liquid clouds and confirm that ice clouds also need to be corrected.
In terms of our climate metrics (e.g., global zonal-mean heating rates for January in Figure 1b), SJ/noIR has 1.7 W m −2 more atmospheric heating than SJ/66b, while SJ/CLIRAD/noIR, with only three IR bands, has 3.8 W m −2 more (T1/R7&8). Coarse wavelength resolution of cloud absorption clearly results in more in-cloud heating balanced almost equally by less reflection and surface absorption. Our global monthly values are about 20 times lower than LZL's case study of a very thick low cloud with overhead sun, and this is probably not inconsistent. LZL attribute these errors to the correlation of gas and cloud absorption, but our results show that a significant component of the error is simply the failure to separate the highly scattering wavelengths from the partly absorbing wavelengths and is not tied to water vapor absorption. This information may help in the development of a combined gas-liquid-ice model. Improving the accuracy of the gaseous absorption is an ongoing effort (Etminan et al., 2016;Mlawer et al., 2012;Paynter & Ramaswamy, 2014;Pincus et al., 2015;Radel et al., 2015), but parallel efforts on cloud absorption are limited.

Scattering Phase Functions and Multiple Scattering
Ideally, the sunlight scattered by clouds, aerosols and gases is resolved semi-continuously in all directions within the atmosphere, but in practice, RT solutions for solar heating keep track of a limited number of angles (streams) in upward and downward directions and average over the azimuth angle. Solar-J uses eight streams (four up, four down) to resolve multiple scattering and this RT solution is implemented in many global chemistry models (M. J. Prather, 2015). RRTM (Mlawer et al., 1997) uses 16-stream scattering but is not implemented in global models; instead, RRTMG (Clough et al., 2005) with 2-stream scattering is used in many climate models. The Geophysical Fluid Dynamics Laboratory (AM3) and the Canadian Center for HSU AND PRATHER 10.1029/2020MS002131 9 of 24 Note. This example assumes liquid water cloud (R eff = 12 μm, wavelength = 600 nm, ω 0 = 0.99999). The Mie phase function is truncated after P 7 . The Henyey-Greenstein phase function is expanded to P 7 is using only the first asymmetry term of the Mie phase function. The δ-0, δ-1 and δ-2 phase functions include at most P 0 and P 1 . All of these SJ variants use 8-stream scattering. Climate Modeling and Analysis (CanAM4) with 4-stream RT (Li & Ramaswamy, 1996; LBYY) appear to be the most accurate scattering codes currently used in climate models.
The number of scattering angles determines how well the scattering phase function, P(Θ), is resolved, where Θ is the angle between incident and scattered light. Nominally, these phase functions are calculated using Mie theory for spherical droplets or other approximations for aspherical ice or dust particles (Mishchenko et al., 2016;Yang et al., 2018) and expanded in Legendre polynomials, P L (Θ), where the first two terms are P 0 ≡ 1 and P 1 ≡ cos(Θ). See Figure 1 of LBYY for the Legendre coefficients for water clouds and dust. Delta-Eddington methods for solving the RT problem (Joseph et al., 1976) approximate this phase function as a forward delta function plus a two-term expansion of the phase function, P 0 and P 1 , see Table 2. M-stream RT methods W. J. Wiscombe, 1977) extend this to include the first M terms in the Legendre expansion. All of these methods reduce the coefficient of P 1 from 3g 1 to a lesser value 3g* (g* is often called the asymmetry factor, see Table 2), and likewise reduce the scattering optical depth. The reduction in cloud optical depth applies only to the scattering optical depth τ sca , while the absorbing τ abs is unchanged. These δ-scaling methods are chosen to avoid the unphysical negative phase functions that result from truncation of the Legendre series. The 8-stream in Solar-J does not use delta-M scaling because early tests showed that simply truncating the phase function for liquid water clouds at P 7 (giving a phase function that oscillates in sign) still produced accurate, non-negative fluxes and mean intensities compared to 160-stream solutions expanded to P 159 . In these tests, the mean intensity differed by 1% throughout most of the atmosphere, with a worst case of 8% in the uppermost layers of an optically thick cloud and overhead sun (Wild et al., 2000). A major difference between these methods is that Solar-J retains the correct direct flux at the surface while δ-scaling methods can greatly exaggerate it.

Phase Function Approximations for 2-Stream Assessed With 8-Stream Scattering
The method designated δ-1 here adopts the commonly used Henyey-Greenstein (HG) phase function (Boucher, 1998) to estimate the Legendre coefficient 5g 2 (i.e., g 2 = g 1 2 ), see Table 2. This coefficient becomes the scaling factor f = g 1 2 and is used to calculate a reduced asymmetry factor g* = (g 1 -f)/(1 -f) and reduced τ* sca = (1 -f) τ sca . This scaling, for example, is applied in the Mie expansion for liquid-water clouds of RRT-MG code used here. Another alternative, designated δ-2 (δ-M method with M = 2), uses the Mie phase function's second term, f = g 2 and the revised g* and τ* sca are calculated as for δ-1. For comparison, an isotropic-equivalent method, designated δ-0 here, drops the asymmetry factor in the phase function and calculates a reduced τ* sca using f = g 1 and the above formulae. From the example in Table 2, δ-0 is the least forward scattering with the largest reduction in τ sca (1-f = 0.14); δ-2 has the next largest reduction (1-f = 0.20); while δ-1 has the least (1-f = 0.25), reducing τ sca by only a factor of 4. SJ versions have been coded that rewrite the cloud optical depth and scattering phase function in accord with δ scaling and are designated eponymously as SJ/δ0, SJ/δ1, and SJ/δ2. In addition, version SJ/HG uses 8-stream scattering and an 8-term HG expansion based only on the first term of the Mie phase function: P HG (Θ) = 1 + Σ L=1:7 (2L + 1) (g 1 ) L P L (cos[Θ]). Rayleigh scattering by air (an important component in the visible region, see above) must also be approximated in 2-stream codes, truncated from P Ray = ¾(1 +cos 2 [Θ]) to P Ray = 1, removing the forward-backward scattering lobe of the phase function to make it isotropic; this is implemented here as version SJ/Ray. All these versions use the standard 8-stream scattering code. The δ-scaling and HG approximation tests here are run without ice clouds to aid in comparisons with RRTMG.
Approximating Rayleigh scattering as isotropic, required in all 2-stream codes, fortunately has inconsequential errors: all three global mean error metrics (reflection, atmospheric absorption, surface heating) are measurable but within ±0.01 W m −2 (T1/R17). There is little systematic geographic or zenith angle errors since the root-mean-square (rms) errors are ≤0.1 W m −2 . This test was done with the 8-stream code.
For clouds, the δ-scaling errors are modest in terms of global mean, within ±0.4 W m −2 for any of the three primary components (T1/R10-11-12). The pattern is interesting in that all three methods show a similar −0.3 W m −2 error in reflected flux, but the surface absorption error shifts from +0.07 to +0.23 to +0.44 W m −2 in the order δ-1 to δ-2 to δ-0, being caused by the successively greater reductions in τ sca by the scaling factors of 0.25, 0.20, and 0.14. Thus in δ-1 the reduction in reflected flux goes into atmospheric heating; while in δ-2 and δ-0, it goes into surface heating. Global mean HG errors are (−0.05, 0.14, −0.09 W m −2 ) for the three primary components (T1/R9). This pattern--less reflection and less forward scattering to the surface-is caused by the weaker forward and backward scattering peaks in the HG phase function. Although these mean errors are modest, there is no basis and no cost advantage to using an 8-term HG phase functions. For a complete analysis on the radiative flux biases due to HG approximations on the clouds and dust aerosols, see LBYY.
The δ-scaling 2-stream models are optimized to give reasonable averages, but with large opposite-sign errors at different SZA (Joseph et al., 1976;W. J. Wiscombe, 1977). Thus, we examine the geographic pattern of δ-scaling errors for fixed sun (00Z) in Figure 4 and find that SZA ∼ 40° (green dashed oval) is the zero-error point for all three δ-scaling methods. Figures 4a-4c columns show the three primary components: reflected flux, atmospheric absorption, and surface absorption, respectively. The first three rows (i)-(iii) show the sequence δ-0 to δ-2 to δ-1, respectively, in order of decreasing scaling factors. Looking at the region with SZA < 40° (inside the green oval), we see that the error in reflected flux is positive and greatest for δ-0 and decreases along the sequence. Similar dipole patterns as a function of zenith angle can be seen in Figure 2 of LBYY with the relative errors about 20% for thin liquid clouds (optical depth, τ < 1) to about 5% for thicker clouds (τ ∼ 10). The zero-error point moves from SZA of about 60° at τ ∼ 0.01 toward smaller SZAs with increasing τ up to 10. For thick clouds (τ > 10), the relative errors and the dipole patterns are comparatively minor. Similar patterns are also shown in Figures 2c and 5a of B2015 for the visible band and the near-infrared band respectively. However, the zero-error point in B2015 stays in the SZA = 40-50° range.
The vertical profiles of the atmospheric absorption errors in Figure 4 are shown in Figure S1. The error at high sun (i.e., inside the oval SZA < 40°) goes from overall negative for δ-0 (-2.7 W m −2 ), to small positive for δ-2 (+0.8 W m −2 ), to strongly positive for δ-1 (+2.9 W m −2 ), and the profiles are consistent with the means. This sequence is similar to the sequence for global mean atmospheric heating, +0.07 to +0.23 to +0.44 W m −2 . When we mix 2-stream versus 8-stream (RRTMG vs. SJ/Mie, red line in Figure S1), the profile errors are large and oscillate. While the column mean difference is small in this case, the rms difference is as large as the worst case δ-0. In high solar latitudes (50°S-70°S and 30°N-50°N in January) where SZA remains large at all hours, the monthly mean averages have consistent errors of 1-2 W m −2 at the surface (Figure 1c). The δ-scaling approximations studied here have largest impact on the reflected and surface fluxes rather than atmospheric absorption.

2-Stream versus 8-Stream and Exact Scattering Solutions
Here we assess the errors in 2-stream scattering by comparing with 8-stream solutions standard in Solar-J using our metric. Fortunately, two key studies have carried this further with exact solutions, using a 128-stream RT code (LBYY) and Monte Carlo simulations (B2015). Räisänen's (2002) work on tuning 2-stream approximations used comparisons similar to ours with a 16-stream discrete ordinate method. For Solar-J, we coded a 2-stream RT solver based on the Feautrier method using δ-1 scaling, and this version is designated SJ/2S. With this we have a clean, bias-free comparison of 2-stream versus 8-stream under clear skies (T1/R6) and with liquid clouds using δ-1 scaling for both models (T1/R13).
Under clear sky conditions (no clouds, no aerosols) the only scaling in SJ/2s is to make Rayleigh scattering isotropic, which has negligible effects as noted above (T1/R17). The pattern of 2-stream errors ( Figure S2, left column) is unusual, with reflected flux being much larger in regions with high surface albedo (Antarctica, dry sub-tropics). The enhanced reflection is matched by reduced atmospheric absorption. Our surface reflection is Lambertian (isotropic), and it is clear that having only a single zenith angle of 55° for the reflected light reduces the average path length through the atmosphere for the reflected flux and results in a shift from absorbed to reflected. There is also a slight increase in the downward scattered light reaching the surface with only one angle. We looked at the B2015 case studies, but could find no corroboration since they used black surfaces. We did find this pattern in the direct comparison of SJ/δ1with RRTMG (Figure 4, row iv) where RRTMG's 2-stream code also picks out the high albedo land masses with enhanced reflection. Unfortunately, we cannot compare with the LBYY climate model results (their Figure 6) because their 'clear sky' includes, and clearly shows, the importance of northern continental pollution and southern ocean sea salt aerosols in their 2-stream errors.   Table 2. Row (iv) shows the difference, RRTMG minus SJ/δ1. All calculations use grid-cell averaged liquid clouds only. The green dashed line encloses the region with SZA < 40 o . The Henyey-Greenstein extension of the phase function using the asymmetry parameter g (value shown at L = 1) shows a reasonable approximation to the Mie phase functions, but not those for ice crystals. The T-matrix hexagonal/cold phase function is similar to three of the four habits shown here, but they all differ from the "10 plates" habit (except for roughness 050). The 10 plates habit has the most asymmetric shape with a maximum length to effective diameter ratio of 7.5 (vs. 1.0 for a sphere).  ative of the convergent error reduction from 2-stream to 4-stream. If we extrapolate to 8-stream, we would see the errors in δ-4SHE drop by a factor of 2-4. We looked separately at the error caused by δ-1 scaling of ice clouds only and found it to be surprisingly small with global mean differences of < ±0.02 W m −2 . So straightforward δ-M scaling of ice clouds, here and in LBYY, does not seem to cause much error, but we cannot easily judge the other more parameterized polynomial fits for 2-stream use (Fu, 1996;Zhao et al., 2018).
In terms of atmospheric heating by clouds, LBYY's parametric study (their Figure 3) shows reduced values over most ranges of optical depth and sun angle, and this error is clearly associated with the number of streams. Similar results are shown in B2015 Figures 5c and 5f. B2015 evaluated 2-stream δ-Eddington errors for a typical mixed-cloud atmosphere derived from high-resolution satellite observations (see their Figures 7 and 8). The atmospheric absorption error is typically 0 to −5 W m −2 , while the surface heating error is 0 to +8 W m −2 . These errors are similar to our 2-stream minus 8-stream differences (both δ-1 scaling, T1/ R13) shown in Figure S2 (middle column).
Atmospheric heating rates by clouds are driven predominantly in the IR region (>778 nm) where SJ and RRTMG have identical bands and atmospheric properties (unlike the UV-VIS region discussed above). The geographic pattern of atmospheric absorption at 00Z for RRTMG minus SJ/δ1(liquid-only) (T1/R15, Figure 4iv) is consistent and uniformly negative. Since both use δ-1 scaling, this supports the result that 2-stream scattering consistently underestimates the atmospheric absorption by clouds. The RRTMG(δ-scaling, 2-stream) minus SJ(8-term Mie or ice scattering phase function, 8-stream) comparison clearly show less heating by liquid clouds below 6 km altitude (Figure 5a), Results in Figure 5a used the full-phase-function SJ (T1/R14), and if we use SJ/δ1 (T1/R15) as the reference, differences are similar. But if we go to our 2-stream, δ1-scaling code (SJ/δ1/2S, T1/R16), the differences with RRTMG are greatly reduced. When we add ice clouds, the difference is much larger, from −0.05 to −0.1 K per day throughout most of the troposphere (Figures 5b and 5c). With ice clouds, there is some reversal of differences in the lower mid-troposphere presumably due to cloud overlap. In terms of global mean, atmospheric heating error jumps from about −1 W m −2 to −2 or −3 W m 2 (T1/R14&15 v. T1/R20&21).
A clear result from B2015, LBYY, Räisänen (2002) and this study is that 2-stream RT codes systematically underestimate the atmospheric absorption by all clouds, and for ice clouds the error is larger in absolute amount. They also overestimate the surface flux, and in all of Räisänen's efforts to tune a 2-stream code, these fundamental errors remain. The ability to resolve the scattered light across multiple zenith angles is critical in calculating heating rates. With only a 55° zenith angle path, the 2-stream scattered light escapes from liquid clouds more easily. Ice cloud errors, however, either reflection or absorption, are not driven by δ-M scaling but by the more parameterized non-δ-Eddington scaling used in many 2-stream codes. Reflection errors for liquid clouds are different and appear to result primarily from δ-Eddington scaling, and in this case expansion to 4-stream or above significantly improves the accuracy (Räisänen, 2002; LBYY) because a more accurate scattering phase function can be used.

Ice Cloud Optics
Ice particles in cirrus or mixed-phase clouds come in a wide range of sizes and crystal habits (Kärcher et al., 2014;Platt & Martin, 1997) with a dizzying array of optical properties (Mishchenko et al., 2016;Yang et al., 2018). In part because of their importance in climate and remote sensing, ice clouds are an intense research area (Baum et al., 2005;Heymsfield et al., 2017;Holz et al., 2016;Platnick et al., 2017). The accurate treatment of ice clouds in climate models remains a fundamental uncertainty because-like the case with aerosol size and chemical composition affecting RT-we can observe, but not accurately characterize the full mix of size and crystal habits within ice clouds (Bailey & Hallett, 2009).
Even when the ice cloud is fully characterized, the RT solution remains difficult and approximations are numerous, for example, RRTMG primarily uses a Fu (1996) parameterization for optical depth, single scattering albedo, and asymmetry parameter. Solar-J inherited the Cloud-J photolysis method for ice clouds including effective diameters (Heymsfield et al., 2003) and T-matrix scattering phase functions (Mishchenko et al., 2004) for the visible (∼600 nm) using two cloud types (pure hexagonal crystals used for T ≤ −40°C, and irregular crystals used for warmer ice clouds). Solar-J's use of a scattering phase function across all wavelengths has minimal error for highly scattering wavelengths (SSA > 0.9) because the ice crystal effective diameters are much larger than the wavelength, and thus the asymmetry parameter g is nearly constant at ∼0.8 in agreement with Bi and Yang (2017). When we compared atmospheric heating rates between RRT-MG and SJ for our January climate simulation (averaged clouds, no overlap model used), the importance of ice clouds stands out. Both models can use the same optical properties for liquid clouds (with RRTMG using δ-1 scaling), and thus the differences in tropospheric heating rates are systematic but modest (±0.05 K per day, Figure 5a, also see case study Figure 4 in H2017). When ice clouds are included, the two models clearly diverge (Figure 5b). Unlike for liquid clouds, SJ cannot simply match the RRTMG's ice-cloud parameterization (Fu, 1996), because it is not a δ-M scaling. RRTMG heating rates are 0.1-0.2 K per day (10%-20%) less than those in SJ throughout the middle-upper troposphere, with the pattern reversed for liquid water clouds in the tropics (2-6 km) presumably due to less shielding by ice clouds. Running both models with fractional clouds assuming MAX-RAN overlap, gives similar results (Figure 5c). While errors in the parameterization of ice optics will contribute to this error, it is most likely caused by 2-stream models as discussed in Section 4.2.
The ice-cloud phase functions shown in Figure 6a show the coefficients of the Legendre expansion to L = 7 scaled by 1/(2L+1) for Mie, T-matrix, Bi-Yang and some HG expansions. Where possible, values are calculated for R eff = 48 µm and 600 nm. In the Mie calculation, the difference between liquid and ice water is small as expected, but the T-matrix coefficient drops quickly (becoming more isotropic), but then levels off, becoming more forward peaked than the Mie at L > 5 for hexagonal ice and L > 7 for irregular ice. The HG phase function matches the Mie out to L = 5 before becoming more isotropic. For ice clouds (T-matrix), the HG diverges by L = 2 and this is why LBYY show a pronounced error in HG versus their 4-stream RT that uses the L = 2 & three terms. The divergence of the ice-cloud phase function from any HG-like extrapolation becomes even greater for L = 4-7, which can be included in the 8-stream calculation here. We compare in Figure 6b our T-matrix expansion with those calculated from the Bi-Yang database used in LBYY. Our hexagonal T-matrix fits within the class of ice crystals from Bi-Yang except for the extremely forward peaked 10-plates example, which has the greatest difference from a sphere: the ratio of maximum diameter to effective diameter is 7.5 versus 1.2 for droxtals. As expected, our irregular T-matrix phase function has a larger fraction of isotropic-equivalent scattering, similar to what happens with increased roughness in the Bi-Yang crystals. All of these phase functions have similar asymptotic behavior. The Solar-J treatment of ice clouds certainly needs to be updated (e.g., Bi & Yang, 2017;Heymsfield et al., 2013), but is solidly linked to the optical properties of ice clouds, and is adequate for the comparisons here.
Some recent efforts have updated the RRTMG options for ice clouds. Zhao et al. (2018) combined more realistic mixtures of ice crystals (Baum et al., 2011) with the Yang library of optical properties for a wide range of wavelengths, sizes and habits (Bi & Yang, 2017;Yang et al., 2013) to create a new Baum-Yang ice cloud model for RRTMG (BY). We checked our ice cloud optics against Zhao et al.'s tables and found consistent values in the scattering regions (VIS and IR bands #S19-22) for SSA ( Figure 3a) and g, but we calculate about 10% more extinction for the same effective diameter and ice water content. We were not able to implement the new RRTMG(BY) code in Solar-J and can only assess their published results: RRTMG(BY) has weaker in-cloud heating than RRTMG (Fu) (their Figures 2d−f). As shown in LBYY, this is a standing problem with all 2-stream codes, and the new ice optics appear to have made this error worse.
Baek and Bae (2018) updated the Korean Integrated Model (KIM) radiation code based on RRTMG using the Yang et al (2013) ice optics library and tested with the KIM forecasting system. They show that use of the updated ice optics versus the RRTMG (Fu) parameterization leads to more in-cloud heating by about 0.1 K per day and less reflection by about 3 W m −2 (their Figure 8). These results appear to correct much of the error in ice-cloud heating found here and in LBYY and B2015, but they are difficult to understand or consolidate with RRTMG simulations here because KIM uses their adaptation of RRTMG's 2-stream solver. We find that the large error in ice cloud heating is due to 2-stream methods and not scaling methods. When we truncate the ice-cloud phase function with δ-1 scaling (SJ/δ1ice vs. SJ, T1/R18) we find very small differences in reflection or atmospheric heating ( Figure S2, right column). The SJ ice-cloud phase functions are consistent with the Y2013 library, and so if Baek and Bae are to correct much of the ice-cloud errors within a 2-stream solution then we need to understand how these new ice-cloud optics are implemented.
The treatment of ice-water clouds remains a large source of uncertain error in solar-heating codes. A more accurate treatment of ice water clouds will combine the physics of individual particles (e.g., Mishchenko et al, 2016;Yang et al., 2018) with the actual mix of such particles observed in the atmosphere (Heymsfield et al., 2017;Thornberry et al., 2017). We concur with LBYY that multi-stream scattering codes, such as CCCma's 4-stream or Solar-J's 8-stream, will become more essential for accuracy and also more cost effective in climate models. They are also necessary to assess ice cloud optics since the 2-stream problems obscure such improvements.

Cloud overlap
Treatment of overlapping clouds provides a challenge for climate models in solving the RT problem as well as other physical processes such and precipitation and scavenging of trace species (e.g., . If we can specify the cloud overlap in terms of separated independent column atmospheres (ICAs, with multiple 1-D RT solutions) or in terms of 3-D cloud fields (a 3-D RT problem) then we can evaluate the errors of different RT methods for case studies using highly accurate methods. One challenge lies in developing an RT solution that works efficiently in climate models, but a greater uncertainty lies the specification of cloud structures. Fortunately, with modern observing systems and cloud resolving models, this problem is becoming less of an unknown (H. W. Barker & Li, 2019; B2015).

One-Dimensional ICAs
When grid-cell layers specify cloud fraction (presumably in terms of areal coverage), explicit information or an algorithm is needed to describe exactly how the cloud is distributed within the grid cell and how the cells overlap. Most climate models do not resolve the horizontal scale of clouds and simply report a cloud water column and fractional coverage in the cell. A typical algorithm is MAX-RAN (Briegleb, 1992): the cloud is assumed to be uniform in the cloudy fraction of the cell; when two adjacent vertical layers have clouds, they are maximally overlapped; but when two cloudy layers, or two groups of contiguous (maximally) overlapped clouds, are separated by a clear layer, they are randomly overlapped (e.g., see figures in Neu et al., 2007). More realistic cloud-overlap algorithms have been developed based on observations showing that cloud overlap has a vertical decorrelation length (H. W. Barker, 2008;Bergman & Rasch, 2002;R. J. Hogan & Illingworth, 2000;. This EXP-RAN method assumes an exponential decorrelation length for connected cloud layers but random overlap across clear layers (A. M. Tompkins & Di Giuseppe, 2007).
In a manner similar to Hogan and Bozzo's (2018) deterministic cloud-cover generator that goes from MAX-RAN to EXP-RAN, Cloud-J developed a deterministic ICA generator for MAX-RAN and then adapted it to use vertical decorrelation lengths in its MAX-COR algorithm (M. J. Prather, 2015). Chemistry models need the selection of ICAs for any overlap method to be deterministic because many critical applications require perturbation-control pairs without stochastic noise (e.g., M. J. Prather & Hsu, 2010). Thus Solar-J cannot use a stochastic cloud generator (e.g., Räisänen et al., 2004), and this drove the structure of our cloud overlap algorithm. MAX-COR was designed to be (i) deterministic, (ii) linear in cost with increasing numbers of layers, and (iii) robust when cloud data are averaged in time or space, because such averaging tends to eliminate cloud-free layers and revert to MAX overlap. Based on observations of decorrelation length (Kato et al., 2010;Naud et al., 2008;Oreopoulos et al., 2012;Pincus et al., 2005), MAX-COR defines 6-layer groupings by altitude range. Because decorrelation is small across the vertical range of each group, we assume MAX overlap within each group and a decorrelation of the overlap of each MAX group with its neighbor. Adopting terminology of climate community, MAX-COR is effectively a MAX-EXP algorithm. By quantizing the cloud fraction to the nearest 10% and allowing an independent cirrus shield at the top, the absolute maximum number of ICAs under MAX-EXP is <5 × 10 6 and thus ICAs can be rapidly defined and binned with low computational overhead. Deterministic EXP-EXP or EXP-RAN models in our code would have to enumerate up to 2 33 ICAs for our model that has potentially 33 cloudy layers, which is truly prohibitive and not linearly scalable with resolution. We believe that a MAX-COR or MAX-EXP algorithm is likely the most stable and scalable deterministic ICA generator for vertical cloud decorrelation algorithms. The RRTMG v4.0 code available at the time of this study uses primarily MAX-RAN cloud overlap, but the new v5.0 code includes an EXP-RAN option. Thus, our comparisons of cloud-overlap results with the RRTMG code are limited to MAX-RAN. Within Solar-J we can run both MAX-RAN (SJ/RAN) and the standard MAX-COR (SJ) and thus compare with J. K. P. , as discussed below.
Let us accept that ICAs generated by cloud overlap algorithms can be solved with 1D RT as horizontally homogeneous plane parallel layers, then the next step is how to solve the RT problem for all ICAs and average the results. The number of ICAs are often numerous enough that no practical climate RT code can solve them all, and most codes do not even count them all (Räisänen et al., 2004). RRTMG randomly selects an ICA for each wavelength bin in the RT solution, a method designated Monte Carlo ICA (McICA, Pincus et al., 2003). McICA has errors at each time step by mixing ICAs across wavelengths and by not accurately sampling the average of ICAs (e.g., average cloud optical depth) in that time step. McICA is intended to deliver the correct mean when averaged enough times over the same cloud system, but it has hourly grid-cell rms errors of 40 W m −2 (H. W. Barker et al., 2008;Pincus et al., 2003). A key underlying premise is that solar heating errors propagate symmetrically and linearly in the climate system and average out, as was found for simple forecast models. Assessing net bias errors caused by noisy heating rates would need to examine nonlinear processes in hydrology, cloud systems, ecosystem productivity, and air quality in Earth system models (e.g., Pincus & Stevens, 2013).
With a deterministic ICA generator, we can calculate an "exact" non-stochastic answer as was done for limited test cases in M. J. Prather (2015), but we could not afford to do this for our January climate metric. Solar-J identifies and sorts all ICAs by cloud optical depth and then selects up to four representative quadrature column atmospheres (QCAs) each with a fractional area to represent the distribution of ICAs. The full-wavelength RT solutions are completed for each QCA (Neu et al., 2007). See Figure S3 for a global picture of the average frequency of occurrence of the 4 QCA bins for January 2015. Cloud quadrature does a very good job of averaging over the ICAs with net bias errors of ∼1% in solar intensity and rms errors of 2%-4%. To reach equivalent accuracy for a single time step using random selection would require about 50 ICAs each with full wavelength calculation (not as in McICA) versus an average of 2.8 QCAs (many grid cells have less than 4 QCAs).
The binary cloudy-or-clear within a grid cell fails to account for varying cloud thickness, and J. K. P. Shonk and Hogan (2008) invented the Tripleclouds algorithm to represent in-cloud heterogeneity with a thick homogeneous core cloud surrounded by a thinner homogeneous edge cloud. In the binary cloudy-clear approach, two layers can generate 2 2 ICAs from their overlap combinations, but in the Tripleclouds system, it is 3 2 . Tripleclouds address a significant source of error in solar heating calculations, but there is considerable uncertainty in partitioning the cloud water path into thick and thin regions (J. K. P. . Tripleclouds has not been implemented in Solar-J, and its primary use is in the RT codes of the ECMWF (R. J. Hogan & Bozzo, 2018) and ACCESS (Franklin et al., 2013).
The cloud overlap assumption is clearly an important source of error. Hogan (2010, hence SH2010) demonstrated this using 4 months of ERA-40 data and cloud optics similar to the January IFS cloud data used in Solar-J here. SH2010s calculation using Tripleclouds showed that EXP-RAN produced about +4 W m −2 more reflected sunlight than MAX-RAN, as expected because EXP-RAN has greater global cloud cover. Similarly, our calculation (using binary clouds) shows MAX-COR has +1.4 W m −2 more reflection and similar reduction in surface heating versus MAX-RAN (T1/R19). This reduced effect is understandable because MAX-COR does not shift to random overlap if there is a gap in one layer. Both models show that increased reflection is balanced by reduced surface heating, with little change in atmospheric heating. A direct comparison is not possible since SH2010 did not calculate EXP-RAN for binary clouds. SH2010s EXP-RAN appears to separate cloud layers more than the MAX-COR sub-groups, but their Tripleclouds may also affect the EXP-RAN results. SH2010 find that the Tripleclouds versus binary clouds is a larger effect (−6 W m 2 ). We support the SH2010 results that cloud overlap is major uncertainty in current models. Unfortunately

Three-Dimensional Cloud Fields
How will the next generation of solar heating codes deal with cloud structures, both vertically and horizontally? The vertical coherence of clouds has become standard in RT codes; and the horizontal coherence, including the variations in cloud water path within a cell is now being modeled explicitly with Tripleclouds (J. K. P. Shonk & Hogan, 2008). This work points toward new approaches for understanding how ICAs within a grid cell column interact with each other. As the horizontal resolution in climate models drops to cloud-resolving scales then it is obvious that neighboring column atmospheres are not independent. B2015 used 3D coherent clouds fields from the NASA A-train (MODIS, CloudSat, CALIPSO) with 3D Monte Carlo RT codes to show that 1D minus 3D RT errors in heating and reflection were as large as 2-stream minus exact multi-stream errors. Recent comparison of chemistry models with aircraft measurements (Hall et al., 2018) showed that scattered light from neighboring clouds alters photolysis rates in clear-sky ICAs.
While 3D RT codes are beyond reach for climate models, R. J. Hogan et al. (2016) developed the Speedy Algorithm for Radiative Transfer through Cloud Sides code as a new operational 2-stream RT code for the ECMWF models (R. J. Hogan & Bozzo, 2018). This code includes first-order 3D elements by coupling the 2-stream column solutions across the three elements in each layer. H. W. Barker and Li (2019) investigate how to average the solar heating over a domain consisting of many columns by using more frequent RT calculations but for an objectively selected sub-sample of the atmospheres therein based on the cloud distributions (Partitioned Gauss-Legendre Quadrature). This exciting approach shares many features with Solar-J QCAs (used for ICAs within a single cell), such as producing small rms errors relative to random sampling.

Ocean Surface Albedo
Climate models often assume that the ocean surface albedo (OSA) is constant in the visible, typically 0.06 for all incident solar direct and diffuse radiation. OSA varies greatly with incident angle and somewhat with wavelength, wind speed, and chlorophyll concentrations (Z. Jin et al., 2004;Li et al., 2006;Taylor et al., 1996 and see Figure S4). Recently, this interactive parameterization of OSA (Z. H. Jin et al., 2011) has been implemented in two Earth system models (Séférian et al., 2018) and shown to better match satellite derived OSA. Here we take the FORTRAN module directly from Séférian with only minor modifications. Because Solar-J resolves the downward diffuse radiation with four angles, we calculate the albedo specifi-cally for those four angles plus the direct solar beam, and do not use the OSA averaged albedo for "diffuse radiation," which is necessary when using 2-stream RT codes.
Solar-J lower boundary condition is 2nd-order in finite-difference RT solution and has not changed since the original Fast-J documentation (Wild et al., 2000). The interactive OSA requirement that each angle has a different albedo required a rewrite of the 2nd-order lower boundary condition. The Fast-J Feautrier solver for scattered light (see Equations 9 & 19 of Wild et al., 2000) uses odd-even (leap-frog) first-order finite-difference equations, solving at the lower boundary for j n .
where u n (n = 1:4) are the cosines of the zenith angles for the scattered intensity (I). The angles are Gauss points with weights w n . We assume a Lambertian reflective surface, and hence I up (+u n ) is isotropic and denoted simply as I up . The solution requires a linear equation relating I up to the intensities at the four angles j n=1:4 . For notation below, we use Σ to denote the sum over the quadrature angles n = 1:4. The upward flux from the lower boundary is the cosine-weighted sum of the specific intensity The upward flux can also be calculated in terms of the downward incident fluxes at the four quadrature angles and direct beam, but with angle-specific albedos A n and A 0 .
Substituting I down n = 2 j n -I up from Equation (1) With A n depending on u n , we derive the new lower boundary condition for I up .
  up solar 0 0 4 / 1 2 ¼ n n n n n n n Evaluation of interactive OSA (SJ minus SJ/OSA) uses the full spherical geometric atmosphere with MAX-COR cloud overlap. The global mean errors with fixed OSA are (+0.7, +0.2, −0.8 W m −2 ) for (reflection, atmospheric absorption, and surface absorption), respectively (T1/R22). The global mean error, fixed minus interactive, can be adjusted to near zero by selecting the fixed OSA, but there remains a strong latitudinal error of 3 W m −2 in ocean heating associated with high sun, see Figure 7. The zonal rms errors are large, 2-8 W m −2 , because of the wide diurnal range of solar zenith angles over the day, but given the thermal inertia of the upper ocean layers, this probably averages out. Overall, these results are similar to those found in Séférian et al. (2018).

Findings and Recommendations
Our goal here is to provide an extensive analysis of the many uncertainties or known errors in our climate model calculations of solar heating rates, and to do this in a consistent framework with climate-relevant diagnostics. We wanted to review as many of the uncertainties/errors as we could, from the mundane (Rayleigh scattering) to the formidable (cloud overlap), so that we could assess priorities for improving the accuracy of solar radiation in climate models. We began this study with the hope of comparing Solar-J directly to AER's RRTMG version 4.0 code, and thus we ran both codes within our chemistry-transport model to integrate over forecast modeled atmospheres. We became quickly humbled when even the clear-sky differences in the two, at the 1 W m −2 level, could not be fully resolved without a complete rewrite of one code or the other. Nevertheless, the parallel simulations with RRTMG were informative (e.g., 2-stream problems) and helped our analysis here. These findings are based primarily on the extensive variants and adaptations made to the Solar-J RT code. In some cases, our comparisons with other published work has provided insight and allowed us to draw insight beyond our own simulations.
Here, we make recommendations based on the magnitude of error and the difficulty or extra computational cost in improving the models. The levels of ranking include 0 (inconsequential errors), 1 (modest errors and easy/cost-effective fix, or significant errors but hard to fix), 2 (significant errors and ready/cost-effective fix).
1. Spherical, refracting atmosphere, level 2. Flat-atmosphere models have 1.9 W m −2 less incident sunlight and 1.1 W m −2 less heating of the climate system (atmosphere and surface) than do spherical, refracting atmospheres (T1/R23&24). Errors can reach 4 W m −2 (monthly averages) at high latitudes. Spherical solar ray-tracing with refraction can and should be readily implemented with simple ray tracing code (P2019) and incorporated in standard 2-stream codes (Spurr & Natraj, 2011). There will be minor costs in that about 56% of the Earth, rather than 50%, will require radiation calls every time step. 2. Geometrical, expanding atmosphere, level 1. Shifting from geopotential to geometric coordinates is a conceptual change and will need more thought on how to account for the extra mass in the upper layers as well as the extra solar heating (P2019) (T1/R25). 3. Stratospheric heating, level 1. Differences in stratospheric heating rates between RRTMG-SW and Solar-J are large (∼10%, see Figure 2 of H2017). These are likely caused by the different cross section for O 3 absorption and the lack of O 2 absorption in RRTMG (significant in the mid stratosphere). These differences can and should be readily resolved with some group efforts like PhotoComp (2010) but with diagnosed heating rates, and with the inclusion of O 2 photolysis as heating. (probably no impact on overall cost). 4. Resolving UV-VIS absorption and scattering, level 1. The 300-700 nm sunlight that reaches the surface interacts primarily with broad band features of O 3 absorption (Hartley-Huggins and Chappuis bands) and Rayleigh scattering that vary differently and widely across these wavelengths. The RRTMbased codes accurately resolve the O 3 and Rayleigh features with g-points. Solar-J bands are optimized for photolysis but not heating rates. Its number of visible broad bands needs to be expanded beyond 2, and those cross section adjusted to Watts weighting. Aerosol extinction, but not clouds, also varies widely across visible wavelengths, and all solar RT codes may consider testing aerosol wavelength dependence across their broad bands. (important, not too difficult, but limited to Solar-J, hence level 1). 5. Resolving IR cloud absorption, level 2. The infrared wavelength bands (#19 through #27 in RRTM) need to be reformulated to more accurately account for the absorption spectrum of liquid and ice water. The error is great in terms of atmospheric absorption (global average of ∼2 W m −2 excess heating) and cloud top heating (25% too great), likely affecting the life cycle of clouds. The LZL approach of doing a double correlated-K for water vapor and liquid water within the RRTM bands may be difficult when ice water is added (which it must be) since the absorption features are shifted in wavelength from those of liquid water. Our finding like Edwards and Slingo (1996) is that most of this error is simply failure to resolve the widely different SSA within each broad RRTM band. We suggest it is time to drop the RRTM band structure and re-group a set of non-contiguous narrower bands into a single group with similar cloud SSA, and then redo the correlated-K for water vapor within that new group. This would probably not increase the computational cost of the RT code, but does involve substantial new research. 6. Rayleigh scattering, level 0. Forcing Rayleigh scattering to be isotropic, as required in current 2-stream codes, is inconsequential. 7. δ-scaling of the cloud scattering phase function, no assessment. The errors caused by δ-Eddington and other scaling methods are significant, causing systematic errors at the 1-2 W m −2 level across latitudes. Because of the large reduction (factors of 4-7) in liquid cloud optical depths, these scaling methods project much greater solar flux impacting the surface at the SZA rather than as diffuse, scattered light. In terms of reflected light, LBYY show large dipole errors with SZA in reflected flux for δ-Eddington scaling. Fortunately, δ-scaling appears to have little impact on atmospheric heating. There is no assessment here because any solution requires multi-stream scattering, a more formidable task.
8. Multi-stream scattering, level 2. From the experiments here and the careful error analysis of B2015 and LBYY, it is clear that 2-stream scattering is a major source of error in many codes, including RRT-MG-SW. 2-Stream RT codes uniformly underestimate atmospheric heating rates by more than 2 W m −2 . They appear to have bias errors in reflected flux that depend on surface albedo. 2-stream requires some δ-scaling method with its own source of errors (vii above). In terms of solar heating errors, the 2-stream errors are comparable to the errors in moving from 1D to 3D RT (B2015). While full 3D RT is beyond the capabilities of climate models, multi-stream is clearly not. The GFDL and CCCma models can currently use 4-stream RT, and it is likely that 6-or 8-stream RT could be optimized for modern processors and software so as to be affordable. For example, the 8-stream Fast-J code is being implemented in E3SM for chemistry-climate modeling. Based on B2015, any of these options would be a significant improvement for climate modeling. 9. Ice clouds, level 1. The range of approaches to parameterizing ice clouds in current solar RT codes reflects both the lack of multi-stream capability to resolve the ice crystal scattering and the fundamental uncertainty in prescribing the mix of crystal habits over the wide atmospheric range of ice clouds. With the recent work by colleagues (e.g., 2013, 2018) quantifying the scattering, and the better atmospheric data from Heymsfield and colleagues (e.g., 2017), we are in a position to remove ice-cloud parameterizations and apply the ice-cloud physics directly to solar RT codes. Two recent efforts to do this have been restricted to RRTMG-like codes and thus resorted to polynomial fits for a 2-stream code. It is clear that we need a good database for typical ice clouds that includes the basic physics, like the phase function needed for 4-stream and higher codes. 10. Cloud overlap, level 2. The representation of sub-grid unresolved cloud overlap is critical in solar heating. The simplistic averaging of clouds over the grid cell produced huge errors (+20 W m −2 in reflected flux in our case) and was quickly dropped in favor of more realistic cloud overlap models such as MAX-RAN. The latest type of cloud schemes includes the successive decorrelation with altitude separation and include EXP-RAN, EXP, and MAX-COR (used here). All of these decorrelation methods increase the effective cloud cover and the reflected solar flux. In our case, MAX-COR has on average 1.4 W m −2 greater reflection and less surface absorption, similar to other approaches, including more exotic approaches (J. K. P. Shonk & Hogan, 2008) that add a second, thinner extension around the primary cloud. The range of approaches reflects the basic uncertainty in mapping climate model cloud data (cloud fraction and cloud water content) into 3D fields of clouds. For the 1D, or quasi-1D (R. J. Hogan & Bozzo, 2018) RT codes in climate models, it would be useful to establish some standard, community-wide, satellite-based cloud overlap models (H. W. Barker, 2008;Bankert et al., 2015;Ham et al., 2015;Kato et al., 2010; along with a simple ICA generator (e.g., M. J. Prather, 2015) to provide a basis for comparisons. 11. Monte Carlo or other ICA averaging, level 1. Using cloud quadrature QCAs to average over ICAs would greatly reduce the numerical noise generated by McICA random selection. It is easy to implement in any code, but would increase the computational costs by a factor of 2.8. There is a broad interest reducing the stochastic noise in heating rates, particularly at high resolution (H. W. Barker & Li, 2019). 12. Ocean surface albedo, level 2. Use of an interactive ocean surface albedo that depends on SZA, wind and wavelength would eliminate a latitudinal mean bias of 3 W m −2 in surface absorption. This easy fix, can be used in RRTMG, and has already been implemented in the ARPEGE and LMDZ models (Séférian et al., 2018). 13. Photosynthetically Active Radiation, no assessment. PAR is calculated in Solar-J with 4 downward diffuse streams and no δ-scaling of cloud optical depth, which is notably more accurate than 2-stream codes where much of the diffuse light is reported as direct beam. Thus, 2-stream methods have large errors in the diffuse:direct ratio of PAR under clouds or aerosols. We estimate that PAR errors are level 2, but a more thorough analysis would need to couple the direct and diffuse PAR to a land biosphere model to evaluate the errors in primary productivity (equivalent of W m −2 ).

Data Availability Statement
Dataset, Solar-J source code, and scripts for generating figures and tables are concurrently available to DRY-AD University of California, Irvine with DOI https://doi.org/10.7280/D1PQ3W.