A PathTracing Monte Carlo Library for 3D Radiative Transfer in Highly Resolved Cloudy Atmospheres
Abstract
Interactions between clouds and radiation are at the root of many difficulties in numerically predicting future weather and climate and in retrieving the state of the atmosphere from remote sensing observations. The broad range of issues related to these interactions, and to threedimensional interactions in particular, has motivated the development of accurate radiative tools able to compute all types of radiative metrics, from monochromatic, local, and directional observables to integrated energetic quantities. Building on this community effort, we present here an opensource library for general use in Monte Carlo algorithms. This library is devoted to the acceleration of ray tracing in complex data, typically highresolution largedomain grounds and clouds. The main algorithmic advances embedded in the library are related to the construction and traversal of hierarchical grids accelerating the tracing of paths through heterogeneous fields in nullcollision (maximum crosssection) algorithms. We show that with these hierarchical grids, the computing time is only weakly sensitive to the refinement of the volumetric data. The library is tested with a rendering algorithm that produces synthetic images of cloud radiances. Other examples of implementation are provided to demonstrate potential uses of the library in the context of 3D radiation studies and parameterization development, evaluation, and tuning.
Key Points
 A pathtracing library is distributed for flexible implementation of Monte Carlo algorithms in cloudy atmospheres
 Nullcollision algorithms and hierarchical grids are combined to accelerate ray tracing in large volumetric data
 Insensitivity of radiative transfer computational cost to surface and volume complexity is achieved
1 Introduction
Radiative transfer, within the scope of atmospheric science, describes the propagation of radiation through a participating medium: the atmosphere, bounded by the Earth's surface. While many components of the Earth system interact with radiation, clouds play a key role because of their strong impact (globally cooling the Earth; Ramanathan et al., 1989), their high frequency of occurrence (Rossow & Dueas, 2004), and their inherent complexity in both space and time (Davis et al., 1994). Radiation and its interactions with clouds are involved in various atmospheric applications at a large range of scales: from the Earth's energy balance and cycle relevant to numerical weather predictions (Hogan et al., 2017) and climate studies (Cess et al., 1989; Dufresne & Bony, 2008) to the inhomogeneous heating and cooling rates that modify dynamics and cloud processes at small scales (Jakub & Mayer, 2017; Klinger et al., 2017, 2019), and to the retrieval of atmospheric state and properties from radiative quantities such as photon path statistics, spectrally resolved radiances, and polarized reflectances (Cornet et al., 2018), observed by both active and passive remote sensors.
The threedimensional (3D) radiative models developed in atmospheric science represent the interactions between clouds and radiation very accurately, but onedimensional (1D) models are preferred in operational contexts for their simplicity and efficiency. This is a demonstratedly poor approximation in cloudy conditions (Barker et al., 2003, 2015), particularly in broken cloud fields, where cloud sides play an important role in the radiative fluxes' distribution and divergence, as they account for a large portion of the interface between clouds and clear air (Benner & Evans, 2001; Davies, 1978; Harshvardhan et al., 1981; Hinkelman et al., 2007; Kato & Marshak, 2009; Pincus et al., 2005). A largescale parametrization for 3D effects was recently developed (Hogan & Shonk, 2013; Hogan et al., 2016, 2019; Schäfer et al., 2016), leading to the very first estimation of the broadband, global 3D radiative effect of clouds (around 2 W/m^{2} after Schäfer, 2016). Approximate radiative models representing 3D effects at smaller scales are also available for highresolution atmospheric models (Jakub & Mayer, 2015; Klinger & Mayer, 2016; Marshak et al., 1998; Wapler & Mayer, 2008; Várnai & Davies, 1999). These advances were made possible by the longterm efforts of a pioneering group of cloudradiation scientists who, over the past 40 years, have been developing and using reference 3D radiative transfer models to analyze and document cloudradiation 3D interactions (see Davis & Marshak, 2010; Marshak & Davis, 2005, and references therein). These 3D models can be divided into two categories: those using deterministic approaches (e.g., the Spherical Harmonics Discrete Ordinate Method; Evans, 1998) and those using statistical approaches, that is, Monte Carlo (MC) methods (Marchuk et al., 1980a). Our proposal builds upon one of the major strengths of MC models: that the computing time is only weakly sensitive to the size of the geometrical and spectral data set.
 connections to the literature and practices of the computer graphics community, and
 a freely available C library for general use in MC problems involving large cloud scenes above complex surfaces.
Although we also present a rendering code implemented using the library, we do not wish to focus on this particular example but rather on the library itself, which is designed to facilitate the coding of a wide diversity of MC algorithms while taking advantage of recent developments in computer graphics. In today's pathtracing MC codes, complicating the ground description has no significant impact on the computing time. We show that by using the nullcollision method (known in atmospheric science as maximum crosssection; Marchuk et al., 1980b) together with computer science advances in the handling of large geometric data, computing time insensitivity can also be reached when increasing the cloud field resolution.
Section 2 briefly recalls the principle of the acceleration grids used to achieve the insensitivity of computing times to ground resolution and explains why, until very recently, the same techniques could not be directly applied to volumes. Section 3 describes a new, free library, the purpose of which is to facilitate the implementation of MC algorithms by providing tools for handling large amounts of data. The algorithmic advances embedded in the library, which are at the heart of our proposal, are (i) the construction of hierarchical grids for accelerating ray tracing in both surfaces and volumes, and (ii) the filtering functions used as an abstraction to allow strict separation of the raytracing procedure from the MC algorithm itself. It is demonstrated in section 4 that the objective of achieving a computing time insensitive to cloud field resolution is reached. This is illustrated using a rendering algorithm that produces synthetic images (fields of radiances) of scenes representing cloudy atmospheres, which we apply to a variety of cloud fields: stratocumulus, cumulus, and congestus. In the last section, the present work is summarized; other examples of MC codes implemented with the library and dedicated to the study of 3D radiative effects of clouds are mentioned (section 5.1); and the technical state of the library, along with its current limitations, are discussed (section 5.2).
2 Acceleration Grids for Large Surface and Volume Data Sets
First, we present the principle of acceleration structures for efficient ray tracing in surfaces, a common practice in the field of computer graphics. Since most MC codes remain sensitive to the size and refinement of the volume description due to the nonlinearity of Beer's extinction law, the end of this section is devoted to the wellestablished family of nullcollision algorithms (NCAs), presented here as a way to bypass this nonlinearity (Galtier et al., 2013), thus opening the door to acceleration grids for volumes as well. To the best of our knowledge, the most advanced proposal along these lines in the field of cloud radiation is in Iwabuchi and Okamura (2017). However, while they use NCAs in acceleration grids, they do not achieve insensitivity of computing times to the resolution of the volumetric data. With distinct applicative objectives, strong efforts have also been made by the film industry, especially by Disney Research, which revisited NCAs and transformed them into a validated industrial practice (Kutz et al., 2017; Novák et al., 2018, 2014).
2.1 Why Can Monte Carlo Codes Be Insensitive to the Complexity of Ground Surfaces?
MC codes simulating radiation above a highly refined ground surface (discretized as millions of triangles) must find the triangle that intersects the current ray, if any. This is a quite simple geometric problem, but speed requirements have motivated the development and use of acceleration structures to increase the efficiency of ray tracing (see Appendix A.1 for a brief historical description). The triangles are represented in memory in such a way that only the rays' neighboring triangles need be checked for intersection. In practice, there is a precomputation phase in which the triangles are virtually gathered into bounding boxes. When a ray is traced into the scene, only the triangles inside the crossed bounding boxes are tested for intersection. When dealing with large numbers of triangles, any such strategy reduces the computing time drastically by comparison with systematic testing of all the triangles in the scene. However, quite sophisticated acceleration structures were required before the cost of raytracing procedures became fully independent of the number of triangles in the scene. It is the hierarchical nature of the acceleration structures that allows the computing time to be insensitive to the complexity of the ground description (see Figure 1). Such structures are made of coarse bounding boxes that are recursively subdivided when they include too many triangles, yielding an adapted multilevel subdivision of space. They are now well documented, and numerous libraries are available for rapid implementation.
2.2 The Nonlinearity of Beer's Extinction Forbids the Straightforward Use of Acceleration Grids for Volumes
N is a large number of realizations and x_{l} is a point location at distance l along s. The corresponding dataaccess difficulties are then reduced to retrieving the extinction coefficient at the sampled locations, and this could be efficiently achieved by using an appropriate memory representation of the elementary volumes, that is, acceleration grids (regular grids being one example of such).
However, this simple integral over the extinction coefficient cannot be statistically combined with the other integrals over photonpaths γ (over scattering angles, wavelengths, etc.) because it appears inside Beer's exponential. The nonlinearity of the exponential imposes that τ be evaluated either in a deterministic way (abandoning the MC approach for this part of the algorithm) by successively crossing the elementary volumes as in Figure 2a, or by using a nonlinear MC approach to handle these two nonlinearly combined integrals simultaneously. Until recently, reported attempts to extend MC to nonlinearly combined processes were scarce (Dauchet et al., 2018). The deterministic approach, intrinsically resolution dependent, has often been retained.
2.3 NCAs and Their Integral Formulation Counterparts
 Set x=x_{0}.
 Trace a ray in the scene as if the volume were empty, originating from x in the direction ω, until either a surface is intersected or the ray reaches the TOA.
 If a surface is intersected, return w=0 (the ground is opaque).

If no surface is intersected, trace a ray in the homogeneous volume:

Compute where L is the distance from x up to the TOA in direction ω.
 Sample an optical thickness according to Beer's extinction.
 If , no collision is detected: return w=1.
 If , a collision is detected: set , move to the collision location x_{s}=x+sω and access the local value k(x_{s}) of the field of extinction coefficient.
 Sample a random number ϵ uniformly in the unit interval in order to decide between a true and a null collision.
 If the collision is true: return w=0.
 If the collision is null: proceed to step 5.

 Set x=x_{s} and loop to step 4.
Through these simple examples, NCAs are presented as an entirely new family of formulations, beyond simple rejection algorithms. Indeed, while in the first formulation 2, the treatment applied to nullcollision events is a simple rejection (a purely forward scattering event), the handling of nullcollision events in the derived formulation 3 is more complex (although straightforward enough): It requires the computation and storage of a new quantity (η, see equation 4). This need for flexibility inside the raytracing procedure required close attention when designing the library (this point will be discussed in section 3.2). The family of nullcollision formulations is notably different from standard MC algorithms in that the integral of k along the line of sight, , no longer appears inside the exponential anymore, and hence acceleration strategies can be deployed (see Figure 2c). However, this comes at the price of increasing the recursivity level of the path statistics: The events induced by the added virtual colliders can lead to a significant increase of computational cost, especially in domains where heterogeneities cover large ranges of scales. There is therefore a compromise to be reached between the number of such events and the number of grid cells intersected during ray tracing. This point is developed in the next subsection.
2.4 The Expected Features of Acceleration Grids for PathTracing in NCAs
Among the first consequences of the analysis of NCAs in their integral forms is the fact that acceleration grids could indeed be introduced for volumes (Iwabuchi & Okamura, 2017; Kutz et al., 2017; Novák et al., 2018). Such structures are expected to ensure fast traversal of the field used in NCAs, and fast access to the true k value when a collision is found in the transformed field, all the while minimizing the computational cost of handling null collisions by locally adjusting the field to the true k field. It is indeed not necessary to add null colliders until the whole field of the extinction coefficient is uniform: It is only required for the spatial variations of to be simple enough to allow fast sampling of the next collision location. If is entirely uniform, then the sampling is ideally fast, but it remains fairly simple if is only uniform by parts. Therefore, the acceleration grid should be composed of voxels where is uniform (supercells in Iwabuchi & Okamura, 2017), and the voxels should be constructed with the constraint that k_{n} be small enough, ideally null, so as not to add too many null collisions.
However, a fast traversal is only achieved when few voxels are intersected by traced rays. This means that k_{n} should not always be close to 0: If matches k very closely, then the acceleration grid will be very refined (to the extreme, as refined as the original field) and traversing the acceleration grid will be as expensive as computing the optical thickness deterministically (the number of intersected voxels will be the same). A compromise needs to be found between grid refinement and collision frequency. This is precisely the issue that was investigated by the computer graphics community when trying to accelerate raysurface intersections, and the same solution can be used for volumes: hierarchical grids, refined as a function of colliders density (the extinction coefficient field). The original grid resolution will be preserved in the densest regions, while contiguous optically thin regions will be merged into a unique voxel of uniform , thereby reducing the number of voxel intersections. Optical thickness in the voxels of the acceleration grid, , is a key quantity: there is no reason for to match k closely as long as remains small, since little collisions will occur anyway. This question will be investigated later in section 4.3. The next section is dedicated to the pathtracing library that was developed to facilitate the implementation of efficient NCAs.
3 A PathTracing Library
Section 2 stated that NCAs can be seen as a way to bypass the nonlinearity of Beer's extinction law, thus making it possible to develop acceleration strategies to trace rays into volumes, while benefiting from similar developments made for surface treatment in computer graphics. This section describes the pathtracing library at the heart of our proposal: a collection of lowlevel functions that facilitate the implementation of MC codes involving large geometric models and large volumetric data sets. The library elements remain independent of the specificity of the (nullcollision) MC algorithm. In this sense, the present contribution is conceived in the spirit of the I3RC Community MC model (Cahalan et al., 2005; Jones & Di Girolamo, 2018; Pincus & Evans, 2009), and the more recent RTE+RRTMGP (Radiative Transfer for Energetics + Rapid Radiative Transfer Model for GCMs, Parallel; Pincus et al., 2019), designed as a platform to facilitate the development of atmospheric radiative transfer codes by radiation physicists in a wide range of applicative contexts. Sharing their concerns regarding flexibility, replaceability and traceability, we have paid particular attention to the abstractions used when splitting the library into elementary functions. Section 3.1 describes how hierarchical grids can be constructed using the library, while in section 3.2, special attention is paid to filtering functions, a feature of the raytracing procedure designed to facilitate the coding of algorithms obtained by manipulation of integral formulations.
As our first concern when developing the acceleration structure was to be able to handle large data sets, an illustration of the data typically output from highresolution atmospheric models is presented in Figure 3a. It shows a vertical cross section of the liquid water mixing ratio in a highly refined cloud field produced by the MesoNH (Lac et al., 2018; Lafore et al., 1997) Large Eddy Model, with a 5m resolution in all three directions, on a 5×5×5km^{3} domain. The initial conditions and model setup for this simulation (but with a 50m resolution) are described in Strauss et al. (2019). The 3D fields of liquid and vapor water, temperature, and pressure are partitioned into regular grids of 1,000^{3} cells, which represents about 38 GB of data. To these physical 3D fields, a spectral dimension issued from a kdistribution model (Iacono et al., 2008; Mlawer et al., 1997) is added, multiplying the amount of data by the 30 quadrature points used in the visible part of the solar spectrum. Details on the production of the physical data and the optical properties of cloud droplets and gas are presented in Appendix Appendix C.
As many grid cells are cloudfree in most simulated 3D cloud fields, thus hardly contributing to the scene optical depth, the benefits of using NCAs combined with acceleration structures are expected to be significant. In Iwabuchi and Okamura (2017), a first step in the hierarchical treatment of these clear cells consists of separating the cloudy layer from the clear layers that stand above and below and then generating acceleration grids at fixed resolutions that differ in clear and cloudy layers. Here, we show that we can go one step further by generating acceleration grids that, by their recursive nature, handle the horizontal and vertical variations of the extinction field at all scales. This is illustrated in Figure 3b, which represents a cross section of the 3D acceleration grid constructed from the 3D 5mresolution cloud field of Figure 3a.
3.1 Construction and Use of Hierarchical Grids
A development environment constituted by a set of independent free libraries is available online (MesoStar, 2016). They were designed for radiative transfer specialists who are either developing new MC codes or upgrading the raytracing routines in existing ones. Independent modules offering functionalities such as random sampling of pdfs, parallel integration of a realization function, sampling and evaluation of scattering and reflection functions, and ray tracing in surfaces and volumes are described in Table D1 of Appendix Appendix D. The module that handles ray tracing in surfaces is based on the Embree library (Wald et al., 2014), the common standard in computer graphics. However, although solutions for rendering complex volumes exist for production purposes (see, e.g., the OpenVDB library; Museth, 2013), it is our understanding that the management of volumetric data has not yet reached the same level of maturity as surface rendering.
3.1.1 Construction
In our library, we chose to implement one specific type of acceleration structure: octrees, hierarchical grids that partition 3D data. To construct these hierarchical grids, groups of 2^{3} cells containing the data (e.g., extinction coefficients) are recursively tested for merging. Since strategies for merging voxels control the balance between the costs of traversal versus null collisions, they should be considered together with the specificity of the implemented algorithm. This is why no assumption is made about the input/stored data or the merging strategy at the library level: It is left entirely to the responsibility of the physicist.
The hierarchical grid illustrated in Figure 3 is built using an optical depth criterion: If the residual vertical optical depth of the merged voxel is greater than this criterion, then the merging is rejected. Following Novák et al. (2014), the residual vertical optical depth is defined as the difference between the maximum and minimum extinction coefficients of the region tested for merging, times its vertical depth. This ensures that homogeneous regions are merged even if optically thick and that optically thin regions are merged even if heterogeneous: In both cases, the residual optical depth is small. The vertical dimension is chosen here because in the reverse solar algorithms we implement, rays are most frequently traced upward in the direction of the Sun. Other strategies might be more appropriate depending on the algorithm.
3.1.2 Storage
Since the paths will be tracked in the hierarchical grids, it is no longer necessary for the raw data to fit into the main memory. The original input data are stored on disk and loaded into memory whenever a collision is found and its nature needs to be tested. The immediate benefit is that calculations in large cloud fields that would not fit into memory are now possible. Of course, time is then spent on loading/unloading chunks of data (fragments of contiguous data in memory or disk space) into/from the main memory which rapidly becomes prohibitive in terms of computational effort. As of now, the octrees are still stored into the main memory; hence, building octrees with a coarser (suboptimal) refinement might prove necessary when handling huge data sets.
However, strategies to improve performance have been anticipated in the library implementation. The library registers the voxels in a Morton order that preserves the spatial coherence of the 3D data in memory or on disk (Baert et al., 2013). The data are fragmented into fixedsize memory blocks (Laine & Karras, 2010), which can be efficiently (un)loaded by the operating system to handle outofcore data (Tu et al., 2003). This insures that whenever a ray interacts with several voxels in a limited spatial region, the relevant data are available in memory as of the first interaction necessitating the loading of the corresponding data chunk.
3.1.3 Crossing
The last important functionality implemented in the library is the crossing of the hierarchical grid. The raytracing procedure can be seen as a sophisticated “do while loop”: it is an abstract procedure that iterates in an ordered fashion over the voxels intersected by the ray. At each intersection, a filtering function (the “loop body”) is called. No assumption about either the nature of the data contained in the voxels or the treatment that will be applied by the filtering function upon voxel intersection is made at the library level: Again, this is left to the responsibility of the physicist. By enabling the requisite independence between ray tracing and intersection treatment, this choice of abstraction responds to physicsdriven considerations detailed in the next subsection.
3.2 Integral Formulations and Filtering Functions
As mentioned before, in designing the library, particular attention was devoted to the separation of concepts. Coherence with computer graphics libraries (Pharr & Humphreys, 2018; Wald et al., 2014) was sought, but possible connections with the integral formulation concepts of the radiative transfer community were favored above all. The specificities of NCAs were illustrated in section 2.3, where a sensitivity algorithm was derived, in which an additional quantity had to be computed at each nullcollision event. Differentiation to evaluate sensitivities is only one example of transformation based on the manipulation of integral formulations. Other examples include the handling of negative nullcollision coefficients (Galtier et al., 2013) and the sampling of absorption lines when the gaseous part of k cannot be precomputed in linebyline MC algorithms dealing with large spectroscopic databases (Galtier et al., 2016). As soon as the introduction of null collisions is perceived as a formal way to handle the nonlinearity of Beer's extinction in heterogeneous fields, interpretation of the modified NCAs may depart widely from the intuitive adding of virtual scatterers.
Filtering functions are used to facilitate the implementation of such algorithms. They isolate the part of the code that is associated with the recursivity of the ray tracing from the physical part of the code where, for example, the treatment of true scattering events is implemented. The same concept was introduced by the computer graphics community in order to deal with surface impacts that require a specific treatment inside the raytracing function itself, for instance, filtering out (ignoring) intersections with transparent surfaces. The objective is for the raytracing procedure to not be exited at each intersection but rather only when a true collision is found. To that end, a filtering function implemented by the physicist is called by the raytracing procedure itself at each intersection, to decide whether to exit or proceed with the traversal. Filtering functions for volumes filter out the intersected voxels where no collision or null collisions occur. More sophisticated computations specific to the treatment of nullcollision events should also be implemented in the filtering function.
4 Implementation and Performance Tests
Simulating all flow structures from turbulence at metric scales to organized convection at mesoscale, above a possibly complex surface, is a relatively recent achievement permitted by the increase in computational power and heavy parallelization (Dauhut et al., 2016; Heinze et al., 2017). These highresolution, largedomain simulations unlock new possibilities but come with limitations related to the amount of produced data. Posttreatment and analysis is becoming difficult, and the outputs of such simulations are not always employed to their full potential, at least as far as studies of cloudradiation interactions are concerned. This is what motivated us to develop radiative tools that would scale with this increasing amount of data. In this section, a rendering algorithm implemented using the library described above is presented. A cloud field typical of today's large LES (1,000 × 1,000 ×1,000 cells) is used to show that NCAs that track paths in hierarchical structures allow the computation of radiance fields of clouds described by large data sets and that the rendering time is almost insensitive to the resolution of the cloud field. This is the main achievement reported in this paper, and this entire section is dedicated to the analysis of performance in terms of rendering time, as a function of the amount of volumetric data, the type of clouds, and the merging strategy used when constructing the acceleration grids.
4.1 The Algorithm
The rendering of images of highly resolved clouds is challenging in terms of computational resources, yet 3D visualization of atmospheric data is useful in assessing the realism of highresolution simulations and provides information on the 3D paths of light and their interactions with clouds. Such rendering algorithms are also useful for evaluating the inversion procedures used to retrieve cloud parameters from satellite images. To render a virtual cloud scene, a virtual camera is positioned anywhere in 3D space, and its position, target point, and fieldofview define an image plane, which is discretized into a given number of square pixels. For each pixel, three independent MC simulations are run to estimate the radiance incident at the camera, integrated over the small viewing angle defined by the pixel size and over the solar spectrum weighted by the responsivity spectra of the three types of human eye cone cells (Smith & Guild, 1931). Pixels are distributed among the different nodes and threads whenever parallelization is active. Once the three spectral components of the radiance field have been computed in each pixel, the map is converted into a standard Red Green Blue (sRGB) image for visualization (see Appendix Appendix D).
The retained backward algorithm is as follows: Paths are initiated at the camera. A direction ω is sampled in the solid angle defined by the pixel size and position in the image plane. A wavelength is sampled following the responsivity spectra of the current component. The narrow band in which the sampled wavelength lies is found in the kdistribution data. A quadrature point is sampled in the narrow band. The contribution of the direct Sun is computed as follows: If the current direction of propagation ω lies within the solar cone and no surface intersection is found along the ray trajectory, then the ray is traced into the volume to compute the direct Sun transmissivity as per the algorithm described in section 2.3, but additionally using a variance reduction technique called decomposition tracking (Kutz et al., 2017; Novák et al., 2014). Otherwise, the direct contribution is null. Then, the path is tracked in the (nullcollision) scattering medium to compute the contribution of the diffuse Sun. Direct transmissivity between each two reflections or scattering events is evaluated in the absorbing volume and cumulated along the path. When the ray hits a surface, the reflectivity of the ground is recovered and termination of the path is sampled accordingly. When a scattering event occurs, local scattering coefficients of the gas mixture and the cloud droplets are recovered, and the species responsible for the scattering is sampled accordingly. Then, the surface or volume event is treated by sampling a new direction of propagation, following the appropriate scattering function (Henyey Greenstein [HG] for cloud droplets, Rayleigh for gas molecules, and Lambertian for surfaces), and the ray is traced again in this new direction. The HG phase function is used along with the asymmetry parameter and single scattering albedo issued from Mie computations, at the wavelength lying at the center of the narrow band. It is used instead of the true Mie phase function to prevent convergence issues associated with its strong forward peak within the context of the local estimate method (see, e.g., Marchuk et al., 1980b or Mayer, 2009, for a description of the local estimate, and, e.g., Buras & Mayer, 2011; Iwabuchi & Suzuki, 2009, for solutions to reduce the variance of MC estimators related to the Mie phase function). Following the local estimate, the path weight is updated at each surface and volume event by adding the Sun direct transmissivity from the TOA to the event location, weighted by the probability of reflection or scattering from the Sun direction into the tracked direction and by the transmissivity cumulated along the tracked path from the event location to the camera. The path is terminated when reaching the TOA or upon absorption by the ground or the volume (if the direct transmissivity between two events is null). A schematic illustration of the algorithm is presented in Figure 4, along with an example of a produced image of a cloud field.
4.2 Insensitivity of Computing Time to the Amount of Volumetric Data
 : voxels are merged while the residual vertical optical depth of the merged region is less than 1, or
 : voxels are never merged, hence the acceleration grid is at the same resolution as the original data grid.
Fields of radiances are then rendered with the same camera and Sun setup and the same number of pixels and paths per pixel (the resolution of the radiance field is independent from the resolution of the cloud field itself by virtue of the camera abstraction). To measure the performance of the rendering algorithm, each tracked path is timed. As the duration time of a path is a random variable, it is treated as such, yielding estimates for the mean and standard deviation of the rendering time per realization and σ_{t} respectively. To compare performances for the cloud fields of varying resolution, the times presented in Figure 5 are relative rendering times: the mean rendering time per realization in the given cloud field, relative to the mean rendering time per realization in the original 5mresolution cloud field (using ). The figure shows that the rendering time for computations with merged hierarchical grids (full line) is almost constant, while the rendering time for computations with unmerged hierarchical grids (dashed line) increases exponentially with the resolution of the field due to the increased number of voxel intersections. Sensitivity of the computing time to the merging criterion is further investigated in the next subsection.
4.3 Comparative Tests for Typical BoundaryLayer Cloud Fields
The next performance tests make use of three idealized LES fields representative of the diversity of boundary layer clouds (BLCs): continental cumulus clouds (ARMCumulus; Brown et al., 2002) run at 25m resolution; marine, trade winds cumulus at 25m resolution (BOMEX; Siebesma et al., 2003); and a stratocumulus case at 50m resolution (FIRE; Duynkerke et al., 2004). They are less challenging than the previously studied congestus in terms of amount of data (respectively 256×256×160, 512×512×160, and 250×250×70 grid cells), but they are typical of our practice of using highresolution simulations to study smallscale processes and support the development of parameterizations in largerscale models. BLCs are of particular interest since they are a frequent regime in time and space and their radiative impact is key to the energetic balance of the Earth system and hence to the evolution of its climate (Bony & Dufresne, 2005). It is important that the acceleration techniques implemented in the library be performant for all types of BLCs. Here, we show how the pathtracing library, through the rendering algorithm presented before, behaves when confronted to various BLCs. Images of these scenes are shown in Figure 6. The renderer is applied to the same cumulus field in Figures 6b and 6c, but the surface is a plane in Figure 6c while it represents a complex terrain in Figure 6b.
For each image, Table 1 gives the imagemean time per realization (path), its standard deviation (computed over all realizations), the total rendering time over 40 threads, and the equivalent speed in number of realizations per second. We have shown that the amount and complexity of surface or volumetric data does not impact the rendering time. Images of pixelmean rendering times, shown in Figure 7, are used to analyze the differences in rendering times between the various scenes. They highlight the strong contrast between cloudy and cloudfree pixels and between optically thick and thin clouds or parts of clouds. The amount of visible cloud, related to the camera setting, explains the difference in rendering time between ARMCu 1 and 2 (where the cloud field is the same and only the viewpoint and Sun position change). Indeed, cloudy pixels take longer to render than clearsky pixels because of the highorder multiple scattering. The optical thickness of the clouds is another factor that affects mean path rendering time: Optically thick clouds take longer to render because the number of scattering events is greater than in thin clouds. This is illustrated with the Congestus 5m and ARMCu 2 images, where the number of cloudy pixels is lower in the former, yet rendering time is almost double that of the latter.
Image  (μs)  σ_{t} (μs)  Total rendering time  Speed (# path/s) 

Congestus 5m  110.883  0.005  9 hr 38 min  326,546 
BOMEX  37.255  0.001  2 hr 59 min  1,054,433 
ARMCu 1  105.049  0.0018  8 hr 22 min  375,983 
ARMCu 2  60.425  0.001  4 hr 59 min  631,249 
FIRE  122.061  0.0016  10 hr 01 min  314,049 
 Note. Images were computed with 3 (channels) × 1,280 × 720 (pixels) × 4,096 (paths) = 11,324,620,800 sampled paths, over 40 threads of a CPU clocked at 2.2 GHz. All computations were performed on a supercomputer (BULL DLC B710). Times per realization and their standard deviations σ_{t} are given for one thread. Total rendering time and speed are given for parallel computation over 40 threads.
As stated in section 2.4, the acceleration potential of null collisions used in combination with hierarchical grids depends on a compromise between the cost of the traversal of the grid (increasing with the hierarchical grid resolution, e.g., when fewer voxels are merged), and the cost of rejecting many null collisions (increasing when too many voxels are merged). This ratio of costs is therefore controlled by the construction strategy of the hierarchical grid. We show how the rendering time, and its partitioning into crossing voxels and rejecting nullcollisions, are impacted by the optical depth threshold used to merge voxels when building the hierarchical grids.
Figure 8a shows that an optimum value for seems to lie between 1 and 10 for all tested scenes. For these values, grids are such that one to ten collisions occur on average in each voxel. Although for all cloud fields, computations are faster when using an optimum hierarchical grid, fields with lesser volumic fractions of cloudy cells seem to benefit more from the hierarchical grids than globally cloudier fields: the rendering time for BOMEX is about 5 times faster when using than for (more null collisions and less intersected voxels), while for FIRE the acceleration ratio is less than 1.5. Looking at the partitioning into (i) crossing and accessing acceleration structure voxels (SVX) versus (ii) accessing raw data and testing collision nature (NCA), Figure 8b shows that, as expected, the optimum strategy for building a hierarchical grid is between the limits of systematically intersecting each voxel (small ) and using a fully homogenized collision field (large ).
5 Outlook and Discussion
 The nullcollision method (maximum crosssection) is revisited. It is an unbiased method which consists in artificially homogenizing the medium to simplify the sampling of the next raymedium interaction. It is presented as a way to bypass Beer's law nonlinearity, which makes the raytracing procedure independent of the native data grid; however, this method is not efficient in highly heterogeneous media.
 The novelty is that the nullcollision method is used in combination with recursive, hierarchical grids (octrees) inspired from the cinema industry, the purpose of which is to accelerate ray tracing. The computing time becomes independent of the data amount and resolution.
 The benefits of writing and manipulating the integral formulation equivalent to the MC algorithm are highlighted. Simultaneous evaluation of sensitivities (the Jacobian matrix) is given as an example of an algorithm derived from integral reformulation.
 The concept of filtering functions is presented as an abstraction that creates a true separation between the algorithm and the raytracing procedure, facilitating the implementation of nonanalog integral formulations.
 A free library consisting of several lowlevel modules associated with distinct MC concepts is available online. One of the modules is dedicated to accelerating ray tracing in surfaces; another to accelerating ray tracing in volumes. NCAs and hierarchical grids can be implemented in a flexible way using the library, regardless of the application objective.
 A free renderer that can be used to generate synthetic images of simulated cloud fields is also available online. Such images can serve to assess the realism of highresolution models, as a tool to analyze cloudradiation interactions, or in the context of satellite observation. The source code of this application can serve as a guiding example of how to implement other algorithms using the library for the interested physicist.
This library can be used to implement various applications, for instance, to study surfaceradiation or cloudradiation interactions, and to support the development, evaluation, and tuning of parameterizations. Next, other examples of MC algorithms are given to show the potential of the library for further application (section 5.1). The technical state and current limitations of the library are then discussed in section 5.2.
5.1 Other Examples of Implementation for CloudRadiation Interactions Studies
The work reported within this paper was initiated in the context of a study on 3D radiative effects of BLCs, with the aim of better understanding them and helping to improve their representation in largescale models. To that end, MC algorithms evaluating metrics other than radiance fields were developed and implemented, using older versions of the library. An example is illustrated here to show the potential for broader use of the library, beyond the rendering application.
In this example, solar radiative transfer is simulated through BLCs (the eighth hour of the ARMCumulus LES) at various solar zenith angles (SZA). Reference 3D MC results are compared to computations performed by the radiation scheme ecRad (Hogan & Bozzo, 2018). Accurate predictions of the surface solar flux partitioning into its direct and diffuse components are in increasing demand, as they are important for various applications, such as solar energy and photosynthesis by vegetation (which in turn relates to the carbon cycle of the Earth and thus to its climate). Since biases of opposite signs on diffuse and direct might compensate each other and still yield an accurate prediction of the total flux, the ratio of directtototal surface fluxes is used as a target metric in this comparison.
In the broadband solar forward MC, horizontally and spectrally integrated downward direct, diffuse, and total fluxes are output at the surface. Paths contribute to the diffuse flux if they have been scattered or reflected at least once, and otherwise to the direct flux. To allow comparison, wavelengths are sampled according to the Rapid Radiative Transfer Model for GCMs (RRTMG; Iacono et al., 2008; Mlawer et al., 1997) kdistribution model, in the solar interval ([820–50,000] cm^{−1}). Input gas profiles are taken from the I3RC cumulus case file provided with the ecRad package. Only vertical variations of gas absorption coefficients are considered. Possible solver choices implemented in ecRad include Tripleclouds, a 1D twostream solver that represents subgrid horizontal variability of the medium by defining three regions in each layer (Shonk & Hogan, 2008) and the SPARTACUS solver (Hogan & Shonk, 2013; Hogan et al., 2016; Schäfer et al., 2016), which is based on Tripleclouds but additionally represents the effect of subgrid horizontal transport on the vertical fluxes (3D effects).
In twostream solvers, the direct / diffuse partition is biased by the use of deltascaling approximations (Potter, 1970). This approximation is widely used in the presence of liquid clouds to correct their otherwise overestimated reflectivity—using only two slantwise directions to propagate diffuse fluxes fails to represent the fact that clouds scatter a large amount of radiation in a very small solid angle around the forward direction, which tends to enhance their transmissivity. In this approximation, the phase function is truncated, and the optical depth and asymmetry parameter are scaled in compensation. With the appropriate scaling, this leads to a correct estimation of the total flux, but the scaled direct flux is larger than the unscaled (physically correct) direct flux. After evaluating the total flux using scaled parameters, some models perform one additional simulation using unscaled parameters in order to compute the physical direct flux. Figuring out the error in the scaled direct flux could help deriving solutions to correct it instead of running the radiative scheme again. Two MC simulations are presented to assess the impact of the deltascaling approximation on direct fluxes: one using the true Mie phase function and one using the HG phase function with scaled asymmetry parameter and scattering coefficients, using the deltaEddington model (Joseph et al., 1976).
The directtototal flux ratio at the surface is plotted in Figure 9 as a function of SZA. The effective cover increases when the Sun is low in the sky; hence, much of the direct beam is intercepted by cloud edges in addition to cloud tops. In 1D (Tripleclouds, black dashed dotted line) ecRad fails to represent this loss of direct flux at large SZA. When 3D effects are included however (SPARTACUS, black dashed line), ecRad agrees very well with the 3D MC computation that uses the same assumptions (deltascaling, red full line). As expected, using the deltaEddington approximation (scaled optical depth) in MC computations yields an overestimated direct flux at the surface (red full line vs. blue full line). In the operational ecRad configuration, deltascaling and ignoring 3D effects both work to overestimate the direct flux at the surface; therefore, estimations of the direct/diffuse partition should be exploited with caution or corrected in relevant applications.
Other nullcollision MC algorithms, such as a backward algorithm that simultaneously estimates the local monochromatic ground flux density and its derivative with respect to the single scattering albedo of cloud droplets, and a forward algorithm that keeps track of horizontal distances traveled by MC photons, were also implemented using the library. They are only mentioned here as other examples of applications used in studies of 3D radiative processes, such as scattering through cloud sides and entrapment (Hogan et al., 2019).
5.2 State of the Library and Current Limitations
We distribute the free library online, together with atmospheric data and a rendering code that produces synthetic images of cloud fields. It is coded in C, for CPU technology. Each module of the library exposes its functionalities through standard C interfaces that can be easily bound to other languages (e.g., Fortran 2003 and beyond). Part of the library is based on Embree. The lowlevel modules (detailed in Appendix Appendix D) are elementary bricks that implement wellseparated concepts and are easily maintained: They are bound to evolve as needs for improvement arise.
While computing time is now insensitive to the amount of data, the construction of the grids is not: One needs to browse through the data in order to precompute the grids; hence, for huge data sets the cost of construction might overwhelm the cost of iterating over path samples. However, there is much room for improvement and optimization in this procedure. For instance, the number of grids to construct could be reduced by combining varying optical properties across the spectrum into a unique structure (in the current version, one hierarchical grid is constructed per spectral quadrature point).
In order to handle large data sets that do not fit into main memory, the outofcore paradigm should be adopted for the whole library: All the data should be stored on disk and (un)loaded on demand. Both the raw data and the acceleration grids were conceived with this objective in mind. However, this will only be efficient if the algorithms that request the data are designed according to their outofcore nature. For instance, the strategy implemented in Hyperion, Disney's outofcore renderer (Burley et al., 2018), consists of tracking paths in bundles instead of individual rays, thus making intensive use of the loaded data before unloading it when memory space runs out.
Algorithmic developments could be undertaken to improve the convergence of the estimators described in our examples. We do not expect any technical difficulties in implementing existing or new solutions to, for example, the convergence issues related to the peaked Mie phase function in solar algorithms using the local estimate (Iwabuchi & Suzuki, 2009; Buras & Mayer, 2011). Further work is certainly needed to better understand which strategy is most appropriate for building the grids, depending on the cloud field, its spectral properties, and the algorithm. The treatment of ice crystals, aerosols, or varying liquid droplet size distribution would require extending the library to load additional 3D fields. We do not expect any technical difficulties here either. Our focus until now has been on the raytracing procedure. Further developments should yield a more comprehensive toolbox capable of handling more complex atmospheric fields.
Acknowledgments
We are sincerely grateful to Robert Pincus and two anonymous reviewers for their fruitful comments and feedback, thanks to which the originally submitted manuscript was greatly improved (versions prior to revision are available on arXiv https://arxiv.org/abs/1902.01137). Our many thanks also go to F. Brient for providing us with the FIRE stratocumulus LES field, C. Strauss, D. Ricard and C. Lac for providing us with the 5m resolution congestus LES field, and C. Coustet for useful discussion. We acknowledge support from the Agence Nationale de la Recherche (ANR, grants HIGHTUNE ANR16CE010010 [http://www.umrcnrm.fr/hightune] and MCGRAD ANR18CE460012), from the French Programme National de Télédétection Spatiale (PNTS201605), from Région Occitanie (Projet CLE2016 EDStar), and from the French Minister of Higher Education, Research and Innovation for the PhD scholarship of the first author. The data and sources described in this paper are available at the website (https://www.mesostar.com/projects/hightune/hightune.html).
Appendix A: Brief History of Path Tracing in Surfaces and Volumes
The content of this appendix is not a rigorous review. Our understanding of the history of path tracing inside scenes involving large geometric models of complex surfaces is briefly summarized, with specific attention paid to the computer science literature devoted to physically based rendering, which indeed addresses the very same radiative transfer equation as ours (section A.1). Recent developments made in the handling of complex volumes by both this community and the engineering physics community (for infrared heat transfer and combustion studies) are then listed in section A.2. Based on our understanding of this literature, a noncomprehensive comparative table of the state of the art of both communities—computer graphics and atmospheric radiative transfer—is presented in section A.3.
A.1 Path Tracing and Complex Surfaces
Image synthesis is the science that aims to numerically produce images from descriptions of scenes. It was developed in the 1970s, when the field of computer graphics started to expand. At first, the focus was on surface rendering, often assuming that the objects in a scene were surrounded by vacuum. Among the diverse existing techniques, we mention here only a few that gradually led to the use of MC based pathtracing methods to render 3D scenes. Methods that were dominant in practice (e.g., micropolygon rendering or rasterization) are missing from this text, and we refer the interested reader to more complete presentations of the field's history, for example, in section 1.7 of Pharr and Humphreys (2018).
The initial concern was to determine which objects in a scene were visible from a given point of view. Appel (1968) first introduced the ray casting method as a general way to solve the hidden surface problem, by casting rays from the camera to the objects in the scene and detecting intersections. This opened up a whole field of investigation dedicated to optimizing intersection tests between rays and large numbers of primary shapes (see Wald et al., 2001; Wald, 2004; Wald et al., 2014, and references therein).
The next question was to determine how these visible surfaces were illuminated by sources and other surfaces, which was referred to as the global illumination problem. Whitted (1980) first used ray tracing and random sampling around optical directions to correct the unrealistically sharp gradients of intensity due to otherwise perfectly specular reflections. Cook et al. (1984) then generalized this approach to multivariate perturbations in the distributed ray tracing method. This was the first algorithm able to render all the major realistic visual effects in a coherent way.
A couple of years later, Kajiya (1986) developed the formal framework of the rendering equation, the integral formulation of the radiative transfer equation in vacuum, focused on lightsurface interactions. His path tracing model was the first unbiased scene renderer to be based on MC ray tracing. While revisiting this proposal, Arvo and Kirk (1990) found inspiration in the experienced community of particle transport sciences, where MC methods were already commonly used and studied. They introduced variance reduction techniques to the image rendering community.
Another important step toward efficiency was Veach's pioneering thesis (Veach, 1998). Using his mathematical background, he introduced a new paradigm in which radiative quantities were formally expressed as integrals over a path space, decoupling the formulation from the underlying physics: The formulations were no longer analog (i.e., based on intuitive pictures of the stochastic physics of particle transport). This allowed him to explore sampling strategies in full generality and to then apply them to path tracing, giving birth to several lowvariance algorithms such as Bidirectional Path Tracing (Veach & Guibas, 1995) and Metropolis Light Transport (Veach & Guibas, 1997).
 It was eventually perceived that MC methods allow independence between the rendering algorithm and the complexity of the scene, thus providing artists with unprecedented freedom.
 They allow a unified, physical representation of the interaction of light with surfaces, removing the need for artists to modify surface properties in order to achieve a specific effect, since they could now rely on physics.
 Improvement of filtering methods has allowed cheap image denoising, thus bypassing the need for more expensive, wellconverged MC simulations.
A.2 Path Tracing and Complex Volumes
A major difficulty in MC methods is the treatment of complex heterogeneities in volumes, for example, cloudy atmospheres. For decades, the computer graphics industry handled the question of volumes in much the same way as other MC scientists; their expertise in designing performant raytracing tools reached its limits in dealing with volume complexity. In section 2, it is claimed that the issue resides in the nonlinearity of Beer's law of extinction: The expectation of a nonlinear function of an expectation can no longer be seen as one expectation only. The method of null collisions can be seen as a way to bypass Beer's nonlinearity.
In neutron transport, this method was first described by Woodcock et al. (1965) under the name Woodcock tracking. In plasma simulations, it first appeared in Skullerud (1968). Soon after, Coleman (1968) gave a mathematical justification for this method, demonstrating its exactness. In the atmosphere, it was first published by Marchuk et al. (1980b) under the name maximum crosssection. Koura (1986) developed it for rarefied gas under the name null collisions. Computer graphics have used it as Woodcock tracking, for the first time in Raab et al. (2006).
Only with Galtier et al.'s (2013) seminal paper did it become clear that nullcollision methods allowed a reformulation of the integral solution to the radiative transfer equation in which the difficulties related to the nonlinearity of Beer's law disappear: The dataalgorithm independence, also strongly highlighted by Eymet et al. (2013), is not a consequence of introducing null collisions, but rather of the underlying integral reformulation.
This explicit framework opened doors to new families of MC algorithms, with potential for solving various problems that were before then considered impossible: nonlinear models (Dauchet et al., 2018), coupled radiationconvectionconduction in a single MC algorithm (Fournier et al., 2016), energetic state transitions sampled from spectroscopy instead of approximate spectral models (Galtier et al., 2016), symbolic MC in scattering media (Galtier et al., 2017), etc. Some of these methods are transposable to atmospheric radiative transfer with large benefits for our community, for example, conductoradiative MC models to investigate atmospherecities interactions, or linesampling methods for benchmark spectral integration, to develop, tune and test spectral models. Over the past few years, the computer graphics community has been similarly impacted by this new paradigm. Kutz et al. (2017) show how integral formulations of NCA can be used to derive more efficient freepath sampling techniques. Novák et al. (2018) provide a good review of the different freepath sampling methods, with a focus on NCA and their newly perceived interest: Acceleration structures that were already used for surfaces could now be used for volumes.
A.3 Comparison of the Computer Graphics and Atmospheric Science Literatures
A noncomprehensive summary of contributions from the computer graphics and atmospheric radiation is presented in Table A1. Only the techniques related to the library are cited. Other techniques such as variance reduction methods are mentioned in the text but do not appear in Table A1.
Method  Computer graphics  Atmospheric radiation 

Nullcollision algorithms  Woodock tracking  Maximum crosssection 
(Raab et al., 2006)  (Marchuk et al., 1980b)  
Acceleration for surfaces  Bounding Volume Hierarchy  No standard 
(Wald et al., 2014)  (Mayer et al., 2010)  
(Iwabuchi & Kobayashi, 2006)  
Acceleration for volumes  Octrees  No standard 
(Burley et al., 2018)  (Iwabuchi & Okamura, 2017)  
Memory management  Outofcore  — 
(Baert et al., 2013) 
Appendix B: Setup of Rendered Scenes
Table B1 describes the setups of the scenes shown in section 4.
Sun  Camera  

Zenith  Azimuth  Position (km)  Target (km)  FOV  Boundary  
Scene  θ (°)  ϕ (°)  X  Y  Z  X  Y  Z  (°)  conditions 
Congestus 5m  25  230  −2.89  1.98  0.80  7.90  2.14  2.36  60  Open 
BOMEX  40  0  2.22  3.68  1.49  8.21  4.47  −0.39  70  Cyclic 
ARMCu 1  60  225  10.24  0.61  0.42  −2.98  6.83  0.84  30  Cyclic 
ARMCu 2  85  130  4.66  0.97  0.83  0.45  7.05  1.58  70  Open 
FIRE  65  340  −3.06  11.70  3.80  10.86  3.68  0.47  70  Cyclic 
 Note. All images shown are constituted of 1,280 × 720 pixels and rendered using 4,096 paths per pixel component, with three components per pixel. All scenes use the same Mie and clearsky data. Boundary conditions apply to the 3D LES domain that is embedded in a 1D atmosphere. The Sun azimuth angle origin is at X>0, Y=0 (to the east) and oriented to the north. FOV is for field of view. Position and target point values were rounded for readability. The data and files describing the scenes are distributed in the starter pack, available online. LES = largeeddy simulations
Appendix C: Physical and Optical Properties of the Cloudy Atmosphere
As mentioned in the text, our MC codes handle liquid clouds and atmospheric gas, the production of which we describe below in terms of contents and optical properties we describe below. These data are provided with the library. The only particularity in the implementation of the lowlevel libraries themselves is that, due to the fact that 3D cloud fields are embedded in 1D gas profiles, the sky module combines the 3D and 1D data wherever the domains intersect each other and then uses lowlevel procedures to build the hierarchical structures.
C.1 Physical Properties of the Atmosphere
C.1.1 Clear Sky
The clearsky atmospheric column is described from ground to space by vertical profiles of temperature, pressure, water vapor mixing ratio, and a mix of other gases (CO_{2}, CH_{4}, N_{2}O, CFC1, CFC2, O_{2}, and O_{3}). The I3RC cumulus case profiles provided with the ecRad package (the radiative transfer model developed at the ECMWF; Hogan & Bozzo, 2018) are used.
C.1.2 Clouds
The realistic 3D cloud fields are produced by the MésoNH model (Lac et al., 2018; Lafore et al., 1997) used in a LES mode, at resolutions lying between 5 and 50 m. The subgrid microphysics is a bulk, onemoment scheme (ICE3; Caniaux et al., 1994). No subgrid cloud scheme is used; that is, the cells are assumed to be homogeneously filled with condensate water when saturation is reached. The 3D turbulent scheme (Cuxart et al., 2000) is closed with a mixing length based on Deardorff (1980). The model outputs include 3D fields of liquid and vapor water mixing ratio, potential temperature, and pressure.
C.2 Optical Properties of Gas and Clouds
C.2.1 Gas Molecules
The radiative properties of the atmospheric column are computed via the ecRad software, which we use as a front end for production of the Rapid Radiative Transfer Model for GCMs (RRTMG; Iacono et al., 2008; Mlawer et al., 1997) kdistribution profiles for 16 spectral intervals in the longwave ([10–3,500] cm^{−1}) and 14 spectral intervals in the shortwave ([820–50,000] cm^{−1}). Each quadrature point is provided with a quadrature weight that is used by our algorithms as a probability for the sampling of absorption coefficient values, which are then practically used as if radiative transfer were monochromatic. As the impact of the horizontal variations of temperature and pressure on the absorption is negligible in solar computations, absorption coefficient profiles are computed from vertical profiles of horizontally averaged temperature and pressure fields. The effect of water vapor variations on the absorption is parameterized using the fact that the absorption coefficients of the gas mixture are roughly linear (in log/log space) with , the water vapor molar fraction. The ecRad software is used in a preliminary step to compute and tabulate absorption and scattering coefficients for the 1D atmosphere, for each spectral interval, quadrature point, atmospheric layer, and value of in a given discretized range. The resulting lookup table is then used within the MC algorithm to retrieve the local k values. Details describing the model and the interpolation procedure are given in the supporting information. The maximum relative error between two profiles computed analytically from RRTMG versus interpolated absorption coefficients is around 1.2%, which is around half the maximum relative error found between profiles computed by ecRad versus analytically, both from RRTMG data (2.6%).
C.2.2 Cloud Droplets
The method developed by Mishchenko et al. (2002), implemented in Fortran as in Mishchenko et al. (1999), is used to solve farfield light scattering by spherical particles using the LorenzMie theory. The main assumptions are that droplets are homogeneous and that polarization is ignored. As with ecRad for gaseous absorption, this code is used externally to compute the single scattering albedo, the extinction, scattering and absorption coefficients, the asymmetry parameter, and the phase function, all of which are averaged over the size distribution. We also compute the cumulative phase function and its inverse to allow efficient sampling of scattering directions. The MC algorithm accesses these data via lookup tables and performs spectral averaging over the narrow bands used in the kdistribution described above: The Mie data are uncorrelated from the gas spectral data, and the same lookup table can be used with various spectral models. The specific table used for the simulations of section 4 is available as a NetCDF file in the starter pack (https://www.mesostar.com/projects/hightune/starterpack.html). The size distribution is lognormal, with an effective radius of 10 μm and a standard deviation of 1 μm.
Appendix D: Description of the Set of Libraries
 lowlevel modules (random sampling, surface and volume data structuring and ray tracing, and scattering), implemented as libraries, forming the generic development environment, available at https://gitlab.com/mesostar/starengine/. They implement true abstractions of MC concepts that can be used regardless of the scientific field of application, but mastering their use requires some time and investment due to the level of abstraction they represent;
 dataoriented modules (3D atmospheric fields, and cloud and gas optical properties data), also implemented as libraries, although not directly available in the development environment as they are already oriented toward atmospheric applications. Using these modules would require the user to produce data in the same format as ours. Other dataoriented modules can be developed to interface new input data with higherlevel modules;
 applicationoriented modules (sky, ground, camera, and Sun), not implemented as libraries, developed in the context of the renderer application. They can be used for other projects implementing atmospheric radiative transfer models; the sky module in particular implements the construction of the hierarchical structures for the volume data that was loaded using the dataoriented modules.
Module name  Description  Example of functions 

Lowlevel  https://gitlab.com/mesostar/starengine/  
StarSamPle (ssp)  Generate reproducible sequences of pseudorandom  ssp_rng_canonical; 
numbers (compatible with parallelization),  samplessp_ran_exp_pdf;  
and evaluate various probability density functions.  ssp_ran_hemisphere_cos;  
Star3D (s3d)  Define shapes, attach them to a scene,  s3d_scene_create; 
trace rays in the scene, filter hits.  s3d_scene_view_trace_ray;  
s3d_hit_filter_function_T;  
StarVoXel (svx)  Define voxels, partition them into a hierarchical  svx_octree_create; 
structure (tree), trace rays in the tree, filter hits.  svx_tree_trace_ray;  
svx_hit_filter_T;  
StarScatteringFunctions (ssf)  Set up, sample, and evaluate scattering  ssf_specular_reflection_setup; 
functions for surface and volume.  ssf_phase_sample;ssf_fresnel_eval;  
Dataoriented  https://www.mesostar.com/projects/hightune/hightune.html  
HighTune: Cloud Properties (htcp)  Describe 4D atmospheric fields.  les2htcp (bin) 
HighTune: Mie (htmie)  Describe the optical properties of water droplets.  htmie_fetch_xsection_scattering; 
htmie_compute  
_xsection_absorption_average;  
HighTune: Gas Optical Properties (htgop)  Describe the optical properties of atmospheric gas mixture.  htgop_get_sw_spectral_interval; 
htgop_layer_lw_spectral_in  
terval_tab_fetch_ka;  
Applicationoriented  https://www.mesostar.com/projects/hightune/man/man1/htrdr.1.html  
htrdr_sky  Build acceleration grid for the atmospheric volume  htrdr_sky_create; 
data (3D clouds embedded in 1D gas) in the  htrdr_sky_fetch_raw_property;  
context of nullcollision algorithms, trace rays in the  htrdr_sky_fetch_svx_property;  
atmospheric volume, access nullcollision and raw data.  htrdr_sky_trace_ray;  
htrdr_ground  Build scene and acceleration structure from  htrdr_ground_create; 
input obj file describing the ground as a  htrdr_ground_trace_ray;  
set of triangles, trace rays in the scene.  
htrdr_sun  Implement a Sun model, sample solar cone, access Sun data.  htrdr_sun_create; 
htrdr_sun_sample_direction;  
htrdr_sun_get_radiance;  
htrdr_camera  Implement a pinpoint camera model, trace  htrdr_camera_create; 
a ray originating from the camera lens.  htrdr_camera_ray; 
In addition, an application (htrdr) that makes use of the different modules to implement a MC algorithm was developed. Typical functions associated with the different modules are cited as illustrations in Table D1. The sources can be downloaded online (https://www.mesostar.com/projects/hightune/hightune.html), and user guides are provided on the website. A starter pack with the data and scripts necessary to reproduce the examples of section 4 is also provided. The setup of the scenes is summarized in Table B1. However, the most useful user guide for the interested reader is the commented code that implements the renderer using the various functions of Table D1. Indeed, this code was in part developed to illustrate the use of the different libraries and modules, to serve as a basis for further developments, and as an example for implementing new algorithms.
Specific to the rendering application, software was developed (htpp) to convert spectral radiances from the colorimetric space that models human color vision (XYZ) into sRGB images. It is distributed with the library, and documentation describing the conversion process is available at https://www.mesostar.com/projects/hightune/man/man1/htpp.1.html.
To test these tools in the context of multiple scattering, we implemented several benchmark experiments and compared our calculations against published results, for example, Table 1 of Galtier et al. (2013), and against the solution of the wellvalidated 3DMCPOL (Cornet et al., 2010) on the IPRT cubic cloud case (Emde et al., 2018; see supporting information). Agreement was found within the MC statistical uncertainty, thus validating our implementation.