The CNRM Global Atmosphere Model ARPEGE‐Climat 6.3: Description and Evaluation

The present study describes the atmospheric component of the sixth‐generation climate models of the Centre National de Recherches Météorologiques (CNRM), namely, ARPEGE‐Climat 6.3. It builds up on more than a decade of model development and tuning efforts, which led to major updates of its moist physics. The vertical resolution has also been significantly increased, both in the boundary layer and in the stratosphere. ARPEGE‐Climat 6.3 is now coupled to the new version (8.0) of the SURFace EXternalisée (SURFEX) surface model, in which several new features (e.g., floodplains, aquifers, and snow processes) improve the water cycle realism. The model calibration is discussed in depth. An amip‐type experiment, in which the sea surface temperatures and sea ice concentrations are prescribed, and following the CMIP6 protocol, is extensively evaluated, in terms of climate mean state and variability. ARPEGE‐Climat 6.3 is shown to improve over its previous version (5.1) by many climate features. Major improvements include the top‐of‐atmosphere and surface energy budgets in their various components (shortwave and longwave, total and clear sky), cloud cover, near‐surface temperature, precipitation climatology and daily‐mean distribution, and water discharges at the outlet of major rivers. In contrast, clouds over subtropical stratocumulus decks, several dynamical variables (sea level pressure, 500‐hPa geopotential height), are still significantly biased. The tropical intraseasonal variability and diurnal cycle of precipitation, though improved, remained area of concerns for further model improvement. New biases also emerge, such as a lack of precipitation over several tropical continental areas. Within the CMIP6 context, ARPEGE‐Climat 6.3 is the atmospheric component of CNRM‐CM6‐1 and CNRM‐ESM2‐1.

In the following, we describe the latest version of ARPEGE-Climat, Version 6.3 (ARPEGE-Climat 6.3 hereafter), which is based on Cycle 37 of ARPEGE/IFS (declared in late 2010). Even though several components of ARPEGE-Climat are shared with its NWP counterpart, the CNRM climate group has developed and implemented over almost the past decade several new elements in ARPEGE-Climat, mostly with regard to the atmospheric and land surface physics. Besides, compared to its previous version, ARPEGE-Climat 5.1 (based on Cycle 32 of ARPEGE/IFS), this period was also taken as an opportunity to catch up with several recent developments and updates introduced since the 2000s in the ARPEGE NWP operational version (especially for the representation atmospheric turbulence and microphysics). This is part of the seamless strategy of the CNRM modeling effort.
The form of climate model documentation in the referred literature is challenging and the level of detail that can (or should) be reached varies significantly among models and journals. Of course, users of climate models and of their output should interact with model developers, for instance, as soon as they address a specific model behavior or aspect related to its parameterizations, to ensure (i) a correct understanding of the provided diagnostics and model physical content and hypotheses and (ii) appropriate feedbacks to the model development (and improvement) process. But scientific studies, which make use of a climate model either as a modeling tool or as part of a multimodel framework (e.g., through CMIP exercises), also require well-documented and peer-reviewed references to understand and emphasize model specificities. Such references are also crucial for modeling groups to document and share the key choices they make along the development of their model, which can be discussed and hopefully better justified along the peer-reviewed process and subsequent comments from the scientific community.
As a consequence, the main objective of this paper is to provide a comprehensive overview of the physical content of ARPEGE-Climat 6.3 and to illustrate the main properties of its simulated climate, when the model is forced by observed sea surface temperatures (SSTs) and sea ice concentrations (SICs). Such a detailed presentation for the ARPEGE-Climat model family has not been done in the refereed literature since Déqué et al. (1994), and ARPEGE-Climat was mostly documented in the gray literature as part of scientific and technical reports. Section 2 briefly introduces ARPEGE-Climat 5.1 before emphasizing retrospectively its evolution toward ARPEGE-Climat 6.3, in terms of both parameterization developments and their integration in the model. Section 3 comprehensively describes the ARPEGE-Climat 6.3 dynamics and physics. Its land surface component is more briefly presented as Decharme et al. (2019) extensively address it. Section 3 also details the tuning strategy. The simulation, which is then used to assess the performance of ARPEGE-Climat 6.3 to capture the main features of the observed climate, is introduced in section 4. This simulation corresponds to the CMIP6 Diagnostic, Evaluation and Characterization of Klima (DECK) Atmospheric Model Intercomparison Project (amip) simulation Eyring et al. (2016). It is compared to the equivalent ARPEGE-Climat 5.1 simulation produced for CMIP5, in order to document the progresses that have been achieved, and also the deficiencies that continue to be rooted in the ARPEGE-Climat sequence of models. Section 5 focuses on the global energy budget of the model, section 6 on the model mean climate, and section 7 on the model climate variability. Section 8 finally summarizes the behavior of ARPEGE-Climat 6.3 and provides a few priorities for ongoing ARPEGE-Climat improvements. Note that ARPEGE-Climat 6.3 is the atmospheric component of the CNRM ocean-atmosphere climate model CNRM-CM6-1  and of the Earth System Model CNRM-ESM2-1 (Séférian et al., 2019).

From Version 5.1 to Version 6.3 2.1. ARPEGE-Climat 5.1 Brief Overview
ARPEGE-Climat 5.1 is the atmospheric component of CNRM-CM5.1 and is briefly described in Voldoire et al. (2013). It is based on Cycle 32 of the ARPEGE/IFS code. Its dynamical core and its radiation transfer, orographic gravity wave drag parameterizations have only marginally evolved in Version 6.3 and thus will be described in section 3. As the major developments toward ARPEGE-Climat 6.3 mostly addressed the moist physics, namely, the turbulence, microphysics and convection parameterizations, we only emphasize these components for ARPEGE-Climat 5.1.
Since the early years of ARPEGE-Climat (Déqué et al., 1994), deep convection has been parameterized following the work of Bougeault (1985). The scheme follows a traditional mass-flux approach. Deep convection is triggered when total (resolved plus subgrid-scale) moisture convergence occurs in the low levels and if the moist static energy atmospheric profile is unstable. In that case, convection adjusts the unstable profile to a cloud profile close to a moist adiabat with constant entrainment. Following Kuo (1965), the convective closure assumes that the total moisture convergence in the column is precipitated or detrained in the convection environment. Turbulence aims at computing subgrid-scale vertical fluxes of dry static energy, water vapor and momentum. The turbulent diffusion coefficients are proportional to a mixing length parameterized through the quadratic profile of Lenderink and Holtslag (2004) and to the square root of the subgrid turbulent kinetic energy (TKE). The latter is computed based on a diagnostic budget equation, which incorporates the water vapor and liquid water effects (Galperin et al., 1988;Mellor & Yamada, 1974;1982;Yamada & Mellor, 1975). This scheme is coupled to the subgrid-scale cloud and condensation parameterization of Bougeault (1981Bougeault ( ,1982 and Ricard and Royer (1993), as it provides the standard deviation of the local saturation deficit and thus determines the variance of its exponential-Gaussian mixed distribution (see also hereafter as this parameterization has been only slightly updated in ARPEGE-Climat 6.3). Microphysical autoconversion and collection processes are parameterized with the simple approach of Smith (1990). Evaporation of precipitation is diagnosed with the Kessler (1969) formulation. ARPEGE-Climat 5.1 is coupled with the surface through the SURFEX platform , which includes surface schemes for natural land, inland water and open ocean. Natural land surfaces are represented with the "Interaction between Soil, Biosphere and Atmosphere" (ISBA) model (Noilhan & Planton, 1989;Noilhan & Mahfouf, 1996). It uses the so-called force-restore method to compute the surface energy and water budgets and a one-layer snowpack scheme (Douville et al., 1995). As they only marginally evolve in ARPEGE-Climat 6.3, the atmosphere couplings with inland water bodies and the ocean are described in section 3.3.

Model Developments Toward Arpege-Climat 6.3
The early stages of the physics now operational in ARPEGE-Climat 6.3 dates back to the late 1990s. It started with the endeavor to improve the turbulent mixing in the boundary layer, which was often too strong and too intermittent in ARPEGE-Climat previous version Siebesma et al., 2004). The first attempt was to implement the turbulent parameterization based on a TKE pronostic equation and used in the PERIDOT model (also developed at Météo-France; Bougeault & Lacarrère, 1989;Therry & Lacarrère, 1983). Its dry formulation was however soon found to be a severe limitation for its use in the CNRM climate model. In the meantime, Cuxart et al. (2000) developed a new TKE-based turbulent parameterization incorporating moist thermodynamics, and which was first targeting both mesoscale and large-eddy simulation (LES) models (e.g., the Meso-NH model; Lafore et al., 1998). In the early 2000s, under the international dynamics of the Working Group 1 of the Global Energy Water cycle EXperiment (GEWEX) Cloud-System Study (GCSS; Browning, 1993), the European funded EUROCS (European Cloud Systems) project, and their overarching goal to improve the representation of clouds in climate models, the TKE scheme of Cuxart et al. (2000) was implemented in ARPEGE-Climat and further tested and evaluated in the now widely used single-column model (SCM)-LES framework (e.g., . The addition of a more detailed description of microphysical processes (Lopez, 2002) and that of a mass-flux shallow convection scheme (Bechtold et al., 2001) strongly improved the model behavior over shallow cumulus regimes (Lenderink

Dynamical Core
The dynamical core of ARPEGE-Climat 6.3 is derived from Cycle 37 of the ARPEGE/IFS system developed jointly by Météo-France and ECMWF. The dynamical core resolved the vorticity and divergence form of the primitive equations, with temperature T and surface pressure logarithm being the thermodynamic state variables. It also computes the advection of specific humidity q v , eight microphysical species (see Sections 3.2.3 and 3.2.4), ozone and possibly other tracers. The primitive equations follow the standard hydrostatic and thin layer hypotheses. Linear terms are computed using a spectral transform on the sphere operating at a T127 triangular truncation, while all nonlinear terms (and physical tendencies) are computed on the associated reduced Gaussian grid (Hortal & Simmons, 1991), equivalent to a spatial resolution of about 150 km in both latitude and longitude. ARPEGE-Climat 6.3 is a "high-top" model, with 91 vertical levels following a progressive hybrid σ-pressure coordinate system (Simmons & Burridge, 1981). The model vertical resolution is illustrated and compared to that of ARPEGE-Climat 5.1 (31 levels) in Figure 1. The first and last model full levels are near 6 m and 82 km (or 0.01 hPa), respectively (30 m and 33 km in ARPEGE-Climat 5.1). ARPEGE-Climat 6.3 vertical resolution ranges from 20 to 200 m in the boundary layer below 1 km, around 400-500 m in the free troposphere and near the tropopause (the resolution is slightly finer there than in the Horizontal diffusion is required for three main reasons: (i) sufficiently dampen the accumulation of kinetic energy at the smallest scales resolved by the model, (ii) efficiently absorb gravity waves that propagate vertically up to the top of model, and (iii) crudely represent unresolved subgrid-scale mixing through some kind of eddy viscosity. As a result, horizontal diffusion is generally key to stabilize the model and keep long time steps for model integration. The horizontal diffusion calibration is rather empirical. Its effects are tentatively minimized, especially for the larger scales, using a simple linear and implicit diffusion scheme of order 2r = 6. In ARPEGE-Climat 6.3, it is applied in the spectral space to the main model prognostic variables X (vorticity, divergence, temperature, specific humidity, and ozone), along the hybrid coordinate surfaces. Its form reads where a is the Earth radius, N s = 127 the model truncation, τ a diffusive time scale, which corresponds to the e-folding time of the largest wave number, and h X a tunable coefficient of order Oð1Þ, specific to each model state variable. The f(p) is constant equal to 1 below approximately 50 hPa. Above, it increases as 1/p with a fixed upper bound to avoid too much diffusion in the high stratosphere and the mesosphere. In that region, f also depends on wavenumber to further increase diffusion of the smallest scales and reduce model instabilities. Horizontal diffusion is three times smaller for the vorticity ζ than for the divergence D (h D = 3h ζ with h D = 1). This tuning mainly follows historical reasons, with the physical idea that divergence has more small scales in it than vorticity and that the higher coupling between divergence and diabatic heating requires stronger diffusion to keep the model numerically stable (see also Wan et al., 2008;Lott & Guez, 2013). Horizontal diffusion for temperature and specific humidity is 2 orders of magnitude weaker. Ozone concentration horizontal diffusion magnitude is similar to that of vorticity. The diffusive time scales in the troposphere for the highest model wavenumber (127) are given for each model state variable in Table 1. Finally, note that the present implementation does not account for the conservation issues induced by horizontal diffusion.
In addition to the horizontal diffusion increase in the upper levels, a sponge layer is used to reduce spurious reflections of vertically propagating gravity waves at the model top. It consists in the introduction of a simple linear relaxation of the wind toward 0. It is active above a specified pressure fixed at 3 hPa. The relaxation time scale increases from the model top (about 0.5 days at 1 Pa) to infinity at 3 hPa (i.e., no relaxation), as τ m × p/(300−p) with τ m ≈ 200 days.

Diabatic Processes
Diabatic processes are associated with (i) small-scale (or subgrid-scale) fluid dynamics which cannot be explicitly represented because of the finite resolution in space and time of the model dynamical core, (ii) water phase transition, and (iii) radiative transfer. Their effects on the larger scale, that is, on the resolved model-state variables, are provided by a full suite of parameterizations which are described hereafter.
The physics of ARPEGE-Climat follows a quasi-parallel framework: at each time step, each parameterization starts from the same state vector and all tendencies are summed together at the end of the time step to increment the state vector at once. Some sequentiality is however required by the various couplings between processes (e.g., turbulence and clouds, clouds and radiation) or for their numerical resolution (e.g., implicit treatment of turbulent and convective vertical mixing and implicit treatment of the surface-atmosphere coupling). The time step also starts with a dry adiabatic adjustment of the temperature profile in the free troposphere to remove any instability with respect to the dry adiabatic lapse rate and with a correction of negative humidity values, which can potentially emerge from the semi-Lagrangian dynamics. In one model time step, the total diabatic tendency is computed before the call to the dynamical core, as the sum of all parameterization tendencies. It is then interpolated at the departure point of the semi-Lagrangian trajectory, so that advection considers the departure point state variables updated with this diabatic tendency.
All the values used for the tuning parameters introduced in the main text are given in the supporting information (Tables S1-S4).

Turbulent Processes
The turbulence parameterization aims at representing the small-scale turbulent mixing, from the boundary layer to the upper atmosphere. It follows the approach of Cuxart et al. (2000), considering additional effect of condensation. The main component of the turbulent mixing is supposed to be vertical and to follow the eddy diffusivity framework: where w is the vertical velocity, ϕ is any variable impacted by turbulent mixing (e.g., dry static energy, moisture, momentum, and tracers), and K ϕ is the eddy viscosity. Bars denote grid-scale values, while primes indicate subgrid-scale departure from these grid-scale values. K ϕ is classically parameterized as where ē is the grid-scale TKE, L m is a length characterizing the scale of turbulent eddies, which is often called the mixing length, C ϕ a tunable coefficient and ϕ φ a stability function. Note all constants of the turbulent scheme follow the proposal of Cheng et al. (2002) and are documented in Table S1. The parameterization determines the eddy viscosity coefficient for the horizontal wind and the conservative variables of Betts (1973), namely, the total water content q t = q v + q c and the liquid water potential tem- L v C p q c , with θ being the potential temperature, L v the latent heat of water vaporization (or ice sublimation), C p the specific heat of air at constant pressure and q c the specific mass of cloud condensed water. The fluxes of θ l and q t are then combined to derive those needed for the model state variables, namely, T, q v , and q c (see hereafter for w′q ′ c ). For the horizontal wind and the TKE, the stability functions are supposed to be equal to 1. They are the same for θ l and q t : where C 1 is a tuning constant, g is the gravity acceleration, L ϵ is the dissipation length (see below), and The TKE equation. The mean subgrid-scale TKE ē is described with a prognostic equation, following the 1.5order scheme of Cuxart et al. (2000): where ρ is the air density, u and v the zonal and meridional wind, and C ϵ a closure constant. The buoyancy production term is evaluated using the fluxes w ′ θ ′ l , w ′ q ′ t , and w ′ q ′ c . The latter is diagnosed in the cloud parameterization which describes the subgrid-scale variability of water (Bechtold et al., 1995; see also section 3.2.2).
The mixing and dissipation lengths. The mixing and dissipation length scales are supposed to be equal (L ϵ = L m ). The mixing length follows the nonlocal approach of Bougeault and Lacarrère (1989). It corresponds to the maximum vertical displacement of a parcel to consume its available TKE, considering only the work of its buoyancy: L up and L down are then combined as follows: In stable conditions, especially in the free troposphere, a minimum mixing length L min m is considered to maintain a minimum vertical mixing. Close to the surface, the mixing length is also supposed to remain greater than κz where z is the altitude above the surface and κ is the von Karman constant.
Top-PBL vertical entrainment. When strong inversions occur at the top of the planetary boundary layer (PBL), entrainment and radiative/evaporative cooling processes might locally enhance the vertical mixing at these inversion (e.g., Wood, 2012). To account for such processes, the approach of Grenier and Bretherton (2001) is adopted. The eddy diffusivity coefficient K ϕ at the height of the inversion z inv is replaced by a new coefficient K inv ϕ , which relates the subgrid turbulent buoyancy flux to the mean buoyancy jump. The inversion height is determined as the level where the TKE becomes lower than a minimal threshold e min = 0.01 m 2 s −2 . K inv ϕ reads where < e> inv is the mean TKE between the surface and the inversion height and L inv = 0.085z inv and N inv are the top-PBL entrainment mixing length and the Brunt-Väisälä frequency at the inversion, respectively. The nondimensional coefficient A inv is a function of the inversion strength, empirically defined to reproduce the observed sharp weakening of top-PBL entrainment from shallow cumulus to stratocumulus regimes. No evaporative cooling effect is considered in the current implementation.
Turbulence in the upper atmosphere. The TKE approach does not provide enough mixing in the upper atmosphere, possibly because of missing sources. It is replaced above 50 hPa by a K-profile formulation based on Louis (1979) and Louis et al. (1982): l turb,ϕ is taken constant and F φ is a stability function depending on the Richardson Number Ri.

Large-Scale Cloud Cover and Condensation
The cloud scheme is based on a statistical joint distribution of q t and θ l following Sommeria and Deardorff (1977), Bougeault (1981), and Ricard and Royer (1993). Assuming that the subgrid fluctuations of q t and θ l are weak, the treatment can be simplified using a unique variable s, which quantifies the distance to the saturation in the q t −θ l space (Mellor, 1977). The s is thus equivalent to a local saturation deficit, generalized to account for θ l fluctuations. The s reads where q sat is the saturated specific humidity and T l is the liquid temperature. This generalized saturation deficit is then normalized by its variance σ s : From Cuxart et al. (2000), the second-order fluxes ðq ′ t Þ 2 , ðq ′ t θ ′ l Þ, and ðθ ′ l Þ 2 are related to the mixing length L m , the temperature eddy diffusivity coefficient K T (as defined in Equation 3), and the TKE. This leads to the parameterized turbulence variance: where C 2 is a tuning parameter. The cloud fraction CF stat is then defined as the saturated part of the distribution of r: G is assumed to follow a mix between a symmetric (Gaussian) and an asymmetric (exponential) probability distribution function (Bougeault, 1981;Bechtold et al., 1995). The relationship between Q 1 and the cloud fraction CF stat is precomputed in ARPEGE-Climat. The choice for the G distribution implies that a cloud fraction larger than 50% only occurs for oversaturated grid boxes (Q 1 > 0) and increases with decreasing turbulence. If the grid box is not saturated (Q 1 < 0), cloud fraction remains lower than 50% and decreases with decreasing turbulence (see also Brient et al., 2019).
G is also used to compute the condensed water content ðq c Þ stat and the second-order covariance s ′ q ′ c : Following the dimensional argument of Bechtold et al. (1995), s ′ q ′ c is used to estimate the vertical flux of cloud water content w ′ q ′ c as γðQ 1 Þ × w ′ s ′ × s ′ q ′ c =σ 2 s , which is required in the turbulent scheme to compute the buoyancy flux w ′ θ ′ vl in cloud layers (Equation 5). Finally, note that a lower bound σ min s to σ s was introduced for numerical purposes. This lower bound is rather crudely increased in the presence of convective ice above 500 hPa, to account for the increase of subgrid-scale variability in the free troposphere by deep convection. Further development is required in the future to improve the interaction between the convection and turbulence schemes.

Microphysics
The microphysical scheme used in ARPEGE-Climat 6.3 was first developed by Lopez (2002). It mainly follows the approach proposed in Fowler et al. (1996), but with a reduced complexity. The scheme represents the microphysical processes indicated on Figure 2. A prognostic treatment of the specific mass of four microphysical species (cloud liquid water, cloud ice, rain, and snow) is adopted, with two main arguments for using prognostic equations to describe precipitating condensates: (i) given the relatively short time step of the model (15 min), it provides a finer description of the time evolution of the precipitation vertical distribution and associated processes (especially for snow) and (ii) it will allow a more direct approach for future data assimilation of precipitation data in the ARPEGE NWP version, so that its use in climate application contributes to the seamless approach sought at Météo-France (Bouteloup et al., 2011).
The generic equations for the specific mass of the scheme four species (namely, q l , q i , q r , and q s for cloud liquid water, cloud ice, rain, and snow, respectively) read Source and sink terms in these equations (see Figure 2 for notations) and associated hypotheses are briefly described in the following paragraphs. Some of these terms are obviously associated with heating/cooling terms in the temperature equation and with drying/moistening terms in the specific humidity equation. Note that Table S2 documents the chosen values of the parameterization tuning parameters.
Condensation and evaporation of cloud condensates. The condensed water content ðq c Þ stat is an input variable computed by the statistical cloud scheme, as described in section 3.2.2. It is first partitioned between liquid water ðq l Þ stat ¼ ½1 − δ ice ðT Þðq c Þ stat and ice ðq i Þ stat ¼ δ ice ðT Þðq c Þ stat when the temperature is below 0 o C, with where H is the Heaviside function, T t is the water triple point temperature, and δ T a parameter that controls the amount of supercooled liquid water for temperature below 0°C. The chosen value of δ T = 11.82 K allows for a significant amount of cloud liquid water up to −40°C (e.g., Pruppacher & Klett, 1998). This function, already used in ARPEGE-Climat 5.1, was shown in Cesana et al. (2015) to provide a rather satisfying liquid/ice cloud partitioning, although the occurrence of liquid clouds at very low temperature might still be underestimated (note that the observational uncertainty is however quite large; e.g., McCoy et al., 2016). The condensation/evaporation terms are then written as an instantaneous adjustment of cloud water to that provided by the cloud scheme: Autoconversion processes. Autoconversion rates follow the simple parameterization proposed by Kessler (1969), who introduced a critical water content above which the autoconversion process occurs: where (q x ) crit is a threshold value and τ x a time scale characterizing the autoconversion process efficiency. For the liquid phase, (q l ) crit and τ l are constant, while for the ice phase, they both depend on the temperature: (q i ) crit thus decreases with temperature from ðq i Þ max crit near 0°C to ðq i Þ min crit at very low temperature, which allows the autoconversion process to occur even in very cold clouds, but with a reduced efficiency (Pruppacher & Klett, 1998). The α crit and β crit are two parameters to tune this transition. They are computed so that ðq i Þ max crit corresponds to a temperature of 0°C and 1.5ðq i Þ min crit to a temperature of −80°C. The β i is meant to crudely capture the temperature dependence of ice crystal structure and its impact on aggregation efficiency (Lin et al., 1983; see also next paragraph).
Collection processes. The treatment of precipitation processes requires hypotheses regarding the particle diameter, mass, and fall speed distributions. For rain, Lopez (2002) classically integrated the continuous collection equation over the Marshall and Palmer (1948) exponential particle spectra, using the work of Sachidananda and Zrnic (1986) for the fall speed distribution. For ice, he followed similar hypotheses adding the temperature dependence of the intercept parameter of the Marshall-Palmer exponential distribution suggested by Houze et al. (1979) and using the mass-diameter relationship from Cox (1988). He thus did not account for the variety of shapes or densities of ice particles. As a result of this integration, plus some slight linearizations of all powers close to unity, the three collection rates of the present scheme (see also Figure 2), namely, accretion (COL l/r ), aggregation (COL i/s ), and riming (COL l/s ), are written as COL l=r ¼ K l=r ε l=r q l q r (24) The collection coefficients K x/y are those derived from the collection equation integration and thus depends on the distribution parameters (see Lopez, 2002, for their formulation). The collection efficiencies ε x/y are tunable parameters, supposed to be of the order Oð1Þ.
Evaporation of precipitating condensates. Evaporation rates of precipitation result from the integration of the diffusional growth equation of an individual liquid or ice particle, over the hypothesized diameter, mass, and fall speed distributions Lopez (2002). They can be expressed as (x = r, s) K x accounts for the thermal conduction of humid air, while D x represents vapor diffusion in the air. They both depend on pressure and temperature. Relative humidity RH is taken with respect to liquid water for rain (x = r) and ice for snow (x = s). N 0x is the intercept parameter of the diameter distribution for rain (N 0r ) or 10.1029/2020MS002075

Journal of Advances in Modeling Earth Systems
The a x , b x and d x are constant parameters resulting from the integration. c x is a function of pressure only.
Melting and freezing of precipitating condensates. When snow falls within a layer where temperature is greater than T t , it is supposed to instantaneously melt. The opposite is also true as rain meets temperature lower than T t .
Sedimentation. Although Lopez (2002) could derive formulations of rain and snow fall speed consistent with the microphysical distributions mentioned earlier, he proposed to use constant fall speeds for the sake of both simplicity and efficiency of the parameterization. He showed that such assumptions performed rather well in the context of weather forecasting. Thus, they were kept for ARPEGE-Climat and further applied to liquid and ice cloud water particles. Bouteloup et al. (2011) more recently developed a numerical probabilistic resolution of the sedimentation equations, which is more efficient than classical eulerian or Lagrangian algorithmic approaches, and well adapted for the rather long time step of the model.

Dry, Moist, and Precipitating Convection
Dry, moist, and precipitating convection are represented with a unified mass-flux framework, based on the work of Piriou et al. (2018). It follows the ideas of Gueremy (2011) for the convective profile and closure, and those of Piriou et al. (2007), which proposed to explicitly separate the convective vertical transport from the convective microphysical processes. For any variable ϕ (dry static energy, specific humidity, momentum, and tracer concentrations), the convective tendency reads where S ϕ is the microphysical convective source of ϕ, α u the updraft area fraction, ω u the updraft vertical pressure velocity, and ϕ u the profile of ϕ within the updraft. The microphysics of convection is thus almost entirely separated from the cloud model. It uses the same microphysics scheme as for its large-scale counterpart (section 3.2.3), thus with the same microphysical variables, namely, the specific mass fractions of convective cloud liquid water q lc , cloud ice q ic , rain q rc , and snow q sc . These new model-state variables introduce a symmetry between the convective and large-scale microphysics, notably in terms of entrainment and detrainment processes. For instance, the evolution equations of q lc and q le read where A is the advection operator, ω e is the compensating subsidence vertical velocity in the convective environment (ω e satisfies (1− α u )ω e + α u ω u = 0), E and D are the convective entrainment and detrainment, respectively, and S lc and S le are the microphysical sources in the convective updraft and its environment, respectively. The convective profile provides the condensation source as an input to the other microphysical processes (section 3.2.3). Note α u q lc ¼ q lc and ð1 − α u Þq le ¼ q l correspond to the grid-scale values of the convective and large-scale cloud liquid water, respectively. Similar equations are used for ice cloud water, rain and snow.
The Convective Profile. The thermodynamical properties of the convective updraft are determined using a buoyant parcel that is raised from a given level below the minimum of equivalent potential temperature, following first a dry adiabat (thus conserving dry static energy) below its lifting condensation level and then a moist pseudo-adiabat (conserving moist static energy) above (Gueremy, 2011). The parcel experiences dilution through entrainment processes, which will be described hereafter. Its buoyancy is then used to compute the updraft vertical pressure velocity ω u , following the work of Simpson and Wiggert (1969): where T v = T(1+0.608q v −q c ) is the virtual temperature accounting for the weight of the condensates (T vu is the virtual temperature within the updraft, T v is the grid-scale virtual temperature), γ vm is a virtual mass 10.1029/2020MS002075

Journal of Advances in Modeling Earth Systems
parameter accounting for acceleration due to pressure effects (Simpson, 1971), ε is the fractional entrainment, and K d is an aerodynamic drag parameter. The sign of the updraft vertical velocity determines the triggering of the convection scheme. Entrainment and detrainment processes are decomposed as the sum of two components (Tiedtke, 1989): the organized component aims at representing the mesoscale convergence/divergence of the convective flow, while the turbulent component targets the turbulent mixing at the edge of the convective ascent. Organized fractional entrainment ε o and detrainment δ o are determined using the buoyancy-sorting approach of Bretherton et al. (2004), assuming a uniform distribution of mixtures between the updraft and its environment: where μ 0 characterizes the partition between buoyant and nonbuoyant mixtures, as computed through the buoyancy sorting approach. The term in absolute value corresponds to the maximum possible organized entrainment or detrainment, as diagnosed from the mass conservation equation. The turbulent entrainment and detrainment are taken as equal and are parameterized as a function of the updraft vertical velocity ω u : where ε min t and ε max t are minimum and maximum values for the turbulent entrainment, respectively, f ε is a transition function (a square sinus) between these two regimes, the transition velocities being between ω min u and ω max u (see Table S3 for parameter values). The f ε is such that, for weak ascent, as in shallow convection, entrainment is large, close to ε max t , while for strong ascent, generally associated with deep convection, entrainment is much weaker, close to ε min t (e.g., Tiedtke, 1989). The aerodynamic drag K d is parameterized in a similar way, introducing K min d and K max d . Finally, the condensation rates for cloud liquid water and ice (Equation 29, included in S lc and S ic ) read Convective closure. Using the mass conservation equation, the vertical profile of the updraft area fraction (α u ) is determined to within a constant. This constant is computed assuming a Convective Available Potential Energy (CAPE) closure, with a relaxation time scale proportional to the time needed by a buoyant parcel to travel from the base of the convective ascent to its top. The closure is done at the base of the ascent and the proportionality constant is a tuning parameter. Besides, in order to keep the area fraction within an acceptable range, consistent with the hypothesis underlying the convection scheme, a maximum value α max u of only 4% is allowed. The convective tendencies of dry static energy, moisture, microphysical species, tracers, and momentum is then diagnosed using Equation 28. To improve numerical stability of the scheme and of its interaction with the turbulent parameterization, convective and turbulent transports are computed together through an implicit approach.
Downdrafts. The convection scheme includes a simple parameterization of downdrafts to account for the transport of low moist static energy from the midtroposphere into the boundary layer Gueremy (2011). The bulk downdraft is supposed to originate from environmental air saturated by the evaporation of convective precipitation falling in it (similar to Tiedtke, 1989, but without any mixing with the updraft). It thus starts from the wet-bulb thermodynamical properties of the environment, if such a parcel is negatively buoyant. The thermodynamical profile first follows a saturated pseudo-adiabat above the lifting condensation level and then an unsaturated dry adiabat below this level. The downdraft vertical velocity ω d is diagnosed from the steady version of Equation 31, using constant turbulent entrainment (ε max t ) and organized entrain- (i.e., no buoyancy sorting). This implies that the downdraft area fraction α d is constant along the vertical. It is related to the updraft area fraction at the base of the convective It is thus supposed to cover one fourth of the updraft fraction and to be further 10.1029/2020MS002075

Journal of Advances in Modeling Earth Systems
modulated by the convection intensity (and thus the evaporation intensity of convective cloud condensates and rainfall). The downdraft has no microphysics so that its impact on the large scale only occurs through vertical transport, in a similar way as in Equation 28. Convective cloudiness. Convective clouds are simply diagnosed as being proportional to the updraft area fraction α u : CF conv = k c α u , k c being the tuning proportionality constant.

Radiative Transfer
The longwave radiation scheme is based on the version of the Rapid Radiation Transfer Model (RRTM; Mlawer et al., 1997) optimized for general circulation models (GCMs) and included in Cycle 37 of the ARPEGE/IFS system (very close to that in Cycle 32; Morcrette et al., 2008). The RRTM scheme follows the two-stream approach and computes upward and downward radiative fluxes in 16 spectral bands encompassing the 10-to 3,000-cm −1 range. The parameterization uses the correlated-k approach with 140 g points (the number per spectral band depends on the absorption coefficient variations within each band and on the spectral band contribution to the total flux). It includes line absorption by the model prognostic gases, namely, H 2 O and O 3 , and by uniformly mixed gases, namely, CO 2 , CH 4 , N 2 O, CFC-11, CFC-12, CFC-22, and CCl 4 , based on the HITRAN 1996 spectrocospic database (Rothman et al., 1998) and on the CKD_2.2 water vapor continuum model (Clough et al., 1989).
The shortwave radiation scheme, originally developed by Fouquart and Bonnel (1980) and which further evolved in the IFS system until its Cycle 32r2 (Morcrette et al., 2008), integrates the fluxes over the whole shortwave spectrum between 200 and 4,000 nm. The scheme includes Rayleigh scattering, absorption by water vapor and ozone, both varying in space and time, and by CO 2 , which is treated as a uniformly mixed gas. The parameterization resolved the radiative transfer equations in six spectral bands, three bands being in the UV and visible spectral range (185-250, 250-440, and 440-690 nm) and three bands covering the near infrared range 190,1,380, and 2,380-4,000 nm).
The radiation time step for both the longwave and shortwave radiation schemes in ARPEGE-Climat 5.1 equaled 3 hr. It has been updated to 1 hr in this new version to improve cloud-radiation interactions. Consistently with Zhao et al. (2018), this modification mainly improves shortwave fluxes. The ratio between cost and benefit when calling radiative transfer every time step was found rather weak. In between two radiation time steps, constant values of the shortwave transmissivities are used to rescale the shortwave fluxes according the solar zenith angle and the surface albedo, while constant values of the longwave emissivities are used to update the longwave fluxes with the evolving surface temperature (e.g., Hogan & Bozzo, 2015).
The shortwave and longwave radiation schemes are evaluated for four clear and clean sky cases against lineby-line reference radiative transfer simulations in Pincus et al. (2015). Biases in the longwave spectral range are lower than 2 to 3 W m −2 , both at the surface and at the top of the atmosphere, which is of similar amplitude in comparison to other state-of-the-art radiative transfer code used in climate models. In the shortwave spectral range, however, ARPEGE-Climat strongly overestimates atmospheric absorption, with a bias that can reach 10 W m −2 . Errors of the CO 2 radiative forcing, as estimated from quadrupled present-day CO 2 concentration experiments over the same four cases, were also estimated. Simulated forcing are rather reasonable for the longwave component, while shortwave errors can be as high as 3 W m −1 at the surface or at the top of the atmosphere, with generally an overestimate of the shortwave absorption forcing.
The following paragraphs describe the treatment of aerosol and cloud optics, required in radiative transfer computations. Note that surface albedo and emissivity are discussed later in sections 3.3 and 4.1 and gas concentrations in section 4.1.
Aerosol radiative properties. Five classes of tropospheric aerosols are considered: sulfate, organic, black carbon, sea salt, and sand dust. Two additional types are included to represent volcanic and stratospheric aerosol types. Aerosols radiative properties are taken from Hess et al. (1998), with updates from Nabat et al. (2013) for single scattering albedo and asymmetry factor. As hydrophilic aerosols, sea salt, organic and sulfate aerosols have radiative properties, which depends on the ambient relative humidity. This dependence is accounted for through modifications of the look-up tables, following the work of Mallet et al. (2003). No aerosol scattering effect is considered in the longwave. In ARPEGE-Climat 6.3, aerosols are prescribed using 2-D maps of monthly-mean aerosol optical depths (AODs), which are combined with given normalized aerosol extinction profiles, constant in space and time for each class of aerosol (Tanré et al., 1984). The preparation of these 2-D maps is discussed in section 4.1.
Cloud radiative properties. The shortwave radiative scheme expects the cloud optical depth, single scattering albedo, and asymmetry factor, while the longwave radiative scheme expects the cloud emissivity. Cloud radiative properties are computed in each spectral band using simple functions of the cloud condensate amount and the effective radius of cloud particles, based on the Mie theory. The effective liquid radius r l e reads where m l is the mass of cloud liquid water, CDNC is the cloud droplet number concentration, ρ w the liquid water density, and κ = 1.1 a coefficient, which accounts for the width of the liquid water droplet radius distribution (Martin et al., 1994). The r l e is limited to the 2-to 24-μm range. The simple CDNC parameterization from Menon et al. (2002) is used to account for the first aerosol indirect effect. It is aimed to represent that, at constant cloud liquid water content, increasing aerosol concentration leads to a larger concentration of cloud droplets of smaller radius and thus increases cloud reflectivity (e.g., Twomey, 1977). CDNC (in cm −3 ) is computed as where m SO4 , m SS , and m OM are the mass concentration of sulfate, sea salt (particles smaller than 0.5 μm), and hydrophilic organic aerosols, respectively. The b, a SO4 , a SS , and a OM are tuning parameters (2.20, 0.20, 0.05, and 0.13, respectively; see Michou et al., 2020).
The effective ice diameter D i e has a bulk formulation, which depends on both ice water mass m i in the cloud and temperature. It follows the proposal of Sun and Rikus (1999), with the correction of Sun (2001): The cloud optical depth, single scattering albedo, and asymmetry factor are computed following the work of Slingo (1989) and Fu (1996) for liquid and ice clouds, respectively. The mass absorption coefficient calculations (used to compute the cloud emissivity) are based on Smith and Shi (1992) and Ebert and Curry (1992) for liquid and ice clouds, respectively. In all cases, cloud radiative properties are linear, sometimes polynomial, functions of r x e or 1=r x e (x = l, i), which coefficients depend on the spectral bands used in the radiative transfer code. The cloud optical depth is finally rescaled using an inhomogeneity factor, both in the shortwave and longwave ranges (0.71 and 0.90, respectively). These coefficients are supposed to account for horizontal and vertical cloud inhomogeneity in the model grid cell (e.g., Cahalan et al., 1994;Tiedtke, 1996) and were crucial in the final steps of the model tuning (see section 3.4). Note that these parameterizations are not fully consistent with the cloud droplet distribution hypotheses used in the microphysics parameterization. This consistency will be addressed in the future.
Cloud overlap. As in ARPEGE-Climat 5.1, convective and stratiform cloud fractions (CF conv and CF stat , respectively), convective and stratiform cloud water content (ðq c Þ conv and ðq c Þ conv , respectively) are combined in the horizontal using a maximum overlap hypothesis: In the vertical, a random cloud overlap approximation is used. Finally, note that the condensates seen by the radiation scheme come from the stratiform cloud parameterization (ðq c Þ stat ) or from the convection parameterization (ðq c Þ conv ), thus before any microphysical processes. This is done to keep consistency between cloud water content and cloud fraction and avoid the computation of a new cloud fraction after microphysical processes occur. Thus, in some sense, the radiative transfer parameterization sees rain and snow concentration, using the same radiative properties as for cloud liquid water and ice, and with an incorrect vertical profile as rain and snow have not fallen yet.

Subgrid-Scale Orography Drag
Gravity wave generated by unresolved orography Gravity waves can be excited when a neutrally or stably stratified flow is impinging irregular terrain. Under certain circumstances, these wave propagate upward, often very high in the atmosphere before being dissipated or absorbed near a critical level in the mean flow. The most important waves in such effects have been shown to have horizontal wavelengths of only a few tens of kilometers (e.g., Clark & Miller, 1991;Palmer et al., 1986), and therefore, their effect on the large-scale flow needs to be parameterized and related to subgrid-scale features of the orography.
The orographic gravity wave drag parameterization used in ARPEGE-Climat 6.3 is the same as in its previous version (5.1). Most of it is described in Déqué et al. (1994) and Geleyn et al. (1994), with a few updates introduced by Catry et al. (2008). The parameterization follows the linear steady-state theory of vertically propagating gravity waves (e.g., Klemp & Lilly, 1980). It accounts for gravity wave dissipation, resonance, and reflexion effects. The wind stress τ ! ¼ ρðu′w′; v′w′Þ reads Boer et al. (1984). The ρ s is the surface air density, N b,s the Brunt-Vaisala frequency at the surface, σ h is the root-mean-square of the unresolved orography variance, ⃗ u s the horizontal wind speed in a surface layer (see below for its definition) and K GW a tuning coefficient (see Table S4 for all tuning parameters), which is supposed to characterize the aspect ratio of the subgrid-scale orography. Note that in this section, for the sake of simplicity, overbars, which indicate grid-scale averages, are omitted.
The momentum flux remains constant along the vertical as long as the linear theory hypotheses are fulfilled. However, as the air density decreases with height, the wave amplitude increases so that there may exist a critical level where the gravity wave breaks. The linear deposition rate is computed under the assumption that the flux is just saturating the Lindzen (1981) convective instability criterium at the surface and is being limited by the surface value during the upward propagation of the wave. The dissipative flux is thus proportional to the square of the local Froude number Nonlinearities are introduced for two types of atmospheric behavior: 1. Gravity waves can be trapped below a neutral or unstable layer, which induces reflexion and downstream dissipation. It is parameterized as where z refl is the reflexion level, that is, the first level where the Brunt-Vaisala frequency N b (z) is equal to 0. Γ 2 (z) equals 0 for z ≥ z refl .
2. The interaction between the vertical wave length and the depth of the neutral layer can generate resonant damping or amplification of the wind stress within this layer (e.g., Peltier & Clark, 1986). The critical level z crit of the resonance is taken as the first level where the Lindzen criterium Γ 1 (z) < 1 is met. The resonance parameterization follows the work of Peltier and Clark (1986). It depends on f r ðθÞ ¼

Journal of Advances in Modeling Earth Systems
level. It is calibrated so that f r captures well the experimental results of Peltier and Clark (1986). For levels below the critical level, the parameterization reads Γ 3 (z) equals 0 for z ≥ z crit . Note that when f r (θ) < 1, the total wind stress profile corresponds to a "fully trapped" profile.
Two other effects, which modulate the orographic gravity wave drag, are introduced in the parameterization: • To reduce the parameterization dependence on the vertical resolution, we consider an effective surface layer of thickness K h σ h instead of the lowest model layer. K h is a empirical coefficient and is used to compute the wind and square Brunt-vaisala frequency averages in this surface layer (i.e., ⃗ u s and N 2 b;s , respectively). To maintain continuity of Γ(z), the wind profile is linearly modified in the surface layer (between the surface and K h σ h ).
• Following Baines and Palmer (1990), anisotropy effects of the subgrid-scale orography are introduced using the anisotropic coefficient γ and angle ψ (see below for their computation). Using idealized elliptical mountains characterized by γ and ψ parameters, Phillips (1984) computed the associated linear wave drag on the atmosphere. These results are used to parameterized anisotropy effects through the introduction of a fictitious surface layer wind ⃗ u s;f ¼ ðu s;f ;x ; u s;f ;y Þ, which is the effective wind parallel with (but opposite to) the surface drag given by Phillips (1984): A and D are functions of γ derived from elliptical integrals (see appendix in Catry et al., 2008, for their formulation).
Blocking and lifting effects induced by unresolved orography. Subgrid-scale orography limits vertical motion close to the surface and part of the flow goes around the mountains, so that friction is enhanced.This blocking effect is parameterized following the ideas of Lott and Miller (1997). It consists in multiplying the wind stress ‖ τ ! ðzÞ‖ computed above by the following coefficient: Orographic obstacles, together with terrestrial rotation, induce a transversal lift force on the impinging flow. This force has an horizontal component that is perpendicular to the incident flow in the linear quasi-geostrophic context and that is proportional to the mountain volume (Smith, 1979). The α is a shape parameter between 0 and 1. The parameterization then follows ideas from Lott (1999) and is described in detail in Catry et al. (2008). The lift force is applied for z < K h σ h , and the associated horizontal wind tendencies read 10.1029/2020MS002075

Journal of Advances in Modeling Earth Systems
where f * can be seen as an additional contribution to the Coriolis force: , the depth over which the influence of the subgrid-scale orography on the flow is felt) equals the integral of L t f over the depth σ h of the subgrid-scale orography.
Topographic parameters. Subgrid-scale orographic effects requires to diagnose three parameters: the subgrid-scale variance σ 2 h of the unresolved orography, its anisotropic coefficient γ and angle ψ. The input orographic database comes from the Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010; Danielson & Gesch, 2011), which provides terrain elevation at a 30-arc sec spatial resolution (∼1 km). Note that this is an update compared to ARPEGE-Climat 5.1, which used the GTOPO30 30-arc sec elevation data provided by the U.S. Geological Survey's Earth Resources Observation Systems (EROS) Data Center (Gesch, 1994;Gesch & Larson, 1996). Orographic data are first band-pass filtered to keep only horizontal scales between 10 km and the model grid scale (∼150 km). Vertical propagation of gravity waves becomes limited for small horizontal scale and a 10-km scale was considered reasonable. This is of the same order of magnitude as the 5-km scale indicated by Beljaars et al. (2004). The subgrid-scale variance is then easily computed. Anisotropy parameters are computed following Baines and Palmer (1990): γ 2 and ψ are the ratio of the two eigenvalues of the subgrid-scale orography gradient correlation tensor and the angle between the longitude axis and the first eigenvector of this tensor, respectively.

Gravity Wave Generated by Nonorographic Sources
Nonorographic sources (convection, fronts, and jets) can emit gravity waves with non-zero phase speed. These gravity waves play an important role in the dynamics of the mesosphere and tropical middle atmosphere (e.g., Holton, 1983;Shepherd, 2002). The vertical transport and deposition of momentum by these waves are described by a physical parameterization based on the stochastic approach described in Lott et al. (2012). A broadband spectrum of gravity waves is represented via the superposition of a large ensemble of statistically independent monochromatic ones. The nonorographic gravity waves are related to convective sources via the surface precipitation translated into diabatic heating (Lott & Guez, 2013) and are related to frontal sources via the potential vorticity (de la Cámara & Lott, 2015). The implementation and tuning details of this parameterization in ARPEGE-Climat 6.3, and its impact on the simulated climate, especially in the stratosphere and the mesosphere, will be documented in a future study.

Linear Scheme for Stratospheric Ozone
The ozone mixing ratio is treated as a prognostic variable with photochemical production and loss rates computed from a simulation with ARPEGE-Climat in which the REPROBUS chemistry scheme was activated . The linearization of the net photochemical production in the ozone continuity equation basically follows the work of Cariolle and Teyssèdre (2007), with some update to implicitly account for heterogeneous chemistry in the scheme coefficients. Further details can be found in Michou et al. (2020).

Water Vapor Sources in the Upper Atmosphere
To maintain a reasonable water vapor concentration in the upper atmosphere, the simple methane oxidation parameterization of Untch and Simmons (1999) was introduced. The associated source of water vapor is applied over the whole model domain and consists in a simple relaxation of the upper-atmospheric specific humidity toward 4.25 mg kg −1 , with a time scale depending on altitude (100 days above 50 Pa, about 2,000 days at 10 hPa; see Untch & Simmons, 1999). The parameterization also includes a water vapor sink due to photolysis in the mesosphere.

Interaction With the Surface: The SURFEX Suite
SURFEX (a French acronym for SURFace EXternalisée) is a numerical platform which simulates surface fluxes at the Earth's surface . It was first designed to facilitate its implementation in all atmospheric models of Météo-France and, second, to use the same code in off-line applications driven by atmospheric observations. It is coupled inline to ARPEGE-Climat 6.3 with an implicit scheme and shares the same grid and time step. SURFEX follows a tile approach, considering three surface types: ocean (including sea ice), lakes, and land (including urban areas treated as rock surface).

The Ocean-Atmosphere and Sea Ice-Atmosphere Interfaces
Over ocean, turbulent fluxes of momentum, heat and water vapor are computed based on an updated version of the Exchange Coefficients from Unified Multi-campaigns Estimates (ECUME) scheme already used in CNRM-CM5.1 (Belamari, 2005;Voldoire et al., 2013). The ECUME scheme is a bulk iterative parameterization, which assumes a local equilibrium between the roughness lengths and the 10-m wind under neutral conditions (i.e., it does not explicitly parameterize roughness lengths as in, e.g., Fairall et al., 1996). This latter assumption is consistent for momentum with the work of Edson et al. (2013). Further details about the ECUME Version 6 parameterization can be found in the supplementary material.
In contrast to the simple sea surface albedo parameterization of ARPEGE-Climat 5.1 (Taylor et al., 1996), ARPEGE-Climat 6.3 now uses the parameterization of Séférian et al. (2018), which provides a spectrally resolved sea surface albedo for both the direct and diffuse components of the incoming solar radiation, and accounts for whitecaps (as a function of surface wind speed) and for the ocean interior radiation pathway.
Finally, while they were prescribed in ARPEGE-Climat 5.1, the surface temperature and surface albedo over sea ice are now interactively computed using the physics of the sea ice model GELATO Version 6 included in the CNRM coupled model (see details in Salas y Mélia, 2002;Voldoire et al., 2019Voldoire et al., , 2013. As the sea ice dynamics and horizontal advection are not activated in the 1-D version of GELATO, SICs must be prescribed to remain consistent with SSTs (see section 4). In contrast, sea ice thickness remains a prognostic variable of the model, which reaches equilibrium within about 10 years.

Lakes
Lakes are represented using the bulk FLake model (Mironov, 2008), which computes the temporal evolution of the vertical lake temperature profile from the surface mixing layer to the bottom including ice and snow. A skin temperature of a 1-mm thickness is simulated as representative of the energy budget at the lake surface. FLake computes fluxes of momentum and of sensible and latent heat. An iterative routine computes turbulent fluxes using Monin-Obukhov similarity relationships. The scheme incorporates a fetch-dependent formulation for the aerodynamic roughness of the water surface, advanced formulations for the roughness lengths for potential temperature and specific humidity in terms of the roughness Reynolds number, and free convection heat and mass transfer laws to compute fluxes of scalars in conditions of vanishing mean wind. Finally, the spatial distribution of lakes at the global scale is given by the ECOCLIMAP-II database (see also section 4.1) while the lake depth was specified from the 1-km Global Lake Depth database (Kourzeneva et al., 2012). More details can be found in Le Moigne et al. (2016). Note that in CNRM-CM6-1, both the Caspian and the Aral seas are also treated as lake surfaces.

Land-Atmosphere Interactions
The land surface physical processes are represented using the new ISBA-CTRIP coupled system , which is a major update of the ISBA-TRIP version used in CNRM-CM5.1 (see also Voldoire et al., 2019). The ISBA land surface model calculates the time evolution of the energy and water budgets at the land surface. It explicitly solves the one-dimensional Fourier and Darcy laws throughout the soil (14 layers), accounting for the hydraulic and thermal properties of soil organic carbon. It also uses a 12-layer snow model of intermediate complexity to simulate the snowpack processes. The CTRIP river routing model simulates river flooding dynamic, water table depth evolution in unconfined aquifers and river discharges up to the ocean from the surface runoff, soil drainage, and groundwater recharge computed by ISBA. CTRIP has his own space-time resolution (0.5°resolution with a 30-min time step). It is coupled to SURFEX every hour using the OASIS-MCT coupler (Voldoire et al., 2017). The two-way coupling between ISBA and CTRIP allows to account for (i) floodplain interaction with the soil and the atmosphere through free-water evaporation, infiltration and precipitation interception, and (ii) groundwater exchanges with the river and the superficial soil through upward capillarity fluxes.

Journal of Advances in Modeling Earth Systems
Surface turbulent fluxes are calculated based on the classical aerodynamic bulk formulae, with drag coefficients depending on the bulk surface Richardson number (Louis, 1979;Mascart et al., 1995;Noilhan & Mahfouf, 1996). The total flux of water vapor is the sum of the direct evaporation over the free water intercepted by the canopy and the floodplains, the evaporation or sublimation over bare soil, the transpiration from plants and the sublimation from snow. Bare soil evaporation depends on surface soil moisture conditions and is equal to the potential evaporation when the superficial water content exceeds the field capacity corresponding to a soil matric potential of −0.33 bar. Plant transpiration is controlled by the stomatal conductance of leaves, which depends both on carbon cycling in vegetation and on the air CO 2 mixing ratio (Calvet et al., 1998). Carbon assimilation, and hence stomatal conductance and transpiration, are limited when soil moisture content drops below field capacity but also when the atmospheric water vapor deficit exceeds a certain threshold (Joetzjer et al., 2014). It stops when the water content in the root zone is below the wilting point corresponding to a matric potential of −15 bar. Note that the leaf area index is prescribed in CNRM-CM6-1 (but not in CNRM-ESM2-1; see Séférian et al., 2019).
Finally, to account for heterogeneities in biophysical soil and vegetation properties within a model grid cell, the land surface tile is further divided into 12 patches. These patches result from the aggregation of 500 land cover units given by the 1-km ECOCLIMAP-II database .

Tuning Approach of ARPEGE-Climat 6.3
The calibration or "tuning" of uncertain model parameters is a key step in the course of model development (e.g., Hourdin et al., 2017;Schmidt et al., 2017). It is often tightly coupled to model development itself as several iterations between these two processes are generally required to achieve a reasonable model behavior. For instance, in the course of the development of ARPEGE-Climat 6.3, it often happened that once a model configuration was found acceptable for a (reduced) given set of metrics, the analysis at a few more metrics or model behaviors revealed some undesirable features of the model, which required further understanding and to come back at the parameterization level, possibly in a lower-complexity configuration. This led to corrections or new developments, which then implied to retune the model. This is probably all-the-more true when a model physics is revisited in depth, as it was the case here from ARPEGE-Climat Version 5.1 to Version 6.3.
Within each iteration, the first step was to achieve a first guess of uncertain parameters based on a single-column approach over a few cases (mostly shallow and deep convection casese.g., see references in section 2.2). This framework provides some confidence in the ability of the model parameterizations to represent appropriately key atmospheric processes and allows in-depth comparisons with observations or with simulations explicitly resolving the targeted processes. Then, uncertain parameters were further constrained using atmosphere-only simulations (from a few years to 30 years depending on the metrics and their robustness with regard to model internal variability). Note that some parameters cannot be well addressed with the currently available single-column cases (especially those controlling the cloud radiative properties and some aspects of cloud microphysics), so that their calibration mainly relies on large-scale constrains and can hide compensating errors. The main targets of the model calibration in the atmosphere-only configuration were the following: 1. One key objective is to couple ARPEGE-Climat to an ocean model  and thus to avoid any large long-term drift in the global mean surface temperature. This requires to achieve a mean energy budget close to 0.0 W m −2 in the coupled model in preindustrial conditions. This was translated into a targeted energy balance at the ocean surface of about −0.8 W m −2 in atmosphere-only present-day simulations, accounting for enthalpy fluxes due to the runoff from land to ocean and at the interface between sea ice and ocean (approximately −0.8 W m −2 , not diagnosed in atmosphere-only simulations), the flux adjustments due to the coupling itself (∼1.1 W m −2 , especially associated with surface turbulent fluxes) and the present-day imbalance (∼0.5 W m −2 , e.g., Hansen et al., 2011). 2. Key metrics then includes global means of the various components of the top-of-atmosphere and surface (especially ocean) energy budgets (e.g., shortwave and longwave fluxes, clear-and cloudy-sky contributions). 3. A second class of metrics targeted the mean climate, that is, 2-D latitude-longitude maps or zonal averages of various seasonally or annually averaged fields, such as shortwave and longwave fluxes, cloud radiative effects, precipitation, cloud fractions, surface pressure, zonal wind, temperature, or specific humidity.

10.1029/2020MS002075
Journal of Advances in Modeling Earth Systems 4. Even though they were not addressed at each iteration, other aspects of the climate system were also from time to time included and yield further parameter calibration: distribution of daily precipitation over several regions, tropical intraseasonal variability, and quasi-biennal oscillation.
This model calibration followed a rather empirical iterative approach, varying one or two parameters only at a time. These parameters mostly included those involved in convection entrainment, cloud microphysics (autoconversion and sedimentation), and cloud radiative properties (inhomogeneity scaling factors for SW and LW cloud optical thickness). Finally, note that the land surface uncertain parameters were calibrated independently with an offline configuration forced by realistic atmospheric parameters , and not retuned when coupled to ARPEGE-Climat.

Prescribed Data for the CNRM-CM6-1 DECK amip Simulation
The main features of the climate as simulated by ARPEGE-Climat 6.3 are evaluated in the remaining of this paper using the CNRM-CM6-1 DECK amip simulation performed in the context of the CMIP6 exercise. From now on, ARPEGE-Climat 6.3 and CNRM-CM6-1 will be used indifferently as they refer to the same model configuration in the context of an amip simulation (same between ARPEGE-Climat 5.1 and CNRM-CM5.1). Such a use of the model is dependent on a variety of prescribed data, which are further detailed below.

Prescribed Data and Forcings
The various forcings used for this simulation follows the CMIP6 recommendations (Eyring et al., 2016) and are detailed below. The main differences with the corresponding CNRM-CM5.1 amip experiment performed within the CMIP5 framework are emphasized at the end of this section. The ARPEGE-Climat 6.3 configuration presented here does not include interactive aerosols (see Michou et al., 2020, for a configuration with interactive aerosols). To account for their effects on radiation and clouds, monthly averages of AOD are prescribed for the five tropospheric aerosol categories (sulfate, black carbon, organic matter, sea salt, and dust) and for one stratospheric aerosol category. These AODs are then vertically distributed based a given profile constant in space and time, and specific to each of aerosol category. Tropospheric aerosols optical depths are derived from an amip-type simulation in which the Tropospheric Aerosols for ClimaTe In CNRM-CM (TACTIC) interactive aerosol scheme  is activated. This simulation follows the CMIP6 specification in terms of anthropogenic and biomass burning emissions of short-lived climate forcers. It covers the period 1850 to 2014 using also Version 1.1.3 of the PCMDI AMIP data set. Model internal variability is removed using an 11-year moving average. Regarding stratospheric aerosols, mainly emitted by volcanic eruptions, CNRM-CM6-1 uses the Thomason et al. (2018) data set, adapted for the 550-nm wavelength. This data set also provides a significant background concentration of stratospheric aerosols The ozone linear parameterization uses monthly parameters, which vary in space and time. They were computed for the 1950-2014 period as a three-member ensemble mean of amip-type simulations performed with a slightly earlier version of ARPEGE-Climat (6.2.4) in which the interactive chemistry scheme REPROBUS was activated (see Michou et al., 2020).
The CNRM-CM6-1 orography and related parameters are derived from the GMTED2010 data set (see also section 3.2.6; Danielson & Gesch, 2011). It is spectrally fitted to ensure consistency between the grid point and spectral spaces at the model resolution. This induces orographic ripples, sometimes far from the location of orographic features.
Finally, CNRM-CM6-1 land cover distribution and properties are now based on the ECOCLIMAP-II data set . The mean annual cycles of the snow-free surface albedo and the leaf area index are 10.1029/2020MS002075

Journal of Advances in Modeling Earth Systems
derived from the 1-km Moderate Resolution Imaging Spectroradiometer (MODIS) products. The soil textural properties (clay and sand) and the soil organic carbon content are given by the 1-km Harmonized World Soil Database (Fischer et al., 2008). More details about the prescribed data for continental surfaces can be found in Decharme et al. (2019).
The forcings used for the CNRM-CM5.1 amip simulation followed the CMIP5 protocol (https://pcmdi.llnl. gov/mips/cmip5/forcing.html) and differs from those used with CNRM-CM6-1 by the following: • SST and SIC boundary conditions were taken from an earlier version of the PCMDI data set, prepared for CMIP5. • GHG concentrations were taken from the data set prepared for CMIP5 and the solar irradiance from the SOLARIS data set (https://solarisheppa.geomar.de/cmip5) • Tropospheric AOD were taken from the Szopa et al. (2013) aerosol data set. The aerosol forcing thus corresponds to a major update from CNRM-CM5.1 to CNRM-CM6-1. A dedicated effort has thus been undertaken to achieve consistency between ARPEGE-Climat Version 6.3 and its ESM counterpart in which aerosols are interactively simulated (Séférian et al., 2019;Voldoire et al., 2019). This leads to significantly weaker AOD over the recent decades in CNRM-CM6-1, mainly due to a reduction of the sulfate aerosol concentration, partly compensated by increased natural aerosol concentrations, especially for sea salt . • Stratospheric AOD were based on the Ammann et al. (2007) data set, which mainly differs from the Thomason et al. (2018) data set during the 19th century and exhibits only a few differences during the twentieth century, in terms of both intensity and duration of volcanic eruption emissions. Note also that the Ammann et al. (2007) data set does not provide any background concentration for stratospheric aerosols, while it is significant in Thomason et al. (2018). • The ozone forcing is similar except that the linear parameterization coefficients do not include implicitly heterogeneous chemistry (see section 3.2.8). • CNRM-CM5.1 orography was based on the GTOPO30 30-arc sec elevation data (Gesch, 1994;Gesch & Larson, 1996). It mainly differs from GMTED2010 over Antarctica and Greenland (not shown). • Land surface properties were based on the ECOCLIMAP data set (Masson et al., 2003). The main differences with respect to ECOCLIMAP-II include the snow-free surface albedo and the leaf area index.

Model Initialization
The CNRM-CM6-1 DECK amip simulation covers the period from 1979 to 2014. It consists of a subperiod of the so-called CNRM-CM6-1 amip-hist experiment (in the CMIP6 vocabulary), performed within the framework of the Global Monsoon MIP (Zhou et al., 2016). This amip-hist experiment covers the period from 1870 to 2014. It is initialized after the following spin-up phase: starting from Year 0 of the CNRM-CM6-1 piControl experiment, CNRM-CM6-1 was run for 40 years in an AMIP configuration with constant forcing as in 1850, except for the SSTs and SICs for which the period 1870-1889 was repeated twice. This ensures a reasonable spin-up of the model, especially for the continental surfaces. Note that a 10-member ensemble of amip-hist simulations was performed, following the spin-up procedure detailed above, and using the same starting date sample from the piControl simulations as for the CNRM-CM6-1 historical simulations (see namely, Years 0, 33,91,110,140,195,229,258,364, and 419 of the piControl experiment Voldoire et al., 2019). This ensemble will be used in the following to assess the model internal variability.

Global Energy Budget in ARPEGE-Climat 6.3
As mentioned in section 3.4, the critical tuning targets are an approximate energy balance at the top of the atmosphere (TOA) and at the surface of the ocean, with an appropriate partition between their various components. Several authors provide estimates of the TOA and surface energy budgets, using various sources of information ( (Table 4). The overall energy conservation of the atmosphere is also improved: the loss of energy corresponded to a global flux of 2.8 W m −2 in CNRM-CM5.1 while it is now reduced to 0.8 W m −2 . Note that this nonconservation of the model energy is a consequence of the dynamical core only (see section 3.1) and will be addressed in the future. Finally, the zeroth-order metric of −0.8 W m −2 for the net energy flux at the ocean surface is achieved (Table 3), as required for stable coupled simulations with ARPEGE-Climat 6.3 (section 3.4). In particular, this modeling constraint does not allow a proper comparison of the net TOA and surface fluxes between model (in this amip framework) and observations. In its coupled version (historical simulation), CNRM-CM6-1 agrees much better with the estimates indicated in Tables 2 and 3 (not shown).
The atmospheric energy flow is further synthesized in Figure 3. CNRM-CM6-1 also exhibits a rather good agreement with the reference estimates (note that no uncertainty range is indicated for the sake of readability), and generally better performances than CNRM-CM5.1. As noted above, the downward SW radiative flux is overestimated, mainly in relation with a lack of absorption within the atmosphere (see also Pincus et al., 2015). Besides, albeit within the reference uncertainty range, the downward longwave flux at the surface is likely to be underestimated (the most recent figures suggest values higher than 341 W m −2 ). This is consistent with a troposphere which is slightly too dry (see section 6.5).

Mean Climate of ARPEGE-Climat 6.3
In this section, the ability of ARPEGE-Climat 6.3 (or CNRM-CM6-1 amip experiment) to reproduce basic features of the recent climate is assessed, in particular in comparison to its previous version, ARPEGE-Climat 5.1 (CNRM-CM5.1 CMIP5 amip experiment). Unless otherwise stated, the following diagnostics are computed using the 10-member amip-hist ensemble mean over the 30-year period 1979-2008. Note that a single member is available for CNRM-CM5.1. Reference data sets derived from observations or reanalyses are detailed in Appendix A. Unfortunately, they often do not cover the same period or even a 30-year period. Therefore, for some data sets, the period of coverage used in the present evaluation only partially intersects that of the simulations. Nevertheless, this is not expected to significantly impact the main strengths or weaknesses of the model that will be emphasized in the following.

Overview
Figure 4 provides an overview of the CNRM-CM6-1 behavior for a few key variables that have been followed rather closely along the model development (2-m air temperature, precipitation, radiative fluxes, and dynamics). Note however that such a synthetic diagnostic tool was not available along the model development, so that it remains a rather blind validation of CNRM-CM6-1. The diagnostic, proposed by Gleckler et al. (2008), provides a synthetic comparison of a set of simulations. The CMIP5 ensemble is here taken as a background, which allows to position CNRM-CM6-1 against the previous generation of climate models, and in particular against CNRM-CM5. . For a given model, blue colors thus indicate that the model performs better than the median model, while red colors correspond to weak model performance compared to the median model. Note that the present diagnostic does not give any information about the raw RMSEs or the raw improvement of the models. Improvements or deteriorations are fully relative. Reference data sets used for the RMSE computation are documented in Appendix A1.
CNRM-CM6-1 provides several significant improvements compared to CNRM-CM5.1, especially in terms of 2-m air temperature, TOA radiation (except clear-sky longwave radiation, see section 6.2), precipitation (except in SON), total cloud cover and low-level (850 hPa) winds. Other dynamical variables (sea level pressure, upper-level wind, 500-hPa geopotential height, and 200-hPa temperature) have either similar or slightly increased error amplitude, which is generally larger than many CMIP5 models. The worse cases are for surface pressure in winter and 500-hPa geopotential height. The former is mainly dominated by stronger biases in the southern hemisphere (see section 6.6), while the latter is consistent with a global cold bias in the free troposphere which increases with height (see section 6.5). Overall, CNRM-CM6-1 compares well with the CMIP5 models and would have belonged to the skilled CMIP5 models.

10.1029/2020MS002075
Journal of Advances in Modeling Earth Systems

Radiation and Clouds
ARPEGE-Climat 5.1 is known to lack from low-level clouds within the tropics and subtropics, so that the "too few, too bright" low-level cloud syndrome is particularly strong in this model (e.g., Nam et al., 2012). The lack of stratocumulus (clouds with high fractions) in CNRM-CM5.1 implies a strong underestimate of the SW cloud radiative effects in the eastern part of the subtropical ocean basins (in comparison to CERES data set, Figure 5c), which has to be compensated with brighter cumulus clouds in the trades to achieve a reasonable global energy balance in the model. Similarly, CNRM-CM5.1 also lacks from low-level clouds over the midlatitude continental regions (e.g., northern America and Eurasia, Figures 6c  and 6d), leading to overestimated shortwave cloud radiative effects in these regions. This bias is consistent with the development of summer warm biases in these regions (e.g., see also section 6.4 and Cheruy et al., 2014). Shortwave cloud radiative effects are overestimated over the midlatitude oceans (e.g., over the austral ocean or the northwest Pacific, Figure 5c), although the simulated cloud cover is lower than or similar to the CALIPSO estimates (Figures 6c and 6d). This shortwave bias is presumably increased by a lack of supercooled liquid water clouds, which would increase the cloud albedo in these regions (e.g., Bodas-Salcedo et al., 2014;Cesana et al., 2015;Kay et al., 2016). Longwave cloud radiative effects are mainly overestimated in the outflow regions of the Intertropical and South Pacific Convergence Zones (ITCZ and SPCZ, Figure 5d), where it is consistent with slightly overestimated high-cloud fractions (Figure 6d). Note that the known clear-sky outgoing LW radiation dry bias (e.g., Sohn et al., 2006) is taken into account as the CERES GCM-consistent clear-sky fluxes are used here (see section A1) Most of these biases are reduced in CNRM-CM6-1 (Figures 5e-5h, 6b, 6e, and 6f). Shortwave cloud radiative effects are improved over the midlatitude oceans and the eastern tropical ocean basins, although they remain highly underestimated close to the coast of Peru, Namibia, and California (see also Brient et al.,

10.1029/2020MS002075
Journal of Advances in Modeling Earth Systems 2019). They are also improved over the midlatitude continental regions. In both cases, this is consistent with the increase of cloud fractions, especially those associated with boundary layer clouds (not shown). Cloud radiative effect biases are also reduced in convective tropical regions and their outflow, in particular due to improved precipitation climatology (see section 6.3) and associated high clouds. In the convectively active tropics, CNRM-CM6-1 cloud fractions are generally similar to the CALIPSO estimates (Figure 6f), while they induce there too much reflection of shortwave radiation back to space (Figures 5e and 5g). This is thus probably related to excessive cloud condensed water or errors in cloud radiative properties. Though still significant, this bias was partly reduced through the increase of subgrid-scale variability of total water in the presence of convective ice (see section 3.2.2). Note finally the increased positive bias of cloud cover over ice-covered regions (Figures 6d and 6f), possibly related to the missing supersaturation process.

Journal of Advances in Modeling Earth Systems
Overall, although the improvements in radiation and cloud features probably arise from a better representation of the underlying moist processes with the renewed set of parameterizations, the increased number of degrees of freedom from ARPEGE-Climat 5.1 to 6.3 and a better and more formalized model calibration process, which includes radiation and cloud metrics as key targets (see section 3.4), are also likely to contribute significantly to these improvements.

Precipitation
Precipitation biases in CNRM-CM6-1 are slightly reduced as compared with CNRM-CM5.1 (Figure 7). Their spatial patterns strongly differ between the two versions. The double ITCZ syndrome, much present in CNRM-CM5.1 (e.g., Oueslati & Bellon, 2013, 2015, almost disappeared in CNRM-CM6-1. It is possibly associated with the use of stronger entrainment rates in the new convective scheme, which makes convection more sensitive to the free-troposphere humidity (see also section 2.2), increases the dilution of convective updrafts, and thereby inhibits deep convection in subsidence regions, as over the southeastern Pacific or in the subtropics (Oueslati & Bellon, 2013). The summer Indian monsoon was also a strong weakness of CNRM-CM5.1, with a large deficit of precipitation over India and the Bay of Bengal and too much precipitation over the tropical Indian Ocean. The spatial pattern of the Indian monsoon is significantly improved in CNRM-CM6-1, albeit with too much precipitation in the windward side of the Ghats and in the Bay of Bengal (also over Southeast Asia and the West Pacific). Note that these biases are similar when using

Journal of Advances in Modeling Earth Systems
TRMM 3B43 Version 7 precipitation estimates, and slightly stronger by 2 to 3 mm day −1 when using those from the GPCP Version 2.3 data set (not shown).
On the opposite, tropical Africa is particularly dry in CNRM-CM6-1 all year long (Figures 7e and 7f). The summer West African monsoon (and to some extent the Australian monsoon in boreal winter), which was reasonably well captured by CNRM-CM5.1 (Figure 7d; see also Roehrig et al., 2013), exhibits a strong precipitation deficit in CNRM-CM6-1, especially over the Eastern Sahel (Figure 7f). Yet, Martin et al. (2017) showed that the diabatic heating and moistening simulated by CNRM-CM6-1 are significant over the Sahel and generally in good agreement with those that can be estimated from reanalyses. This emphasizes that a good precipitation climatology, albeit a simple metric crucial for impact studies, is not a guaranty for an appropriate representation of monsoon processes. Several hypotheses need to be analyzed in depth in the future to fully understand this model behavior over West Africa: • In the semiarid conditions of the Sahel, Couvreux et al. (2015) showed that the CNRM-CM6-1 convective scheme tends to produce more congestus-like convective regimes than deep convective ones. • The update of the surface albedo, aerosol climatology and aerosol optical properties strongly impacts the radiative budget over the Sahara, which is argued in Martin et al. (2017) as key for the West African monsoon rain band position. bias (c and d, period 1979-2008), and CNRM-CM6-1 (e and f, period 1979-2008, 10-member amip-hist ensemble mean bias). The bottom row indicates zonal averages of the raw shortwave (g) and longwave (h) cloud radiative effects for the three data sets (CERES EBAF Edition 4.1 in black, CNRM-CM5.1 in red, and CNRM-CM6-1 in blue).

Journal of Advances in Modeling Earth Systems
• Evaporation of precipitation below the cloud base is key in organizing convection over the Sahel (e.g., Lafore et al., 2017), but, in the current convection parameterization, it likely significantly cools the boundary layer and acts as a strong negative feedback on convection.
The CNRM-CM5.1 dry biases over Amazonia (winter and summer) and the maritime continent (winter) are mostly reduced, with the notable exception of the southern part of Brazil and the northern part of Argentina in winter. Summer dry biases over North America and Central Eurasia are also weaker in CNRM-CM6-1 than in CNRM-CM5.1.

Near-Surface Air Temperature
Near-surface (2-m) air temperature biases are generally of similar amplitude in CNRM-CM5.1 and CNRM-CM6-1, both in winter and summer (Figure 8). In winter, the eastern Siberia warm bias and the China cold bias are increased. These differences between the two model versions are likely related to drastic changes in the snow cover and properties . Note however that the three reference data sets used here (see section A4) exhibit large uncertainties over these two regions (see also Decharme et al., 2019). Uncertainty is also large over Greenland and over Antarctica (only the Berkeley Earth Surface Temperature (BEST) data set provide temperature estimates over the latter region), which makes the apparent improvement over these regions difficult to fully confirm.
In summer, CNRM-CM5.1 has a strong warm bias over the midlatitude continent, as many CMIP5 models (e.g., Cattiaux et al., 2013;Ma et al., 2014). The strong warm bias over North America in CNRM-CM5.1 was attributed to underestimates of surface shortwave cloud radiative effects and low-and high-cloud fractions, as well as to an overestimate of surface evaporation in the previous months Van Weverberg et al., 2018). This warm bias is significantly reduced in CNRM-CM6-1, consistently with the increased cloud fraction and cloud radiative effects over North America. A similar behavior, albeit weaker, is also observed over Eurasia and Siberia. Over the tropical land of Africa and northern Australia, CNRM-CM6-1 develops warm biases, stronger than those in CNRM-CM5.1. They mainly follow the annual migration of the ITCZ and are likely associated with the lack of precipitation discussed in section 6.3. Figure 9 shows the annual-and zonal-mean temperature differences with respect to AIRS data set (see section A6) for CNRM-CM5.1 and CNRM-CM6-1. Biases are relatively similar, with a large cold bias in the tropical upper troposphere. This cold bias is only partly reduced in CNRM-CM6-1 and now extends further downward in the free troposphere. Cold biases at the extratropical tropopause are also reduced. These cold biases are common in recent climate models (e.g., Held et al., 2019;Mauritsen et al., 2019) and are generally attributed to deficiencies in semi-Lagrangian advection schemes. The zonal-mean specific humidity is compared with AIRS data set on Figure 10. As CNRM-CM5.1, CNRM-CM6-1 is drier than AIRS in the tropical PBL and free troposphere, where CNRM-CM5.1 exhibits a strong wet bias over the whole column. The latter is mainly located in the Indian-Pacific ITCZ region (not shown), possibly emphasizing an overly active deep convection scheme in CNRM-CM5.1. This wet bias almost vanishes in CNRM-CM6-1, consistently with the use of increased entrainment rates in the convection scheme, which tends to inhibit deep convection. In contrast, the CNRM-CM6-1 zonally averaged dry bias around 10°N mostly emerges from the significant precipitation underestimate over the African continent (see section 6.3). In ): average between the MSWEP Version 1.2, GPCP Version 2.3 and TRMM 3B43 Version 7 raw precipitation data set (a and b, see section A3), CNRM-CM5.1 bias (c and d), and CNRM-CM6-1 bias (e and f). The bottom row indicates zonal averages of the raw precipitation averages (g and h, reference data set in black with minimum/maximum in gray shading, CNRM-CM5.1 in red and CNRM-CM6-1 in blue).

Atmospheric Temperature and Moisture
the tropical upper troposphere, the CNRM-CM5.1 dry bias is replaced by a moist bias in CNRM-CM6-1. Subtropical and extratropical moist biases are significantly reduced in CNRM-CM6-1.

Mean Circulation
The representation of the mean circulation by CNRM-CM6-1 is first addressed with the sea level pressure, in both winter and summer, and for both the Northern and Southern Hemispheres (Figures 11 and 12). Note that the reference data sets provide a single realization, while the interannual variability is known to be high in the middle and high latitudes. However, given the amplitude of most sea level pressure biases, and the inter-member variability in the CNRM-CM6-1 ensemble (not shown), we expect those biases to be robust. Besides, the sea level pressure consists of a significant extrapolation in elevated region (Källén, 1996), so that biases in such regions should be considered with care. Nevertheless, reanalyses and climate models uses

10.1029/2020MS002075
Journal of Advances in Modeling Earth Systems similar formulas for this extrapolation, so that strong biases are expected to be robust and reflect surface pressure and air temperature biases.
In winter, the Northern Hemisphere is characterized by two pressure lows, over Iceland in the Atlantic and over the Pacific, while a pressure high prevails over the Eurasia continent (Figure 11a). In CNRM-CM5.1 (Figure 11c), the Atlantic low extends too far southeastward, so that the low-level circulation is too zonal. In contrast, the Pacific low is slightly displaced to the north. While CNRM-CM6-1 has improved over the northern Pacific, the Atlantic low remains similarly biased. Over Eurasia, the pressure high is slightly displaced southward, inducing overestimated sea level pressure over the Himalayas, consistently with the cold surface bias (section 6.4). In summer, the position and intensity of the Northern Hemisphere pressure highs are mostly improved from CNRM-CM5.1 to CNRM-CM6-1. The positive bias in sea level pressure over the Arctic remains significant, albeit slightly more confined to Greenland and the North Pole.
In the Southern Hemisphere, the winter and summer mean circulation consists of a pressure low over the Antarctic coast and the surrounding oceans, and a band of high pressure over the austral ocean, which drives strong westerlies in between (Figures 12a and 12b). This circulation is stronger during the austral winter (JJAS). It is mostly deteriorated from CNRM-CM5.1 to CNRM-CM6-1, both in winter and summer (Figures 12c-12f). The austral summer (DJFM) bias keeps approximately the same but enhanced pattern: The pressure low is positively biased thus too weak, while the high-pressure band is negatively biased, thus too weak also. As a result, the latitudinal mean pressure gradient is underestimated and the strength of associated low-level circulation in the austral ocean is too weak in both models. In austral winter (JJAS), although the sea level pressure bias over the Antarctic plateau is significantly reduced, the CNRM-CM6-1 bias remains rather similar to that during the austral summer (Figure 12f).
In terms of zonal-and annual-mean zonal wind biases (Figure 13), the position of the subtropical jet streams are well simulated. In the tropical upper troposphere, a positive bias appears in CNRM-CM6-1, associated with a inaccurate Walker circulation, especially over the African region. The latter bias is consistent with a lack of convective activity which thus prevent the reacceleration of the tropical easterly jet over central Africa.

Near-Surface Air Temperature Variability
In the present section, we evaluate both the intraseasonal and diurnal variability of near-surface (2-m) air temperatures. The intraseasonal variability of daily temperature is assessed following Fischer et al. (2012). First, at each grid point, we subtract from the 1979-2008 time series of daily mean raw temperatures a long-term trend and a periodic annual cycle in order to only retain intraseasonal time scales. The long-term trend is computed as a linear regression fitted on yearly averages and then extrapolated over all Figure 10. Same as Figure 9 but for the specific humidity bias (in g kg −1 ). Contours indicate AIRS specific humidity raw values from 1 to 15 g kg −1 ,every 2 g kg −1 .

Journal of Advances in Modeling Earth Systems
days; it is centered so that its time average is 0. The annual cycle is computed as multiyear averages for each calendar day, then smoothed by periodic splines with 12 degrees of freedom. The intraseasonal variability is finally measured as the standard deviation of the resulting temperature anomalies, for both summer (JJAS) and winter (DJFM) seasons. Note that although this diagnostic also includes interannual variability, it is mostly driven by intraseasonal time scale (not shown).
In the reference data set (ERA5, see section A4), the intraseasonal temperature variability is stronger (i) in winter, (ii) over land, and (iii) at high latitudes; the standard deviation of intraseasonal temperature anomalies can reach 10 K in Siberia and North America (Figure 14). Note that the use of other reference data sets,

Journal of Advances in Modeling Earth Systems
such as BEST , provides similar results (not shown). CNRM-CM5.1 performed reasonably well in simulating the daily variability: global mean bias of 0, RMSE of 0.6 K in winter and 0.7 K in summer. Global values nevertheless hide regional biases, such as a generally overestimated variability over land. Over ocean, intraseasonal variability is slightly underestimated, possibly due to the SST forcing that uses monthly means. CNRM-CM6-1 have comparable global performances. Though, it is slightly deteriorated in winter, in particular, over northern midlatitude land regions where it exhibits a strong negative bias at around 60°N compensated by a strong positive bias at around 30°N. The origins of this zonal structure in the variability require further investigation but seem consistent with the zonal bias of the midlatitude low-level circulation in this region, which is exacerbated in CNRM-CM6-1 (see Figure 11e). In summer, CNRM-CM6-1 performs slightly better than CNRM-CM5.1 over midlatitude Figure 12. Same as Figure 11, but for the Southern Hemisphere.

Journal of Advances in Modeling Earth Systems
land areas. Note that both model versions have strong biases over high latitudes, ice-covered areas. CNRM-CM5.1 strongly underestimated the temperature variability over the Arctic in DJFM and over the Antarctic in JJAS, due to the prescription of monthly-mean surface temperature over sea ice. In contrast, the inclusion of an explicit parameterization of sea ice surface temperature (see section 3.3.1) now yields an overestimate of the intraseasonal variability of temperature over sea ice in CNRM-CM6-1. It is likely related to biases in the sea ice thermodynamical properties and will be further investigated in the future.
The diurnal variability of near-surface temperature is evaluated using the diurnal temperature range (DTR), which is simply the difference between the daily maximum and the daily minimum temperatures (e.g., Cattiaux et al., 2015). The DTR is stronger over land in the HadEX2 reference (see section A4), and maximum (i) in summer and (ii) at low latitudes (Figures 15a and 15b); it can locally exceed 20 K on average over JJAS in California (Figure 15b). The use of other reference products provides similar findings (e.g., BEST, not shown). CNRM-CM5.1 globally overestimated the DTR by about 1 K, this bias being much stronger (5 to 10 K) in subtropical land areas, resulting in high RMSEs. These biases have been significantly reduced in CNRM-CM6-1 (lower RMSEs), although the global mean bias (now negative) remains of the same order of magnitude (about 1 K).
These biases in both the temperature intraseasonal variability and the DTR have direct consequences for the representation of daily temperature extremes, as assessed from the annual minimum of minimum temperature (TNn) and the annual maximum of maximum temperature (TXx) in Figure S3 (e.g., Sillmann et al., 2013). In particular, the overestimation of TXx over midlatitude land areas by CNRM-CM5.1 has been reduced in CNRM-CM6-1, but the representation of TNn has been slightly deteriorated over Eurasia.

Daily Rainfall Distribution
In the present section, we assess the ability of CNRM-CM6-1 to simulate the distribution of daily precipitation. First, we define rainy days as days with mean precipitation greater than 1 mm day −1 (Figure 16). The GPCP v1.3 and TRMM 3B42 V7 reference data sets indicates a mean frequency of rainy days of about 60% to 80% in the ITCZ region, while it falls below 20% over a large part of the subtropics (Figure 16a). Over the midlatitudes, it rains about 50% of the time. The third reference data set MSWEP v1.2 indicates a much higher frequency of rain events by 10% to 20% (Figure 16b). Given that this rainfall product combines both observational and reanalysis data, such a difference is expected as numerical models have been shown to overestimate the frequency of light precipitation (e.g., Dai, 2006;Flato et al., 2013).
CNRM-CM5.1 was much concerned by this climate model systematic error. It strongly overestimates the frequency of rainy days over most of the tropics, except in the eastern subtropical ocean basins where it is rather consistent with the observations (Figures 16b-16d). Similarly, the model also rains too often over the midlatitude storm tracks. CNRM-CM6-1 significantly reduces these biases over most of the globe, except over tropical continental areas and over the maritime continent (Figures 16b, 16e, and 16f). There, the rainy 10.1029/2020MS002075

Journal of Advances in Modeling Earth Systems
day frequency is underestimated, indicating that the dry biases mentioned in section 6.3 are partly related to a lack of rainy events. Figure 17 now focuses on the full distribution of daily precipitation associated with rainy days and evaluates it in the two CNRM models based on a quantile-quantile approach. Tropical land and ocean are separated to account for different regimes of convection and different model behaviors. Several reference data sets are used to emphasize the uncertainty around this distribution (gray markers). It is rather large over the ocean, especially for the largest quantiles of the distribution (Figure 17b). As most climate models (e.g., Chen & Dai, 2019;Dai, 2006;Sun et al., 2006), CNRM-CM5.1 overestimates the frequency of light rain events (typically below 5 mm day −1 ), especially over ocean (inset in Figure 17b), and underestimates that of moderate-to-heavy rain over both regions. The most extreme events (above the 99.5% quantile) are generally too intense. They are most probably associated with the so-called grid point storm syndrome (e.g., Scinocca & McFarlane, 2004). The simulated precipitation distribution is dramatically improved in CNRM-CM6-1 and becomes very similar to that of TRMM 3B42 V7 (presumably because we mainly used this data set during the development of the model). Nevertheless, the model still

Journal of Advances in Modeling Earth Systems
produces too many rainy events in the range 5-20 mm day −1 over ocean (inset in Figure 17b). Similar improvements are found over the midlatitudes (not shown).

Precipitation Variability
Over the tropics and the subtropics, precipitation is strongly modulated at both the intraseasonal and diurnal time scales.
The intraseasonal variability of tropical precipitation can first be quantified as the variance of precipitation intraseasonal anomalies. These anomalies are computed by removing from raw precipitation the first three harmonics of the mean daily annual cycle. To the first order, the improvements from CNRM-CM5.1 to CNRM-CM6-1 in the representation of precipitation intraseasonal variability are consistent with those of the precipitation climatology ( Figures S4 and 7), especially over the tropical ocean. Using a wavenumber-frequency filtering procedure, the precipitation intraseasonal variability can be further decomposed into different modes, namely, the Madden-Julian Oscillation and convectively coupled tropical waves (Roundy & Frank, 2004;Schreck et al., 2012;Wheeler & Kiladis, 1999). The filter domains of Kiladis et al. (2005) for the MJO and Wheeler and Kiladis (1999) for all other tropical modes are applied to precipitation intraseasonal anomalies. The variance associated with each mode is further averaged between 10°S and 10°N ( Figure 18). Voldoire et al. (2019) pointed out that the coupled version of CNRM-CM6-1 simulates a much weaker MJO than the coupled version of CNRM-CM5.1. Here, in SST-imposed configurations, both

10.1029/2020MS002075
Journal of Advances in Modeling Earth Systems models underestimates the precipitation variance in the MJO wavenumber-frequency domain (30-to 90-day period, Wavenumbers 1 to 6, Figures 18a and S5). For CNRM-CM5.1, this is consistent with previous studies (e.g., Leroux et al., 2016). Nevertheless, the MJO signal in CNRM-CM6-1 is slightly more active over the Indian-Pacific region than in CNRM-CM5.1. This might be consistent with an improved representation of the shallow convective moistening, similarly to the mechanism reported by Hirota et al. (2018) for another climate model.
Although CNRM-CM6-1 generally underestimates the variance associated with convectively coupled equatorial waves, the zonal distribution of their associated variance is improved with respect to CNRM-CM5.1 (Figures 18c-18e). An exception concerns equatorial Rossby waves, which are now overly active over the maritime continent and the western Pacific (Figure 18b). Similarly to their respective coupled versions , the Kelvin wave propagation speed decreases from CNRM-CM5.1 to CNRM-CM6-1 ( Figure S5). The changes from CNRM-CM5.1 to CNRM-CM6-1 are likely due to the major update of the atmospheric physics. Nevertheless, further investigation is required to understand which parameterization feature is key and whether it is related to changes in process representation or changes in the model mean state (Leroux et al., 2016).
The diurnal cycle is also a fundamental property of precipitation variability. It has been shown to peak in the early evening and night over continental areas, and in the late night and early morning over the ocean (e.g., Yang & Slingo, 2001;Figures 19a and 19b). These properties of the diurnal cycle are particularly difficult to capture by state-of-the-art climate models (e.g., Covey et al., 2016;Dai, 2006). CNRM-CM5.1 reasonably captures the night peak over the ocean, albeit slightly too early (Figures 19c and 19d). However, over land, the precipitation maximum is stuck to the local maximum of solar insolation, with almost no variability in space. Its amplitude is also significantly overestimated especially over Central Africa and South America in boreal winter ( Figure S6). CNRM-CM6-1 improves the phase of the diurnal cycle. Although the precipitation peak over land remains generally too early, the spatial variability of the diurnal phase is now much more similar to that of the reference, especially across coastlines (e.g., the Maritime Continent in winter- Figure 19e-or Central America in summer- Figure 19f) or near orographic features (e.g., South America near the Andes). Consistently with the dry bias over tropical land regions, the diurnal cycle amplitude is now significantly underestimated, in particular over Africa and Amazonia ( Figure S6). Over ocean, the phase of the diurnal cycle is generally delayed by a few hours, thus in closer agreement with the reference.

Variability in the Midlatitudes
A common way to characterize the midlatitude storm tracks is to compute the standard deviation of the 2-to 6-day band-pass-filtered daily geopotential height at 500 hPa or mean sea level pressure (Harvey et al., 2015;Hoskins & Hodges, 2002). Figure 20 shows this metric using the 500-hPa geopotential height for ERA5, CNRM-CM5.1, and CNRM-CM6-1. Both CNRM-CM5.1 and CNRM-CM6-1 capture well the location of the maxima, over the North Atlantic and the North Pacific, but the storm activity is similarly underestimated in both models.
The midlatitude variability can be further characterized by the alternating of weather regimes (e.g., Michelangeli et al., 1995;Robertson & Ghil, 1999), among which the blocking regimes is of particular importance for the occurrence of extreme weather events over North America and Europe, either cold episodes in winter (e.g., Pfahl & Wernli, 2012;Sillmann et al., 2011) and warm episodes in summer (e.g., Cassou & Cattiaux, 2016). Atmospheric blockings are persistent high-pressure systems that temporarily block the midlatitude westerly flow. They mostly occur over eastern parts of the Pacific and Atlantic basins. The representation of blockings by the CNRM-CM models is evaluated using the index proposed by Scherrer et al. (2006), which is a two-dimensional generalization of the one-dimensional Tibaldi-Molteni index (Tibaldi & Molteni, 1990). Each grid point is considered as blocked when the local meridional gradient of the 500-hPa geopotential height verifies two criteria: (i) the northern gradient must be lower than −10 m per degree of latitude and (ii) the southern gradient must be greater than 0 m per degree of latitude. Figure 21 shows the wintertime region. Note that all data sets are conservatively regridded onto the CNRM-CM6-1 grid before computing the quantiles. (b) Same as (a) but for tropical ocean grid points (note that CHIRPS data set is land only). Only one member is used for each of the models.

Journal of Advances in Modeling Earth Systems
climatological frequency of blockings for the ERA5 reanalysis (see section A5), CNRM-CM5.1 and CNRM-CM6-1. In ERA5 (Figure 21a), blockings usually occur over the Northern Pacific, Greenland and Northern Europe. The spatial distribution of blockings is well captured by the CNRM-CM models but their frequency of occurrence is underestimated (Figures 21b and 21c). This behavior is common for most CMIP5 models and can be explained by biases in the mean state (Davini et al., 2017;Woollings et al., 2018). Nevertheless, the blocking frequency underestimate is reduced from CNRM-CM5.1 to CNRM-CM6-1, consistently with the improvement in the representation of the midlatitude dynamics in the Atlantic and Pacific sectors (Figures 11c and 11d), that of the low-level eddy-driven jet found in Oudar et al. (2020) and that of the North Atlantic Oscillation found in Voldoire et al. (2019).

Annual Cycle of River Discharge
Discharges at the outlet of major rivers integrate the water cycle (surface water budget, drainage, and runoff processes) over large river basins and therefore partly reflects the skill of the land-atmosphere model to capture the annual cycle of precipitation and evaporation at the regional scale, as well as the high-frequency dynamics of precipitation (as it relates to runoff for instance). Such diagnostics are thus relevant to assess the ability of climate models to represent some of the main water cycle properties. Decharme et al. (2019) evaluate the river discharges at the outlet of major rivers of the world as simulated by the ISBA-CTRIP land surface model (section 3.3.3), using an "offline" approach in which the land surface model is driven by two sets of "debiased" meteorological forcing (see their Figure 14). They found major improvements from the land surface model used in CNRM-CM5.1 to the ISBA-CTRIP version used in CNRM-CM6-1, in particular due to the addition of new processes: stream flow variable velocity, diffusive soil scheme, floodplains, and aquifers. Here, using the same observations as in Decharme et al. (2019; see also  (e and f) same as (a) and (b), respectively, but for CNRM-CM6-1. The mean diurnal cycle is computed from 3-hourly average precipitation rates. The phase then corresponds to that of the mean diurnal cycle first harmonic (e.g., Covey et al., 2016). Note that all grid points with a first harmonic amplitude lower than 0.1 mm day −1 are masked. Only one member is used for each of the models.

Journal of Advances in Modeling Earth Systems
section A7), we assess the ability of CNRM-CM6-1 to reproduce the mean annual cycle of the discharges of a few major rivers and to improve it over CNRM-CM5.1 (Figure 22). The main improvements found in the "offline" approach of Decharme et al. (2019) remain valid when ISBA-CTRIP is coupled to ARPEGE-Climat 6.3, emphasizing the key role of surface water processes to represent the seasonal dynamics of river discharges. It also demonstrates that CNRM-CM6-1 captures the main features of the annual cycle of precipitation over several river basins. Figure 22 indicates that CNRM-CM6-1 has a remarkable ability to properly represent the seasonal flow of all Arctic rivers (represented here by the Mackenzie, the Ob, and the Lena) even if a too early springtime snowmelt shifts the annual peak of discharges. While this problem is well known (see Decharme et al., 2019), the better seasonal dynamics of river discharges found here is especially due to both the new ISBA-CTRIP physics and to the improved annual cycle of precipitation over the associated basins.
In the Northern Hemisphere midlatitudes, the improved climatology of precipitation in CNRM-CM6-1 over the eastern United States (despite an overestimation of spring precipitation) allows to better represent the flows of the Mississippi compared to CNRM-CM5.1. In Europe, a persistent overestimation of precipitation during winter and spring leads to an overestimation of Danube discharges throughout the year, both in CNRM-CM5.1 and CNRM-CM6-1. The improved seasonal dynamics of the simulated discharges in these two basins is mainly related to the addition of groundwater processes, which contributes to delay the intense flows of rivers from the wet season to the dry season . This also explains why, despite underestimated summer precipitations in these regions, the river flows remain overestimated during the summer, in connection with heavy winter or spring precipitations.
In the tropics, the seasonal dynamics of river flows is reasonably well captured by CNRM-CM6-1. Over the Amazon and Paraná basins in South America and over the Congo basin in Central Africa, the strong underestimation of precipitation leads to a strong underestimation of the simulated annual discharges. In contrast, over South Africa, the overestimation of precipitation leads to an overestimated discharge of the Orange River. In West Africa, the case of Niger with its endorheic zones not represented in CNRM-CM6-1 (as in CNRM-CM5.1) remains problematic : even if the West African monsoon precipitation is underestimated in CNRM-CM6-1, the Niger flow remains largely overestimated. The CNRM-CM6-1 underestimated monsoon precipitation over southwest Asian regions (in particular the foothills of the himalayas, Figure 7). It induces an underestimated flow of the Ganges. In southeast regions (Cambodia and Vietnam), the annual cycle flow of the Mekong is significantly improved in CNRM-CM6-1 compared to CNRM-CM5.1.

Conclusions
The present paper presents and evaluates the latest version (

Journal of Advances in Modeling Earth Systems
model CNRM-ESM2-1 (Séférian et al., 2019) developed at CNRM and CERFACS and used to perform many experiments within CMIP6 context. ARPEGE-Climat 6.3 consists of a significantly increased vertical resolution and of major updates of its parameterizations, including that of turbulence, convection and microphysics processes. All of the new parameterizations add new prognostic variables to the model, thus increasing the memory of the atmospheric state from one time step to the other. Consistency and better coupling between the parameterizations were also major objectives, even though further work is still Figure 22. Comparison between the mean annual cycle (mm day −1 ) of simulated and in situ measured daily river discharges near the outlet of the major basins of the world during the 1979-2008 period. Observations (see section A7) are in black, CNRM-CM5.1 in blue, and CNRM-CM6-1 in red where shading shows ±1.64 times the intermember standard deviation. The annual simulated discharge ratio to observation (r d ) as well as the annual ratio of simulated to observed precipitation rates (r p ) are also shown for each basin and model. The observed precipitation rate used to calculate this ratio is the average of the same three products as in Figure 7.

10.1029/2020MS002075
Journal of Advances in Modeling Earth Systems required. Its coupling with the surface through the SURFEX platform, also involves new schemes, especially over sea ice with an explicit representation of surface temperature and albedo, and over land where the new ISBA-CTRIP model now includes floodplains, aquifers and more complex snow and soil parameterizations. These developments and their integration in ARPEGE-Climat took more than a decade before the full model achieves an acceptable behavior. ARPEGE-Climat 6.3 thus also results from long-term efforts in model calibration conducted in the CNRM climate modeling team. In particular, a specific attention was given to several features of the global energy budget (global mean and regional patterns), as required by the coupling of ARPEGE-Climat 6.3 to the ocean. This achievement limits the drift in long-term coupled integrations .
A general evaluation of the CMIP6 DECK amip experiment of CNRM-CM6-1 provides an overview of the new model skills in simulating the present-day mean climate and variability, and how several of them have been improved over its previous version CNRM-CM5.1. The global energy budget, both at the top of the atmosphere and at the surface, is consistent with the various estimates provided in the refereed literature. The regional details of this energy budget are also significantly improved, thanks to a better representation of clouds (radiative effects and cover). Precipitation biases are also reduced, especially over the tropical Pacific, Asia and the midlatitudes, while they are strongly increased over Africa. Circulation biases are of similar amplitude between CNRM-CM5.1 and CNRM-CM6-1 in the Northern Hemisphere, and however strengthened in the Southern Hemisphere. Several features of the climate variability are significantly improved in CNRM-CM6-1, including the diurnal variability of near-surface air temperature, the daily precipitation distribution from light to extreme rainy events and the spatial variability of the precipitation diurnal cycle. Finally, as an integrator of the water cycle over land, major improvements are found regarding the river discharges at the outlet of several major rivers of the world, especially over middle and high latitudes. Most of them are related to the improved precipitation annual cycle and to the update or increased complexity of the land surface representation. Although they were not documented here, this new ARPEGE-Climat version also improves several features of the mean state and variability of the stratosphere, which will be detailed in future studies. Most of these improvements remain valid in the coupled version of CNRM-CM6-1 Voldoire et al. (2019) and in its Earth system version (CNRM-ESM2-1, Séférian et al., 2019).
Of course, many biases remains: lack of low-level clouds in the eastern parts of ocean basins, weak tropical intraseasonal variability, too early peak in the precipitation diurnal cycle over land. A few new biases also emerge, such as the lack of precipitation over several tropical continental areas (e.g., Africa and North Australia). Several of these biases will be investigated in detail in the future, in particular to understand whether they are related to calibration issues or to structural limits of ARPEGE-Climat parameterizations (e.g., as in Brient et al., 2019).
Further validation efforts will be undertaken to benefit from the various model configurations. Starting at the process level with single-column simulations, the ARPEGE-Climat atmospheric physics will be more systematically assessed against a wide diversity of well-documented cases. Nudged or initialized configurations have also shown their potential to better disentangle physical processes from the dynamics and to facilitate the comparison with observations, especially from field experiments (e.g., Brient et al., 2019;Diallo et al., 2017). The ARPEGE-Climat 6.3 atmospheric physics is also the atmospheric component of the CNRM regional model CNRM-ALADIN63 used over several regions, such as the Euro-Mediterranean area  or the southeast tropical Atlantic (Mallet et al., 2019). This provides opportunities to get new and original looks at the ARPEGE-Climat 6.3 physics. Besides, the ongoing convergence of ARPEGE-Climat with the ARPEGE NWP version used for operational activities at Météo-France will ultimately ensure a robust validation of the model on a day-to-day basis.
ARPEGE-Climat 6.3 has also been used recently at higher horizontal resolution (T359, ∼50 km) within the CMIP6 framework (CNRM-CM6-1-HR), keeping everything the same as in the presently described version except a slightly retuned horizontal diffusion. It is currently under evaluation, but preliminary results suggest that the increased resolution only marginally improved the model performance, except in the context of tropical cyclones (Roberts et al., 2020).
ARPEGE-Climat 6.3 results from a major update of most ARPEGE-Climat physical parameterizations. As such, several behaviors of the model still need to be documented and understood, and the representation of several processes still needs to be improved. In particular, a few priorities emerge: • Energy conservation of the model, though improved in this new version, still requires further investigation, at least because it might question long-term integration of coupled configurations (e.g., Boville, 2000). • The physical consistency and coupling between parameterizations requires further improvements, such as for the microphysical hypotheses, which are currently different between the microphysics and radiation schemes, or for the coupling between the convection, turbulence and cloud schemes, which is rather crude yet. • The use of a unified and continuous approach for representing shallow and deep convection raised several difficulties during the process of model calibration. It needs to be further assessed, in particular against spectral parameterization approaches. • The model calibration remains critical but yet still empirical. New approaches, based on more advanced statistical tools, emerge (e.g., Williamson et al., 2013Williamson et al., , 2017 and seem promising to first constrain model free parameters at the process level, as in a single-column model framework, and then use this process-level information to further constrain them in the final 3-D model configuration.
This new version of ARPEGE-Climat thus serves as a new starting point for further model development at CNRM, which will hopefully continuously increase the model performance in terms of climate representation and climate-relevant processes.

Appendix A: Reference Data Sets
The present appendix provides a few details about the reference data sets used for model evaluation. All data were interpolated on the CNRM-CM6-1 model grid using a bilinear algorithm, except for radiation and precipitation for which a first-order conservative algorithm was used.

A1. Radiation
The CERES Energy Balanced and Filled (EBAF) Edition 4.1 data set provides TOA and surface radiation fluxes for the period 2001-2018 at a 1°× 1°resolution. Edition 4.0 is described in Loeb et al. (2018) and Kato et al. (2018). Edition 4.1 mainly differs from 4.0 by an update of the aerosol and cloud data sets used for the computation of surface fluxes (see CERES website for more detail). Edition 4.1 provides clear-sky fluxes estimated in a similar way as usually done in climate models, that is, using the information of the total-sky column but ignoring clouds (no screening of clear-sky pixel only). The clear-sky outgoing LW radiation dry bias (e.g., Sohn et al., 2006) is thus reduced when using these estimates. These latter fluxes are used to derive TOA cloud radiative effects. In comparison to those computed using observed-only clear-sky fluxes, globally averaged cloud radiative effects are reduced by 0.5 and 2.2 W m −2 in the shortwave and the longwave, respectively. Such a reference for cloud radiative effects is more consistent with the way they are usually computed in climate models.

A2. Clouds
Cloud cover is documented using observations from the Cloud Aerosol Lidar with Orthogonal Polarization (CALIOP, Winker et al., 2007) lidar onboard CALIPSO. CALIPSO Vertical Feature Mask Version 4.20 (VFM; Vaughan et al., 2009) consists in a binary cloud mask on a high-resolution grid: 333 m in the horizontal and 30 m in the vertical up to 8.2 km, 1 km in the horizontal and 60 m in the vertical from 8.2 to 20 km. Several products are available (see Chepfer et al., 2013;Hagihara et al., 2014, for comparisons) and the VFM is used because of its ability in detecting very thin high clouds. The cloud cover is computed from the instantaneous orbital data on the model grid as the fraction of the grid point that is covered by clouds. Data are then averaged at a monthly time scale and cover the period 2007-2017.