Observed and CMIP5‐Simulated Radiative Flux Variability Over West Africa

We explore the ability of general circulation models in the Coupled Model Intercomparison Project (CMIP5) to recreate observed seasonal variability in top‐of‐the‐atmosphere and surface radiation fluxes over West Africa. This tests CMIP5 models' ability to describe the radiative energy partitioning, which is fundamental to our understanding of the current climate and its future changes. We use 15 years of the monthly Clouds and the Earth's Radiant Energy System Energy Balanced and Filled (EBAF) product, alongside other satellite, reanalysis, and surface station products. We find that the CMIP5 multimodel mean is generally within the reference product range, with annual mean CMIP5 multimodel mean—EBAF of −0.5 W m−2 for top‐of‐the‐atmosphere reflected shortwave radiation, and 4.6 W m−2 in outgoing longwave radiation over West Africa. However, the range in annual mean of the model seasonal cycles is large (37.2 and 34.0 W m−2 for reflected shortwave radiation and outgoing longwave radiation, respectively). We use seasonal and regional contrasts in all‐sky fluxes to infer that the representation of the West African monsoon in numerical models affects radiative energy partitioning. Using clear‐sky surface fluxes, we find that the models tend to have more downwelling shortwave and less downwelling longwave radiation than EBAF, consistent with past research. We find models that are drier and have lower aerosol loading tend to show the largest differences. We find evidence that aerosol variability has a larger effect in modulating downwelling shortwave radiation than water vapor in EBAF, while the opposite effect is seen in the majority of CMIP5 models.


Introduction
The global balance at the top-of-the-atmosphere (TOA) between net incoming shortwave radiation and outgoing longwave radiation (OLR) emitted by the Earth and its atmosphere, is a World Meteorological Organization essential climate variable (GCOS, 2010) and fundamentally determines the energy budget of our climate system. Long-term observations of reflected shortwave radiation (RSR) and OLR underpin our understanding of the TOA radiation budget, leading to estimates of the global annual mean OLR and RSR (Brindley & Bantges, 2016;Hartmann et al., 2013;Schuckmann et al., 2016). Global surface budgets, for which models are often used due to the scarcity of direct observations, have also been extensively studied (Hatzianastassiou et al., 2005;Li et al., 1997;Wild et al., 2005Wild et al., , 2013Wild, 2009). Studied together, TOA and Earth and Space Science 10.1029/2019EA001017 surface radiation fluxes have informed our understanding of atmospheric absorption (Arking, 1996;Hakuba et al., 2016;Miller et al., 2012) and of the energy flows of the Earth-atmosphere system (Kiehl & Trenberth, 1997;Stephens et al., 2012;Trenberth et al., 2009;Wild et al., 2015). Both natural and anthropogenic forcings as well as feedbacks from predominantly, but not limited to, clouds, aerosols, water vapor, and surface characteristics lead to regional and temporal variability in RSR and OLR within these global budgets. Disentangling these underlying influences on the radiation budget is difficult, requiring coincident measurements of a variety of relevant atmospheric variables.
It is crucial that general circulation models (GCMs) are able to capture the observed variability in TOA radiation balance in order to improve confidence in future climate projections. The Coupled Model Intercomparison Project, Phase 5 (CMIP5, Taylor et al., 2012) provides a set of model simulations of both historical and future climates. These have been extensively compared to observations for model evaluation and to help understand inter-model spread due to uncertain feedbacks, such as those associated with clouds (Dolinar et al., 2015;Stanfield et al., 2014;Trenberth et al., 2015;Zhang et al., 2005). Globally, although CMIP5 mean biases of TOA fluxes are much reduced from CMIP3, an earlier phase of the intercomparison project, this is largely from error cancelation, and regional biases remain high, particularly in regions which are convectively active (Li et al., 2013).
Indeed, while much focus has been on the global mean radiation budget in CMIP5, the ability of GCMs to describe observed patterns in radiation variability at regional scales is equally important. For example, accurate estimates of changes to synoptic-scale weather systems and associated changes in temperature, atmospheric water vapor, aerosol loading, and surface characteristics are needed on a regional scale in order for appropriate planning and mitigation. In order to understand how these systems may change, and the subsequent regional impact on the radiation budget, it is vital to understand to what extent they can be accurately modeled in historical simulations. The region focused on here, West Africa, described in more detail in section 2, is one such region, where the radiation budget is heavily determined by the progression of the West African monsoon (WAM Sultan & Janicot, 2003), and also other competing factors such as the influence of aerosols from mineral dust and the burning of biomass.
Satellite data are typically recorded at appropriate spatial resolution, coverage and time scales for GCM evaluation. One such product, the Clouds and Earth's Radiation Energy System (CERES Wielicki et al., 1996) Energy Balanced and Filled (EBAF Loeb et al., 2009) TOA and surface products has provided a key part of our understanding of radiation budgets and has been used extensively (e.g., Calisto et al., 2014;Dolinar et al., 2015;Li et al., 2013). However, surface radiation fluxes and atmospheric absorption retrieved from satellite data are not an observational "truth", and while there has been validation on a global scale (Rutan et al., 2015), significant observational uncertainties remain .
Our aim here is twofold. The first is to understand to what extent the CMIP5 models are able to describe the regionally integrated seasonal variability in surface and TOA radiation fluxes over West Africa, using satellite products, reanalysis and surface measurement sites as references. Biases in the radiation fields are examined in relation to cloud radiative effects, water vapor and aerosol. Second, we attempt to interpret reference and model differences common to many CMIP5 models. In particular, we explore (a) to what extent the misrepresentation of radiation biases associated with the WAM progression can be attributed to biases in coupled model sea surface temperatures and (b) influences of aerosols and water vapor on both atmospheric clear-sky shortwave absorptivity and clear-sky downwelling radiation at the surface. three regions chosen here: The Sahel is defined as 10-20 • N, 12 • W to 15 • E, similar to that used by Zhou et al. (2007); the Sahara as 20-30 • N, 10 • W to 20 • E; the coastal region as 5-10 • N, 8 • W to 8 • E, as used by ; and the wider West African region as 5-30 • N, 15 • W to 20 • E. The Sahara is arid and is characterized by a dry, dusty climate and high surface albedo. The Sahel is a semiarid region: typically dry and dusty with a high surface albedo in the dry season, contrasting with higher humidity and precipitation in the wet season leading to a large increase in vegetation and decrease in surface albedo (Milton et al., 2008(Milton et al., , 2009. For a more comprehensive overview of radiative processes in the Sahel, we refer the reader to our previous study (Mackie et al., 2017). The Sahel is also particularly vulnerable to changes in climate, having already suffered from extensive droughts in the past decades (L'Hôte et al., 2002), and been identified as one of a number of climate change hot spots (Diffenbaugh & Giorgi, 2012). A number of large and rapidly growing cities on the Guinea Coast produce increasing emissions of greenhouse gases and anthropogenic aerosols, in addition to the naturally occurring marine and biogenic aerosols (Knippertz, Coe, et al. 2015;. The migration of the WAM determines the seasonal cycle of the radiation fluxes across West Africa, and we illustrate its progression using OLR from CERES (Figure 1), which decreases as cloud cover increases. The main part of the monsoon, characterized by deep convective cloud, high precipitation and typical OLR values of 180-220 W m −2 , is over the ocean, and therefore south of the shown landmass, from December-February ( Figure 1a). By April it passes up over the coastal region (Figure 1b), reaching its northernmost extent over the Sahel during July/August (Figure 1c), before passing back over the coastal region as it retreats in the autumn (Figure 1d). This results in the coastal region experiencing a "little dry season," with a period of comparatively dry weather lasting a few weeks in late July-August (Adejuwon & Odekunle, 2006;Omotosho, 1988).
The increased cloud cover associated with the monsoon leads to an increase in the shortwave radiation reflected to space, thus increasing all-sky RSR and decreasing all-sky downwelling shortwave radiation (DSR) at the surface. This leads to a maximum in RSR and a minimum in DSR in August in the Sahel Figure 2. Climatology of monthly all-sky TOA and surface radiation fluxes: of reflected shortwave radiation (RSR, a-d), outgoing longwave radiation (OLR, e-h), downwelling shortwave radiation (DSR, i-l), and downwelling longwave radiation (DLR, m-p) in study regions (section 2) from EBAF (full years, 2001-2015 inclusive), GERB/SEVIRI (2005, SARAH (2005SARAH ( -2014, and CMIP5 and AMIP models (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004) (Table 1). Light shading marks range of CMIP models; dark shading marks one standard deviation. Thin dashed line marks range of AMIP models. Inset numbers indicate annual average of climatology, with colors corresponding to data sets as marked in legend. (Figures 2b and 2j) and the coastal region (Figures 2c and 2k). Conversely, as the clouds are cooler than the underlying surface, they have the effect of reducing OLR, with an annual decrease observed first in the coastal region, and then in the Sahel (Figures 2f and 2g). The increase in cloud cover also leads to an increase in all-sky downwelling longwave radiation (DLR) at the surface in the monsoon months (Figures 2n and 2o) from increased downwelling emission from clouds in addition to the increased emission from higher summer atmospheric temperatures. In this study, we use radiative fluxes, in particular OLR, as a proxy for monsoon position.

Previous Modeling Studies: Common GCM Problems
The physical mechanisms that link the WAM to the Intertropical Convergence Zone and sea surface temperatures (SSTs), and therefore have a strong influence on African rainfall (Caniaux et al., 2011;Vizy & Cook, 2002), have proven challenging for GCMs. For example, annual mean zonal SST gradients are reversed

10.1029/2019EA001017
with respect to observational gradients in most CMIP5 models, with only small improvements from CMIP3 (Richter et al., 2014). Martin et al. (2014) find that spatial distribution biases of SSTs in CMIP5 lead to incorrect teleconnections between Atlantic multidecadal variability and Sahelian rainfall. These coupled model SST biases influence WAM dynamics, leading to a southward shift in the Intertropical Convergence Zone in comparison to the Atmospheric Model Intercomparison project, AMIP, the atmosphere-only, prescribed-SST modeling experiment of CMIP5 (Roehrig et al., 2013). Interannual variations in the monsoon, for example, through Sahelian and coastal rainfall, are also linked to variability in SSTs (Tippett & Giannini, 2006). Difficulties with modeling the WAM do not originate with Atlantic SSTs alone, however. Using AMIP simulations, Hannak et al. (2017) find that too much solar radiation reaches the surface in the coastal region, a consequence of too little cloud cover, with these clouds also being too high. Low cloud cover over the coastal region has a large effect on solar radiation (Knippertz et al., 2011). For a more detailed description of the atmospheric dynamics of the WAM, we refer the reader to Cook and Vizy (2006), and for a comprehensive overview of the WAM in CMIP5, to Roehrig et al. (2013).
Aerosols also have a significant impact on the radiation budget (Ansell et al., 2014;Banks et al., 2014;Milton et al., 2008;McFarlane et al., 2009;Ridley et al., 2014;Slingo et al., 2006) in the region, which can also lead to modeling biases. In particular, a number of studies have linked GCM overestimations of clear-sky shortwave fluxes at the surface to both water vapor and aerosol optical depth (AOD) (Freidenreich & Ramaswamy, 2011;Wild, 1999Wild, , 2006, particularly in regions with high dust or biomass burning aerosol loadings.

Reference Products and CMIP5
We use a range of reference products to evaluate the CMIP5 models, the primary of which is CERES EBAF. Measurements from the CERES instruments form the basis of CERES SYN1deg Edition 4 (Doelling et al., , 2016Rutan et al., 2015), which provides global surface and TOA radiative fluxes on a 1 • × 1 • grid. Closely linked is EBAF Edition 4, which adjusts SYN1deg to be compatible with observed estimates of ocean heat storage on a decadal global mean scale (Loeb et al., 2009(Loeb et al., , 2018. We use monthly TOA radiation fluxes: incident solar radiation (ISR) and both all-sky and clear-sky RSR and OLR. For OLR, we take the positive direction to be radiation to space. We also use both all-and clear-sky DSR, DLR, and ULR from EBAF Surface . From SYN1deg we use AOD at 500 nm and total column water vapor (TCWV), both of which are used as inputs to compute radiation fluxes in SYN1deg (Rutan et al., 2015). Water vapor profiles are input from the Goddard Earth Observing System model Versions 4 and 5.2, while AOD comes from the Moderate Resolution Imaging Spectroradiometer (MODIS) and Multi-scale Atmospheric Transport and Chemistry model (Collins et al., 2001), a chemical transport model that assimilates MODIS data. CERES data are available from March 2000 to the present.
We compare the EBAF and SYN1deg against other products and observations, at regional scales and also at four surface measurement stations. Using a number of independent observational products provides an indication of the uncertainty range of the references, which can then be used to assess the CMIP5 outputs. For the regional analysis, we use two Climate Monitoring Satellites Application Facility (CMSAF) products from February 2004 to April 2015: GERB/SEVIRI Edition 1 TOA all-sky broadband radiation fluxes (Clerbaux et al., 2017), and DSR from the Surface Solar Radiation Data Set-Heliosat (SARAH) Edition 2 (Pfeifroth et al., 2017). We also use TCWV data from the European Centre for Medium-Range Weather Forecasts ERA-Interim reanalysis (ERA-I, Dee et al., 2011). We download all of GERB/SEVIRI, SARAH, and ERA-I at a 1 • × 1 • resolution. Surface measurements are scarce in this region: We use measurements from two surface measurement sites, which are part of both the Baseline Surface Radiation Network (BSRN, Ohmura et al., 1998) and the Aerosol Robotic Network (AERONET, Holben et al., 1998), and an additional two sites from just AERONET. These sites, Tamanrasset, Algeria (22.8 • N, 5.5 • ); Ilorin, Nigeria (8.5 • N, 4.6 • W); Banizoumbou, Niger (12.2 • N, 1.4 • W); and Ouagadougou, Burkina Faso (13.5 • N, 2.7 • W), are marked in Figure 1. At Tamanrasset and Ilorin, all-sky and clear-sky DSR and all-sky DLR are inferred from the BSRN instruments, using the method from Long and Ackerman (2000) and Long and Turner (2008) for clear-sky inferences. Due to insufficient data availability at Iloirn and Tamanrasset, clear-sky DLR estimates are not available. AOD and TCWV are available from all four sites. We use AOD at 500 nm, except for at Ouagadougou, where we use 440 nm. As can be seen in Figure 1a, Tamanrasset is within the Saharan region, Ilorin the coastal region, and Banizoumbou and Ouagadougou in the Sahel. We use data from these sites to validate the satellite and reanalysis products at the closest grid points. All data products used in the regional analysis are listed in Table 1, and those used in the validation and the surface sites in Table 2.
As part of CMIP5, a number of "historical" coupled runs were performed, forced by variations in ISR, and observations of atmospheric composition and land-use changes (Taylor et al., 2012), from at least 1860-2005, which we use here at their native resolution. We select models by data availability: we restrict our analysis to models where upwelling and downwelling, shortwave and longwave radiation fluxes are available, in both all and clear skies (Table 3). A subset of these models also provide AOD at 550 nm, indicated in the table.
As well as the coupled model runs, we use simulations from AMIP (Gates, 1992;Taylor et al., 2012), which use observed SSTs and sea ice as boundary conditions for atmosphere-only simulations.

Methods
We first examine the range of the mean annual cycles of radiative variables, TCWV and AOD from the reference data sets. We then evaluate the ability of the CMIP5 GCMs to simulate these mean annual cycles. GCMs such as those used in CMIP5 are not intended to exactly replicate the state of the atmosphere at a specific time or place, rather the general patterns of variability. Therefore, we examine the average seasonal cycles over up to 15 years. We calculate the average seasonal cycles both integrated over the regions defined in section 2, and also at the surface measurement sites. We use full years for our analysis: For EBAF TOA and surface, and ERA-I, we use 2001-2015 inclusive; for CMSAF GERB/SEVIRI and SARAH we use 2005-2014 inclusive. Using the overlapping years of 2005-2014 for EBAF TOA and surface, CMSAF GERB/SEVIRI and SARAH, and ERA-I for the analysis does not significantly alter results. For the surface measurement sites, we use all available data between 1992-2017 inclusive, though the number of individual months used varies significantly depending on the variable and site. However, for each variable and site, the number of data points used in the monthly mean is approximately equal throughout the annual cycle (not shown). For the coupled and atmosphere-only simulations of CMIP5 models, we use 1990-2004 inclusive. We define the dry season as October-March inclusive, and the wet season as April-September. While this definition is less appropriate for the coastal region (see the description of the "little dry season" in section 2), we apply this definition to all regions to enable a consistent analysis.
Our analysis utilizes data from both point measurements (surface sites) and area averaged (satellite products and CMIP5 output). This requires care when making comparisons. Hakuba et al. (2014) found that the climatological mean DSR from surface stations has a spatial sampling error of 0.1 W m −2 for the Ilorin and Tamanrasset sites within a 1 • grid with respect to colocated gridded data. Moreover, Schwarz et al. (2017) showed that time series of monthly mean DSR from surface sites can be considered representative for an area 100-600 km around the sites location. This suggests that the climatological annual cycles of DSR obtained from BSRN sites are reasonably representative of the grid averages of the other reference products. Additionally, we find that the sites have qualitatively similar climatologies to the surrounding regions (see sections 4.1 and 5.1), which further supports our analysis approach.

Results: All-Sky Radiation Fluxes
In this section, we first evaluate the agreement of the reference products in order to understand the spread in observational uncertainty in all-sky TOA and surface radiation flux. For the surface radiation fluxes we evaluate both the surface sites and also the wider regions. Second, we place the CMIP5 model outputs in the context of the references, to characterize key differences in the multimodel mean and range. Finally, we interpret these differences with respect to monsoon timing and progression.

Observational Range in All-Sky TOA and Surface Radiation Fluxes
At the TOA, we use the EBAF and GERB/SEVIRI products as our primary references. We find that for all-sky monthly radiation fluxes, the annual averages of the satellite products agree to within 4 W m −2 in all-sky RSR and OLR over West Africa and individual regions (Figures 2a-2h). When analyzed over all West Africa, the differences between GERB/SEVIRI and EBAF RSR are larger in the dry season (October -March, 1.0 W m −2 , Table 4) than in the wet season (April-September, −0.2 W m −2 ). However, the largest differences are in the wet season in the coastal region (−3.2 W m −2 ). Seasonal contrasts between GERB/SEVIRI and EBAF are less pronounced in OLR, with dry season biases (−2.4 W m −2 , over all West Africa, Table 4) smaller than wet season biases (−2.8 W m −2 ). Differences between GERB/SEVIRI and EBAF are smallest in the Sahel (−1.4 W m −2 in annual mean) than the coastal region and Sahara (−3.1 and −3.4 W m −2 , respectively).
For the surface radiation fluxes, our primary reference comes from the BSRN surface measurement sites. We validate the other reference products (EBAF and SARAH) at the nearest grid point to the site. At Tamanrasset, the annual average of all-sky DSR from EBAF agrees within 5 W m −2 to the BSRN value of 263 W m −2 , with the annual average from SARAH markedly higher at 276 W m −2 (Figure 3a). There is a less consistent picture at Ilorin (Figure 3b), though the annual mean from EBAF is within 6 W m −2 of the BSRN value of 203 W m −2 , with the annual average from SARAH again larger at 221 W m −2 . However, due to the limited data available from Ilorin, this may not be an accurate representation of the annual surface insolation cycle. The annual cycle in all-sky DSR at Tamanrasset is largely similar in shape to that of the wider Saharan region (correlation coefficient r 2 =0.95 between BSRN DSR, Figure 3a, and EBAF in the Saharan region, Figure 2l), as is Ilorin to the coastal region (r 2 = 0.77 between BSRN DSR, Figure 3b, and EBAF in the coastal region, Figure 2k). Over the wider regions (Figures 2i-2l), we find that the EBAF all-sky DSR product is consistently lower than the other products, with the annual average over all West Africa 13.8 W m −2 more in SARAH than in EBAF (Table 4). This difference is highest in the coastal dry season (25.1 W m −2 ), and lowest in the Saharan dry season (7.6 W m −2 ).
The downwelling longwave radiation flux in Tamanrasset from BSRN and EBAF have a similar annual cycle (correlation coefficient r 2 = 0.97, Figure 3e), with an annual average of 328 and 351 W m −2 , respectively, with the largest differences in the wet season. Differences between reference products are smaller in Ilorin (Figure 3f), with annual averages of 406 and 395 W m −2 for EBAF and BSRN, respectively, which are largely in line with the wider coastal region (407 W m −2 , Figure 2o). However, at both Tamanrasset and Ilorin, it is notable that EBAF has consistently higher values of DLR than BSRN, suggesting that EBAF may overestimating the DLR in this region.

All-Sky TOA and Surface Radiation Fluxes in the CMIP5 Ensemble
We now examine the coupled CMIP5 models all-sky output in the context of the reference data outlined above. At the TOA, we find that the multimodel mean all-sky RSR from the coupled CMIP5 models agrees with EBAF to within 4 W m −2 in the annual mean, slightly larger than the 1-2 W m −2 difference between CMSAF GERB/SEVIRI and EBAF (Figures 2a-2d). However, there is a large spread across the models, reaching ∼70 W m −2 in the annual mean (Table 4). In the longwave, the multimodel mean all-sky OLR agrees with the EBAF values to within 9 W m −2 in the annual mean, again larger than the ∼3 W m −2 between CMSAF GERB/SEVIRI and EBAF (Figures 2e-2h).
There are distinct contrasts in all-sky OLR between regions and seasons in the CMIP5 models with respect to EBAF, which we interpret as a consequence of the timing and progression of the WAM in the models. Two aspects of the EBAF and CMIP5 model differences in particular point to this. First, CMIP5 mean differences with respect to EBAF are larger for all-sky OLR in the coastal dry season (16.9 W m −2 , Table 4 and Figure 2g) and Sahelian wet season (9.6 W m −2 , Table 4 and Figure 2f). Second, there is a distinct difference in the shape of the seasonal cycle in all-sky OLR between the reference products and CMIP5 models over 10.1029/2019EA001017 the coastal region. EBAF and GERB/SEVIRI show an increase in the later part of the wet season, approximately July/August, which is not captured by the CMIP5 multimodel mean. Both of these aspects suggest that the OLR-reducing deep convective clouds associated with the monsoon in the models reach the coastal region later in the year than observed. Additionally, the deepest convective clouds do not progress over the coastal region and into the Sahel, leading to the coastal "little dry season," as they do in radiation observations, consistent with previous research (e.g., Dunning et al., 2017;Roehrig et al., 2013). It should be noted, however, that the coastal region, as defined here, is defined as spanning 5 • in latitude. Depending on the model resolution, there may not be many grid points within this region, and the number of land/ocean may differ model to model. This may account for some of the model range observed.
Our interpretation is supported by two derived products in the Sahel and coastal region, longwave atmospheric cloud radiative effect (CRE LW , Figures 4b and 4c) and longwave cloud radiative forcing at the TOA (CRF TOA LW , Figures 4f and 4g). Here, we focus only on the longwave as the effects of the monsoon are more clearly seen. By taking the difference between the clear-sky (CS) values and the all-sky (AS) radiative fluxes, CRF TOA LW represents the radiative effect of the presence of clouds on the TOA longwave fluxes: As the presence of clouds reduces OLR, this is a positive quantity. The difference between the net longwave fluxes entering and leaving the atmospheric column in the all-sky and clear-sky defines the longwave atmospheric cloud radiative effect, CRE LW : This gives a measure of the change in longwave radiation entering and leaving the atmospheric column due to clouds, where a positive CRE indicates that the presence of clouds warms the atmosphere. For a more detailed discussion of these derived variables, see Miller et al. (2012).
As with the all-sky OLR, the "little dry season" in the coastal regions is evident in the CRE LW from EBAF ( Figure 4c), with two distinct peaks in May and October as the main part of the monsoon passes overhead, but not in the CMIP5 models. Similarly, the CRE LW and CRF TOA LW in the Sahelian wet season (Figures 4b  and 4f) show a distinctly different seasonal cycle between the CMIP5 models and EBAF. In August, EBAF has a maximum in CRE LW and in CRF TOA LW , which are not captured by the CMIP5 models. This supports the hypothesis that the clouds which pass over the coastal region into the Sahel at the northern most extent of the monsoon are not as well developed and thus reduce OLR to a lesser extent, in the coupled CMIP5 models than in the references. We note that models and observations define "clear-sky" in different ways: while CERES classifies a footprint as clear-sky if cloud fraction, as determined by MODIS, is ≤0.1 (Loeb et al., 2018), models compute the flux as if there were no clouds present. This sampling error can lead to a dry bias in satellite estimates of clear-sky radiative fluxes, which in turn impacts CRF TOA LW , especially over convective regions (Allan & Ringer, 2003;Sohn et al., 2006). A dry bias would lead to an overestimate in clear-sky BBAF OLR and subsequently an overestimate in EBAF CRF TOA LW . This effect may account for some of the CRF TOA LW difference between EBAF and the CMIP5 models discussed here, though this is unlikely to account for all the difference, given the EBAF and CMIP5 differences in all-sky OLR (Figures 2e-2h).
We now examine the ability of the coupled CMIP5 models to simulate the mean seasonal cycle of the all-sky surface fluxes. While the regional CMIP5 multimodel mean lies within the range of the reference products for all-sky DSR, the standard deviation of CMIP5 values is generally outside this range, especially in the coastal region and Sahelian wet season (Figures 2i-2l). The seasonal and regional contrasts in the range of Earth and Space Science 10.1029/2019EA001017 CMIP5 models suggests that the monsoon effect on the radiation fluxes may also be the cause of this spread. At Tamanrasset, DLR in the CMIP5 models is lower than the direct observations from BSRN (Figure 3e), a typical feature of GCMs (Wild et al., 1995(Wild et al., , 2013(Wild et al., , 2015. Furthermore, over the wider regions the multimodel annual mean is consistently smaller than the EBAF product (Figures 2m-2p).

Effect of Imposed SSTs
As discussed in section 1, coupled models are known to suffer from SST biases, which have been linked to biases in the WAM (Roehrig et al., 2013;Dunning et al., 2017). To test to what extent the differences in radiative variables discussed above can be attributed to the SST biases in coupled models, we repeat the analysis with model output from the atmosphere-only experiments, AMIP, which use observed SSTs.
We observe closer agreement between AMIP models with EBAF and GERB/SEVIRI in some of the identified variables. For example, the shape of the seasonal cycle in all-sky OLR, atmospheric CRE LW and CRF TOA LW in the coastal region show considerable improvement (Figures 2g, 4c, and 4g). However, differences between EBAF and the AMIP multimodel mean remain, especially in the coastal region, with annual mean differences remaining high (10 W m −2 ), and the decrease in OLR in the multimodel mean still lagging that in EBAF by 1-2 months (Figure 2g). There is also little change from the coupled model results in the Sahel (Figures 4b and 4f). This implies that the clouds associated with the monsoon in the AMIP simulations do not have as strong an effect in the longwave as in EBAF in the early part of the monsoon, but the cloudiest part of the monsoon does progress over the coastal region into the Sahel leading to the coastal "little dry season." However, the limited improvements over the Sahel suggest that this progression does not extend as far northward as that indicated by EBAF. Negligible differences are seen between the multimodel mean values of DSR and DLR for the AMIP and coupled model results (Figures 2i-2p). With the exception of the coastal wet season (Figure 5c), we also see negligible differences in regional TCWV between using coupled and atmosphere-only models (Figures 5b-5d), indicating that the dry bias seen in many CMIP5 models is not linked to SST biases.
Comparing the model range in all-sky radiative fluxes (purple dashed lines compared to shading, Figure 2) in AMIP and the coupled models, we find there are negligible differences at the TOA in most regions, with the exception of the all-sky OLR in the coastal region, where the range reflects the "little dry season" increase in July-August. At the surface, the all-sky DSR and DLR AMIP range is reduced, especially the DSR in the Sahara. There is also a reduction in AMIP model range in CRE LW and CRF TOA LW with respect to the coupled models ( Figure 4). Due to the influence of SSTs on interannual variability in monsoon progression, a reduced range in the models is expected.

Results: Surface Clear-Sky Fluxes, Aerosols, and Water Vapor
In this section, we focus on the effect of two factors, AOD and TCWV, on the downwelling clear-sky fluxes at the surface, DSR and DLR. As discussed previously, clear-sky DSR and DLR have long been noted as having consistent biases in GCMs with respect to surface measurement stations (Rutan et al., 2015;Wild et al., 1995Wild et al., , 2013. We begin discussing the range in clear-sky surface fluxes, AOD and TCWV from our reference products. We then examine how the CMIP5 models perform relative to this range, particularly by interpreting differences in surface clear-sky fluxes with respect to in differences in AOD and TCWV. Finally, we extend this analysis by examining how differences between EBAF/SYN1deg and CMIP5 downwelling fluxes relate to differences in AOD and TCWV in individual models.

Observational Range in Clear-Sky Surface Fluxes, Water Vapor, and Aerosol
In the shortwave, we see that SARAH has consistently higher DSR than EBAF under clear-sky conditions across all regions (Figures 6a-6d). This is consistent with the observations at Ilorin and Tamanrasset (Figures 3c and 3d), which show that DSR from SARAH more closely fits with the seasonal cycle as inferred from the BSRN data.
Next, we examine the TCWV and AOD fields in the reference products, using output from ERA-I and observations from AERONET as our primary reference, respectively. TCWV retrievals from AERONET are consistently lower than the other reference products (Figures 7a-7d), consistent with evidence that AERONET TCWV measurements suffer from a dry bias with respect to the more accurate, but not yet widely available, GPS and radiometry methods (Pérez-Ramírez et al., 2014). Using GPS as a standard, Kishore et al. (2011) find that among model outputs ERA-I performs well in its representation of TCWV, suggesting that this may be a better standard to use to evaluate CMIP5. We therefore use SYN1deg and ERA-I as our Figure 5. Regional annual cycles (as described in section 3) from SYN1deg, CMIP5, AMIP, and ERA-I. Dark gray shading indicates CMIP5 standard deviation, light gray shading indicates CMIP5 range, and thin purple lines indicate AMIP range (a-d only); thin magenta lines indicate range of models using an interactive aerosol scheme (e-h only). Inset numbers indicate annual average of climatology, with colors corresponding to data sets as marked in legend.
primary references for TCWV in the following section, rather than those from AERONET. SYN1deg and ERA-I have differences in annual average water vapor ≤0.3 cm, likely because the former is also based on reanalyses, albeit from a different system (section 3). We find similar (correlation coefficients r 2 > 0.94) ERA-I TCWV annual cycles in Tamanrasset, Ilorin, Banizoumbou, and Ouagadougou (Figures 7a-7d) as in the corresponding wider regions (Figures 5a-5d).
AOD is also similar at the AERONET sites to their wider regions (cf. Figures 7e to 5h, r 2 = 0.74; Figures 7f to 5g, r 2 = 0.86; and Figures 7g and 7h to 5f, r 2 = 0.72 and r 2 = 0.43). AOD at Tamanrasset from AERONET has a slightly later peak in July than the June peak of SYN1deg (Figure 7e). In Ouagadougou (Figure 7h), AOD from AERONET is much lower than that in Banizoumbou (Figure 7g) and is the only site where SYN1deg has a consistent higher aerosol loading than AERONET. This is indicative of the high spatial and temporal variability in AOD: despite the relative proximity of Banizoumbou and Ouagadougou, the AERONET values are different by a factor of ∼2.

Surface Radiation Fluxes, Water Vapor, and Aerosol in the CMIP5 Ensemble
In general, the CMIP5 models have higher values of DSR than EBAF under clear-sky conditions (Figures 6a-6d). The CMIP5 models also show a large range in the wet season, particularly in the Sahel (62.5  Table 4). Despite this, the multimodel mean is within the range given by EBAF and SARAH across the regions (Figures 6a-6d) and agrees within 5 and 1 W m −2 with the BSRN observation at Tamanrasset and Ilorin, respectively (Figures 3c and 3d). While there are no clear-sky DLR inferences at the surface sites, we see that EBAF DLR is consistently higher in the regional integrations than the CMIP5 multi-model mean (Figures 6e-6h). This difference is highest in the Saharan wet season (−18.0 W m −2 , Table 4).
The multimodel mean from CMIP5 has a dry bias with respect to ERA-I both at the sites (Figures 7a-7d) and across the wider regions (Figures 5a-5d). This is especially the case in the dry season, where the ERA-I and SYN1deg value is outside of the range given by the CMIP5 models. This drier atmosphere in the CMIP5 models is consistent with the higher clear-sky DSR and lower clear-sky DLR with respect to EBAF.
At the sites, the CMIP5 multimodel mean in AOD is generally lower than the AERONET AOD observations (Figures 7e-7h). The exception to this is Ouagadougou, where there is a higher aerosol loading. Comparison of Ouagadougou and Banizoumbou in the CMIP5 models and SYN1deg, however, shows that they have very similar AOD at these two sites, indicating that neither has the high spatial variability of the observations. We note that although taking the multimodel mean of the models with an interactive aerosol scheme (Wilcox 10.1029/2019EA001017 Figure 7. Annual cycles (as described in section 3) of TCWV at AERONET sites Tamanrasset, Ilorin, Banizoumbou, and Ouagadougou. Dark gray shading indicates CMIP5 standard deviation; light gray shading indicates CMIP5 range. Inset numbers indicate annual average of climatology, with colors corresponding to data sets as marked in legend. et al., 2013) at the regional level (Figures 5e-5h) increases the AOD by ∼0.1, thus slightly decreasing the difference to SYN1deg, the range of the models is negligibly different from the full set of coupled models. It is harder to link general patterns in clear-sky surface fluxes with AOD than with TCWV: The CMIP5 models have a very wide range of values with respect to that from SYN1deg (Figures 5e-5h), though the multimodel mean is generally lower. Lower atmospheric dust aerosol loading in the CMIP5 models would also result in more solar radiation reaching the surface, and less downwelling longwave radiation.
While some disagreement exists between reference products, and there is a suggestion that EBAF may underestimate the all-sky DSR and clear-sky DLR, we continue our analysis using just the CMIP5 models and EBAF. This is for two reasons: First, in order to evaluate radiation fluxes with respect to other variables, we require self-consistent data sets, such as that provided by EBAF/SYN1deg and CMIP5. Second, while EBAF Surface/SYN1deg is essentially also model output, comparison of the sensitivity to aerosols and water vapor of clear-sky surface fluxes between EBAF and the CMIP5 models is of interest.

Link of Clear-Sky Fluxes to AOD and TCWV
In this section, we aim to address two questions to determine the relative effect of TCWV and AOD on the clear-sky downwelling fluxes by examining the CMIP5 models individually. The first is whether models which have a larger discrepancy in AOD or TCWV with respect to EBAF have a larger discrepancy in shortwave radiation absorbed by the atmosphere (Figure 8) or downwelling longwave radiation at the surface ( Figure 9). The second is to probe how changes in water vapor and aerosol loading may affect the clear-sky fluxes, and how this might be different in EBAF and the CMIP5 models.
10.1029/2019EA001017 We approach the first of these questions in the following way: We start with calculating the time series of the proportion (%) of shortwave radiation reaching the surface for each model individually, in each region, by taking the ratio of the DSR and the incident solar radiation at the TOA; next, we calculate the mean of the model − EBAF discrepancy in this proportion over that time period; we then repeat this for AOD and TCWV for each model, to obtain the mean discrepancy with SYN1deg. Figures 8a-8d shows the CMIP5 mean model − EBAF discrepancy of the percentage of shortwave radiation reaching the surface (DSR/TSI), plotted against the mean model − SYN1deg discrepancy in AOD and TCWV, integrated across the regions. Figures 9a-9d shows the same for clear-sky downwelling longwave radiation at the surface. We see that, in general, the largest differences in DLR and DSR are in the bottom left quadrant where the models are drier and have lower aerosol loading than SYN1deg. Indeed, as the model positions are the same in Figures 8a-8d and 9a-9d, we see that some of the models that are considerably more transparent in the shortwave than EBAF (dark red in Figures 8a-8d) also have the smallest downwelling longwave radiation values with respect to EBAF (dark blue in Figures 9a-9d). We note that over all West Africa and in the coastal region, models Earth and Space Science 10.1029/2019EA001017 with the largest difference in TCWV also have the largest difference in DLR. Additionally, there are models that have more DLR than EBAF over some regions, in particular, the coastal region. These models generally have a small difference in TCWV (±0.2 cm) compared to SYN1deg. There is a less clear picture with AOD, though there is also no clear evidence that models with interactive aerosol schemes (marked with diamonds) have smaller differences to EBAF than those using climatologies.
To answer the second question, we extend our analysis to look at the correlations between anomalies in clear-sky DSR and DLR and anomalies in AOD and TCWV (Figures 8e and 8f and 9e and 9f). For each model, and for EBAF, we start by calculating the anomaly of the regional mean for each month from the 15 year monthly mean of the variables. We then use a least squares regression to correlate the anomaly in AOD and TCWV with the anomaly in clear-sky DLR and DSR. The Pearson's r 2 correlation coefficient for each correlation gives a measure for the proportion in variability of the flux anomaly explained by the variability in AOD or TCWV. Figures 8e-8h show the r 2 correlation coefficient of DSR with AOD, plotted against that from DSR and TCWV. Figures 9e-9h shows the same but for DLR. On both plots, the correlations within EBAF/SYN1deg are given by crosses, for comparison with the individual models.
We see that, for EBAF, clear-sky DSR has a far higher correlation coefficient with AOD than with TCWV. While some CMIP5 models do show this behavior, the majority of the CMIP5 models show the opposite: a stronger degree of covariability between clear-sky DSR and TCWV than between clear-sky DSR and AOD. A number of models with a large difference in the proportion of shortwave radiation reaching the surface (dark red) have a particularly weak correlation with AOD. The differences in sensitivity of clear-sky fluxes to AOD between EBAF and the models in CMIP5 could arise from a number of reasons: a result of the lower variability in AOD in the models; that EBAF uses AOD directly (as given by the Multi-scale Atmospheric Transport and Chemistry model and MODIS, see section 3) while AOD in GCMs is calculated from aerosol properties; horizontal and vertical resolution of the models in comparison to EBAF; or that EBAF uses 18 shortwave channels (Rutan et al., 2015), which is larger than for most GCMs (cf., e.g., the six shortwave channels used in the Met Office Hadley Centre models Walters et al., 2017). This analysis also suggests that the clear-sky DSR is more sensitive to variations in water vapor in most of the CMIP5 models than EBAF. Again, with this measure there is no evidence that models with an interactive aerosol scheme have a larger impact on clear-sky DSR variability than those with a climatology.
For clear-sky DLR in EBAF, there is a very low correlation with AOD and a correlation of r 2 ∼0.6 with TCWV in all regions. While the CMIP5 models also have generally have a low correlation coefficient with AOD, there is a larger spread in r 2 value for TCWV. Models that covary more strongly with water vapor also appear to have less clear-sky DLR with respect to EBAF (dark blue), implying that they are more sensitive to variations in water vapor than EBAF. This is particularly the case over the Sahel, the Sahara and over the wider West African region. The exception to this is the coastal region. Again, this analysis suggests that clear-sky DLR in the majority of CMIP5 models is more sensitive to water vapor than in EBAF.

Discussion and Conclusions
In this study, we compare TOA and surface fields of all-sky and clear-sky radiation fluxes over West Africa in a variety of reference products to model output from the fifth phase of CMIP5. Through use of multiple reference data sets, including satellite-derived products, ERA-Interim reanalysis and surface stations measurements, we are able to evaluate the CMIP5 models within the context of these references. This also places our primary reference, CERES EBAF, in the context of other reference products, allowing us to proceed with derived radiative products from EBAF (cloud effects and forcing) and auxiliary products relating to water vapor and aerosols, which are only possible with complete, self-consistent data products such as EBAF/SYN1deg, while being aware of their limitations. A particular focus is the contrast between different seasons and regions within West Africa. Here, we aimed to address the extent to which the CMIP5 models agree with the range given by the reference products in the TOA and surface radiation fields. Our analysis gives rise to further questions which we address: To what extent can we link coupled model SSTs to monsoon-related differences in radiative variables and how do AOD and TCWV modulate clear-sky downwelling surface radiation fluxes in both EBAF and the CMIP5 models?
For the first of these questions, we find that differences between the reference data sets are generally smaller than those with respect to many CMIP5 models. Though the multimodel annual mean agrees with the satellite products within 1 and 5 W m −2 when averaged over West Africa for RSR and OLR respectively, there is a large range in behavior of CMIP5 models, particularly in the coastal wet season and Sahelian dry season. At the surface, we find that, in general, the CMIP5 models overestimate the downwelling shortwave radiation, and underestimate the downwelling longwave radiation with respect to EBAF. However, analysis at Tamanrasset and Ilorin suggests that EBAF may underestimate the clear-sky shortwave radiation reaching the surface. At these sites (Figures 3c and 3d), and also regionally (Figures 6a-6d), the CMSAF SARAH clear-sky product agrees more closely with the CMIP5 multimodel mean than with EBAF.
There are a number of aspects in our analysis that link radiation biases to the representation of the WAM. First, from examination of the coastal region OLR patterns (Figure 2g), there is clear split between the satellite-derived products (GERB/SEVIRI and EBAF) and the CMIP5 models. In particular, the coupled models are generally unable to recreate the "little dry season," seen in the seasonal evolution of the OLR (Figure 2g) and CRE LW (Figure 4c). This southward shift in CMIP5 models has been well documented (Roehrig et al., 2013), also in ERA-Interim (Dunning et al., 2016;Hill et al., 2016) though here we focus on the impacts of this on the radiation budget. We use output from atmosphere-only experiments, AMIP, to explore whether differences in radiation fluxes can be attributed to coupled model SSTs. We find that although atmosphere-only models are able to capture the coastal "little dry season," the advance of the deepest convective clouds still occur later in the year than in the observations and do not advance as far north into the Sahel. This is consistent with Hill et al. (2016), who note the overestimation of OLR in the Sahel in AMIP with respect to CERES products, indicating that although the southward shift is reduced, it is not limited to coupled models. Also in the shortwave, coastal low-level clouds associated with the monsoon in AMIP models have been linked to a reduction in shortwave radiation reaching the surface (Hannak et al., 2017). This suggests that factors other than the SST biases are causing differences in the representation of the WAM which lead to radiation biases.
A limitation of this study is the relatively short period of observations with respect to known interannual variability in the WAM. However, using an ensemble of models for the multimodel mean reduces the impact of this. Comparisons to AMIP experiments with observed SSTs show little reduction in TOA radiation fluxes, though larger reductions in model range are observed in the surface fluxes, as well as atmospheric column quantities CRE LW and CRF TOA LW . There are indications of other sources of radiation differences not linked to coupled model SST, especially on the downwelling surface fluxes. Our analysis of the clear-sky dependencies on TCWV and AOD show that there are clear differences between EBAF/SYN1deg and the CMIP5 models. Most models have a drier atmosphere, with lower aerosol loading, than EBAF, and we link this to both lower clear-sky DLR and higher clear-sky DSR and find some evidence that in some models these effects may cancel each other. Additionally, we find that water vapor in almost all the CMIP5 models contributes more to the variability in clear-sky DSR than in EBAF, but AOD contributes less. This suggests that, in the shortwave, aerosols do not modulate the atmospheric absorptivity as strongly as in EBAF, and water vapor has a larger effect. The picture is less clear in the longwave, with aerosols having little effect, but there is some evidence that models which have a larger difference in clear-sky DLR with respect to EBAF are most affected by water vapor. This implies the GCMs and EBAF have different sensitivities to water vapor and AOD.
There are a number of questions raised by this analysis, particularly about the sensitivity of surface radiation fluxes in GCMs to variations in water vapor and aerosols. The regional analysis here uses large area integrations, and there is currently insufficient coincident radiation and aerosol and water vapor profiles data in West Africa to properly analyze whether EBAF is an appropriate reference to use over larger regions. Use of more accurate water vapor retrievals from, for example, GPS, could improve confidence in the co-variance of radiation fluxes with water vapor. However, such measurements are scarce, particularly those which are measured simultaneously with radiative and aerosol properties. Another aspect which could provide insight would be to directly contrast radiative transfer codes of GCMs and EBAF. Additionally, examination of these fields in regions other than West Africa, where aerosol loadings and origin are different, would be an interesting contrast.
In summary, we find that all-sky radiative biases in CMIP5 GCMs are linked to the biases in coupled model SSTs, via the subsequent misrepresentation of the WAM. However, even with atmosphere-only versions of these GCMs that used fixed SSTs, we find some indications that the northwards extent of the monsoon, as measured by OLR, remains too far south. We also find that many of the CMIP5 models show contrasting