Estimating Shortwave Clear‐Sky Fluxes From Hourly Global Radiation Records by Quantile Regression

Estimates of radiative fluxes under cloud‐free conditions (“clear‐sky”) are required in many fields, from climatic analyses of solar transmission to estimates of solar energy potential for electricity generation. Ideally, these fluxes can be obtained directly from measurements of solar fluxes at the surface. However, common standard methods to identify clear‐sky conditions require observations of both the total and the diffuse radiative fluxes at very high temporal resolution of minutes, which restricts these methods to a few, well‐equipped sites. Here we propose a simple method to estimate clear‐sky fluxes only from typically available global radiation measurements (Rsd) at (half‐)hourly resolution. Plotting a monthly sample of observed Rsd against the corresponding incoming solar radiation at the top of atmosphere (potential solar radiation) reveals a typical triangle shape with clear‐sky conditions forming a distinct, linear slope in the upper range of observations. This upper slope can be understood as the fractional transmission of solar radiation representative for cloud‐free conditions of the sample period. We estimate this upper slope through quantile regression. We employ data of 42 stations of the worldwide Baseline Surface Radiation Network to compare our monthly estimates with the standard clear‐sky identification method developed by Long and Ackerman (2000, https://doi.org/10.1029/2000JD900077). We find very good agreement of the derived fractional solar transmission (R2 = 0.73) across sites. These results thus provide confidence in applying the proposed method to the larger set of global radiation measurements to obtain further observational constraints on clear‐sky fluxes and cloud radiative effects.


Introduction
Solar radiation provides the main energy input to the Earth system and is of great importance for the local climate and human activities such as agriculture and solar energy generation. Diurnal and seasonal variations in surface solar radiation are mainly determined by astronomical settings which determine the incoming solar radiation at the top of atmosphere (in the following referred to as potential solar radiation). Further variation is caused by the atmospheric conditions with clouds being the most dominant source of variation. The radiative properties of clouds contribute to most of the uncertainty in our ability to model and predict current and future climate (Bony et al., 2015). Apart from clouds, the solar beam is attenuated by the presence and concentration of gases and aerosols, which also cause variations of R sd in space and time (Iqbal, 1983). ©2019. The Authors. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

10.1029/2019EA000686
Key Points: • Estimate monthly fractional clear-sky shortwave transmission from commonly available global radiation measurements • Validated against a standard procedure by Long and Ackerman (2000) across the Baseline Surface Radiation Network • No further input variables are required to obtain monthly shortwave clear-sky radiative fluxes for various applications Supporting Information: • Supporting Information S1 • Figure S1  The solar fluxes under cloud-free (used here synonymously for clear-sky) conditions serve as an important baseline describing the potential of solar energy and are used as a reference to quantify the radiative effects of clouds. Water vapor and aerosols from natural and anthropogenic sources will further affect the clear-sky solar radiation and contribute to climate changes. Especially, the contribution of aerosols is, however, rather uncertain for the past, present, and future (Boucher et al., 2013). The ability to resolve clear-sky fluxes and especially the absorption of solar radiation in the cloud-free atmosphere was found to correlate with the hydrological sensitivity of climate models (DeAngelis et al., 2015). Although remote sensing-based surface solar radiation products are available to compare climate models at global scale, these surface fluxes are modeled by radiation transfer with atmospheric profiles obtained from numerical weather models, which leads to large uncertainties in the fluxes. Direct observational estimates are still required to validate the clear-sky fluxes from radiative transfer models (Zhang et al., 2019).
Such a comparison of clear-sky fluxes directly estimated from surface radiation measurements with historical runs of climate models has actually revealed substantial model biases and helped to obtain a global cloud-free energy balance (Wild et al., 2019). Their comparison was based on 54 Baseline Surface Radiation Network (BSRN) stations across the globe. These measurements follow a strict protocol and include direct radiation measured with pyrheliometers, as well as diffuse radiation and the total incoming solar radiation (or global radiation) measured with pyranometers at high temporal resolution (minute data; Driemel et al., 2018;Ohmura et al., 1998).
There are many different approaches to determine clear-sky fluxes described in the literature (see Gueymard et al. (2019) and Ruiz-Arias and  for recent reviews). The simplest approach is based on a comparison of incoming total surface shortwave radiation (R sd ) with potential solar radiation at the top of atmosphere (R sd,pot ). The ratio, often referred to as "clearness index," is then used as threshold above which a period is identified as clear. Yet it is unclear which threshold should be used with different values reported in the literature ranging from 0.6 to 0.7 (Reno & Hansen, 2016). Further approaches also use observations of diffuse and direct normal radiation at the surface to classify clear conditions (e.g., Perez et al., 1990;C. N. Long & Ackerman, 2000). Another approach is to directly model the clear-sky irradiance dependent on atmospheric conditions. These models range from simple Linke turbidity factors (Ineichen & Perez, 2002;Linke, 1922) to radiative transfer models (Gueymard, 2008;Lefèvre et al., 2013). The latter approaches require additional input data on the atmospheric composition such as water vapor and aerosol loads (Reno & Hansen, 2016), whereas the former approaches are based on surface observations only.
The method by Long and Ackerman (2000) only requires surface shortwave radiation measurements and has found wide application as a reference (Gueymard, 2012;Kim & Ramanathan, 2008;Reno & Hansen, 2016;Wild et al., 2019). This approach, however, requires global and diffuse radiation at high temporal resolution (1 min) to estimate clear-sky fluxes (Long & Ackerman, 2000). These requirements thus strongly restrict the available data to a few sites and limit the evaluation to a short period after 1990. In contrast, pyranometers, which measure global radiation, are much more widely used by the various weather services and by research networks (e.g., FLUXNET). To address this need we present a simple, robust methodology, which is based on global radiation with (half-) hourly temporal resolution.
To overcome these restrictions of the standard clear-sky identification approaches we propose a signaturebased approach which takes the observations of global radiation out of their temporal context and relates them to the potential solar radiation. The potential solar radiation describes the radiation which would be received at the surface under a nonattenuating atmosphere at a certain location and only depends on the date, time, and latitude. Figure 1 illustrates such a relationship with half-hourly data of one month at the BSRN site Lindenberg, Germany. Data recorded under cloud-free conditions forms a linear slope in the upper range of observations, whereas cloudy conditions with reduced global radiation fall below this Figure 1. Illustration of quantile regression approach. The scatterplot shows a monthly sample of 30-min observations of incoming shortwave radiation versus the corresponding potential shortwave radiation (Lindenberg, Germany, August 2003). The scatter forms a well-defined orthogonal triangle with clear-sky conditions close to the upper boundary. The quantile regression method (Koenker, 2005) allows to estimate this upper slope for a given quantile ω (here 85%). upper linear slope. Such signatures are found to be rather typical and thus can be used to estimate the fractional clear-sky transmission of solar radiation (also known as clearness index), representative for a given period. To estimate this upper linear slope, we employ quantile regression to estimate conditional quantile functions (Koenker, 2005). The proposed method requires a relatively large sample of clear-sky conditions to obtain statistically significant estimates. Here we used a monthly sampling period with (half-)hourly data. Therefore, the resulting fractional clear-sky transmission is only representative for the sampling period of one month. Hence, the method is not suitable for daily or hourly assessments and is thus applicable for climatological analyses, such as seasonal changes of clear-sky fluxes.
In order to test and validate the proposed methodology, we use data from the BSRN network with a sparse but global coverage from the tropics to the poles. The BSRN network provides 1-min data of diffuse and global shortwave radiation (Driemel et al., 2018). Reference clear-sky fluxes are computed by the standard method of Long and Ackerman (2000). We then employ the quantile regression methodology on monthly samples of half-hourly aggregated data of global radiation only. We use fixed parameters for all sites to test if the method can be applied to any site and season without a priori knowledge. Results are reported for 54 sites of the BSRN network in terms of monthly clear-sky shortwave fluxes and the fractional shortwave transmission. We compare the new estimates with monthly estimates from the Long and Ackerman (2000) reference method using scatter diagrams and standard error characteristics.

Modeling Potential Solar Radiation
Potential radiation at the horizontal surface can be modeled as a function of the solar zenith angle θ and top of atmosphere solar radiation S 0 (Long & Ackerman, 2000): The exponent b modulates the form of the cosine and allows to consider the modulation of the optical path through a curved, refractive atmosphere compared to a plane-parallel, nonrefractive atmosphere, which will be discussed below. The solar zenith angle θ is a function of latitude φ, the degree of declination δ between the Sun and the equator determined by day of the year and the hour angle H given by the time of the day (Iqbal, 1983): The Sun-Earth orbit eccentricity correction is applied in the calculation of the degree of declination δ. Specifically, we used the function fCalcPotRadiation of the R-software package REddyProc (Wutzler et al., 2018). That function implements the correction by Spencer (1971). Hence, the potential radiation can be computed when location and time are known.
The potential radiation formulation as such assumes a plane-parallel atmosphere through which the light beam travels. This causes a small nonlinearity of the observed to potential radiation relationship, which is exemplified in Figure 2 for a clear-sky day at the site Lindenberg. Assuming a plane-parallel, nonrefractive atmosphere (b = 1) we find a small, but consistent curvature ( Figure 2a) which shows larger deviations to the linear fit at lower potential solar radiation (morning and evening; see Figure 2b). To account for the effects of a curved, refractive atmosphere, we use the exponent b in equation (1) similar to Long and Ackerman (2000). We find that the nonlinear deviations can be largely resolved with an exponent b = 1.2 (see Figure 2). The exponent b = 1.2 is also used in the formulations for quality control of surface shortwave radiation measurements for BSRN sites (Long & Shi, 2008;Roesch et al., 2011). In contrast to Long and Ackerman (2000), we choose to use a fixed exponent b = 1.2 for all sites and periods for two reasons. First, we find that the residualsquared error of the quantile regression (explained below) shows a minimum close to the exponent b = 1.2 across sites and seasons. Second, we find that the slope of the quantile regression covaries linearly with the exponent b, which limits the parameter identifiability. Since we are mainly interested in a monthly change in fractional clear-sky transmission, we keep exponent b constant, enabling better comparability across sites and seasons.

Clear-Sky Signature and Quantile Regression Approach
The key of the proposed methodology is to establish and quantify the relationship of the observed global radiation with the potential solar radiation. Potential radiation is only a function of date, time, and location and can thus be calculated for any site a priori. The relationship can be best visualized by plotting observed global radiation (R sd ) against potential radiation R sd,pot in a scatterplot. This takes the time series observations out of their temporal context ( Figure 1) and reveals an approximately linear relationship of R sd to R sd,pot close to the upper boundary. Clouds mostly reduce R sd when direct radiation is blocked or enhance R sd under partial cloud cover with enhanced diffuse radiation and but direct radiation passing to the surface.
We assume that the slope close to the upper linear boundary represents the fraction of transmitted solar radiation under clear-sky conditions. To estimate this slope from the data we employ quantile regression. Quantile regression is an emerging statistical method for linear and nonlinear response models (Koenker & Machado, 1999) which provides an estimate of a conditional quantile function (Koenker, 2005). Here we use the univariate linear form relating response Y to forcing X through Thereby we estimate the intercept α and slope β conditional on the quantile ω. The resulting regression line, say for a quantile ω = 0.85, ensures that 85% of the data is below the estimated line (see Figure 1). The coefficients are estimated using the default method recommended by the author of the quantreg R package (Koenker, 2005).
The estimated slopes should be in the range 0 < β(ω) < 1, since observed global radiation should not exceed the potential radiation. The intercept term α should also be close to 0. Deviations from 0 indicate systematic biases in the measurements, such as problems of timing of measurements or issues with the mounting of the pyranometers.
The quantile regression approach allows to estimate the standard deviation of the derived slope and intercept terms and thus provides a quantitative assessment of the uncertainty. Goodness of fit is evaluated through a pseudo-R 2 measure R 1 suggested by Koenker and Machado (1999). The rationale is to assess the goodness of fit at a certain quantile by comparing the sum of weighted deviations of the model of interest (ρ mod ) with the sum of weighted deviations from a model in which only the intercept appears (ρ 1 ): with ρ determined by the model residuals u weighted by the quantile ω: The only free parameter in the proposed method is the quantile ω used in the quantile regression. A very high ω would be sensitive to short periods with very large values of shortwave radiation, for example, caused by reflections during partial cloud cover. This effect is particularly important when using data at 1-min temporal resolution, but it can be reduced by temporal aggregation to half-hourly or hourly values. A small ω will lead to smaller β, which will be important during periods with almost permanent cloud cover. In order to facilitate the choice of the ω, we performed a sensitivity analysis with varying values of ω and use the standard method as reference with all sites available.
The estimated slopes by the quantile regression β(ω) represent the fractional transmission of shortwave radiation under cloud-free situations for a given period. Assuming that this slope, representing the fractional clear-sky transmission, is constant for that period, we can estimate the clear-sky fluxes by multiplying with the potential solar radiation:

Baseline Surface Radiation Network
The BSRN provides high-quality downwelling radiation observations at 1-min resolution for more than 50 sites across the globe (Driemel et al., 2018;Ohmura et al., 1998). Here we use the majority of sites (54) for which monthly clear-sky fluxes are available from the standard method as used by Wild et al. (2019). The location of the sites is shown in the map in Figure 3. Meta data including site codes are tabulated in Table 1.

Data Preparations
We flag suspect data following the BSRN policy (Long & Shi, 2008;Roesch et al., 2011) and set values outside the "extremely rare" limits to missing. Three sites showed lower negative nighttime values than the extremely rare minimum limits (SBO, SOV, TAM). This issue which is related to the thermal offset due to longwave loss of the instrument has been often reported (Driemel et al., 2018;Dutton et al., 2001). To cope with  these issues, we set negative values <−10 to missing. Remaining negative values during periods with high solar zenith angle (θ > 80°) have been set to 0.
For analysis we use the unshaded pyranometer measurements of global radiation when available. If these are not available for a given time step we use measurements of direct R sd,DIR and diffuse solar radiation R sd,DIF to compute total radiation with R sd,tot = cos(θ) R sd,DIR + R sd,DIF . We prefer the use of unshaded pyranometer measurements of global radiation since these data are broadly available in other measurement networks in contrast to direct normal and diffuse radiation.
To aggregate the data, which is available in 1-min resolution, we follow the recommendations by Roesch et al. (2011). Thereby, we first calculate 15-min averages where only 20% of the data must be available. From these 15-min aggregates we compute 30-min aggregates which require two values. For hourly aggregates we require three nonmissing 15-min values. Monthly fluxes are obtained by first computing the monthly mean diurnal cycle from the 30-min data, whereby 26 nonmissing values for each time step are required. Then the monthly mean is calculated from the monthly mean diurnal cycle requiring no missing values. These procedures try to reduce sampling biases which can be significant due to the pronounced diurnal cycle.

Application of the Quantile Regression Approach to the BSRN Data
The quantile regression (QR) approach is applied for each month at all available sites with sufficient data.
Here we use half-hourly data as basis for regression. The resulting estimates are very similar when hourly aggregates are used.
Persistent cloud cover can reduce the slope of the quantile regression because of insufficient cloud-free measurements. This is identified by a poor fit using the R 1 goodness-of-fit statistic and a deviation of more than 25% of the site average values obtained by regression of all data at one site. If R 1 > 0.75 and the slope is within a 25% range of the site average value the estimate is retained. If not, we increase the sampling period by including the preceding and following month. We used three-, five-, and seven-month sampling windows which allow estimates for most observations in the BSRN database.
At higher latitudes (|φ|> 67°) the solar zenith angle will not rise above the horizon during winter days and therefore R sd,pot = 0. However, many sites still measure very small amounts of global radiation (usually <30 W/m 2 ). During these winter months the potential radiation was observed to be lower than the measurements resulting in β(ω) > 1. To account for these issues we constrain the quantile regression to periods with more than 100 values of R sd,pot > 10 W/m 2 . If this constraint is not met, the coefficients are set to missing, while for estimating the monthly mean clear-sky flux we simply use the potential radiation and set β to 1. These considerations allow to calculate annual average clear-sky fluxes for most of the sites in the BSRN network.

Reference Clear-Sky Detection Approach
In order to evaluate the performance of the quantile regression approach we use the Long and Ackerman (2000) clear-sky detection approach which provides clear-sky fluxes of shortwave radiation. The approach is based on observations of total and diffuse hemispheric broadband shortwave radiation with a temporal resolution of 1 min. It employs a series of four tests to identify clear-sky episodes and then estimates the clear-sky fluxes of each day with these data using a power law formulation with the cosine of the zenith angle as independent variable. The tests involve thresholds for acceptable ranges of total and diffusion radiation under clear skies. Also, temporal variability is used to identify clear-sky conditions, one for total radiation and one for the ratio of diffuse to total radiation normalized by the cosine of the zenith angle. All tests require to set thresholds, which are based on experience and can be locally different. To automate the process, Long and Ackerman (2000) use an iterative procedure of clear-sky identification and parameter fitting. A core limitation are days and periods with persistent cloud cover with less than 2 hr of clear-sky conditions. The fluxes of those days are estimated by linear interpolation and then monthly fluxes of clear-sky solar radiation are computed. For sites with persistent cloud cover, the Long and Ackerman algorithm uses a different estimation technique which is based on monthly input and historical data to derive clear-sky coefficients (C. N. Long & Gaustad, 2004). This affects especially (sub)tropical sites in ocean environments (BER, COC, ISH, KWA, MAN, NAU) and cloudy extratropical sites (LER, CAM, SPO). Also, a few sites the Long and Ackerman method were marked as doubtful (ALE, ILO, SOV). These sites have been excluded from the comparison of monthly estimates.
The approach by Long and Ackerman (2000) has become a standard clear-sky detection approach. For example, it was used to establish a climatology of clear-sky fluxes of 53 BSRN stations to evaluate the modeled clear-sky fluxes of global climate models (Wild et al., 2019). Here we use these climatological clear-sky fluxes on a monthly mean basis for evaluation of the simpler quantile regression approach.

Statistical Evaluation
We use standard methods to calculate the goodness of fit through Pearson correlation and linear regression. Additionally, we employ the mean square error skill score (MSES) to evaluate the skill of the derived clearsky solar radiation with a reference estimate: where MSE p is the mean squared error of the prediction and MSE ref of the references, respectively. As reference we use a constant fractional solar transmission β ref = 0.81 multiplied by the potential solar radiation, representing the average across all sites. A perfect skill is indicated by a MSES of 1 and 0 if the prediction is worse than the reference. It can be interpreted as the reduction of variance compared to the reference.

Sensitivity of the Quantile Used in Regression
The proposed method relies on quantile regression which requires to choose a quantile ω to estimate the fractional clear-sky shortwave transmission for a given sample. Using a very large ω close to 100% would lead to high β due to short periods with very large values of shortwave radiation, for example, caused by reflections and enhanced diffuse radiation during partial cloud cover. A smaller ω will lead to smaller fractional transmissions, which will be important during periods with almost permanent cloud cover.
To assess the sensitivity of ω, we repeated the quantile regression for different ω in the range between 70 and 99%. Then skill scores (MSES, R 2 ) were computed using the clear-sky fluxes of the standard method as the reference. To evaluate the skill, we use a constant β = 0.81 as reference across all sites and conditions. When data of all the sites are pooled together, we find the best performance in terms of bias and correlation at ω = 85% (see Figure 4a). The explained variance of the monthly fractional solar transmission estimates is R 2 = 0.79 at ω = 85% which is within a flat peak of the maximum correlation. One can note from Figure 4a that the bias is more sensitive to the choice of ω than the correlation. In the supporting information the relationships of the skill score to ω are shown for each site ( Figure S1). There is a broad agreement with a maximum skill score at ω = 85%; however, some sites show deviations from the global pattern. The sites ASP, BRB, CAR, DRA, SMS, and TAM show their maximum MSES skill at distinctly lower values of ω with approximately 70%. Hence, at these sites we must expect a positive bias to the standard method, when using a global value of ω = 85%. At three sites (BOU, TAT, XIA) we find the maximum skill at ω = 95% indicating an underestimation of the QR approach with ω = 85% at these sites.
An increase in ω will lead to a larger estimate of fractional clear-sky transmission. However, the sensitivity study also revealed that the sensitivity of the clear-sky fluxes to the choice of ω is rather small. Figure 4b shows the distribution of the difference between ω = 90% and ω = 85% for monthly clear-sky fluxes across all the sites which is relatively uniform across the sites. Overall an increase of ω by 5% leads to increase by 3 W/m 2 when averaged across all sites. Given the measurement uncertainty this should be acceptable when no further information, such as diffuse radiation and direct normal radiation, is available.
Given the relatively small uncertainty in setting ω and because the skill scores for bias and correlation have their maxima close to ω = 85% we use that value for subsequent analysis for all sites.

Evaluation of Fractional Shortwave Transmission
We applied the quantile regression approach to each month of the available data using a fixed ω = 85% for all sites and conditions. This yields a regression slope which can be interpreted as fractional clear-sky solar transmission. Overall the temporal and spatial variability of the fractional clear-sky transmission as measured by the Long and Ackerman method is well captured by the QR approach as shown in the scatterplot in Figure 5. Most estimates follow the 1:1 line and yield a R 2 = 0.73 across the 42 sites. The derived estimates and statistics are very similar when using hourly (R 2 = 0.71) or half-hourly data (see Figure S2 in the supporting information which is based on hourly aggregated data).
Time series of the fractional clear-sky transmission of the QR methods and the Long and Ackerman method are shown for five selected sites in Figure 6 (all sites are shown in Figure S3 in the supporting information).
Here we used an automatic scheme (see section 3) to detect overcast conditions, where months with poor fit are replaced by a fit using a longer sampling period. These conditions are marked in Figure 6 with crosses. At the site PAY there are frequent overcast conditions during winter which would lead to underestimation of β. These conditions are often characterized by small fractional transmissions accompanied by a poor R 1 of the quantile regression and can be effectively identified.
The time series of both methods show a pronounced seasonal cycle at extratropical sites (see Figure 6 and S3). During winter there is generally a higher fractional shortwave transmission than in summer. Toward the tropics the seasonal variation is strongly reduced. Furthermore, we find that continental sites show larger seasonal variations than sites in an oceanic environment at comparable latitude (compare sites E13 in Central United States and BER at the Bermuda Islands in the Atlantic; Figure  S3). At the poles, especially at the sites in Antarctica we find very high fractional solar transmission >1 during the dark winter months. In these periods the potential solar radiation is very small or zero, but there are still diffuse light sources which contribute to the measured solar fluxes. In these cases, the fractional solar transmission can be larger than 1. This  is, however, less relevant for the estimated clear-sky fluxes since the potential solar radiation in these winter months is very small or zero.

Comparison of Clear-Sky Fluxes
Next, we compare the clear-sky shortwave radiation fluxes obtained from the quantile regression approach with the results obtained from the standard method proposed by Long and Ackerman (2000). We use the monthly average fluxes for comparison. The scatterplot in Figure 7 is based on monthly data of 42 sites. The estimates follow the 1:1 line with only a slight overestimation (slope of best linear fit 1.01) and low error (RMSE = 8.5 W/m 2 ).
Since the main signal in the clear-sky estimate comes from the seasonal course of potential solar radiation we further compare our results with a reference case which uses a constant fractional transmission β ref = 0.81. Figure 8 shows the residuals (difference between the QR estimate and the standard method) ordered by the clear-sky fluxes. Both residual distributions (right panel of Figure 8) have their center around zero, but the constant fractional transmission estimate shows a much larger spread. Hence, the quantile regression method as applied to the monthly data is able to reduce the residual variance with MSES = 74% compared to the reference with constant solar transmission across all the sites.

Site-Level Evaluation
The BSRN stations are distributed across the globe and provide measurements in all climate zones. This wide spectrum allows us to assess if the proposed method can be applied on any site with solar radiation measurements. The evaluation statistics of all 54 sites are reported in Table 1. Time series of the fractional clear-sky solar transmission of each site are shown in Figure S3 in the supporting information. Note that the Long and Ackerman method used a different estimation technique for sites with persistent cloud cover (C. N. Long & Gaustad, 2004), which does not allow a fair comparison of the time series. These sites have been excluded from comparison (see section 3.4).
At the site TAM the QR method shows a positive trend in β which is not as strong for the Long and Ackerman method (see also Figure 6c). This site is located in the Sahara which is characterized by high loads of dust which can reduce solar radiation at the surface (Polo et al., 2009). More recently, it was noted that the ratio of diffuse to total radiation used a one of the tests of the Long and Ackerman method identifies many periods as cloudy which were rather hazy . The tropical site in continental climate (PTR) shows no correlation of monthly β time series (R 2 = 0.04), which can be due to the low seasonal variation of the Long and Ackerman estimate for this site. At the site XIA there is a positive trend in β by the QR approach, which is not seen by the Long and Ackerman method.
The remaining 39 sites show good agreement with the Long and Ackerman method with RMSE <18.5 W/m 2 averaging to 7.7 W/m 2 for the clear-sky shortwave fluxes. The explained variance of the fractional clear-sky transmission is on average R 2 = 0.63 and larger than R 2 > 0.24. Given that only global radiation is used to diagnose the clear-sky fluxes we believe that these deviations at these sites are acceptable and illustrate that the QR method is a reliable tool to estimate fractional solar transmission and clear-sky fluxes at a monthly time scale in various environments and atmospheric conditions.

Interpretation
We introduced and successfully tested a simple methodology to estimate the monthly fractional clear-sky transmission by only using global radiation measurements at a half-hourly resolution. The proposed QR method can be considered as a clear-sky identification approach which is different from approaches which model the radiation transfer through the atmosphere. The latter approaches have to link surface radiation data and column-integrated data of water vapor and aerosols with surface radiation measurements which may introduce further uncertainties (Ruiz-Arias & Gueymard, 2018).
The QR method exploits the commonly observed linear relationship of surface solar radiation measurements with incoming solar radiation at the TOA under cloud-free conditions forming a rather slope close to the upper envelope. This linear relationship is reliably estimated with quantile regression when using a rather high quantile. The slope of this linear relationship β can be thought of the fractional solar transmission under cloud-free conditions and it is typically smaller than 1. Here we find an average of β = 0.84 (see Table 1). The complement, that is,(1 − β), represents fractional absorption and backscattering of shortwave radiation in the atmosphere. Fractional absorption under all-sky conditions was found to be around 0.23 + −0.02 (Hakuba et al., 2016) and 0.21 under clear-sky conditions (Wild et al., 2019). Note that Hakuba et al. (2016) used a difference of top of atmosphere and surface net shortwave radiation under all-sky conditions from the CERES-EBAF remote-sensing product, while here we used cloud-free conditions of the downwelling radiation only. Nevertheless, there is a broad agreement, indicating that a large part of the reduction of solar radiation at the surface under cloud-free conditions can be attributed to the absorption of shortwave radiation within the atmosphere.
We find that the monthly time series of fractional clear-sky transmission show a pronounced seasonal cycle at extratropical sites (Figures 6 and S2). Thereby, we find higher fractional shortwave transmission during winter compared to summer. The seasonality can be linked to the seasonality of water vapor in the atmosphere which is an important absorber of shortwave radiation (DeAngelis et al., 2015). This is also consistent with the reduced seasonality toward the tropics and the lower seasonality in maritime environments compared to continental climates.

Limitations
The proposed QR method strongly relies on sufficient samples of cloud-free conditions. Here we used one month as sampling period which is sufficient for most sites to obtain reliable estimates; however, persistent overcast conditions can lead to distinctly smaller estimates of fractional clear-sky transmission. Here we used the R 1 goodness of fit metric and a threshold of 25% of the site average estimate to identify these periods. Increasing the sampling period allowed to obtain estimates throughout the year for almost all sites and months.
The method also requires choosing a quantile ω for which the slope β is being estimated. Generally, this quantile ω should be above the 75% percentile and lower than the 97% percentile. At the highest quantile the performance drastically decreases. This is because partly cloudy conditions can enhance global radiation due to enhanced diffuse radiation and high direct radiation. Hence, the occurrence of such conditions at a given site can influence the choice of ω for quantile regression. Here further research which should include diffuse radiation measurements is required to improve the approach at specific sites.
Here we used the Long and Ackerman method as a reference to find that the quantile of 85% provides the best results across the whole set of sites. There are sites where lower quantiles would improve the skill of the method and further investigation of the radiative conditions at these sites could improve the comparison with the standard method. However, the sensitivity of the correlation and the bias shows a rather flat plateau around the 85% and indicates that the uncertainty stemming from the choice of ω is relatively small. Given further uncertainties in measurements we believe that the good error statistics for most sites demonstrate the applicability of the QR approach.

Potential Applications
Global radiation is commonly measured in various national meteorological sites, and in many research networks (e.g., FLUXNET) which allows to apply the method for a much wider spatial and temporal domain than available with the BSRN network. Thereby, one can obtain relevant climatic information on fractional solar transmission which is shaped by absorption of solar radiation in the atmosphere, which allows to compute clear-sky fluxes and the shortwave cloud radiative effect. Spatial and temporal trends are thereby of high interest to understand for example the reasons of global dimming and brightening (Wild et al., 2005). Station-based estimates of fractional clear-sky transmission and the cloud radiative effect could help to attribute the decadal changes in global radiation observed, namely, in Europe, Northern America, and Asia (Sanchez-Lorenzo et al., 2017;Wild, 2016).
With the dense network of global radiation measurements, it is possible to obtain estimates of clear-sky radiative fluxes at the monthly time scale. These may be used for validation of remote sensing-based estimates (e.g., CERES), as reference for climate models (Wild et al., 2019) or for the planning of photovoltaic installations (Ruiz-Arias & Gueymard, 2018).

Conclusions
We proposed a novel, alternative parsimonious approach to derive clear-sky solar radiation fluxes only from commonly measured global shortwave radiation observations at (half-) hourly resolution. The key idea is that measurements taken under clear-sky conditions are linearly related to potential solar radiation. This slope can be reliably estimated with the statistical method of quantile regression (Koenker, 2005). We used data of the BSRN network which provides the highest-quality observations of surface radiation measurements to evaluate the proposed method with a well-accepted standard approach (Long & Ackerman, 2000). The standard approach is based on high-resolution temporal data and also requires diffuse solar radiation measurements which can only be applied for well-equipped sites, such as those from BSRN. By comparing the derived fractional solar transmission of the novel approach and the standard approach we found the best agreement with the quantile ω = 85% which is the only parameter which must be set a priori. The estimated clear-sky shortwave radiation allows to predict the temporal variation with low bias and high correlation across the 54 sites evaluated here. These results provide confidence to apply the novel method at meteorological sites where global radiation is being measured accurately. With a much larger data coverage of available global radiation records the proposed method has the potential to provide estimates of monthly fractional clear-sky shortwave transmission and clear-sky fluxes with higher spatial and temporal coverage than before. This can be used to analyze the attenuation of solar radiation by gases and aerosols and the cloud radiative effect on shortwave radiation from a much larger observational network. These aspects are key to improve and validate remote sensing-based as well as model-calculated estimates of shortwave radiation; they can improve the planning of solar energy installations and provide observational constraints on cloud and aerosol effects relevant to the surface climate.