Comparing Approaches to Deal With Non-Gaussianity of Rainfall Data in Kriging-Based Radar-Gauge Rainfall Merging

Merging radar and rain gauge rainfall data is a technique used to improve the quality of spatial rainfall estimates and in particular the use of Kriging with External Drift (KED) is a very effective radar-rain gauge rainfall merging technique. However, kriging interpolations assume Gaussianity of the process. Rainfall has a strongly skewed, positive, probability distribution, characterized by a discontinuity due to intermittency. In KED rainfall residuals are used, implicitly calculated as the difference between rain gauge data and a linear function of the radar estimates. Rainfall residuals are non-Gaussian as well. The aim of this work is to evaluate the impact of applying KED to non-Gaussian rainfall residuals, and to assess the best techniques to improve Gaussianity. We compare Box-Cox transformations with k parameters equal to 0.5, 0.25, and 0.1, Box-Cox with time-variant optimization of k , normal score transformation, and a singularity analysis technique. The results suggest that Box-Cox with k 5 0.1 and the singularity analysis is not suitable for KED. Normal score transformation and Box-Cox with optimized k , or k 5 0.25 produce satisfactory results in terms of Gaussianity of the residuals, probability distribution of the merged rainfall products, and rainfall estimate quality, when validated through cross-validation. However, it is observed that Box-Cox transformations are strongly dependent on the temporal and spatial variability of rainfall and on the units used for the rainfall intensity. Overall, applying transformations results in a quantitative improvement of the rainfall estimates only if the correct transformations for the speciﬁc data set are used.


Introduction
Accurate spatially distributed estimates of precipitation are desirable in many water-related fields, like hydrology, water quality monitoring, and water resources management. Merging weather radar rainfall estimates and rain gauge data has proved to be a successful solution compared to the use of a single source of information (Goudenhoofdt & Delobbe, 2009;Haberlandt, 2007;Jewell & Gaussiat, 2015;Schuurmans et al., 2007;Wilson, 1970). Several techniques have been developed for merging radar and rain gauge data, many of which are based on kriging interpolation methods (Goudenhoofdt & Delobbe, 2009;Li & Heap, 2011;McKee & Binns, 2015). In many comparative studies, kriging with external drift (KED-also known as universal kriging, or regression kriging) is reported to be one of the best performing merging methods (Boudevillain et al., 2016;Goudenhoofdt & Delobbe, 2009;Jewell & Gaussiat, 2015;Li & Heap, 2011;Nanding et al., 2015;Velasco-Forero et al., 2009). Other methods that perform comparably well, usually require considerably more computational effort, making use of large covariance matrices or Monte Carlo methods (e.g., Scheidegger & Rieckermann, 2014;Todini, 2001) and this high computational cost is an obstacle for their operational usage, especially for large areas, or for long-time series. In spite of the good performance of KED, the method has some limitations that may introduce errors in the rainfall estimates. One of the limiting factors of many kriging-based merging techniques is their applicability only to Gaussian variables (Diggle & Ribeiro, 2007;M€ uller, 2007). In particular, KED requires the residuals between the rainfall process and its mean (or drift, modeled as a linear function of the radar estimate) to be Gaussian. Although in many rainfall data merging applications the Gaussian approximation is adopted (e.g., Delrieu et al., 2014;Schiemann et al., 2011;Verdin et al., 2015), rainfall intensity and accumulation have a strongly non-Gaussian, skewed, probability distribution, with only positive values and a discontinuous peak for the value zero. The residuals are not known a priori, but can be approximated by the difference between rain gauge values (assumed as the true rainfall at point locations) and a linear function of the radar values at rain gauge locations, derived from a linear regression (Delrieu et al., 2014). Since the difference between two Gaussian variables results in a Gaussian variable, often Gaussianity of residuals is achieved applying Gaussian transformations on rain gauge and radar rainfall data. This leads to the use of trans-Gaussian kriging methods, i.e., to the application of kriging to variables transformed with analytical or numerical Gaussian transformations (Cressie, 1993). Various transformations can be applied to rainfall, like square root (Sideris et al., 2014), fourth root (van de Beek et al., 2011), Box-Cox with variable k parameter (Erdin et al., 2012), logarithmic (Sinclair & Pegram, 2005), or normal score transformation (de Wit et al., 2008;Germann et al., 2009;Villarini et al., 2014). Beyond trans-Gaussian kriging, some other approaches can be used to address non-Gaussianity of rainfall data. A possible approach is to separate the wet areas from the dry ones at each time step and do the merging only in the wet ones, to eliminate the discontinuity in the probability distribution corresponding to the zero values (Barancourt et al., 1992;Braud et al., 1994;Schleiss et al., 2014). The extension of this approach is to separate different rain intensity intervals and krige them separately. Indicator kriging transforms the continuous variable in a categorical one, allowing to krige separately the values in different intervals, thus approximately reproducing the original cumulative density function (Remy et al., 2012). It has been applied to KED, but an improvement is not always observed (Berndt et al., 2014;Haberlandt, 2007). Disjunctive kriging uses a nonlinear formulation, based on Hermite polynomials, to include a nonlinear bijective relationship between the actual variable and a normal one (Yates et al., 1986a(Yates et al., , 1986b(Yates et al., , 1986c. It has been applied to cokriging merging, with good results, but the method is complex, computationally intensive, and its application to KED is not straightforward, given the different ways that radar and rain gauges are integrated in KED (Azimi-Zonooz et al., 1989). The local singularity analysis (SA) proposed by Wang et al. (2015), separates the non-Gaussian features in the spatial field (local singularities), and performs kriging only on the nonsingular part, approximately Gaussian, recovering the singularities on the merged result. It has been applied to the Kalman filter, Bayesian merging technique proposed by Todini (2001), but it has not been applied to Kriging with External Drift so far. The methodology is used to deal with the smoothing in rainfall fields due to the application of Gaussian merging methods.
Although many methods of different complexity are available, not much work has been done on the comparison between them and on the uncertainty reduction deriving from their application. Kirstetter et al. (2010) showed that the probability distribution of rainfall residuals conditioned on the rainfall intensity is not as far from Gaussian as the probability distribution of rainfall, suggesting that a transformation of rainfall may be not necessary at all for KED application. Erdin et al. (2012) studied the effects of using Box-Cox transformations in KED merging, comparing the use of different k parameters. The work acknowledges that the rainfall is best transformed to a normal variable using a transformation close to logarithmic (k ! 0), but at the same time, an almost logarithmic transformation introduces biases in the back transformation; therefore, the study recommends a parameter close to k50:2, and proposes a Bayesian approach to infer the best parameter at each time step. It must also be noticed that the logarithmic transformation cannot be applied to rainfall data, because of the impossibility of transforming the zero values; therefore, low values of k need to be tested instead. Although Erdin et al.'s findings are particularly significant to the aim of this work and are compared to the outcome of it, the study does not compare Box-Cox transformations with numerical transformations or alternative methods. The Normal Score Transformation (NST), also known as Normal Quantile Transformation (NQT), is a technique widely used in hydrology (Bogner et al., 2012;Montanari & Brath, 2004;Villarini et al., 2014;Weerts et al., 2011). It is based on the idea of finding an empirical transformation to convert the quantiles of the available data set probability distribution, to the quantiles of a standard normal distribution. Its application to rainfall presents some difficulties that need to be addressed. The application of NST should be done on variables with a strictly increasing, and continuous distribution (Herr & Krzysztofowicz, 2005;Kelly & Krzysztofowicz, 1994). When this is not the case, like in the case of rainfall, numerical approximations need to be introduced. For example, the discontinuity of the rainfall probability distribution in zero is addressed by Lien et al. (2013), or the back-transformation of values outside the observed domain is addressed by Bogner et al. (2012). Similar solutions are addressed and discussed in the methodological part of this paper.
The objectives of this work are: 1. To compare different rainfall products generated using different approaches to address the rainfall non- This work provides a comprehensive comparison of the most popular methodologies to address the Gaussian approximation in radar-gauge rainfall merging and the results are of practical use for the scientific community and operational agencies. The work confirms and expands the limited literature existing in this field. A broader comparison, both in terms of methodology and investigation techniques, is implemented. The results reinforce the validity of previous findings, also thanks to the application to a different case study and to different data sets, and show more insight into the applicability of the different methodologies.
Additionally, some important considerations on the variability of the optimal transformation with time and space variant rainfall intensity and used units are included.
The document is organized as follow: the case study and the used data sets are presented in section 2; section 3 presents all the tested methods to correct for Gaussianity; section 4 introduces the techniques used for the evaluation of the results; the results obtained by the application of the methods to the case study and the quantitative and qualitative evaluations are presented in section 5 and discussed in section 6; the conclusions are drawn in section 7.

Data Sets and Case Study
The proposed case study is based on 9 months of data, from January 2016 to September 2016, in the UK. The data represent different weather conditions: In January, February, and the beginning of March the rainfall events were mainly stratiform; in June, July, and August, strongly convective events occur; in April, May, and September rainfall is characterized by a mix of frontal rainfall and convective precipitation. The analysis is conducted only on ''wet hours'' defined as those hours in which both the rain gauges and the radar recorded rainfall in the studied area, the mean of all the radar pixels is greater than 0.001 mm/h and at least the 1% of pixels record rainfall. The thresholds are empirical, based on the specific case study and resulted in 3,968 h of data available for the study.
The considered area is a large portion of Northern England, 200 km by 200 km wide. The area covers flat and hilly orography, as well as more urban environment in the South and more natural in the North.
The area contains 212 tipping bucket rain gauges from the Environment Agency, with a tipping resolution of 0.2 mm. The data set is freely available upon request (national.requests@environment-agency.gov.uk) under the Open Government Licence. The data set was accumulated to hourly time steps. The data set was quality checked, to eliminate the stations that reported an excessive number of missing values, and to eliminate those presenting suspicious behaviors (signs of blocking, strong, and consistent disagreement with neighboring stations and the radar, or unrealistic prolonged dry spells). This operation was performed manually and expert judgment was used to recognize errors. The Environment Agency rain gauge network is represented in Figure 1.
The radar rainfall data mosaic is provided by the UK Met Office at 5 min resolution on a 1 km by 1 km grid (Met Office, 2003). The study area is covered by three C-band weather radars: Hameldon Hill, High Moorsley, and Ingham. The radar rainfall product is provided by the UK Met Office through the BADC (British Atmospheric Data Centre) portal already corrected for beam blocking, clutter, attenuation, anomalous propagation, vertical reflectivity profile, bright band, and orographic enhancement; it is also bias-corrected with rain gauge measurements (Harrison et al., 2000(Harrison et al., , 2009. The rain gauge network used by the UK Met Office for bias correction is different and independent from the one used in this work, managed by the Environment Agency. In 2016, the Hameldon Hill and the Ingham radars were already converted to polarimetric, while the High Moorsley radar still used single-polarization. The data, as provided by the UK Met Office, are at 5 Water Resources Research 10.1002/2016WR020330 min resolution and presents some gaps over time. Any missing 5 min resolution radar scans were interpolated by advecting the rainfall radar field with a nowcasting model (maximum of 3 h), and the 5 min radar data are accumulated at hourly time steps. However, the nowcasting interpolations represent less than the 0.8% of the data. A residual 0.6% of radar data remains not available.

Methods
Starting from the same data set, different rainfall products are obtained by applying different techniques to correct the rainfall non-Gaussianity. The transformed rain gauge and radar data are then merged and back transformed. The original radar data and an ordinary kriging interpolation of rain gauges without the use of radar data are added to the comparison as references. The methods and the products are then compared under different aspects, presented in section 4. The compared products are: 1. Radar only-RD 2. Ordinary Kriging (using only rain gauges)-OK 3. KED with no transformation applied-NO 4. KED applying Box-Cox transformation with k50:1 (almost logarithmic)-BC 0.1 5. KED applying Box-Cox transformation with k50:25 (fourth root)-BC 0.25 6. KED applying Box-Cox transformation with k50:5 (square root)-BC 0.5 7. KED applying Box-Cox transformation with time-variant optimization of k-BC opt 8. KED with Normal Scores Transformation-NS 9. KED with adapted Singularity Analysis-SA The proposed methods aim at transforming the rainfall data sets to be merged and back transform the results of the KED merging. The KED merging algorithm is therefore not affected and is the same for all the products. We use the formulation included in the gstat R package, based on Cressie (1993).

Ordinary Kriging (OK)
Ordinary Kriging (OK) is one of the most popular interpolation methods and can be summarized as a weighted average interpolation, where the weights are a function of the distance between the rain gauge observations and the estimated point, and of the correlation characteristics of the modeled phenomenon as observed from the available data. OK does not use radar information for the interpolation. In this work, a global average is used, i.e., all the rain gauge observations are used to estimate each point on a 1 km by 1 km grid. The rainfall is estimated in a generic point x 0 as follow: where b R OK x 0 ð Þ is the estimated rainfall in a generic point x 0 , R x a ð Þ are the measured values in the rain gauge locations x a , n is the number of observations, and w a are the kriging weights, estimated minimizing the variance under unbiased conditions, which results in solving the kriging system (Cressie, 1993): Where C d ð Þ is the covariance function at distance d, x a and x b are generic rain gauge locations, and l 1 is a Lagrange parameter (Cressie, 1993). In this paper, the covariance function is estimated at each time step as illustrated in section 3.7. When the kriging system is solved, the optimized kriging variance for each point x 0 is equal to: where c is the sill parameter, explained in section 3.3.

Kriging With External Drift (KED)
Universal kriging, as opposed to ordinary kriging, considers the mean of the studied field R x ð Þ nonstationary in space: where m x ð Þ is the mean and d x ð Þ is a zero-mean stationary random process (Cressie, 1993). Kriging with External Drift (KED) is a special case of universal kriging, in which the mean is considered a linear function of external factors. In the presented case, the mean of the rain gauge interpolation is considered a linear function of the radar rainfall estimate: where r x ð Þ is the radar rainfall estimate at location x, whereas a and b are linear coefficients to be determined.
KED equations are very similar to the ones for OK. The rainfall estimation at location x 0 , b R KED x 0 ð Þ, is derived from equation (3) but the weights are calculated as follows: where r x ð Þ is the radar observation in a point x and l 2 is a second Lagrange parameter. Similarly, the kriging variance equation for each point x 0 is modified as follows:

Variogram and Covariance Functions
The correlation characteristics of the modelled processes are estimated at each time step after the data set transformation. However, KED requires the variogram to be estimated on the rainfall residuals, which are not known a priori, therefore a four-passage process is followed to estimate the residual variogram.
1. The rainfall variogram is estimated at each time step using a subset of the wet radar pixels. Radar estimates are used instead of rain gauges, to have a sufficient number of observations at each time step. In particular, if less than 1,000 wet pixels are available, all of them are used (however, at least 400 pixels-1% of the total-need to be wet to consider the time step ''wet''), otherwise the subset is limited to 1,000 pixels randomly selected, to limit the required computational time. The empirical variogram is calculated using 1 km bins, and then fitted with an exponential function: where h is the range parameter, c 0 is the nugget, and c is the sill.
2. OK is performed on the rain gauges as illustrated in section 3.1, obtaining the field b R OK x ð Þ. 3. The residuals are estimated as the difference between the OK field and a linear function of the radar estimate: where a and b are estimated fitting a linear model between the OK field b R OK x ð Þ and the radar field r x ð Þ.
4. An empirical variogram is calculated as in point 1, using the residual field y x ð Þ instead of the radar field.
The covariance function C d ð Þ is directly derived from the variogram function c d ð Þ, thanks to the assumption of stationary random function:

Box-Cox With Fixed k
The Box-Cox transformation is one of the most well-known analytical transformations used to improve the Gaussianity of a given data set. The transformation is dependent on only one parameter k: where y is a nontransformed variable of interest and y Ã is the variable after transformation.
In this work, three fixed k parameters are tested. k50:1 is used to test an almost-logarithmic transformation, since the logarithmic one with k50 cannot be applied to the rainfall data set, because of the presence of zero values. k50:25 corresponds to a fourth root of the variable, i.e., to a square root-square root transformation. Finally, k50:5 corresponds to a square root transformation of the variable. The same transformation is applied to both the radar and the rain gauge data set before merging. KED provides us with mean and variance of the merged field, in the transformed domain. For the square root and for the fourth root transformations, there is an analytical formula to obtain the mean and the variance in the untransformed space. Nevertheless, for consistency with all the other methods that do not have an analytical backtransformation, the back-transformation is accomplished applying the inverse of the analytical transformation function to 99 quantiles and calculating the mean of the back-transformed quantiles. This is done for each pixel at each time step, based on the kriging mean and variance.

Box-Cox With Time-Variant Optimal k
The Box-Cox transformations with a fixed k parameter assume that the probability distribution of rainfall is stationary in space and time. While we have no means to estimate separately the probability distribution of Water Resources Research 10.1002/2016WR020330 the different points in space, it is possible to assess at each time step what is the best transformation to apply. Erdin et al. (2012), propose a method to estimate the optimal k parameter at each time step using Bayesian inference.
The assumption that in the transformed space rain gauge measurements should be equal to a linear function of the radar measurements (drift) plus an additive normal noise that is independent and identically distributed is used to construct a likelihood function. This likelihood function is used to arrive at the posterior distribution of k; and the value at the maximum posterior probability density is chosen.
In our analysis, we prefer to use a different objective function for two reasons. First, maximizing the posterior density does not ensure to maximize the Gaussianity of the sample. Second, posterior probability tends to be highly sensitive to the choice of our prior distribution, given the limited number of spatial observations at each time step.
We use an approximation of negentropy as objective function. Negentropy tests are based on the information theory principle that, given a certain variance, entropy is maximum for the Gaussian distribution (Cover & Thomas, 1991). Negentropy, defined as differential entropy, is zero for Gaussian variables, and larger than zero otherwise, but it requires knowledge of the variable probability distribution to be calculated. Approximations of negentropy are therefore used instead. Here, we use an approximation based on the use of nonquadratic functions (Hyv€ arinen & Oja , 2000): Where E 2 f g represents the expected value operation, J y ð Þ is the negentropy of the normalized residual variable y; J a y ð Þ is its approximation, z is a normal variable, and G is a nonquadratic function, in this case exponential. The approximation of negentropy J a y ð Þ ranges between zero and 11, and for Gaussian distributions it is equal to zero. Therefore, we select the k that, applied to rain gauge and radar data, minimizes the approximation of negentropy of the residuals. As in Erdin et al. (2012), the tested k values are limited to the interval [0.2, 1.5]. To avoid skewing the residual distribution using an ordinary kriging interpolation of the rain gauges, the residuals are calculated only at rain gauge locations. However, to ensure that at each time step a sufficient number of points is available to optimize k; a time window centered in the time step of interest is used, so that at least 500 points are available. The time window for this case study is never larger than 11 h.

Normal Score Transformation
Normal score transformation (NST) is an empirical transformation method, that associates each quantile of a given probability distribution to the corresponding quantiles of a standard normal probability distribution, thanks to multiplicative factors (scores, Bogner et al., 2012). The method is simple and very effective, but requires the data set to transform to have a continuous strictly increasing cumulative distribution. Rainfall data sets do not respect these conditions and therefore the method has to be modified, and some approximations need to be done.
In particular, two problems are here addressed: 1. The data sets contain a large number of zeros, meaning that the cumulative distribution has an initial jump for the value zero. Additionally, the same problem appears for values multiples of 0.2 mm because the rain gauges have a tipping bucket resolution of 0.2 mm; therefore, the data set is not strictly continuous. This results in the fact that equal values in the nontransformed domain are associated to multiple different values in the transformed domain. The method used to solve the problem is similar to the one proposed by Lien et al. (2013): to all the equal values in the nontransformed domain is associated the median of all the different values that would have been obtained in the transformed domain applying NST without correction. Figure 2 illustrates the problem and the adopted solution for the zero value as an example. Considering both solutions, the normal score transformation is applied as follow: 1. At each time step, both the radar and the rain gauge data sets are considered, and all the values are sorted in increasing order. 2. The corresponding number of quantiles from a standard normal distribution are calculated. 3. A lookup table is created matching the quantiles from the original distribution to the corresponding quantiles of the standard normal distribution. 4. Both the radar and the rain gauge data sets are transformed using the lookup table. 5. The maximum normal quantile corresponding to zero NS 0;max is recorded, before applying the correction method for repeated values presented above. All the values in the transformed domain corresponding to equal values in the original domain are substituted with their median.
At this point the KED merging can be performed, resulting in a probabilistic estimation of rainfall on a grid in the transformed domain, described by a mean value and a variance for each pixel. The back transformation is calculated as follow: 1. For each pixel, given the mean and the variance obtained from the KED merging, 99 quantiles are calculated. 2. For each quantile, if it is lower than the zero threshold NS 0;max , it is back-transformed to zero, otherwise, the two values above and below are found in the lookup

Singularity Analysis
The singularity analysis method, introduced by Cheng et al. (1994) to identify geochemical anomalies, was adapted to identify and separate singularities in rainfall by Wang et al. (2015). According to Wang et al. (2015 the term singularity refers to ''an anomalous amount of energy release or mass accumulation [. . .] often associated with structures depicting fractality or multifractality.'' It must be noted that the aim of the method is not strictly to obtain a Gaussian field, but to separate singularities, which are characteristic of non-Gaussian structures and carry information relative to moments beyond the second. However, removing singularities in the merging phase has also the effect of obtaining a field closer to Gaussian; recovering the singularities in the merged products should help reconstruct a field with a probability distribution similar to the one of the original data. Rainfall singularities are characterized by the fact that the areal averaged rainfall centered in x follows a power function of the area (Wang et al., 2015): where 5l=L, i.e., the side of the square under consideration l divided by the side of the maximum considered square L; q x; ð Þis the average rainfall intensity in a square centered in x with side l, c SA x ð Þ is a constant intensity value, a SA x ð Þ is the singularity index, and E SA 52 is the Euclidean dimension. The squares considered for each pixel are selected to have l51; 3; 5; 7; 9 km.
Þcan be calculated as the intercept and the slope of a line fitting log q x; ð Þ ð Þas function of log ð Þ: Then, c SA x ð Þ can be considered the nonsingular rainfall value. In reality, to reach stability, the process needs to be iterated: where c SA 21 x ð Þ is the original value and k50; 1; . . . n. A number of iteration equal to 5 (k54) is selected, as it should be sufficient to remove the singularities (Wang et al., 2015). Wang et al. (2015) applied the singularity analysis to the Kalman filter, Bayesian radar-gauge merging technique (Todini, 2001). The Bayesian merging technique does not require to remove the singularities from rain gauge data, because a block kriged rain gauge field is used to be merged with the nonsingular radar field, which already produces a smoother and more Gaussian field. In kriging with external drift, instead, the rain gauge point measurements are used directly, without previous interpolation. The main limiting factor for the application of the singularity analysis method to KED is that the singularity removal cannot be applied to rain gauge data, because they lack the areal characteristics necessary to apply the method. The singularity identification and removal is therefore only applied to radar. An attempt to modify rain gauge values using the same factor of the radar pixel containing the rain gauge was made, but the results were not satisfactory, due to the very large representativeness difference between the point measurement and the 1 km by 1 km radar pixel value, especially for the convective events that generate the most intense singularities.
The removed singularities are reintroduced by multiplying the nonsingular KED rainfall estimation b R KED;SA x ð Þ by the ratio between the calculated nonsingular radar field r SA x ð Þ, and the original radar field r x ð Þ:

Evaluation Techniques
The studied methods are evaluated under three points of view, as reported in Figure 3. The first evaluation stage is based on the ability to transform the rainfall residuals in Gaussian variables. The second evaluation stage is based on the ability to reproduce the original probability distribution after back-transformation of the merged products. The true probability distribution of rainfall is not known, but it is assumed that the radar Probability Density Function (PDF) is proportional to it, and therefore this test is done comparing the radar data and the back-transformed KED product with QQ-plots. Finally, the most important evaluation stage assesses the quality of the final product. The KED rainfall estimates are validated with ''leave one out'' cross-validation. Figure 3 illustrates when each evaluation technique is used in the merging process, as well as what questions it aims at answering.

Gaussianity of the Residuals
Although transformations are applied to radar and rain gauge rainfall, KED requires rainfall residuals to be Gaussian. A Gaussianity test is applied at each time step to the residuals, as calculated in section 3.5. There are several methods used in literature to assess the Gaussianity of a data set. Two of the most used indicators are kurtosis and skewness, which give a measure of the ''tailedness'' and asymmetry of a distribution, Water respectively. Both are informative measures, but are not univocally indicators of Gaussianity (Joanes & Gill, 1998), since they measure the fourth and the third moment, respectively. A third method, the approximation of negentropy, is based on the notion that entropy is maximum for a Gaussian variable, for a given variance. The presentation of the three indicators together gives a more complete overview of the data set Gaussianity. The indicators are calculated at each time step, using a time window as illustrated in section 3.5. The values are then averaged over each month.

Kurtosis
Kurtosis is a measurement of the fourth standardized moment, and has been proposed in different forms.
In this work we adopted the formulation used in many statistical software packages, like SAS, SPSS, and EXCEL (Joanes & Gill, 1998): where y i are the available residual values, y is their mean, and n is the number of samples. Kurtosis G 2 y ð Þ ranges between 21 and 11, and for Gaussian distributions it is equal to zero.

Skewness
Skewness is a measure of asymmetry of the distribution and is defined as the third standardized moment. The formulation adopted here is the one adopted in statistical packages, similarly to the kurtosis one, (Joanes & Gill, 1998): Skewness G 1 y ð Þ ranges between 21 and 11, and for Gaussian distributions it is equal to zero.

Approximation of Negentropy
The approximation of negentropy is a fast and robust method to estimate the Gaussianity of a variable. The formulation in section 3.5 is used to evaluate the Gaussianity of the residuals as well.

Coefficient of Determination R 2 of QQ-Plots
The second mean of evaluation of the studied methods is the comparison between the original rainfall probability distribution, and the probability distribution of the merged rainfall products after backtransformation. Since the probability distribution of the true rainfall is not known, the probability distribution of the original radar data set is compared to the probability distribution of the KED back-transformed rainfall estimates. Although rain gauges are assumed to be representative of the true rainfall, we decided to use the radar data for two reasons: 1. Radar rainfall is a continuous measurement, while rain gauge data is discretized due to the bucket resolution equal to 0.2 mm. 2. Rain gauge data points are available at given locations only, and they may miss the extremes of the distribution.
The 99 quantiles of the radar PDF are plotted against 99 quantiles of the KED products. Since bias may be present in the radar data and may have been adjusted through the merging process, we do not expect the data to lay on a x5y line, but rather on a generic straight line. For this reason, a linear model is fit to the QQ-plots and a coefficient of determination R 2 is calculated using the fitted line. R 2 ranges from 0 to 1, with 1 indicating perfect correlation. If the R 2 is close to 1, regardless of the slope and intercept of the fitted line, the method is assumed to reproduce the probability distribution of the original rainfall data.

Rain Gauge Validation
The ability of a method to produce Gaussian residuals and to back transform the final merged rainfall products to the original probability distribution are important for the kriging application, given the Gaussianity assumption underlying the method. Nevertheless, the primary purpose of the Gaussianity correction is to improve the quality and the reliability of the final rainfall estimates. For this reason, the most important evaluation is the validation of the final merged products.
A cross-validation exercise is carried out. The KED estimation is performed leaving one rain gauge out, and estimating the value in its location. Then the exercise is repeated for each rain gauge. The KED estimations at rain gauge locations are compared to the rain gauge values. Two indicators are used to quantitatively compare the KED estimates and the rain gauge measurements: mean root transformed error and bias. The estimators are calculated at each time step and presented averaged monthly, to compare the different performance over the change of season and rainfall conditions.

Mean Root Transformed Error
The Mean Root Transformed Error (MRTE) is a measurement of the difference between observations and modeled data that is preferred in meteorology to the more common Root Mean Square Error (RMSE) because of the lower weight given to high intensity rainfall errors. Its values range from 0 to 11, with optimal value equal to zero. It is calculated as: where b R KED x i ð Þ is the KED prediction at rain gauge location x i , R x i ð Þ is the corresponding rain gauge observation, and n is the number of rain gauges.

Bias
To measure the bias in the estimated rainfall, the mean of the difference between the estimated value and the observed value is calculated. The bias value defined this way can range from -1 to 11, with optimal value equal to 0:

Results
The three evaluation passages shown in Figure 3 and explained in section 4 are used to evaluate the performance of all the tested kriging methods applied to the case study. The results are reported in this section. Additionally, a qualitative evaluation is reported.

Gaussianity of the Residuals
The first evaluation stage assesses how well the methods perform in transforming rainfall residuals to a Gaussian variable. Although the transformations are applied to radar and rain gauge rainfall data, their effects are expected to improve also the Gaussianity of the residuals. For a qualitative understanding of the rainfall and the residuals probability distributions, Figure 4 shows the histograms of radar rainfall and residuals, approximated as explained in section 3.5, with and without transformations, for an example time step  Figure 5 shows how the methods compare to each other and to the untransformed data, in terms of the three selected indicators: kurtosis, skewness, and approximation of negentropy.
The results show that all the methods improve the Gaussianity of the rainfall residuals. The SA is performing worst, probably affected by the impossibility of transforming rain gauge data. The best performing method is the NS, but satisfying results are obtained also with Box-Cox with a fixed k50:1, k50:25; and with the optimal k. The optimized Box-Cox transformation does not perform as well as k50:1; because it is limited in the range [0.2, 1.5]. In accordance to the results published by Erdin et al. (2012), the majority of the parameters are close to the lower limit of k 5 0.2 as shown in Figure 6. This explains why the results for the fixed k 50:25 and the time-variant k are very similar.

Coefficient of Determination R 2 of QQ-Plots
The second evaluation stage assesses the ability of each method to reproduce the original rainfall probability distribution, approximated by a linear function of the radar probability distribution. The 99 quantiles from the KED distribution are plotted against 99 quantiles from the original radar distribution. To account for possible radar biases, the methods are compared using the coefficient of determination R 2 relative to a linear regression fitting the QQ-plots, regardless of the slope and intercept. Example QQ-plots for the same time step considered in Figure 4, 04:00 of the 28 March 2016, are presented in Figure 7, while the quantitative comparison of the R 2 for the whole study, averaged on each month, is reported in Figure 8.
The results of the second evaluation stage show that all methods achieve a satisfactory reproduction of the original rainfall probability distribution, resulting in R 2 values for the QQ-plots above 0.8 in all analyzed cases, and very little differences can be noticed across the studied methods. Ordinary Kriging is included in the comparison as well. The Box-Cox transformation with k50:1 shows worse results overall, while the use of no transformation achieves comparably good results in this evaluation stage, meaning that the use of transformations does not bring particular benefits in terms of reproducing the original rainfall probability distribution. The reproduction of the original probability distribution by KED products seem more effective in winter months, when more stratiform precipitation is observed.

Rain Gauge Validation
The last, but most important stage of evaluation assesses the quality and the reliability of the merged products. Validation of the results is performed with cross-validation. The performance is evaluated with two indicators, namely the Mean Root Transformed Error (MRTE) and the bias. The comparison between the performance of the radar without merging, the OK results, the KED merged product without Gaussianity corrections, and the KED with the different tested correction methods are reported in Figure 9. The results confirm that KED, even without transformations, produces better results compared to the noncorrected radar rainfall estimates. Nevertheless, OK performs better than KED without transformation. The use of a Box-Cox transformation with k50:1, produces significantly worse results. The use of Box-Cox with: k50:5 and of the Singularity Analysis produce nonsatisfactory results as well, comparable to the use of no transformation. Improvements are instead observed using the optimized Box-Cox, Box-Cox with fixed k 5 0.25 and the normal score transformation, both in terms of MRTE and bias.

Qualitative Evaluation
In addition to the skill scores used to quantitatively assess the performance of the studied methods, an example of a visual comparison of the different rainfall estimates is reported in Figure 10. The presented rainfall estimations are relative to the same example presented in Figures 4 and 7, on 04:00 of the 28 March 2016.The visual comparison of the results provides additional information about the features of each method that may not be obvious looking only at the performance indicators. The original radar and OK are compared to the KED results without transformations and with the application of the analyzed methods. In addition, the rain gauge values are superimposed, to understand how well the merging process reproduces Figure 5. Performance of the tested methods in terms of the ability to transform the rainfall residuals into a Gaussian variable. Three indicators are shown: Kurtosis, Skewness, and Approximation of Negentropy, for the studied 9 months. The methods' acronyms are the ones introduced in section 3.

10.1002/2016WR020330
the observed rainfall. The example was selected because it presents at one time many characteristics that appear in other cases as well and can be discussed: radar artifacts, evidence of spatially heterogeneous radar bias (the radar overestimates the rainfall intensity in the south part of the image and underestimates it in the north part), and spatial variability of rainfall. In the radar image, some artifacts are evident: it is possible to identify some partial radar beam blockage in the north east part of the domain; there is also high reflectivity values in the south-east part of the image, not observed by the rain gauge network; on the contrary, the intensity in the north part is underestimated; there are patches of lower reflectivity in the north part of the image, and finally it is possible to identify where the acquisitions from the three radars are joined in one composite, resulting in a sharp change of reflectivity along three straight lines cutting the image in three sectors. These effects are partially corrected in most of the KED results.
As expected, all the KED rainfall estimates with and without transformations show a better agreement with rain gauges than the radar. The ordinary kriging is designed to show perfect agreement where the rain gauges are available, and shows a smoother behavior elsewhere. However, the high rain gauge density allows OK to reproduce remarkably well the spatial features of the rainfall field. The KED with SA shows enhanced spatial features compared to the rest of the products. BC 0.1 presents higher intensity values, compared to the other products.  6. Discussion Figure 4 shows that residuals do have a more Gaussian distribution compared to rainfall, as observed by Kirstetter et al. (2010), but the application of transformations to rainfall can benefit the residuals as well. In particular, for the specific time step, the necessary transformation is well addressed by the optimized Box-Cox transformation and the normal score transformation. From the results in section 5.1, showed in Figure  5, the ability of the different methods to transform the rainfall residuals in a Gaussian variable is assessed. All the tested methods improve the Gaussianity of the residuals. However, poor results are achieved by KED with the Singularity Analysis (SA). In the SA, the residuals are calculated between the untransformed rain Figure 8. Coefficients of determination R 2 relative to the linear models fitting the QQ-plots. The R 2 is considered an indicator of the ability for each method to reproduce the original rainfall probability distribution. The methods' acronyms are the ones introduced in section 3. gauge data and the transformed radar data and this results in a non-Gaussian distribution, which affects all the KED rainfall methods and the quality of the results. The results in section 5.3 do not show a significant improvement in using SA compared to no transformations. For this reason, the SA is considered not appropriate for KED merging, because of the impossibility to apply a transformation to point measurements, and also because the method is computationally very expensive.
As concerns the other tested methods, they all significantly improve the Gaussianity of the residuals. The results confirm that the Box-Cox transformations tend to perform better in transforming rainfall data to Gaussian with a lower k. Nevertheless, transformations with a low k present problems in the backtransformation phase, and therefore in the evaluation phases presented in sections 5.2 and 5.3 in this paper. The results are in agreement with the conclusions of Erdin et al. (2012). The normal score transformation, being explicitly designed to produce Gaussian residuals, achieves the best results overall. The second evaluation stage assesses the ability of the methods to reproduce the original rainfall probability distribution back-transforming the KED merged products. The results are presented in Figure 8, section 5.2. It is remarkable that all the methods perform well, with R 2 values above 0.8, especially OK and KED without transformations. Only the Box-Cox with k50:1 shows an appreciable lower performance due to the exponential ''stretch'' in the back transformation. In fact, what is called an introduced bias in Erdin et al. (2012), is actually not uniform on the rainfall intensity scale, and therefore noticeable in the QQ-plot linear fit. In Figure 7, the QQ-plots for an example time step are reported. The plot corresponding to BC 0.1 has the lowest R 2 values and accentuates the deviations from the linear regression. As concerns the other methods, it seems that KED, especially when transformations are applied, is particularly beneficial in winter months rather than in the summer. This is probably due to the fact that during convective events, the spatial variability of rainfall at subpixel scale causes the rain gauges and the radar to disagree often, therefore the KED finds a weaker correlation between the rain gauges and the radar. When this happens, the radar spatial information is not used to improve the estimates correctly. The SA performs well in terms of reproducing the radar probability distribution, being able to reintroduce all the singularities that the KED merging process would remove. Nevertheless, this is not always a positive feature, as illustrated later in this section.
The exponential stretch effect in Box-Cox with k50:1 is evident in the results of the third evaluation stage too, presented in Figure 9 and section 5.3. The third evaluation stage is the most important, because it assesses the quality of the resulting merged products: the purpose of merging is indeed to improve the quality and the reliability of rainfall estimates, while the probability distributions of the transformed and back-transformed variables are important only if they lead to a higher-quality final product. Indeed, transforming radar and rain gauge data sets with a Box-Cox transformation with k50:1 leads to a good transformation of the variables to Gaussian, but fails in producing a high-quality back-transformed rainfall product, as can be observed in the results of both the second and third evaluation stages; the KED products obtained with the Box-Cox 0.1 transformation shows larger error (MRTE) and larger bias. KED with square root transformation (BC 0.5) or with the Singularity Analysis performs better, but still worse than the other presented methods and do not bring significant advantages compared to the use of KED without transformations. Ordinary Kriging produces outstanding results in this evaluation stage. This may be due to the high-density EA rain gauge network, and also partially due to the rain gauge-based cross-validation method. The KED without transformations performs well, as shown by all the indicators used in the third phase of evaluation (MRTE and BIAS) and it improves the results of the uncorrected radar, which confirms that KED merging, even without transformations, is a positive and reliable practice to improve the quality of radar rainfall estimates. Some transformation methods do improve the results, compared to KED without transformations. The results for the optimized Box-Cox and Box-Cox with k50:25, show remarkably similar results, with a lower MRTE and a lower bias compared to KED without transformations. In fact, as shown by Figure 6, the optimal k is close to the 0.2 lower boundary in the majority of the cases. However, it is important to note that the optimal Box-Cox transformation is very case-specific. In particular, the Box-Cox function has the power to stretch the probability distribution for values below 1 and compress the probability distribution of values above 1. This has two important consequences: (1) the optimal k is a function of the units and (2) the optimal k is a function of the temporal accumulation. It is important therefore to be very careful in selecting the optimal Box-Cox transformation for the specific case study. The use of the Box-Cox optimization method is; therefore, recommended over the use of the k50:25 case, although in this specific case study they work comparably well.
The Normal Score transformation, instead, is not affected by the used units or temporal accumulation. It produces interesting results overall: it shows low MRTE and bias, and performed very well in the transformation and the back-transformation evaluation stages. The qualitative comparison in Figure 10 gives an overview of the methods' characteristics. In particular, it is easy to note the differences between the Box-Cox methods, the Normal Score Transformation, and the Singularity Analysis. As mentioned before, the Singularity Analysis is very good in reconstructing and enhancing the small-scale features. Nevertheless, it is not able to discern between the rainfall field features and the radar artifacts. Methods like the Box-Cox with k5 0:5 and k50:25, and the Normal Score, have a different way to preserve the spatial structures, which can better eliminate the radar artifacts and maintain the rainfall field structures (for example they partially mitigate a lower intensity sector in the top-right corner, or they improve the sharp intensity jump between the Water Resources Research 10.1002/2016WR020330 different radar composite). The way Box-Cox and Normal Score methods preserve the spatial features is using the transformed radar image as a drift, and adjusting it with the rain gauge observations. The spatial features that have a good agreement with rain gauge measurements are maintained, while the ones that are in disagreement with the ground measurements (like the artifacts) are smoothed. In the singularity analysis, instead, the radar field is smoothed before merging, and all the singularities are reintroduced after, without the possibility to compare them with the ground reference.

Conclusions
Kriging merging methods are among the most popular and effective methods to improve radar rainfall estimates and, among them, KED proved to be effective and reliable. Kriging methods are based on the assumption of Gaussianity, and one could argue that their application to rainfall data sets could introduce errors. The objectives of this paper are to understand how significant the uncertainty introduced by the Gaussian approximation is, and what are the best methods to correct it.
Starting from the latter, some methods are identified to be not suitable for KED applications. In particular, Box-Cox transformations with a k close to zero, approximation of the logarithmic transformation, introduces a ''stretch'' in the back transformed value, resulting in unrealistic high intensities of rainfall. This result is consistent with the results of Erdin et al. (2012). The Singularity Analysis does not produce particularly positive results as well, and two reasons are identified. On the one hand, the method relies on a transformation based on areal rainfall characteristics; since the KED merging scheme uses point rain gauge measurements, the method cannot be applied to correct rain gauge data and is applied only on the radar fields, which are used only as a drift. This may limit its effectiveness in improving the merging process. On the other hand, the transformed radar images that are used to preserve the rainfall spatial characteristics in KED are smoothed by the transformation, and therefore during the merging process; the local singularities, i.e., the spatial details, are then reintroduced only after the merging process. This does not allow to compare the spatial features with the ground measurements and therefore to distinguish between the real rainfall features and the radar artifacts, due for example to radar beam blockage, attenuation, clutter, or radar compositing, which are preserved and even enhanced in the merged product.
It is fair to say that the Met Office is currently upgrading all weather radars in the UK to dual-polarization weather radars. This will enable better identification of nonmeteorological echoes (Rico-Ramirez & Cluckie, 2008), improvements of attenuation correction algorithms (Bringi et al., 2001;Rico-Ramirez, 2012), and better rainfall estimates in the presence of partial beam blockage (Lang et al., 2009). The SA method could be more effective in the absence of these weather radar artifacts.
The Normal Score Transformation produces overall positive results in all the stages of the analysis. It is effective in transforming the variables to Gaussian and back-transforming the KED products. The final KED rainfall estimates have a low MRTE and a low bias compared to other methods. However, the method is not as simple to implement for rainfall data as it is when applied to different data sets. The application to rainfall data, with a noncontinuous, not strictly increasing cumulative distribution introduces some challenges, and the back transformation requires a linear interpolation/extrapolation for each pixel, for a sufficient number of quantiles (in this case 99), which results in a computationally slow method. Nevertheless, the Normal Score Transformation has the advantage of being independent on the variable values, and therefore independent on the units or on the temporal accumulation of rainfall.
In terms of Box-Cox transformation, instead, it is necessary to be careful in identifying the optimal transformation for the specific case study. In case units in millimeters per hour are used, like in this case, the majority of the rainfall values fall below the threshold of 1 mm/h in most of the considered time steps, therefore the optimal Box-Cox parameter tends to be low. The Box-Cox transformation with k50:25 produce positive results, under all the examined points of view. It is effective in transforming the rainfall residuals to more Gaussian variables before merging, and to back-transform the merging results to the original distribution; it generates good-quality merged rainfall estimates, with low MRTE and bias. The advantage in selecting this transformation method is that it has an analytical back transformation, although in this work the back transformation is done using the mean of the back transformation of 99 quantiles in order to be consistent with the other studied methods (Sideris et al., 2014).

10.1002/2016WR020330
Therefore, the use of Box-Cox transformation with k50:25 is usually a good solution, when [mm/h] are used as rainfall units, but this transformation needs to be used carefully, due to its dependency on the rainfall numerical values. An optimization of the k parameter can be beneficial and the method proposed in this paper is simpler and faster than the Bayesian approach by Erdin et al. (2012). Nevertheless, the Normal Score Transformation is preferable to Box-Cox transformations, being independent on the rainfall values. This work illustrates how to overcome the limits of its applicability only to variables with continuous strictly increasing cumulative distributions.
Finally, as concerns the impact of the Gaussian approximation on the final rainfall estimate uncertainty, the quantitative evaluation shows that the use of a transformation can improve the KED estimates, if the transformation is correctly selected.