Tropospheric delays in ground‐based GNSS multipath reflectometry—Experimental evidence from coastal sites

Recent studies have demonstrated the utility of ground‐based Global Navigation Satellite Systems‐Multipath Reflectometry (GNSS‐MR) for sea level studies. Typical root‐mean‐square (RMS) differences of GNSS‐MR‐derived sea level time series with respect to nearby tide gauges are on the order of 6–40 cm, sufficiently accurate to estimate tidal and secular sea level variations but are possibly biased due to delay of the signal through the troposphere. In this study we investigate the tropospheric effect from more than 20 GNSS coastal sites located from several meters up to 280 m above sea level. We find a bias in the estimated heights that is elevation and height dependent and can reach orders of 1 m for a 90 m site. Without correcting for tropospheric delay we find that GNSS‐MR‐estimated tidal coefficients will be smaller than their true amplitudes by around 2% while phases seem unaffected. Correcting for the tropospheric delay also improves leveling results as a function of reflector height. Correcting for the tropospheric delay in GNSS‐MR for sea level studies is therefore highly recommended for all sites no matter the height of the antenna above the sea surface as it manifests as a scale error.


Introduction
The propagation delay of microwave signals from satellites and radio sources due to the neutral atmosphere (or troposphere) is one of the major error sources in the analysis of satellite and space geodetic systems [Bevis et al., 1992;Davis et al., 1985;Herring et al., 1990;Tralli and Lichten, 1990;Tralli et al., 1992;Treuhaft and Lanyi, 1987]. The tropospheric propagation delay is typically separated into two parts, the hydrostatic and wet delay, and modeled as a combination of a zenith delay for each part and a corresponding mapping function that gives the slant delay at a given elevation angle [Davis et al., 1985]. Most current Global Navigation Satellite Systems (GNSS) processing schemes estimate tropospheric zenith delays (or zenith delay residuals) as unknowns alongside position coordinates using a mapping function such as the Vienna Mapping Function [Boehm and Schuh, 2004;Boehm et al., 2006]. The coefficients of the mapping functions are generally derived from ray tracing through a numerical weather model such as the European Centre for Medium-Range Weather Forecasts (ECMWF). residuals in the height estimations over a 12 min interval was consistent with tropospheric turbulence. Neither paper, however, reported the actual delay experienced. In most subsequent papers on coastal sea level measurements using GNSS-MR, the tropospheric delay has mainly been ignored [Larson et al., 2013a[Larson et al., , 2013bLofgren et al., 2011;Löfgren et al., 2011Löfgren et al., , 2014. This is presumably because no obvious effect could be seen in the data, and the reflector heights (RH) were sufficiently small (on the order of 10 m) that the tropospheric effect was thought to be insignificant. More recently, Roussel et al. [2014] developed a simulator to accurately predict the specular point off the reflecting surface. They included a tropospheric effect by using an adaptive mapping function [Gegout et al., 2011] to calculate the change in incidence angle (with respect to the vacuum angle) due to bending of the radio waves and found that, for receiver heights greater than 5 m and elevation angles less than 10°, tropospheric error has a significant effect on the specular point position. They did not however calculate the effect on propagation delays or reflector heights, which was also left for future investigation in their follow-on paper [Roussel et al., 2015]. Santamaria-Gomez et al., [2015] noted an elevation dependence in estimated leveling heights between observations using satellite arcs with mean elevations below 12°compared to those above 12°. Both had estimated heights that were generally smaller than expected, but the higher elevation arcs were closer to the expected values. They attributed at least some of this systematic bias to tropospheric delay but speculated that some of the error could be due to sea surface roughness. They also found an error in the estimated heights that was proportional to the reflector height which varied due to tidal conditions at the sites. This scale error was on the order of 0.6-1.4 cm/m. In a follow-on paper using a test site at Spring Bay, Australia, Santamaría-Gómez and Watson [2016] found a consistent elevation-dependent error for angles below 12°. They attributed this to the bending effect of the incident angle and used local atmospheric pressure and temperature to calculate the change in elevation angle [Bennett, 1982], which is then applied to the vacuum elevation angles before calculating the estimated heights. The bias was generally reduced when this correction was applied. They also noted that they got similar results when calculating the bending correction using temperature and pressure data from ECMWF model. Tropospheric delays also featured in GNSS-R phase altimetry, in which direct and reflected signals are tracked separately and later compared in the complex domain. For example, Semmling et al. [2012] utilized a ray tracing tool to account for tropospheric refraction over a spherical Earth. Fabra et al. [2012] reused zenith delay estimates from GNSS positioning, combined with Niell's mapping function and an exponential vertical decay model (of a given tropospheric scale height). In this paper we use data from more than 20 GNSS coastal sites to evaluate the effect of tropospheric delay on estimated reflector heights and subsequent derived products such as amplitude and phase of tidal constituents. The sites were chosen for their range of antenna heights above the water surface, large azimuthal and elevation field of view, and varying tidal conditions. Section 2 is a description of the data set used in this study. In section 3 we introduce the tropospheric delay model we can apply to the data. The data processing is described in section 4. In section 5 we present the GNSS-derived results, analysis of their dependence on height and satellite elevation angle, and a comparison with the derived tropospheric delay model. We also examine the effects of tropospheric delay on derived tidal parameters. Finally, in section 6, we present our conclusions.

Data
For this study we chose 22 GNSS stations with heights above mean sea level (AMSL) ranging from 5 m to 276 m and tidal ranges or peak-to-peak amplitude from essentially zero to over 7 m. One site in Kentucky, KYDH, is approximately 90 m above Dale Hollow Lake, but for simplicity in the text we will refer to heights above water as AMSL. We used 1 Hz data at all sites and used both L1 and L2 signal-to-noise ratio (SNR) observations (legacy L2P and, where possible, modern L2C). Each site was established to measure primarily land deformation for various geophysical phenomena such as tectonics, glacial isostatic adjustment, and sea level studies using commercial off-the-shelf geodetic quality receivers and antennas. No modifications have been performed to the equipment for multipath reflectometry studies, and they are therefore simply sites of opportunity. Details about the sites are provided in Table 1, including their geographical position, height AMSL, number of days of data, equipment, approximate tidal range, and the satellite elevation range used. In order to study the effects of tropospheric delay on the GNSS-MR measurements, it would be preferable to know both the total water level during the observations from a nearby tide gauge and the height of the antenna above the tide gauge zero (TGZ, the reference point for tide gauge measurements). This is traditionally assessed through leveling to land-based benchmarks and from there to the GNSS antenna Journal of Geophysical Research: Solid Earth 10.1002/2016JB013612 reference point (ARP) [Woodworth et al., 2015]. In this way, and assuming there are no other sources of error, the difference between the estimated and known reflector height would be absolute. However, at some sites we do not have all of this information but the results are still useful as we can investigate the tropospheric delay effect in a relative sense, e.g., between different satellite elevation intervals [Santamaria-Gomez et al., 2015].
Most sites have over half a year of data except for NOMI, situated at around 276 m above sea level on the island of Santorini, Greece. This site, while still active, only has 20 days where 1 Hz data were recorded. At that height AMSL, such high sampling rate is essential [Santamaria-Gomez et al., 2015] rather than preferred as for most of the other sites, so as to fulfill Nyquist sampling criterion. Furthermore, if tropospheric delays are height dependent, then this site should be extremely useful to this study, because it is 3 times higher than the next highest site. NOMI also has some other advantageous aspects that help the study: it is situated on the edge of a cliff so that there are few obstructions, and the sea in this region of the Mediterranean is relatively calm with tides of only a few centimeters in amplitude. Sites NYAL, NYA1, and NYA2 are also interesting because they form a cluster with 250 m radius.

Geometric and Tropospheric Delay Models
The SNR varies as a function of the satellite geometry and local environment. Using simple models we can relate the two and use this observable for environmental studies. SNR may be modeled as (1) the sum of a trend SNR T and a sinusoid having amplitude SNR A and phase φ = k τ, where k = 2π λ À 1 is the wave number and τ = τ r À τ d is the interferometric (reflection minus direct) delay. We will model only the geometrical and tropospheric components: Starting with a static surface in a vacuum, Figure 1 (left) shows the geometric delay model for multipath [Elosegui et al., 1995] which gives the familiar equation where H G is the geometrical height of the antenna above the reflecting surface and ε is the satellite elevation angle. We assume that the surface is horizontal and neglect phase contributions from the antenna radiation pattern as well as from the surface composition and small-scale roughness caused by, for example, salinity and winds, respectively. Now turning to the interferometric tropospheric delay, it can be expressed as τ T = τ Tr À τ Td , in terms of the direct tropospheric delay, τ Td , and the reflection tropospheric delay, τ Tr . Each of the two terms, in their turn, can be defined as the difference, e.g., τ Td ≡ τ Ld À τ Gd , between the familiar vacuum distance or geometrical delay τ Gd and the optical length or radio delay τ Ld = ∫n(l)dl experienced by the electromagnetic wave propagating in the atmosphere, where n is the index of refraction at a given ray path distance l.
Accounting for angular refraction, the ray path is allowed to bend. In a vertically stratified atmosphere, the refracted or apparent elevation will always be larger than the vacuum or geometric elevation. Santamaría-Gómez and Watson [2016] assumed that the predominant tropospheric effect is this bending δε of the radio wave elevations [Bennett, 1982], particularly at low-elevation angles [Anderson, 2000]. Consequently, they input the bent elevation angle in the geometric delay formula equation (3), so that the tropospheric delay is the difference with respect to the vacuum specular delay: This heuristic is somewhat inconsistent, as it neglects linear refraction along the propagation path. They found that this approach removed a large part of the elevation-dependent bias, but some residual error was still evident which they attributed to either deficiencies in the bending estimate or other nontropospheric effects.
We take a different approach and account for linear refraction only. As we neglect angular refraction, we continue to refer to vacuum or geometric elevation angles as "elevation angles" for simplicity. We assume that the ray path, and respective delay, taken by the direct and reflected signals are equal from the satellite down to the height of the antenna. The interferometric tropospheric delay is then simply twice the delay difference between the delay at the antenna height and the delay at the reflector surface ( Figure 1, right), such that τ T = T 1 + T 2 = 2 T, where T is the excess path delay roughly coincident with the geometric path, S 2 .
There is already a substantial body of work devoted to the study of path delays in the neutral atmosphere for geodetic instruments. These include the development of mapping functions, derived from ray tracing models, that map zenith delays down to any elevation angle; see Nilsson et al. [2013] for a detailed discussion. We therefore exploit this expertise by using the VMF1 mapping functions [Boehm et al., 2006] together with the Global Temperature and Pressure (GPT2w) model [Böhm et al., 2014] to derive a combination of hydrostatic and wet tropospheric delay: where Δτ z = τ z (ÀH) À τ z (0) is the zenith delay difference across antenna and surface positions and m is the mapping function, separately for each hydrostatic and wet components. These are much faster to calculate than the computational burden of ray tracing and include the bending effect in the direct path [Boehm et al., 2006;Davis et al., 1985]. Throughout, we neglect differential angular refraction in the reflected path.
The direct slant tropospheric delay, shown in Figure 2 (top left), is nearly 25 m at 5°and around 5 m at 30°. The tropospheric delay difference, T, shown in Figure 2 (top right), is just under 3.5 cm at 5°and nearly 0.6 cm at 30°. However, it is not the delay itself that affects reflector height retrievals but its rate of change as a function  Table 1 for coordinates). (top right) Tropospheric delay difference between an antenna at 10 m elevation and at the reflecting surface, T in equation (3) , who showed that if the reflector surface is moving, then the estimated reflector height is biased. We can derive a similar equation for changing tropospheric delay. If we differentiate equation (2) with respect to 2 sin(ε) then we get an apparent reflector height H = H G + H T that includes the desired geometrical height H G plus a nuisance tropospheric height bias H T : which we evaluated numerically via finite differences. Figure 2 (bottom left) shows the instantaneous tropospheric height bias for three antenna heights at 5 m, 10 m, and 15 m. We include the term instantaneous because in actual measurements the height is calculated over a given elevation range so that the true height bias in the observations will be a local average. At 5°the bias is 15 cm, 30 cm, and 45 cm for the three heights, respectively. The bias is always negative-meaning that H underestimates H G -and is both elevation and height dependent.
In order to test the appropriateness of mapping functions for our purpose, we compared it to [Bennett, 1982] bending model as used by Santamaría-Gómez and Watson [2016]. There is a small offset between the two (around 3 cm at high-elevation angles), but the elevation-dependent trend is well represented in both ( Figure 2, bottom right). We also calculated the hydrostatic tropospheric delay using atmospheric ray tracing with the CIRAq climatology [Nievinski and Santos, 2010]. We tested both bent and straight line ray tracing models and found a maximum difference at low-elevation angles (5°) of less than 3% of the height bias (bias for straight line model is larger, which is consistent with Fermat's least time principle [Nievinski and Santos, 2010]). The difference was independent of antenna height and decreased to zero at higherelevation angles. Comparison of the VMF1/GPTw and ray traced hydrostatic delay showed a similar level of difference in the instantaneous tropospheric height bias, with the VMF1/GPT2w results approximately 3% larger at 5°, decreasing to just over À1.5% at high elevations, with the two models agreeing near 10°e levation. There appears to be a slight (less than 0.1%) dependence on reflector height. We found a large difference between the hydrostatic only and the total (hydrostatic plus wet) tropospheric height bias. At 90+ m height, we found that the total bias was around 30% larger than the hydrostatic only bias and varied from around 33% at 5°reducing to 27% at 90°. The results were only slightly smaller at 10 m elevation. These results were calculated for the site KYDH, which is not close to sea level, and may well vary at different sites, but it shows that both components are important when calculating the bias.

Analysis
We processed the SNR data in a manner similar to Larson et al. [2013aLarson et al. [ , 2013b, and Löfgren et al. [2014]. We used the broadcast ephemeris to compute the observed azimuth and elevation for all GPS satellite tracks. Each track was split into ascending and descending passes and the data screened using a simple elevation and azimuth mask to remove observations from directions where the sea surface is not expected to be the dominant reflection. The masks were further tweaked once all the data were processed. The SNR data were converted to a linear scale and the trend is removed by fitting a low-order polynomial to the whole satellite pass [Bilich et al., 2008]. Each satellite pass was then split up into subarcs of 1024 s (~17 min) and stepped forward by 64 s. Although this overlap introduces a relatively large correlation between retrievals, we considered this useful for extracting the elevation-dependent tropospheric delay pattern. If we had not overlapped our subarcs, then we would only have a few independent elevation bins to compare with the modeled results. We ran simulations to test correlation and found that for 10 subarcs the correlation was equivalent to having three independent points. When estimating tidal coefficients (section 5.3) the correlation increased the uncertainties by around 10%. At some sites, for example, Newlyn (NEWL), we took a slightly different approach and binned the data into several overlapping elevation intervals. This was to ensure that both approaches (binning by elevation and time) gave similar results. It also assisted for testing for a tropospheric bias in amplitude and phase of the main tidal constituents during the harmonic analysis.
Returning to equation (1), we note that, to first order and in the presence of a strong reflecting surface, the observed SNR has a periodic component that is a function of satellite elevation, wavelength, and the reflector height. Therefore, spectral analysis with sin(ε) as the independent variable will yield a dominant peak at a frequency related to the reflector height. Lomb-Scargle periodograms [Lomb, 1976;Scargle, 1982] of the Journal of Geophysical Research: Solid Earth 10.1002/2016JB013612 detrended SNR data were formed using oversampling factor calculated to produce a resolution of 1 mm in the height retrieval. The peak of the periodogram, within plausible upper and lower limits, indicates the dominant reflector height and was adopted as the retrieved or measured value, H M . The mean elevation for each segment was also recorded. The retrieval was flagged if the spectral power at the peak is below some threshold with respect to the background noise based on an autoregressive fit to the residuals. Where possible, further screening was done by comparing the L1 with the L2 retrievals.
We applied a height rate correction, introduced by Larson et al. [2013b]: where _ H ¼ ∂H=∂t and _ ε ¼ ∂ε=∂t are, respectively, the temporal rates of height and elevation. The elevation rate was calculated from ephemerides. The height rate was calculated directly from tidal measurements where available, i.e., _ H ¼ _ H TG . Where no nearby tide gauge existed, the height rate correction was calculated in conjunction with the tidal harmonic analysis (as detailed in section 5.3). This assumes that the height rate is purely a function of the tide and not due to, e.g., storm surges.
To focus only on the satellite elevation angle dependence, we removed the sea level variations, H SL , from GNSS-MR retrievals, H M , thus obtaining reflector height deviations: δ = H M + H SL (Note that H M is measured from the antenna down and H SL is AMSL hence the positive sign in the equation). Again, where a nearby tide gauge was available, we used it directly, H SL = H TG . Otherwise, sea level variations were approximated via harmonic synthesis of the tidal coefficients fit to the GNSS-MR retrievals themselves. It should be noted that at those sites without tide gauge data (Table 1), the deviations after removing the tidal harmonics will still include nontidal effects, whereas at sites where we did subtract the tide gauge measurements directly, all sea level variations are effectively removed. Therefore, at tide gauge sites the remaining variations are mainly of tropospheric origin, apart from very local differences between the tide gauge location and the footprint of the reflections. Conversely, at non-tide-gauge sites there is additional noise when looking at elevation dependence. Finally, notice that the deviations δ will not be zero mean. Rather, they will be close to, but biased from, the static antenna height AMSL, H 0 (measured, for instance, using conventional leveling [Woodworth et al., 2015]). Further removing this H 0 from deviations δ produces reflector height residuals, r = δ À H 0 .

Elevation Angle Dependence
For each site, we binned the L1 results by the mean elevation ε ð Þ of each subarc and calculated the mean reflector height deviation ( δ ). In general, the bin size was 0.1°except for BUR2, where the bin size was increased to 0.5°, and TN01 and DUDE, where a bin size of 1°was used. As an initial test, we calculated the difference in δ between two specific bins, 15°and 8.5°(with mean 11.75°and range 6.5°): Δδ ¼ δ 15°À δ 8:5°. These are plotted as a function of the antenna height AMSL (Table 1, column 4) in Figure 3 (top left). The 1 sigma error bars plotted are simply the standard deviations of the individual differences Δδ for each station scaled after fitting a straight line to the data. There is a clear trend in Δδ over site height AMSL, H 0 ; we calculate a ratio of Δδ=H 0 = 8.7 ± 0.3 mm/m. In Figure 3 (top right) we took this point farther and calculated the ratio Δδ=H 0 across all stations for each elevation pair possible and plotted the results as a function of the elevation range (difference in elevation between bin pairs, 6.5°for the initial example above) and mean elevation (average elevation for the bin pairs, (8.5 + 15.0)/2,11.75°for the above example). There is a clear pattern seen in these results, with Δδ=H 0 proportional to the elevation range and inversely proportional to mean elevation. Lastly, in Figure 3 (bottom) we show a time series of ĤM (not δ) for site KYDH, color coded to reflect ε. Apart from the water level variations (shown by the offset blue line), there is a clear variation in the estimated height as a function of elevation angle. Indeed, measurements near 17°are approximately 1 m greater than the measurements near 9°.
To further illustrate the elevation dependence of the results, we plot the binned reflector height deviations ε as a function of mean subarc elevation ε for eight sites (Figure 4). In the rest of the paper we refer to the mean elevation for a subarc as elevation angle, for simplicity. These sites illustrate the main tropospheric effect but also highlight other issues that are, as yet, unknown. Shown overlaid are the tropospheric height bias predicted from the two models (hydrostatic ray trace and VMF1/GPT2w) derived in section 3; the input Journal of Geophysical Research: Solid Earth 10.1002/2016JB013612 heights used in the models are best fit in a least squares sense to the observations. Again, there is clear elevation dependence of the estimated reflector heights and that dependence is larger for sites that are located at greater heights AMSL. NOMI shows a 6 m difference over the interval of elevations observed. KYDH, AC12, HONS, and ANDE all show variations that are close to predicted from the tropospheric delay models. NYA2 clearly shows dependence on elevation, but it is noticeably less than that predicted from the models. Both BRST and TGDE show more variable reflector heights. At BRST the elevation dependence is still evident but displays an oscillation around 15°and 20°. The calculated tropospheric elevation dependence at TGDE, which is only 5 m AMSL, is minute (~5 cm); therefore, other, as yet unknown, nontropospheric effects appear to dominate. It is possible that the same secondary effect is present in the other sites but is swamped by the tropospheric delay and would indicate that they do not scale with reflector height. For instance, both KYDH and NOMI appear to have a slight inflection point between 10°and 15°. It is hard to tell from the other sites because the elevation range is small; however, HONS does appear to dip at around 14°.

Scale Errors
A common systematic effect in conventional tide gauge measurements, found when comparing collocated instruments of different technology (e.g., pressure and acoustic), is scale error [Martín Míguez et al., 2008,  [2015] found that the error in height retrieval was proportional to height itself, at least at their sites with the largest tidal range. They reported scale errors on the order of 6 to 14 mm/m and found that similar errors were found when using only higher satellite elevation angles.
In Figure 5 we plot the residual reflector height, r (after further removing any remaining average, r), for both L1 and L2 as a function of height AMSL for 18 sites where we have a nearby tide gauge. To eliminate effects due to changes in elevation angle, we used only the measured H M from the bottom 10% of elevation angles as it will maximize the effect. At low tide the reflector height is largest, and therefore, the tropospheric height bias is also largest, causing the residual reflector height, r, to be smaller than its average, r. Conversely, at high tide r will still be shorter than the static height, H 0 but will be higher than the average residual reflector height, r . The estimated scale errors together with the predicted scale errors from the VMF/GPT2 model are plotted in Figure 5 (bottom). We get grand medians of 13 and 15 mm/m for L1 and L2, respectively, similar to that found in Santamaria-Gomez et al. [2015]. The scale errors appear to have no obvious correlation to Some anomalies stand out, for instance, BRMU, KYDH, and SC02 (L2). To investigate the issue, we note that the current tide gauge near BRMU is situated at Esso Pier on the North West Coast of St. George's Island, whereas BRMU is at the Biological Station on the South East Coast of St. George's Island. Although they are only around 600 m apart, the tides may be different. We can use an earlier tide gauge operated at the Biological Station until 1994 which overlaps for~5 years with the newer gauge. Using 1 year of data from both gauges, we found a scale difference of 35 mm/m which would partly explain some of the anomalous behavior. The negative rates at KYDH may be related to a couple of issues. First, the measurements are on a lake with no tidal variations, the variations in height are predominantly annual in nature as the lake fills and drains, and other effects could also have an annual signal. Second, the lake level measurements are made at the dam nearly 18 km away, so some scale error could manifest from this. We do not know why the L2 scale error at SC02 is so negative.

Tidal Harmonic Analysis
Given that a tropospheric scale error was found, this would also have implications on the estimation of tidal parameters from GNSS-MR measurements. If we assume at a particular site we use a fixed elevation range to estimate the reflector height, then to first order we note that the tropospheric height bias H T is a linear function of the reflector height, i.e., At BRST, for example, α was found to be À0.0121 m/m for a fixed elevation range of 5°to 15°. Note that α is negative here as we imply that δ is a correction to apply to the known H (for instance, if you know the sea level and the height of the GPS antenna above the datum of the sea level measurements). For an estimated reflector height we need to look at a simplified equation for the system where H 0 is the height of the antenna phase center (APC) above some nominal datum (such as mean sea level), H M is the measured reflector height, H TG is the sea level above the same nominal datum, H _ H : is the height rate correction due to the time variation of sea level over the measurement period [Larson et al., 2013b]. Rearranging gives us or If we ignore the H _ H : component for now, then we see that And, therefore, a corrected H M is GNSS-MR amplitude by the tide gauge-derived amplitude, see Figure 6 (top). We then recalculated this ratio after we correct for tropospheric bias prior to the tidal analysis, see Figure 6 (bottom).
We find that all large tidal harmonics estimated from GNSS-MR are, as predicted, smaller than that estimated from the nearby tide gauge. When we correct for the tropospheric delay, the majority of the harmonics shift closer to a ratio of 1. If the tropospheric delay was the only error, then the corrected estimates of the amplitude would then line up on the 100% line and any additional unbiased noise would produce amplitudes that scatter around the 100% line. There is still perhaps a small negative bias since only five amplitudes are larger than the tide gauge-derived amplitude. Some of this can be attributed to the uneven distribution of the GNSS-MR measurements. If the same plot is derived using the tide gauge measurements, but interpolated to the GNSS-MR epochs, then we see a more even scatter of the amplitudes around the "true" value. Note that the year to year scatter in the tidal amplitudes is around 1% and due to the tidal cusps which are either real, for instance, due to internal tides or due to nonlinearity in the tide gauge, for example, clock errors [Munk and Cartwright, 1966]. Although real variations are not necessarily an issue here, since the amplitudes are estimated over the same period, errors at the tide gauge may be a factor. The error bars reported are purely statistical, based on the scatter in the GNSS-MR results, and do not account for other effects and so are likely underestimated. Note, however, that the remaining scatter is no greater than the 1% variations in the yearly estimates. In Figure 7 we plot the phase difference between the two estimates. There is very little difference in the plots where we did or did not account for tropospheric delay indicating that the effect is indeed a scale error.
At site NEWL where we used specific elevation ranges we also performed a tidal analysis on each separate range. The predominant tide at Newlyn is the M 2 tide with amplitude of~1.7 m. Figure 8 shows the estimated M 2 tide for each satellite elevation band. We see a steady increase in the amplitude as the elevation angle increases. This happens for both L1 and L2 but L2 does seem to decrease again at higher elevation angles which must indicate some other effect. Also shown are the predicted M 2 amplitudes derived from the VMF/GPT2 delays. The small variations are due to uneven time sampling in the different elevation bands.

KYDH Site-Intraannual Variations
Site KYDH is rather unique compared to the other sites in this study, being over 90 m above Dale Hollow Lake in Kentucky. The antenna is attached to the chimney of a resort hotel overlooking the lake and in the Journal of Geophysical Research: Solid Earth 10.1002/2016JB013612 direction of the lake; there is very little multipath clutter apart from the lake itself in the far field and the local roof and building in the near field. In addition, the receiver at this site has been recording L2C from the more recent Block IIR-M satellites which produce more robust signals [Larson et al., 2010]. For these reasons, the results from this site appear to be very clean and allow more detailed investigation. For the L1 and L2C results we removed the lake level variations and split the deviations into their individual satellite and elevation bands. Then for each time series we fit an intercept and an annual signal using least squares. We also did the same thing for the tropospheric height biases predicted from the VMF/GPT2 model, which depends on day of year. We assumed a nominal antenna to mean lake level height of 93.27 m which was estimated to fit the whole data set in a least squares sense; a marginally different height would not affect the prediction significantly. Figure 9 shows an example of the fits for L1 of satellite PRN18 (a similar figure for L2C of PRN 12 is given in the Supporting Information S1). A clear seasonal signal is seen in both the estimated and modeled residuals especially at low-elevation angles. The reduction in annual amplitude with respect to elevation angle indicates that this is not due to an actual annual signal in the lake level, for instance, as that would not be elevation angle dependent and has been removed prior to these calculations.
In Figure 10, the estimated parameters for L1 are plotted for all  satellites and elevation bands (again a similar figure for L2C is given in the Supporting Information S1). We see that the estimated height above the lake agrees well between satellites and with the VMF/GPT2 model. Good agreement is also found in the estimated day of the year for the trough of the annual signal (annual phase). The estimated annual amplitude agrees approximately with the modeled estimates, but the variations are quite large, particularly for PRNs 1 and 13. Figure 10 (bottom right) shows the RMS of the residuals after removing the least squares fit as a function of the elevation angle. For all satellites the RMS reduces as the elevation angle increases, and for the highest elevation band it reaches a low of just over 40 mm. More typical RMS values are around the 100 mm range [Löfgren et al., 2014;Santamaria-Gomez et al., 2015;Santamaría-Gómez and Watson, 2016].
The large annual amplitude for low-elevation angles may explain why the scale errors estimated for KYDH in section 5.2 appeared to be anomalous. The scale errors calculated from the VMF/GPT2 model for that section were only calculated for a specific day of the year, but if there is sufficient annual variation, it would cloud the estimation of the scale errors in both the model and the data. The reduction in annual amplitude as a function of elevation angle may also explain the reduction in RMS as the angle increases. The VMF/GPT2 model only allows for annual and semiannual variations; any tropospheric delay due to other, higher-frequency, variations will presumably also decrease in magnitude leading to a cleaner signal at higher-elevation angles. This is however offset with the higher likelihood of other multipath obstructions in the near field (higher elevation angles) and the reduction in the visibility of signals in the SNR data at higher-elevation angles.

Absolute Leveling Results
Tropospheric delay cannot explain any heights larger than expected. We have 11 sites where there are both a nearby tide gauge and in situ leveling measurements in order to examine absolute height estimates. The difference between the leveling and the mean GNSS-MR estimates after true sea level variations are removed are shown in Figure 11. The results were calculated for both L1 and L2 (mixture of L2 and L2C) and with and without a tropospheric correction. The weighted average height difference is À7.4 ± 4.3 cm and À13.0 ± 4.9 cm for L1 and L2, respectively, when no tropospheric delay is removed and 16.6 ± 2.2 cm and 6.1 ± 2.6 cm when the delay is corrected for. So after removing the tropospheric delay, we now have on average heights that are larger than expected. This could imply that the model overestimates the delay, but we can see from Figure 11 that before removing the tropospheric delay there is obvious trend in the height difference as a function of reflector height (sites are ordered with respect to increasing reflector height) whereas afterward, there is no obvious trend. The L1 bias appears to be larger than the L2 bias which agrees with the findings of Santamaria-Gomez et al. [2015]. Since the tropospheric delay would be equal for both frequencies (troposphere is nondispersive), we attribute this bias most likely to antenna phase center issues (but not the phase center offset as this has been accounted for in the data where we add the  [2016] found at Spring Bay, Australia, that the 12 cm leveling bias in L1 from their previous paper reduced to 6 mm when they optimized the oscillations in the SNR data by converting to a linear scale and using a bandwidth of 7 Hz. We used SNR values converted to a linear scale but implicitly assumed a receiver bandwidth of 1 Hz, so this could partly explain some of the leveling difference.

Discussions and Conclusions
We are not the first to use or mention tropospheric delay in reflectometry studies. Santamaria-Gomez et al.
[2015] mentioned it as a possible source of error but largely dismissed it. We have shown here, however, that some of the other observed effects they saw were likely the result of tropospheric delay-namely, the scale errors as a function of sea level height. Santamaría-Gómez and Watson [2016] successfully applied a tropospheric correction which improved the results at low-elevation angle, but they only used a correction based on the bending angle, whereas Treuhaft et al. [2001] estimated altimetric height along with zenith tropospheric delay using a mapping function. Anderson [2000] used ray tracing results as a black box, with no attempt at obtaining an analytical closed-form approximation. Roussel et al. [2014] considered the effect of angular refraction in displacing the specular point, neglecting the effect on ranging delay, including linear refraction. We are, however, the first to look at the tropospheric delay effect in detail.
It is evident from the results above that tropospheric delay is a significant error in GNSS-MR measurements. The zeroth-order effect is a decrease in measured height as a function of height above the reflector surface and the satellite elevation angle. However, for sites close to the reflector surface and using only higher- elevation angles, the conclusion that the effect can be ignored is incorrect. The tropospheric delay is such that measured heights are always smaller than the true geometric heights. If the reflecting surface is changing in height, then the tropospheric delay will vary (approximately linearly) as a function of the height change and will therefore induce a scale error in the measurements. This is largely independent of the height of the antenna above the reflecting surface. The scale error is roughly similar if you have a high antenna with low tidal range or a low antenna with a high tidal range. Without correcting for tropospheric delay we predict that estimated tidal coefficients will be smaller than the true amplitudes by around 2% (2 cm/m). Also, before removing the tropospheric delay, there is an obvious trend in the leveling results as a function of reflector height whereas afterward, there is no obvious trend.
We used both atmospheric ray tracing of a simple climatology and the VMF/GPT2 models to calculate the tropospheric delay, both of which calculate the total delay, and find that the hydrostatic delays agree to within a few percent of each other. The VMF/GPT2-derived delays are computationally fast and so offer a convenient way to correct GNSS-MR estimates for tropospheric delay; not only in sea level studies but also in snow depth and soil moisture estimates. We also found an excellent agreement between the VMF/GPT2-derived delay and the measurements for the annual component in the delay at site KYDH. If you always measure at a site with the same satellite elevation range, then it is relatively straightforward to calculate a scale factor, α, with which to calculate the correction. Since for a zero reflector height the delay would be zero, this scale factor allows you to calculate the absolute delay for a given reflector height and the change in delay as the reflecting surface changes height. For more complicated measuring schemes then the predicted delay can be computed fairly easily for each individual measurement using the VMF/GPT2 model.
These elevation angle, and antenna-dependent biases require more study in order to improve GNSS-MR measurements for sea level particularly if the results are to be used for leveling at tide gauges, studies of mean dynamic topography, or height systems unification [Woodworth et al., 2012[Woodworth et al., , 2015. In the future subdaily GNSS-derived estimates of ZHD and ZWD together with standard mapping functions could be used to estimate the tropospheric delay or there is the possibility that the GNSS-MR results themselves can be used to derive the delay particularly since established mapping functions are not tuned for coastal atmospheric variability. Also, an assessment of the angular refraction and linear refraction approaches by comparison against a superior method deserves further attention in the future.