Bayesian Inversion of Wrapped Satellite Interferometric Phase to Estimate Fault and Volcano Surface Ground Deformation Models

Bayesian inference and an improved downsampling method is used to determine earthquake and volcano source parameters using a popular geodetic observation method, satellite radar interferometry. The main novelty of the proposed approach is that the interferometric wrapped phase can be directly inverted, circumventing the ill‐posed phase unwrapping processing step. Phase unwrapping errors severely affect the estimation of earthquake and volcano source parameters using interferometric observations. Therefore, it is desirable to avoid phase unwrapping completely. To overcome the need for phase unwrapping, we propose a downsampling algorithm and a method to estimate the covariance function of the wrapped phase and establish an appropriate misfit function between the observed and simulated wrapped phase. Uncertainties in source parameters are assessed with a Bayesian approach, and finally, the robustness of the inversion methodology is tested in multiple simulations including variable decorrelation and atmospheric noise simulations. The method is shown to be robust in challenging noise scenarios. It features an improvement in performance with the Bayesian approach, compared to similar previous methods, avoiding any influence of seed starting models and escaping local minima. The impact of a small percentage of incorrectly unwrapped phase observations in current state‐of‐the‐art methods is shown to strongly affect the estimation process. We conclude that in the cases where phase unwrapping is difficult or even impossible, the proposed inversion methodology with wrapped phase will provide an alternative approach to assess earthquake and volcano source model parameters.


Introduction
Phase unwrapping is the process of recovering the absolute phase from unambiguous wrapped phase values, measured in modulo 2 radians. The aim of phase unwrapping is to reconstruct the absolute value, because it is proportional to the difference in path length for a time-separated pair of SAR images. It provides a quantitative observation, which can be used to interpret ground deformation due to earthquake or volcanic processes, among others.
From a mathematical point of view, phase unwrapping is an inverse problem; however, it is ill posed and notoriously difficult to solve in the presence of noise. The difference between the absolute phase and wrapped phase is an integer number of 2 radians, and the solution is nonunique unless further constraints are given. Although several assumptions have been made in the development of phase-unwrapping algorithms, a challenge still remains to overcome two common problems: noise and discontinuities in the absolute phase map (Huang et al., 2017;Werner et al., 2002). One constraining condition in such algorithms is a phase continuity assumption adopted by path-following algorithms, such as branch cut (Goldstein et al., 1988), quality guided (Xu & Cumming, 1996), or the minimum discontinuity approach (Flynn, 1997). Under this assumption, the absolute phase difference between any two neighboring pixels is assumed to be less than . Various methodologies have been proposed to construct the path connecting pixels, to enable the absolute phases of all pixels to be determined through spatial or temporal integration. However, the phase continuity assumption is not satisfied everywhere, and it would be violated in the presence of high noise level or discontinuous terrain or surface displacements. High noise level could cause unwrapping errors. Indeed, an error at a single point will propagate into the rest of the unwrapped phase signal along the integration path. For areas where true discontinuities occur, the absolute difference between two adjacent pixels is more likely to be over , and the integration paths between those two pixels would not be allowed, so several disconnected regions would thus occur. Another popular constraining condition is to find the global minimum of an energy function based on the L p norm of the difference between absolute phase differences and wrapped phase differences. When p = 2, the L p norm-based phase unwrapping method is transformed to the least-squares phase unwrapping method, and when p = 1, it is changed to the minimum cost flow phase unwrapping method (Chen & Zebker, 2001;Costantini, 1998). The dilemma for solving the global minimization is that noisy pixels will distort the solution, thus affecting the noise-free areas. Usually, a phase filtering step is applied before unwrapping; however, phase information contained in noisy pixel areas can be lost.
Unwrapping error decreases the accuracy of geophysical models constrained using InSAR observations and their subsequent analysis (Huang et al., 2017;Whipple et al., 2016). One remedial measure is to correct the error after the unwrapping procedure. Some InSAR practitioners have attempted to manually correct unwrapping errors by adding integer multiples of 2 to badly unwrapped regions of pixels (Biggs et al., 2007), but it is a time-consuming process. Others designed an iterative unwrapping technique to identify and mitigate unwrapping errors, and this could reduce the number of pixels with unwrapping error to a lower level (Hussain et al., 2016). Ultimately, however, unwrapping errors are difficult to mitigate and in some circumstance cannot be unambiguously detected.
A potential solution to avoid the unwrapping error issue completely is to carry out a geophysical inversion directly on the wrapped phase observations. A first pioneering approach to this was proposed a decade ago by creating an appropriate cost function based on the residual between observed and modeled wrapped phase (Feigl & Thurber, 2009). In their original work the authors directly used the observed wrapped phase, and this was further refined in Ali and Feigl (2012), who used the downsampled phase gradient in range change (wrapped phase). In order to reduce the computation complexity, singular value decomposition technique can be used during the nonlinear inversion (Fornaro et al., 2012). However, neither of the existing approaches considered estimating a variance-covariance matrix in their proposed cost function. Ignoring the observation correlations leads to inaccurate uncertainty bounds on the geophysical model parameters (Lohman & Simons, 2005). A second approach was to create a global network connecting the observed pixels and calculate the difference in the observed wrapped phase between connected pixels (Galetto et al., 2020;Hooper, 2010;Wang et al., 2014). The cost function to minimize is the relationship between observed and modeled differential wrapped phase among connected pixels. The potential risk might be a huge matrix describing the relationship between connected pixels, which could cause numerical computation problems.
In this paper, we adopt the ideas of the first approach to minimizing the residual between observed and modeled wrapped phase and also incorporating the wrapped phase gradients (Ali & Feigl, 2012;Feigl & Thurber, 2009). We propose an improved downsampling scheme for the wrapped phase and solve a variance-covariance matrix to evaluate the observations. Furthermore, the inverse problem is solved using state-of-the-art techniques for geodetic source model inversions, accounting for full model uncertainties through a Bayesian approach (Anderson & Segall, 2013;Gombert et al., 2018;González et al., 2015;Ragon et al., 2018). Rather than providing only one set of optimal parameters (maximum likelihood estimations), Bayesian approaches describe the optimum model as probability density function (PDF). They thus assess Journal of Geophysical Research: Solid Earth 10.1029/2019JB018313 the model's uncertainties and have gained popularity recently as implemented in packages such as AlTar (Minson et al., 2013), BEAT (Vasyura-Bathke et al., 2017), GBIS (Bagnardi & Hooper, 2018), and slipBERI (Amey et al., 2018).
The workflow of the manuscript is presented in Figure 1. We begin by providing the methodology to construct the likelihood function, including the new algorithm to downsample the wrapped phase and a spectral method to estimate data correlation in the wrapped phase observations. A likelihood function suitable for Bayesian techniques is then derived. This is followed by a description of the sampling technique using a Markov chain Monte Carlo (MCMC) method, which produces source models based on the classical Metropolis-Hasting (MH) rule. Finally, the proposed methodology is tested on synthetic cases with variable noise and one real earthquake case. We show that the performance of our proposed methodology is complementary and, under certain conditions, superior to the method using the unwrapped phase.

Methodology: WGBIS
Solving a geophysical inverse problem must start with a clear understanding of the observed data, d, and its relationship, G(m), with the model parameters m. Such a relationship should also account for the existing of error sources, .
Bearing this framework in mind, we describe in this section how to treat the observed data, d, referring to the downsampled wrapped phase and weighting scheme. Then, we explain how to estimate the model parameters using a Bayesian approach built upon a state-of-the-art method, GBIS (Bagnardi & Hooper, 2018), hence the name WGBIS.

Data Downsampling and Weighting
Many downsampling algorithms have been developed for spatial downsampling and averaging of interferograms. The number of data points, as well as the noise, is reduced while as much information as possible is retained (Jonsson et al., 2002;Lohman & Simons, 2005;Simons et al., 2002). Most although not all downsampling algorithms recursively divide the data into quads, until the variance of the data within each smaller quad is below a preset threshold. Feigl and Sobol (2013) extended the quadtree downsampling approach to the wrapped phase. However, their approach estimates the gradient of range change by averaging the difference between rows or columns, which could be sensitive to the noise on the edge of the patch. We propose a modification that takes advantage of the redundancy of data within each patch, by using robust fitting of a bilinear ramp. Our experiments show that an iterative fitting method is more robust against noise than a least-squares algorithm. We also remove such trends to obtain an unbiased estimate of the variance of the wrapped phase data.

Data Downsampling
Our proposed methodology to downsample the wrapped phase, , is to subdivide the phase map recursively into patches for which we estimate (1) a circular mean phase value, , (2) an estimate of the discrete spatial derivative of range change with respect to the east coordinate (or X-direction), X , and (3) an estimate of the discrete spatial derivative of range change with respect to the north coordinate (or Y -direction), Y .
It is assumed that a given patch of size n pixels contains wrapped phase values (i, j), where i and j are the indexes in the X-direction and Y -direction. Note that the following equations are valid for circular statistics (Mardia & Jupp, 2000). For each patch and at each iteration, the procedure is as follows: (1) To estimate the circular mean phase of the patch, , (2) The range change gradients X and Y are calculated in X-direction and Y -direction. The phase in space domain is projected into the spectral domain with a one-dimensional Fourier transform in X-direction and Y -direction, separately. We take advantage of the observation that the power of specific wavenumbers corresponds to the number of fringes in the space domain. Using this approach, it can be empirically determined if a given patch has less than one fringe. Hence, there should be less than two dominant frequencies with high power in both X-direction and Y -direction, and a robust linear regression algorithm, "robustfit", is directly used for the wrapped phase to estimate the gradients. A more rigorous approach is possible, but this straightforward method performs well. We define a frequency to be of high power, P X (f X ) or P Y (f Y ), if the power spectrum is higher than three standard deviations from the mean power spectrum.
where (P X ) and (P Y ) are the mean power spectra and (P X ) and (P Y ) are their standard deviations.
For the patch having no more than one frequency with high power in both X-direction and Y -direction, a bilinear function is applied to fit the phase and estimate the gradient.
where ΔX and ΔY are the coordinates with respect to the center of the patch.
Second, we remove the ramp from the phase and calculate the circular mean deviation, 2 , of the deramped phase (Nikolaidis & Pitas, 1998).
Different criteria were studied for downsampling circular mean phase and gradient of range change. We concluded that for any given quad, the following conditions must be checked; if any of them is met, the quad is subdivided, and then, Steps (1) to (3) are again calculated in the child quads: (a) if there is more than one wavenumber with high power, as defined in Step (2) along X-dierction or Y -direction; (b) if the circular mean deviation of phase is higher than a predefined maximum threshold for circular mean deviation, 1 > 1,crit ; and (c) if the circular mean deviation of the deramped phase exceeds a second predefined maximum threshold of circular mean deviation, 2 > 2,crit . Ultimately, the iteration is forced to stop if the number of pixels within the current patch is less than a given minimum pixel threshold, n 2 ≤ n 2 thresh . The initial inspiring subdividing methodology for wrapped phase, PHA2QLS.C (Feigl & Sobol, 2013), estimates the phase gradient by averaging the gradient between consecutive pixels in rows (or columns) within each patch. We noted that this approach can be sensitive to phase noise on the edge of the patch. Therefore, the performance in the estimation of phase gradients was compared between (a) the PHA2QLS algorithm, (b) a simple least-squares bilinear ramp method, and (c) a bilinear ramp fitted using a robust iterative least-squares method. Our experiments show that the robust fitting method provides a significant improvement against noise in the estimation of the gradients, with only a modest increase in computation time.

Data Weighting
Accounting for all noise sources in satellite radar interferograms is complex (González & Fernández, 2011) but essential to accurately estimate geophysical model parameters (Lohman & Simons, 2005). Here, we focus on the characterization of the spatial pattern of noise. This involves estimating the data variance-covariance matrix, C, to enable calculating a data weighting matrix, W. The input data are the observed wrapped phase. For data preparation, the observation wrapped phase is reduced to downsampled mean phase and phase gradient with the quadtree algorithm described in section 2.1.1. The variance-covariance matrix for data weighting is calculated for circular mean phase and phase gradient, respectively, in section 2.1.2. For Bayesian inversion in the green box, the likelihood function is calculated from the downsampled wrapped phase and variance-covariance matrix, which is explained in section 2.2. According to the Monte Carlo Markov chain algorithm, the new inversion model m i+1 is randomly generated, based on its previous model, m i . The Metropolis-Hasting rule is applied to decide whether to accept or further update the new model on the basis of the likelihood function. The posterior probability density function of the model is the collection of accepted models after the Bayesian inversion.
We start with the gradient of range change, , this being relatively easier as it is a continuous differentiable field (Sandwell & Price, 1998). Therefore, we are able to calculate the corresponding covariance function directly using popular geostatistical methods. Adopting the semivariogram, first, an empirical semivariogram is first estimated and then a theoretical semivariogram model fitted to the estimated empirical observations. The general idea is to exploit the correlation of values of pairs of pixels. A closer pair is more similar than those farther apart, so the spatial decay of the correlation provides useful information on the noise characteristics. We search for pixel pairs with a given distance Δh i and calculate the variance (Δh i ).
where M is the number of pixel pairs with the distance Δh i . The series of Δh i and (Δh i ) are used to estimate the semivariogram parameters.̂( where n is the nugget variance, s is the sill variance, and h is the range. Three semivariogram parameters are then used to construct its covariance function, and the covariance value between two pixels with distance h is (h).
For circular mean phase, the circular mean deviation of the deramped phase, 2 , described the noise level within each patch, so we make use of the square of the circular mean deviation, 2 2 , as the diagonal term of the variance-covariance matrix. However, the circular mean phase is a spatially discontinuous function, and we are not able to evaluate the characteristics of the spatially correlated noise directly. Therefore, we ignore the correlation of circular mean phase between patches.

Bayesian Parameters Estimation
In this section, we present the methodology in detail, as shown in Figure 2.

Bayesian Framework
In the Bayesian framework, the unknown model parameters are described by PDF, which is updated after adding the new information source. Prior information on model parameters is named a prior PDF. After adding more information provided by likelihood function, p(d|m), the Bayesian approach updates the prior PDF, p(m), to posterior PDF, p(m|d), as described in equation (11).
where p(d|m) is the likelihood function based on residuals between the observed data d and the model predictions G(m) and p(d) is a constant number independent of m. In this methodology, the prior PDF, p(m), is assumed to be a uniform distribution with the lower and upper bounds based on our previous knowledge, and it means that the probability value is the same for all the trial models. This assumption is reasonable, especially for the blind fault we had never previously studied. Thus, we will later focus on the likelihood function, since the posterior PDF, p(m|d), is proportional to the likelihood function, p(d|m), and they share a similar PDF.
For consistency in the statistical treatment of the unbounded phase gradients, we approximate the phase as a wrapped normal distribution, instead of the von Mises distribution used in Feigl and Thurber (2009). The wrapped normal distribution and the von Mises distributions are important in the analysis of circular data, and they are similar in some ways (Kent, 1978): The von Mises distribution converges toward the wrapped normal distribution when its concentration parameter ( ) tends toward infinity (Pepe, 2019). The convolution of two wrapped normal variables is also wrapped normal (Jammalamadaka & SenGupta, 2001), but the multivariate von Mises distribution requires a rather complex estimation procedure (Mardia & Voss, 2014). The maximum likelihood parameter estimation in multivariate von Mises is also still an open problem (Nodehi et al., 2018). Assuming that the error in equation (1) follows the multivariate wrapped normal distribution with zero mean and variance-covariance matrix C, the likelihood function p(d|m) is as expressed in equation (13).
where r is the residual between the model predictions G(m) and the observed data d, N is the number of data points, and i is an integer. Using the wrapped normal distribution is ambiguous to the integer k. However, in our specific case, the spatial distribution of the InSAR phase values permits discarding ambiguous models during the Bayesian estimation. In addition, the observed phase gradient is also incorporated in the inversion, and phase gradient is very sensitive to the integer k: An incorrect estimate of k would severely affect the observed phase gradient and dramatically increase the residual.
After phase downsampling, we will have three downsampled data sets (circular mean phase and phase gradients), and we combine three observation sets in the inversion as equation (14) shows.
For the variance-covariance matrix C in equation (14), if neglecting the correlation between observations sets, then C is a diagonal variance-covariance, and thus, C 1 , C 2 , and C 3 are each diagonal. If one instead

10.1029/2019JB018313
considers the spatial correlation between patches but neglects the correlation within an individual patch, then C becomes a block-diagonal matrix with C 1 diagonal but C 2 and C 3 full.
For circular mean phase, the residual phase r 1 is the wrapped difference of the observed and modeled circular mean phase, r 1 = wrap(G 1 (m), d 1 ). Note that the wrap(.) operator is not a linear operator because The residual phase of circular mean phase can be calculated in MATLAB as equation (15) (Feigl & Thurber, 2009).
The modeled circular mean phase G 1 (m) is the absolute phase in line-of-sight direction.
where u is the modeled three-component displacement vector. Generally, s is the unit vector for projecting the observation vector from Cartesian coordinates to line-of-sight direction with incidence angle , and angle ′ = + is the combined effect of heading, , and, , squint angles (González et al., 2015), and c is the constant value for converting units from meters to radians with radar microwave wavelength .
For the gradient of range change, the residuals r 2 and r 3 are calculated in equation (17).
Note that the unit is radians/meter for G 2 (m), G 3 (m), d 2 , d 3 . The modeled gradient of range change in X-direction, G 2 (m), and in Y -direction, G 3 (m), is the projection of the east and north components of phase gradient into the line-of-sight direction.
where u x is the displacement gradient in X-direction and u is the displacement gradient in Y -direction. Assuming K-independent interferograms, the overall likelihood function is written as the product of the likelihood functions for each data set.

Sampling Approach and Acceptance Rule
Theoretically, Bayesian sampling will produce models from anywhere in the model space and characterize the posterior PDF according to the probability density. As a result, the sets of models consistent with the prior PDF and the data are sampled more often than those with lower probability (Anderson & Segall, 2013;Minson et al., 2013). Following the Bayesian sampling algorithm provided by GBIS (Bagnardi & Hooper, 2018), we adopt the MCMC algorithm as the probability sampling approach based on the MH rule.
The advantage of MCMC is that it can sample the vector space of the model parameters without specifying their PDF. In the MCMC sampling algorithm, a random sample moves forward to the next random sample, without dependencies on any samples before the previous one.
where a, ranging from −1 to 1, is a random number from a uniform distribution and Δm is the model step defined in the input file.
The MH rule is thought to be the most important technique for controlling random walk and is able to converge the sampling toward the target distribution (Anderson & Segall, 2013). In MH, the acceptance probability, b, is a random number following the uniform distribution ranging from 0 to 1. If the ratio, = p(d|m i+1 )∕p(d|m i ), is higher than b, the new sample m i+1 will be accepted and updated to the new model, and vice versa.

Results: Validation Experiments
In section 2, we presented our methodology for constructing a likelihood function between observed and estimated wrapped phase and the corresponding variance-covariance matrix and also the algorithm to generate the samples for the posterior PDF of the estimated model parameters. In this section 3, we validated the performance of our proposed methodology by applying it to two main synthetic cases. These exemplify the significant concerns about the most popular volcanic and earthquake ground deformation models.

Choice of Likelihood Function
In the likelihood function, the key elements affecting the inversion include the observed data and the weighting matrix. The observed data in conventional InSAR inversion are the unwrapped phase, whereas for our approach, it is wrapped. The weighting matrix, which is the inverse of the variance-covariance matrix, describes the noise contained in the observed data. If the weighting matrix is diagonal, it infers that the observation is independent and the spatial correlation among observation is ignored. Hence, the aims of this section are (1) to compare the effect of using observed data by choosing either wrapped or unwrapped phase and (2) to explore the role of variance-covariance by choosing whether to use diagonal or block variance-covariance.

Experiments Designed With Various Noise Levels
Three groups of experiments were designed: Groups A, B, and C, as shown in Table 1. The difference between Groups A and B is whether spatial correlation is considered in the variance-covariance matrix, and the difference between Groups B and C is whether wrapped or unwrapped phase is used.
In the experiments, patterns of differential interferometric phase change, Δ , were simulated as the addition of three main components as shown in equation (21), where disp is the surface displacement phase, atm is the atmosphere noise phase, and rand is the random noise phase (e.g., decorrelation and thermal noise).
Wrapped phase, , was calculated by wrapping the differential interferometric phase, , onto the interval The displacement phase ( disp ) was simulated by using a volcano source (Mogi, 1958) with a spherical shape. The synthetic parameters are listed in Table 2. Figure 3a shows the simulated displacement phase. Atmosphere noise phase ( atm ) has two physical origins: vertical stratification and turbulent mixing. The vertical stratification is correlated with topography and is temporally more stable. However, the turbulent mixing is governed by strongly nonlinear processes, and it is commonly regarded as the most important source of Y center (m) −2,000 −3,000 −10,000 10,000 Depth (m) 9,000 13,500 100 10,000 Volume change (m 3 ) −3× 10 7 −4.5 × 10 7 −10 9 −10 2 uncertainty in interferograms. Therefore, we generated realistic atmospheric noise phase based on the turbulent mixing in the lower troposphere. Based on the 100 × 100-km interferograms, Hanssen (2001) studied the tropospheric signal and proposed three scaling regimes between wavenumbers and power spectrum. One regime, where the power spectrum is proportional to the −3/8 power of the wavenumbers, was the most often identified in previous research (Ferretti et al., 1999;Goldstein, 1995). For simplicity, only the regime with a power exponent −8/3 and a scale between 0.25 and 1.5 km was applied in the simulation. Furthermore, we considered three sizes of atmosphere noise phase and explored how well the model parameters could be reconstructed from them. The signal-to-atmosphere noise ratio, SNR (atmosphere noise), is 2, 5, +∞ for the experiment with large, small, or no atmosphere noise phase, which corresponded to the range 50%, 20%, and 0% of the amplitude range of the deformation signal. Figure 3b shows a simulated atmosphere noise phase with large magnitude. The random noise phase ( rand ) is generally caused by thermal noise in the instrument and by decorrelation or incoherence (Fattahi, 2015). The random noise in this research was assumed to be spatially uncorrelated white noise, and we contaminated our synthetic data with uniform random noise within the ranges 50%, 20%, and 0% of the amplitude range of the deformation signal. The amplitude range of the deformation signal is with respect to 2 for wrapped phase or the peak amplitude of displacement phase for the unwrapped phase. The signal-to-random noise ratio, SNR (random noise), is 2, 5, +∞ for the experiments with large, small, or no random noise phase. Figure 3c shows an example of a strong decorrelation (random) noise phase component.
In Figure 4, the simulated noise-plus-deformation wrapped interferogram is shown, as listed in Table 1, and the various noise levels are clearly seen. Then, the wrapped phase is downsampled following the algorithm in section 2.1.1. Figures 5a and 5b show the downsampling results for the case without and with the noise. To estimate the covariance function, we chose an area far from the deformed region and assumed that the phase is dominated by noise, as shown in Figure 6a. Then, we apply the algorithm in section 2.1.2, which was applied. For phase gradient, the theoretical semivariogram function (Figures 6c and 6d) was fitted from  (Mogi, 1958). One colored fringe corresponds to one cycle of phase change. (b) Simulated atmosphere noise phase with large magnitude and SNR (atmosphere noise) = 2; that is, the range of atmosphere noise phase is equivalent to 50% of the amplitude range of the displacement phase. (c) Simulated random noise phase with a large magnitude and SNR (random noise) = 2; that is, the range of random noise phase is equivalent to 50% of 2 . The area of the region in the black box is around 100 × 100 km, and this region contains most of the displacement signal. Therefore, we use this region for phase downsampling. The region in the red box contained few displacement signals, and the phase is dominated by the noise phase, so we use it to calculate the covariance function for data weighting. the downsampled phase gradient (Figures 6a and 6b). Note that the downsampled phase gradient was here calculated based on the phase gradient in the patch with uniform size n 2 thresh = 256.

Evaluation of the Results
After inversion, for each experiment, we assessed the normalized deviation R of source parameters between the synthetic value m and the estimated valuem.
where n is the number of source parameters, m i is the ith synthetic source parameter value, andm i is the ith estimated source parameter value. The normalized deviation R was evaluated from two perspectives: One concerns the source location, where m ∈ { X center, Y center, depth }; the other is related to the source strength, where m ∈ { volume change }.
Under the assumption that the posterior PDF of source parameters follows the normal distribution  , the mean value and variance 2 were then estimated from normalized deviation R.  1. Influence of variance-covariance (Group A vs. Group B): The spatial correlation between patches is not considered to construct the likelihood function in Group A. Thus, the downsampled patches are independent and have the same weight, while the block variance-covariance considering the spatial correlation is applied to weight the residual in Group B. For each corresponding experiment, the normalized deviation is greater in Group A where the spatial correlation is neglected. 2. Influence of observed data (Group B vs. Group C): The uncertainties of the normalized deviation derived from the wrapped phase (blue distribution) being narrower than that from the unwrapped phase (yellow distribution). Note that the results should be the same from both wrapped phase and unwrapped phase without unwrapping error. However, the correlation between circular mean phase and phase gradients or between phase gradients is not considered in WGBIS, and each wrapped data set is independent, which leads to the underestimated uncertainty. 3. Influence of atmosphere noise phase (Group B): For experiments using the same SNR (random noise) (the same column in Figure 7), the mean value of the normalized deviation and the uncertainties became greater with increasing SNR (atmosphere noise). 4. Influence of random noise phase (Group B): For experiments using the same SNR (atmosphere noise) (the same row in Figure 7), the normalized deviation maintains similar distribution with increasing SNR (random noise).

Strong Phase Gradient Scenario
One situation, where it is difficult to perform inversion, is when there is surface rupture with a strong phase gradient, because unwrapping errors are more likely to occur due to the discontinuous phase (Bacques et al., Figure 6. Covariance function estimation from the phase in the nondeformed region for Case 1 (volcano deformation model). The chosen region for covariance estimation is the undeformed region for the experiment containing noise phase (the noise level is the same as the center figure in 5). Images (a) and (b) are the downsampled phase gradients in X-direction and Y -direction with minimum pixels n 2 thresh = 256 as the downsampling threshold. In (c) and (d), the experimental (rectangular) and theoretical (solid line) semivariograms are shown for phase gradients in X-direction and Y -direction, estimating from (a) and (b) according to equation (9). 2018; Shugar et al., 2010). Although the conventional methodology based on the unwrapped phase with unwrapping error might not provide a robust satisfying result in such a scenario, our methodology allows us to explore the inversion results based on the wrapped phase. Therefore, we simulated a fault with surface rupture and explored the performance with the wrapped and unwrapped phase.

Experiments Designed With Strong Phase Gradient
We designed three sets of experiments: Groups D, E, and F, as shown in Table 3. The difference between Groups D and E is that they are either wrapped or unwrapped phase, and the difference between Groups E and F is that it is whether they contain the unwrapping error. Note that the atmospheric noise and random noise are not considered in these cases. The displacement filed in this case is simulated by half-space fault dislocation model (Okada, 1985) with a major strike slip and minor dip slip. The synthetic parameters are listed in Table 4. Figures 8a and 8c show the wrapped and unwrapped displacement phase. The maximum slip at surface corresponds to five to six fringes in the line-of-sight direction. An unwrapping error 2 k in magnitude, where k is an integer, might occur during phase unwrapping procedure. In this earthquake case, the top depth of the fault plane is as shallow as 10 m, so it is more likely that unwrapping error would occur. Thus, we assume 4% of downsampled patches to be affected by unwrapping error, and k is a random number in {−2, −1, 1, 2}.
According to Figure 8a, the total slip at the surface corresponds to five to six fringes in the line-of-sight direction, and three to four of them are identified by the wrapped phase downsampling algorithm with fine details (downsampled circular mean phase in Figure 8b). The widths of two fringes that fail recognition are less than 10 pixels, and the sharp phase change contributes to a high circular mean deviation of phase, which may contribute to its exclusion. Similarly, the information within the same region is lost in the downsampled unwrapped phase (Figure 8d).

Evaluation of Results
Following section 3.1.3, the estimated source parameter was assessed by the normalized deviation R from two perspectives: One is concerned with the location of the sources, and the other with the source strength, where m ∈ { geodetic moment } and geodetic moment is the product of fault length, fault width, total slip, and shear modulus (rigidity). Figures 8f and 8g show normalized deviation R for source locations and strength (released moment). Our findings are listed below.
1. If the input data do not contain unwrapping error in either wrapped phase or unwrapped phase without unwrapping error, the methodology provided good inversion results. However, the mean value and uncertainty of normalized deviation based on the wrapped phase (blue distribution) were slightly lower than that based on the unwrapped phase (yellow distribution). As mentioned in section 3.1.3, the results  should be the same from either wrapped phase or unwrapped phase without unwrapping error, and the underestimated uncertainty for wrapped phase is because each wrapped data set is independent. 2. If the input data are the unwrapped phase, the addition of unwrapping error will corrupt the inversion result. The normalized deviation (black distribution) exceeded 0.6 when adding 4% of unwrapping error.

Choice of the Likelihood Function
The key in this Bayesian inversion methodology is the new likelihood function in which the residual between observed and modeled wrapped phase is weighted by the variance-covariance matrix. In this section, we will discuss the effect on the method's performance of the choice of observed wrapped phase and variance-covariance matrix.
In previous inversion methodologies based on the wrapped phase, the input data are limited to one observable. For example, Feigl and Thurber (2009) minimized the circular mean deviation based on the wrapped phase, and Ali and Feigl (2012) minimized that based on the phase gradient in Y -direction. In order to show the effect of input data on the inversion results, we used our Bayesian approach to design a set of experiments with different input data: (a) circular mean phase, (b) phase gradient in Y -direction, and (c) combinations of circular mean phase and phase gradients. As shown in Figure 9, these experiments suggest that the mean value of the normalized deviation based on the circular mean phase (a) is lower than that based on their combinations (c). In contrast, the inversion with only circular mean phase (a) is easily trapped into a local minimum due to a wrong estimation of integer k. We also noticed that the mean value of the normalized deviation based on the circular mean phase (a) is higher than that based on their combinations (c) and that the uncertainties based on the phase gradient (b) are one to six times wider than those based on their combinations (c). It can therefore be concluded that the inversion based on the combinations provides a more appropriate estimation for the source parameters.
As shown in Figure 7, for experiments based on wrapped phase and variance-covariance matrix considering the correlation between pixels (Group B), the uncertainty of normalized deviation is lower than that based on unwrapped phase (Group C). Unlike the independent observations among different interferograms, the three downsampled wrapped phases are not fully independent, explained by the change in the wrapped phase being correlated with that in phase gradient. Thus, the ignorance of the correlation between circular mean phase and phase gradients, or between phase gradients, would lead to an excessive fit with the observed data and underestimated uncertainties Fukuda & Johnson, 2008;Sun et al., 2013). Moreover, the difference between uncertainties based on wrapped and unwrapped phase decreased when the noise level rose. This infers that, under high noise conditions, the inversion based on wrapped phase could lead to a similar uncertainty of the model, compared with that based on unwrapped phase. Nevertheless, it would be better to apply an appropriate algorithm to estimate the correlation between the three downsampled wrapped phases, and this needs further exploration.

Effect of Increase in Random Noise Phase
In this research, the random noise phase is simulated as spatially uncorrelated white noise (a decorrelated effect), and the experiment in section 3.1 presents the inversion result with various noise levels. When comparing the experiments with the same atmosphere noise, using the proposed approach, the PDFs of the normalized deviation remain stable under various random noise levels ( Figure 10).
In order to explore the performance of the proposed approach under extreme noise conditions, we broadened the extent of the simulated random noise level. Noise levels are expressed as SNR (random noise) ranging from 1, 1.1, … , 10, +∞. This corresponds to the percentage of random noise phase with respect to 2 for wrapped phase cases or the peak amplitude of the deformation phase for unwrapped phase cases, 100%, 90%, … , 10%, 0%. Although the data with such low SNR (random noise), for example, 1 or 1.1, may not exist in the real case, it is still valuable to test the limits of our approach. In order to explore the effect of random noise phase, we plot the inversion results versus random noise levels. Figure 10 shows the mean value of the normalized deviation of source parameters with SNR (random noise) in horizontal direction. In Figures 10a and 10b, the inversion with wrapped phase stays robust with increasing random noise level. In comparison to the Bayesian inversion with unwrapped phase, the mean values of R(location) and R(strength) based on wrapped phase are lower, as shown in Figures 10a and 10c (or Figures 10b and 10d). It is also notable that the inversion with the unwrapped phase still had a good performance when SNR (random noise) = 1.2, assuming no unwrapping error. In reality, unwrapping errors might occur when SNR reaches such a low value, where random noise dominates the interferometric phase, and performance will be severely influenced. Therefore, the good performance when inverted by unwrapped phase with SNR (random noise) = 1.2 as input data is an ideal but unattainable case.

Effect of Increase in Atmosphere Noise Phase
Atmosphere noise has been widely regarded as the most difficult error to mitigate in surface deformation detection, because both signals are long wavelength and are not easy to separate. Although many methods have been developed to diminish the effect of atmosphere noise for the unwrapped phase, such as the topography-dependent method (Elliott et al., 2008) and the external data-based method (Jolivet et al., 2014), a general methodology applicable to all conditions is still under discussion Yu et al., 2018). In section 3.1, our analysis inferred that the normalized deviation of location and strength would be biased up to 0.08 with SNR (atmosphere noise) = 2. Similarly, Scott and Lohman (2016) demonstrated the biased fault parameters due to the impact of atmosphere noise.
In order to explore the performance of our approach under extremely strong atmosphere phase, we broadened the extent of simulated random noise level and the SNR (atmosphere noise) ranged from 1, 1.1, … , 10, +∞. This corresponds to the percentage range of atmosphere noise phase to the peak amplitude of the deformation phase, 100%, 90%, … , 10%, 0%.
In Figures 10a and 10b, for inversion with the wrapped phase, the mean value of normalized deviation distribution decreased quasilinearly with increasing SNR (atmosphere noise). For inversion with the unwrapped phase in Figures 10c and 10d, a linear relationship is still applicable. Another significant feature is the uncertainty of the distribution, which is five times wider than with wrapped phase on average, and it also decreases with decreasing SNR (atmosphere noise).

Effect of Strong Phase Gradient
In order to reduce the number of observations and also the influence of noise phase, several algorithms have been proposed to downsample the unwrapped phase (Jonsson et al., 2002;Lohman & Simons, 2005;Simons et al., 2002). In this paper, we proposed a new methodology to downsample the wrapped phase. For the case with low phase gradient, our algorithm provided good downsampled patches describing the deformation signal under SNR low to 1.2. However, it would be more difficult to downsample the wrapped phase with a high gradient, since the interferometric fringes are so closely spaced that aliasing occurs. In this case, the estimated gradient would be too low in absolute value. Therefore, the choice of downsampling criteria, such as circular mean deviation and minimum pixels in one patch, is critical to generate good downsampling results.
(1) The circular mean deviation affects to what extent the deformation signal could be detected. If the circular mean deviation threshold is overestimated, the fringe we prefer to keep could be lost, and the deformation signal would be ignored.
(2) The threshold of minimum pixels in one patch decides in which details the fringe pattern could be detected. If we double the threshold of the minimum pixels, the deformation signal would be averaged. For the region with high phase gradient, the width of the fringe is narrow, and the overestimated minimum pixels threshold will lead to the exclusion of the displacement signal. Thus, the initial minimum pixels threshold could be low and then adjusted to a suitable value according to the noise level and the pattern of deformation signal.
(3) The necessary condition for interferometry implies that the maximum detectable deformation gradient is one fringe per pixel (Massonnet & Feigl, 1998) but, in real data, phase gradients in excess of this threshold seem to be fairly rare. Strong tilts and rotations can cause the phase gradient to exceed this limit (Peltzer et al., 1994).

Effects of Searching Bounds and Starting Models
In order to find the global minimum of an objective function for nonlinear equations, global optimization algorithms, for example, Bayesian approach (Amey et al., 2018;Bagnardi & Hooper, 2018) and the simulated annealing algorithm (Cervelli et al., 2001;Jonsson et al., 2002), are widely used for geophysical problems, due to their high efficiency and capability of jumping out of local minimum regions of a high-dimensional parameter space. One concern about WGBIS is whether it could be caught in a suboptimal solution. The initialization procedure for these global optimization algorithms consists of two steps: (1) to set bounds on the values for all the model parameters and (2) to pick an initial starting model. We therefore set out to test the performance of WGBIS under various of starting models and searching bounds. The starting model is 0%, 10%, or 100% off the synthetic value, and the searching bound of volume change is in a narrow or wide range. The results revealed that WGBIS could retrieve the input model values with a normalized deviation as low as 0.01 in all cases.
In comparison, the program embedded with the simulated annealing algorithm, General Inversion for Phase Technique (GIPhT) (Ali & Feigl, 2012;Feigl & Thurber, 2009), is also applied to invert for the optimal model. The input phase is either circular mean phase (Feigl & Thurber, 2009) or phase gradient in X-direction (Ali & Feigl, 2012). The same starting model and searching bounds are set as mentioned above. The experiments revealed the following findings.
(1) With phase gradient in X-direction as input data, GIPhT successfully finds the global minimum with normalized deviation as low as 0.01.
(2) With circular mean phase as input data, GIPhT failed in one case, where the range of searching bounds is wide and the starting model is strongly biased (100%) away from the synthetic value. As demonstrated by Shirzaei and Walter (2009), the success of simulated annealing at finding a global solution critically depends on the cooling schedule. This is substantially problem dependent, so it seems impracticable to develop a global remedy for all problems.

Exploring the Performance in a Real Case
In order to test the performance of this methodology with actual data, we investigated the surface deformation caused by the 2019 Acipayam earthquake in Turkey. This M w 5.7 earthquake based on published focal mechanisms was generated by a normal fault. This is consistent with the regional tectonics in southwestern Turkey, which are dominated by extensional stresses. We computed a descending interferogram using the European Space Agency's Sentinel-1 satellite images in Terrain Observation by Progressive Scans mode from 3 November 2019 to 23 March 2019. This period spans the occurrence of the earthquake and should be dominated by any coseismic displacement. The area shows very good coherence, and a clear strong fringe pattern is visible (Figure 11a). The interferogram was computed using the JPL/Caltech/Stanford ISCE package and was spatially filtered.
The filtered interferometric phase pattern was fed directly into WGBIS, which downsampled the wrapped phase and phase gradients data, estimated the data variance-covariance matrix, and finally ran the Bayesian inversion for the fault rupture parameters. Figure 11b shows the downsampled observed wrapped phase and phase gradients, modeled phase, and their residuals. In Figure 12a, we also show very well-defined posterior PDFs for the nine estimated fault source parameters. The bottom row shows the histograms of marginal distributions for each parameter, and the remaining rows show the joint distributions between pairs of parameters. In Figure 12b, we show that the distributions of parameters in the prior have significantly narrowed down to the posterior distribution after the Bayesian inversion. The Bayesian inversion results reveal that coseismic surface displacement can be well explained by fault slip on a 9.4 ± 0.2-km-long fault and 9.2 ± 0.2-km-wide fault, striking 340 • ± 1.2 • and shallowly dipping at 29.3 • ± 1.5 • , with 14 ± 0.5 cm of slip in the down-dip direction and 7 ± 1 cm of sinistral strike-slip components. The estimated fault slip infers that this fault has a dominate extensional mechanism with minor left lateral strike-slip component, and this is consistent with geologic and geomorphologic investigation (İrem & Yaltrak, 2016). From the analysis of the marginal posterior probabilities (Figure 12a), correlations are observed between any two of the source parameters, with all of them reasonably well defined. Assuming a shear modulus of 3.32 × 10 10 N/m 2 , Poisson's ratio 0.25, and the marginal posterior probabilities for fault geometry and slip, the geodetic moment is 4.45 ± 0.2 × 10 17 Nm, equivalent to M w between 5.68 and 5.71. This finding is consistent with the estimated seismic moment M w 5.7 from the available earthquake catalogs (Turkey earthquake catalog, https://www.koeri.boun.edu.tr/sismo/2/earthquake-catalog/; GFZ earthquake catalog, https://geofon.gfz-potsdam.de/old/eqinfo/list.php; USGS earthquake catalog, https://earthquake. usgs.gov/earthquakes/search/). It should also be pointed out that, during the Bayesian inversion, the searching bounds for the fault parameters were selected over quite a broad range; for example, the striking angle is searched between 0 • and 360 • (see Table 5 for more details). This indicates that our methodology has the ability to distinguish two conjugate fault planes inferred from the seismic wave inversion. However, this could be due to the very shallow fault slip associated with this event. Furthermore, a strong indicator that our estimated fault plane is the causative slipping fault is that the spatial distribution of aftershocks is closely aligned to the modeled fault plane ( Figure 13). . The orange and blue dots are aftershocks with ML > 3.5 and ML < 3.5 in the first month following the main shock, retrieved from Turkey earthquake catalog: https:// www.koeri.boun.edu.tr/sismo/2/earthquake-catalog/.  Figure 10, and the fitted function is R = q 1 x −q 2 + q 3 −q 4 + q 5 x −1 −1 + q 6 , where R is the normalized deviation, x is SNR (random noise), y is SNR (atmosphere noise), and q is function coefficient. The solid line is the normalized deviation for WGBIS, and dashed line for GBIS. The dark blue region is the common extent where normalized deviation is lower than 0.1 for both WGBIS and GBIS. The light blue region is the extended application extent for WGBIS. The gray region stands for normalized deviations higher than 0.1 for both WGBIS and GBIS.

Conclusions
This research has developed a new methodology for estimating earthquake and volcano source parameters, by inverting the interferometric wrapped phase using a customized Bayesian approach. We first introduce a new downsampling algorithm suitable for the wrapped phase and another algorithm to estimate the variance-covariance matrix for describing the noise contained in the wrapped phase. Then, we propose a new likelihood function for the observed and modeled wrapped phase. Finally, a Bayesian approach is applied to generate the PDF of the likelihood function, which is proportional to the posterior PDF of the source model by assuming a uniform PDF. Benefiting from the proposed methodology, the advantages are demonstrated by the simulated experiments and are listed as follows.
(1) The wrapped phase is directly used in our inversion methodology, so the concerns about the unwrapping error are no longer worrisome. An improved downsampling algorithm provides a fine observed data for the inversion.
(2) Embedded with a Bayesian approach, the methodology provides an assessment of the model uncertainty, and it also has the capability to escape the local minimum. (3) The block variance-covariance describing the noise contained in the wrapped phase is considered in the likelihood function, thus providing an optimistic source model, better than that used when neglecting the off-diagonal variance-covariance term. (4) The robustness of our methodology was validated with multiple likelihood functions, in simulated cases with various noise levels, in a scenario with strong phase gradients, and in an actual earthquake case by using real SAR data sets. (5) In terms of its performance with respect to noise, the range of application for WGBIS is extended compared with that for GBIS ( Figure 14).