Multimodal Layered Transdimensional Inversion of Seismic Dispersion Curves With Depth Constraints

MuLTI (Multimodal Layered Transdimensional Inversion) is a Markov chain Monte Carlo implementation of Bayesian inversion for the probability distribution of shear wave velocity (Vs) as a function of depth. Based on Multichannel Analysis of Surface Wave methods, it requires as data (i) a Rayleigh‐wave dispersion curve and (ii) additional layer depth constraints, the latter we show significantly improve resolution compared to conventional unconstrained inversions. Such depth constraints may be drawn from any source (e.g., boreholes, complementary geophysical data) provided they also represent a seismic interface. We apply MuLTI to a Norwegian glacier, Midtdalsbreen, in which ground‐penetrating radar was used to constrain internal layers of snow, ice, and subglacial sediments, with transitions at 2 and 25.5 m, and whose Vs is assumed to be in the range 500–1,700, 1,700–1,950, and 200–2,800 m/s, respectively. Synthetic modeling demonstrates that MuLTI recovers the true model of Vs variation with depth. Significantly, compared to inversions without depth constraints, in this synthetic case MuLTI not only reduces by about a factor of 10 the error between the true and the best fitting model, but also reduces by a factor of 2 the vertically averaged spread of the distribution of Vs based on the 95% credible intervals. We further show that using frequencies above about 100 Hz lead to unconverged solutions due to mode ambiguities associated with fine spatial structures. For our acquired data on Midtdalsbreen, we use 14‐100 Hz data for which MuLTI produces a large‐scale converged inversion.


Introduction
Many inversions of geophysical data derived from a single type of geophysical instrument are underconstrained, a property that results from not only limitations of the size and accuracy of the data set, but inherent nonuniqueness: many models of the subsurface may be consistent with the single class of surface-based constraints. For this reason, joint inversions using data sets of mixed types can be a powerful method of constraining the model space, where ambiguities of one methodology are mitigated by resolution in the other. Over the past few decades several different joint inversion algorithms and techniques have been developed in various geological settings to overcome such underconstrained problems. Geoelectric and seismic surface wave data have been combined (Hering et al., 1995;Wisén & Christiansen, 2005) to improve the definition of layered structure in near surface environments (e.g., Ronczka et al., 2018). Similarly, reflection seismic and controlled source electromagnetics have been combined to estimate fluid properties of petroleum reservoirs, which could not be obtained from one survey-type alone (Hoversten et al., 2006). Colombo et al. (2017) used the joint inversion of seismic and airborne time domain electromagnetics to improve imaging of complex near surface structures to reduce risk in shallow petroleum exploration. In deep-Earth investigations, surface wave dispersion and teleseismic P wave receiver functions both supply constraints on the crustal and upper mantle geology and efficient algorithms for joint inversions have been developed (e.g., Bodin et al., 2012;Julià et al., 2000;Shen et al., 2012). Our paper focuses on the use of constrained inversions to characterize a glaciated subsurface by inverting Rayleigh (surface) wave data sets in the presence of depth constraints here provided by ground-penetrating radar (GPR) data.
Rayleigh waves are a type of seismic wave that travel along the ground surface, which, in an active seismic survey, are efficiently generated by a compressional wave source: such a source typically converts more than two thirds of the total seismic energy into Rayleigh waves (Richart et al., 1970). Using Multichannel Analysis of Surface Wave (MASW) methods, the dispersive properties of the Rayleigh waves can be utilized to infer the elastic properties of the subsurface , often expressed in terms of shear wave velocity (Vs), compressional wave velocity (Vp), and density. The MASW technique is most sensitive to Vs (Xia et al., 2003), being only weakly dependent on Vp and density (Wathelet, 2005). When inverted with no other constraint, Rayleigh wave dispersion curves have poor depth sensitivity (Foti et al., 2009) particularly given data collection issues including noise and finite bandwidth, the latter being problematic for resolving short length scales and near-surface structures. Indeed, a direct inversion of such data has a vertical resolution of only one third of the shortest wavelengths sampled (Gazetas, 1982), typically with 1-10 m resolution in active source surveys. In view of this limitation in resolution, many models of subsurface Vs structure may provide an acceptable level of fit to the observed data, therefore giving an ambiguous inversion.
An independent geophysical survey technique is GPR which is sensitive to changes in subsurface dielectric permittivity and can resolve the layered near-surface velocity-depth structure to centimeter accuracy (Booth et al., 2010). However, as with MASW, GPR data itself cannot unambiguously constrain subsurface structure, and often has limited depth penetration. By combining Rayleigh wave observations and depth information from GPR data, constrained inversion offers a powerful way to reduce the ambiguities inherent in single-technique inversions, provided that subsurface interfaces correspond to colocated contrasts in both elastic and electromagnetic properties. This assumption is likely appropriate in (for example) a glaciated environment with snow, ice, and a subglacial substrate (Tsuji et al., 2012); a permafrost environment featuring unfrozen and frozen ground (Kneisel et al., 2008); and hydrological settings such as the imaging of shallow aquifers (Cardimona et al., 1998). Such inversions both honor the centimeter-scale accuracy of the GPR constraints and, assuming the layered GPR model can be interpreted, their major advantage is the narrowed range of elastic properties that is permitted for each layer within the model. This latter effect vastly reduces the space of subsurface models consistent with the data, and thus significantly improves the resolution of any inversion. Furthermore, layer constraints mean that poor surface wave resolution in one layer does not necessarily spread to the adjacent layers. Of course, this approach is only viable when the contrasts detected by the seismic and GPR methods are likely the same; in situations where a change in electromagnetic properties would produce no linked elastic contrast (e.g., a salinity horizon within an aquifer), our approach would not be useful. In this study, we use synthetic and real data examples from a glaciological setting, modeling the subsurface distribution of snow, ice, and subglacial material. In this setting, we expect that elastic and electromagnetic horizons will be colocated. In practice, depth constraints could be added from any external data source-for example, from seismic reflection studies or borehole control.
Even when multiply constrained, inversions are seldom unique. In many inversions, regularization is employed to penalize small-scale roughness to produce a single smooth solution. However, single solutions may not be sufficient given the nonuniqueness issues in surface wave inversions. Bayesian Markov Chain Monte Carlo is a type of method that probabilistically quantifies the model space consistent with the observations. The application of the method results in a posterior probability distribution of the subsurface structure, numerically approximated by an ensemble of models, from which representative models such as the mean or mode can be obtained, in addition to rigorous estimates of uncertainties. This posterior probability distribution must lie within the bounds defined by a specified prior distribution, but is honed by the data into peaks that correspond to the most likely models. A refinement, applied here, is to use a transdimensional Bayesian method that allows each ensemble member to self-select both its model values and complexity based on the data. Averaging over the many ensemble members, to obtain a mean solution, then provides an effective smoothing. Furthermore, in the case where surface observations are weakly informative (as is often the case in surface wave dispersion), a Bayesian transdimensional approach will prefer models with fewer layers, and therefore larger length scales (MacKay, 2003), rather than models with many thin layers.
Here we base our method on existing implementations of such a transdimensional method, which have successfully inverted both single and multiple data sets (Bodin et al., 2012;Bodin & Sambridge, 2009;Livermore et al., 2018).
In this paper we present the algorithm MuLTI (Multimodal Layered Transdimensional Inversion), which implements a Bayesian inversion of surface wave data, honoring depth constraints. MuLTI is coded in Matlab and is freely available; we provide all data sets and scripts needed to reproduce the figures in this paper. The remainder of the paper is structured as follows. We first describe MuLTI in detail and demonstrate its ability to retrieve a known subsurface structure using synthetic data sets within a glacial setting. We then show an application of MuLTI to image the structure, both within and under a Norwegian glacier using a data set which we acquired in situ. Although we focus on glacial environments, MuLTI can be used in any layered geological environment where electromagnetic and elastic properties change at the same depths.

The MuLTI Algorithm
MuLTI is a Bayesian method which seeks to determine the posterior distribution of the Vs as a function of depth, for a prescribed profile of Vp and density (see also Bodin et al., 2012;Wathelet, 2005). Denoting the data set by d and the model description of Vs by m, using Bayes' theorem this can be written where p(m| d) is the posterior probability of the model given the data, p(m) is the prior information known about the model before introduction of the data, p(d| m) is the likelihood (probability of observing the measured data given a particular model (m)), and p(d) is the evidence. In what follows, we consider a Markov Chain Monte Carlo methodology to sample the posterior distribution, in which relative inference is sufficient: thus, here the evidence does not enter our analysis. The algorithm traverses the space of admissible models, sampling with greater frequency those models that are more likely. Provided that the chain of models is long enough, the statistics of the discretized ensemble will converge to those of the underlying posterior distribution.
It is important to note what we take to mean for the data, d. Both GPR and Rayleigh wave data sets are data in the geophysical sense but, because the GPR-derived layer depths are comparatively so well resolved, we use the two data sets in different ways. The GPR data we take to be part of our background knowledge of the system and are included in the prior, while the Rayleigh wave data we take to comprise the data (d), used in the likelihood. Hence, the two data sets are not treated on an equal footing.
MuLTI can run in two different modes. In the first, it conducts a constrained inversion using both seismic data and layer-depth constraints, here derived from GPR data and assuming an underlying infinite half-space. In the second mode, MuLTI runs with no predefined internal boundaries, and the subsurface here is described by an infinite half-space.

The Data
Rather than using raw data from synthetic waveform models, seismograms, or active seismic acquisitions, MASW uses the derived dispersion curve in the frequency-phase velocity domain. This dispersion curve is characterized by a discrete set of points that we term the data, d, by picking the phase velocity for a set of frequencies according to the spectral maximum (Foti et al., 2015). If N discrete points are picked in this domain, the data comprise N pairs of frequency (f) and Rayleigh wave phase velocity (PV) values with corresponding standard deviation σ: The standard deviation of the phase velocities is a measure of uncertainty in the picked dispersion curve, which is either prescribed as a constant vector (e.g., in a synthetic test), or determined as a function of 10.1029/2018GC008000 Geochemistry, Geophysics, Geosystems frequency from the width of the waveform dispersion image. For the latter case, we note that the resolution of the dispersion curve depends on the survey parameters used to acquire the seismic data. For example, a higher density of wavefield sampling, that is, more receivers and longer source-receiver offset ranges, produces better resolved dispersion curves; furthermore, in general, higher frequencies are typically better resolved than lower frequencies (Park et al., 2001).

Model Parameterization
We describe the 1-D variation of Vs with depth as a piecewise constant function using Voronoi nuclei (but see Dettmer et al., 2010, for an alternative), in which each layer is divided into a variable number of sublayers with constant velocities; at each depth, the value of Vs is determined by its nearest nucleus within the same layer (Bodin et al., 2012), see Figure 1 for an illustration of this model parameterization. To ensure that Vs within each prespecified layer is always described (requiring a minimum number of nuclei of 1 within each layer), we define a set of confined nuclei, which are confined to the given layers but are otherwise unconstrained in depth. The number of confined nuclei is equal to the number, l, of internal layers including the half space. All other k nuclei in the model are unconstrained in depth, and are termed floating nuclei. The model vector is then m ¼ dp 1 ; dp 2 ; …:; dp k ; Vs 1 ; Vs 2 ; …:; Vs k ; k; dpc 1 ; dpc 2 …:dpc l ; c; Vsc 2 ; …:; Vsc l ½ ; where dp i are the floating nuclei depths, Vs i are the floating nuclei wave speeds, dpc i are the depths of the confined nuclei, Vsc i are the wave speeds of the confined nuclei, and k is the number of floating nuclei. In our transdimensional framework, the number of floating nuclei k≥0, characterizing the complexity of the Vs profile, is a free parameter. All Voronoi nuclei are defined with depths ranging from 0 to a predefined maximum depth dp max . We note that the lowermost Voronoi cell is unbounded in downward vertical extent and describes an infinite half-space.

The Likelihood
The likelihood expresses the probability of observing the data (in our case, the picked dispersion curve) given a specific model m, here achieved by running a forward calculation of the frequency-PV response and comparing to the observed data (d). For a given frequency, there are multiple phase velocities at which the Rayleigh wave can travel; the slowest velocity is called the fundamental mode, the next highest velocity is called the first higher-order mode, then the second higher-order mode, and so forth . Early models of surface wave inversion only considered the fundamental mode with simple near surface environments , although it has been shown subsequently that higher order modes are preferred over the fundamental mode in several types of velocity structures (e.g., when a high velocity layer overlies a low velocity layer; Gucunski & Woods, 1992). Here we calculate all relevant modes (fundamental, first and second higher-order modes) and we assume that the probability of the ith datum PV i (f i ) is normally distributed about the nearest modal value, c (f i ), at frequency f i (of the three calculated) with standard deviation σ i (f i ). We deliberately do not specify which mode should be associated with any given datum because it is often difficult in practice to unambiguously assign the correct modal index (see section 3.5). Assuming that each datum (indexed by i = 1, 2 … N data ) is independent, the likelihood p(m| d) is then proportional to We calculate the modal dispersion curves using the Geopsy dispersion curve algorithm of Wathelet (2005) by first converting our Voronoi cells to a layer-model. The Geopsy dispersion curve algorithm uses a propagator matrix method to find the eigenvalues of the dispersion equation (Wathelet, 2005). This is a fast algorithm suitable for running repeatedly within MuLTI. We note that there is an option within MuLTI to limit the calculation of misfit to use either only the fundamental mode, or to use only a subset of the frequency range.
Our framework can be easily altered to include a different definition of likelihood: the choice we made above is not unique. Other definitions include a characterization of the misfit in terms of a determinant, removing the need to calculate the modal curves (Maraschini & Foti, 2010), or use of a full wavefield inversion approach using the dispersion spectra instead of a picked curve (Dou & Ajo-Franklin, 2014).

Prior Information
The remaining key aspect of our Bayesian method is the prior information that is assumed for the model parameters: the number (k), depth (dp), and material properties (Vs) associated with the Voronoi nuclei. By conditioning on the value of k, the prior can be written as p m ð Þ ¼ p dp 1 ; dp 2 ; …:; dp k ; Vs 1 ; Vs 2 …:Vs k ; dpc 1 ; dpc 2 ; …:; dpc l ; Vsc 1 ; Vsc 2 ; …:; Vsc l jk ð Þ p k ð Þ : By further assuming that each pair (dp j Vs j ), for both the floating and confined nodes, is independent of the others, the above probability can be written as ∏ kþl j¼1 p Vs j ; dp j jk We assume that k is uniformly distributed between given bounds: we commonly take k~U[0,30], limiting the maximum number of floating nuclei to 30. We further assume that the depth of each Voronoi nucleus is independent and has a depth uniformly distributed within given bounds: either the limits defined by GPR-defined layers for confined nuclei or dp~U[0, dp max ] for floating nuclei, with dp max typically taken to be 40 m. The value of Vs attached to each nuclei is assumed to be uniformly distributed within bounds dependent on its assumed composition. Without GPR constraints, wide bounds are applied for the infinite half space, typically Vs~U[200,2800] m/s, but with GPR constraints the much narrower ranges are defined by the layers' composition. While all the model prior distributions are uniform, this does not mean the distributions of all diagnostics of the prior are also uniform. For example, the distribution of nuclei depths is not uniform (when using GPR constraints), because the existence of the confined nuclei that are tied to certain layers skews the distribution.

Numerical Sampling of the Posterior
We sample the posterior distribution by using the Monte Carlo Markov chain algorithm, in which we iteratively generate a long chain of models. The algorithm is very similar to that described in Gallagher et al. (2011) and Bodin and Sambridge (2009) and there is no need to reproduce all the details here; we present only the key features and any differences. At each step, a new model m 0 is proposed that differs from the current model by one of four perturbations (Figure 2), which depend on a set of user specified parameters (σ change , σ move , and σ birth ) whose values affect the speed of convergence to the posterior distribution: • change Vs: perturb the velocity of a randomly chosen nucleus by a random amount distributed as N(0, σ change 2 ). • move nucleus: alter the depth of a randomly chosen nucleus; if it is a floating nucleus it is perturbed by an amount distributed as N(0, σ move 2 ) and can move between the depth-derived layers; if it is a confined nucleus it is moved to a random depth distributed uniformly over that layer to which it is tied. • birth: add a floating nucleus to the existing model whose depth is uniformly distributed U[0, dp max ] and whose Vs is distributed N(v, σ birth 2 ), where v is the value of Vs based on the current nuclei distribution. • death: remove a floating nucleus from the existing model. Confined nuclei cannot be removed.
Each proposed model is tested to see if it satisfies a certain acceptance criterion which involves the quantity where q(m 0 | m) is the probability of moving from model m to m 0 . For the type of transdimensional proposal used in this approach, the Jacobian term (J) is unity (Bodin & Sambridge, 2009). The evaluation of α for the types of perturbation shown in Figure 2 is standard (and is not described here) except the move perturbation when considering floating nuclei. For this case, q(m| m 0 ) = q(m 0 | m), p(d| m 0 ) p(d| m) À1 is the ratio of the two model likelihoods and p(m 0 )p(m) À1 is the ratio of the ranges of Vs between the layers that the nucleus may move between. After α is evaluated it is compared to a random number u~U[0,1]: if α > u the proposed model is accepted, added to the chain, and becomes the current model; if it is not accepted the existing model is retained as the current model.
The method is initialized with a randomized model from the prior and the method is run for a burn-in period. At this point, the Markov chain is presumed independent of the initial condition and statistics of the chain are recorded from then on. This technique therefore differs fundamentally from techniques such as the Genetic Algorithm which relies on an initial reference model to start the inversion process (Hayashi, 2012). On completion, all diagnostics must be checked to ensure sufficient iterations have been taken to achieve convergence. Perturbations that improve the data fit are mostly accepted; those which decrease the fit are most likely to be rejected but are occasionally (and randomly) accepted. Proposed models that lie outside the prior bounds give p(m 0 ) = 0; it follows that α = 0 and such models are never accepted. Figure 3 shows a schematic view of the algorithmic core of MuLTI.
MuLTI produces a variety of diagnostic statistics of the ensemble. These include the posterior probability of the model of Vs as a function of depth; the best model sampled with the lowest calculated misfit; the average and modal models; 95% credible intervals on Vs with depth; posterior distribution of the number of nuclei; comparative plots of the observed data with dispersion curves for the best, average and modal models; and plots of the misfit against iteration count highlighting convergence of the solution. We reiterate that MuLTI can be used in any geological layered environment where electromagnetic and elastic rock property changes are coincident.

Case Studies Using MuLTI
This section describes applications of MuLTI to both synthetic test data created assuming a simple glacial structure and real data acquired from a Norwegian glacier. Because the output of MuLTI depends upon the resolution of the input surface wave dispersion curve, we first describe the Norwegian field setting and the data acquisition procedure in order to motivate the specific synthetic tests that will provide insight into the reliability and limitations of inversions from real data.

Data Acquisition
Data were acquired on Midtdalsbreen, an outlet glacier of Norway's Hardangerjøkulen ice cap (60.59°N, 7.52°E), with the aim of characterizing the Vs properties of the subglacial environment. These properties provide an important insight into subglacial water storage and the flow dynamics of the overlying glacier, thus motivating this study. The subsurface comprises fresh snow over~25 m of glacier ice and a substrate of unknown subglacial material (likely sediment). Midtdalsbreen is surrounded by mountains of crystalline rock, and it is thought that this hard rock lies below the subglacial material.
Seismic shots (e.g., Figure 4a) were recorded with 48 10 Hz vertical-component geophones at 2 m incremental offset from a hammer-and-plate source, and digitized using a Geometrics GEODE system. A GPR profile was acquired along the length of the seismic line with Malå Geosciences antennas of 200 MHz centerfrequency. The thickness of snow and ice layers were estimated from velocity analysis of a GPR common midpoint gather located half way along the seismic spread (Figure 4b), using the method described in Booth et al. (2010Booth et al. ( , 2011. The GPR velocity in the snow and ice layers was 0.213 ± 0.0014 and 0.172 ± 0.0015 m/ns, respectively, with depths to their base of 2.0 ± 0.07 and 25.5 ± 0.22 m, respectively. The small relative size of the layer

10.1029/2018GC008000
Geochemistry, Geophysics, Geosystems depth uncertainties compared with resolution of surface waves (typically 1-10 m) supports our choice of fixing the layer depths. The underlying half-space is assumed to be unconsolidated sediment and bedrock. The prior distributions on Vs (defining the upper and lower bounds) for each layer were defined as the values listed in Table 1, obtained from previous glaciological seismic studies (Peters et al., 2008;Podolskiy & Walter, 2016;Tsoflias et al., 2008). The shear wave speed Vs is narrowly constrained within the snow and ice layers but, given the uncertainty about the subglacial material properties, we permit a large range of Vs to encompass soft, wet sediment to hard frozen sediment and bedrock.

Synthetic Data Tests
To evaluate the performance of MuLTI, we constructed a synthetic version of our acquired data, but underpinned by a simple known subsurface structure. Figure 5a shows our 4-layer model of snow, ice, soft sediment, and hard sediment that plausibly represents our glacial setting.
We use the discrete wavenumber method (DWM; Bouchon & Aki, 1977) to generate a full synthetic waveform data set, based on the true model shown in Figure 5a. A dispersion image is calculated from the waveform created by transforming into the frequency-phase velocity domain where the dispersive pattern of the Rayleigh wave can be determined (Figure 5b). The maximum amplitudes of the frequency-phase velocity image are picked to create the Rayleigh wave dispersion curve, which is input into MuLTI as the data (d) along with an estimate of its width, the uncertainty σ(f), Figure 5c. For the first two examples (sections 3.3 and 3.4), the DWM parameters used to calculate the synthetic waveform were chosen to have a long offset and large number of receivers, in order to sample the full wavefield. However, in the final synthetic example (section 3.5), we limit (b) GPR CMP gather acquired half way along the line. GPR velocities were derived by matching the curvature of diffraction hyperbolae, highlighted in blue, which were used to determine the thickness of snow and ice layers via Dix inversion (Dix, 1955). GPR = ground-penetrating radar; CMP = common midpoint. Geochemistry, Geophysics, Geosystems the DWM parameters to those used when acquiring the real Norwegian data to see the effect of a reduced data set.
MuLTI was run with and without GPR constraints applied. With constraints, the bases of snow and ice layers were fixed at respective depths of 2 and 25.5 m. The maximum number of floating nuclei was set at 30 and the maximum depth, dp max , of the model was set to 40 m. The burn-in chain length was set to 10000. We used parameter values σ change = 20 m/s, σ move = 1 m, and σ birth = 400 m/s, and 1 million iterations (including the burn-in) were found to be enough for convergence of the posterior distribution sampled by the single Markov chain. Convergence was also confirmed by running multiple independent chains with different initial states. To test the Markov chain was sampling correctly, the likelihood was set to unity, effectively removing the data (and rendering equivalent the posterior and prior distributions) and so the Markov chain sampled the known prior distribution against which it was benchmarked. As a preliminary test, we confirmed that MuLTI reproduced the prior distributions with the likelihood set to unity.

The Effect of High Frequencies on the Inversion
We first assess whether or not MuLTI can recover the known subsurface structure from a full frequency spectrum. Figure 6 shows the results from MuLTI of the posterior Vs distribution with GPR constraints applied and frequency range of 1-140 and 1-100 Hz (bandlimiting the high frequencies). Although ostensibly including high frequency (>100 Hz) picks adds extra data, they cause ambiguities in higher order modes associated with these frequencies, where many different models of Vs can fit the observed data. This arises because we do not assign any specific mode to each frequency, an unavoidable consequence of poor resolution in real data (see section 3.5). The plethora of complex models that all have a low misfit overwhelms the natural parsimony of the Bayesian method, producing a posterior density that is far from the true model. The probability density distribution of Vs values within their 95% credible interval are plotted as colored contours alongside the true solution (black line). The highest density distribution (red) for each depth corresponds to the most likely values of Vs. These high frequency ambiguities are highlighted in Figure 6(I); even though the observed data fit the final solution, the underlying true solution is not recovered. Limiting the high frequency range to 100 Hz (Figure 6(II)) mitigates this problem and the true solution is almost exactly recovered (to within~60 m/s) in this inversion.

Model Uncertainties Caused by Finite Bandwidth
As a second test that more closely aligns with the frequency content of real acquired data (e.g., those recorded with 10 Hz geophones), a new synthetic data set was created with a frequency range of 14-100 Hz, avoiding the high-frequency ambiguities described above. Figure 7 displays the posterior Vs distribution, respectively, with and without GPR constraints applied. Figure 7(II) shows significant deviation between the true solution and the model ensemble particularly between 0-7 and 25-35 m depth, without

10.1029/2018GC008000
Geochemistry, Geophysics, Geosystems GPR constraints applied. With GPR constraints applied the results are much improved. The density plot in Figure 7(I) shows that MuLTI recovers a distribution that is everywhere peaked close to the true model. This difference in resolution can be quantified by comparing the range of Vs values between the lower and upper 95% credible interval boundaries. Without and with GPR constraints, respectively, this range is 1,140 and 618 m/s, on average over the whole depth range. Therefore, including GPR constraints yields a stark decrease in the range of Vs (and therefore a reduction of uncertainty) by a factor of about 2. Quantified in a different way, the inversion using GPR constraints has a depth-averaged absolute error between the modal and true values of 62 m/s, about a factor of 10 smaller than the comparable error of 540 m/s associated with the inversion without GPR data.
Here we briefly discuss the uncertainties in the recovered subsurface model which limit resolution at both small and large length scales. At a given frequency, the phase velocity PV specifies the resolvable wavelength λ associated with each datum (Stokoe et al., 1994) as and its associated resolvable scale is L = λ/3, assuming a one-third wavelength resolution criterion (Gazetas, 1982). If the data points are bandlimited in the frequency range, which is the case for most real MASW data

10.1029/2018GC008000
Geochemistry, Geophysics, Geosystems sets, this immediately limits the resolvable scales by an unconstrained inversion to L min ≤ L ≤ L max , where L min is the thinnest resolvable layer and L max is the maximum resolvable depth. However, the addition of independent depth constraints will improve the resolution beyond what is possible with surface waves alone, so in the constrained inversion case these bounds will widen. Because it is not possible to easily quantify these improved resolution bounds in a simple way, in what follows we use L min and L max as illustrative values. For the tests shown in section 3.4, the data points were bandlimited with frequency range 14-100 Hz (Figure 7), illustrating the nature of any real data set. Therefore, in this example, L lies in the illustrative range 5 m≤ L ≤ 50 m. Which frequencies remain sufficiently clear above the noise level to be picked depends both on survey design: the dispersion curves are better resolved for a longer source-receiver offset (Park et al., 1998(Park et al., , 2001) and on the specific frequency: lower frequencies have lower resolution and hence a larger error.
Although the final inverted Vs solution may contain shallow and layers thinner than L min , the calculated Vs values for these layers may be considered unreliable . An illustrative case is our first synthetic example, shown in section 3.3 (Figure 6b), where L min = 5 m which is greater than the imposed snow depth (layer 1). The structure within that layer is therefore not well resolved, as shown by the 95% credible intervals that span the majority of the prior range in Vs, hence the mismatch between the true and the modal posterior Geochemistry, Geophysics, Geosystems distribution of Vs in this layer. Put another way, in the snow layer the data are relatively uninformative and do not constrain the posterior much beyond what is already assumed in the prior. It is worth remarking however that the snow layer depth of 2 m, being less than the nominal resolution limit of 5 m from the surface waves, is well resolved by the additional GPR constraints.

Influence of Survey Design on MuLTI
In addition to the issues described above, there can also be resolution difficulties at large depths caused by cut-off at long λ (low frequencies), a realistic scenario when using dispersion curves derived from active source seismic data. If the signal-to-noise ratio is low, the dispersion curve can be difficult to pick out over the noise, and the low frequencies then unavoidably have a higher uncertainty.
We illustrate this effect by creating a final full synthetic waveform data set, designed to mimic as closely as possible the real data we acquired. As before, this is based on the same true model as in section 3.2, using the DWM (Bouchon & Aki, 1977). As for the usual processing, dispersion images are calculated from these waveforms created and dispersion curves picked from the images (Figure 8b). The DWM parameters used to calculate the synthetic waveform were chosen to be equal to the parameters used when acquiring our Norwegian data: 48 geophones with 2 m spacing. The maximum amplitudes of the frequencyphase velocity image are picked to create the Rayleigh wave dispersion curve. It is especially noteworthy that the picked dispersion curve no longer overlies the true modal lines (as for the examples in sections 3.3 and 3.4), due to the restriction to realistic survey parameters. In this synthetic example the lowest resolvable frequencies are limited to 14 Hz, where the dispersion curve becomes very wide (with large uncertainty) with no clear maxima to pick below this frequency (see Figure 8b). From this picked dispersion curve L min is 5 m and the maximum resolvable length scale L max (i.e., the maximum resolvable depth) is 49 m. The Rayleigh wave dispersion curve picked from the DWN generated synthetic waveform, along with an estimate of its width, were used as the data d and uncertainty σ(f) (section 2.3) in MuLTI. It is also noteworthy that the poor resolution of Figure 8b makes it difficult to uniquely identify the first and second order modes, motivating our methodological choice of defining a likelihood based on the nearest modal value.
With GPR constraints applied and using the same parameters as before, Figure 8b shows this DWM synthetic inversion. Within the resolvable depth interval, there is a high probability of the posterior Vs model (highlighted in red in the Vs plot) being very close to the true model. The Vs distribution scatter is however larger than the previous synthetic examples, due to the larger uncertainty caused by weak sensitivities of the observations to structures at 25 m depth, itself a result of the survey parameters used in this DWN synthetic. This example demonstrates MuLTI works well with dispersion curves picked from synthetic waveform data simulating a MASW data set, within the data's resolvable depths, and accurately reflects the sensitivities of the observations. Figure 9a shows the dispersion curve corresponding to the seismic data in Figure 4a, acquired on the glacier Midtdalsbreen. The dispersion curve from Figure 8b (derived from a synthetic model with a low velocity layer) and Figure 9a are visually comparable, suggesting that a low velocity zone under the glacier is plausible. From the picked dispersion curve, equation (9) suggests that the thinnest resolvable layer (L min ) in this real data set is 5 m, and the maximum resolvable depth (L max ) is 48 m. As elsewhere in this paper, we assumed 3-D effects were negligible: there were no surface objects on the glacier front and we modeled the subsurface as laterally homogeneous.

Application to the Midtdalsbreen Data Set
The same MuLTI inversion parameters were used for this real data example as the previous synthetic examples, and results are shown in Figures 9b-9d. The modal Vs within the ice layer is shown to be at the low end of the prior distribution, around 1,700 m/s. A low velocity zone is identified directly below the glacier (at 25 m depth), roughly 7 m thick, which could be due to unfrozen wet subglacial sediment. The high Vs zone at 32-40 m depth could be the hard bedrock boundary, underlying the sediment layer.
Lastly, Figure 9d shows a comparison of the best-fitting model to the data set, following comparable plots in previous sections. Although illustrative of data-fitting, it is possible that this end-member model, which is consistent with the prior, nevertheless shows some extreme physical properties. Figure 9d shows an increase with frequency of one of the modal velocity curves, whereas it might be expected that all such curves decrease with frequency. Physically, the best-fit model is largely comparable to the modal model of Figure 9b with the exception of a rapid adjustment at 35 m depth to a low-velocity half space (of about 700 m/s), which is consistent with the prior but physically unexpected and rather exotic.

Discussion and Conclusions
Many techniques have been recently established utilizing Rayleigh wave dispersion curves to infer the seismic structure of the subsurface. However, recurring problems in the inversion of Rayleigh wave dispersion curves are poor depth sensitivity, low resolution, and ambiguous, nonunique solutions. Here we have presented MuLTI, a novel tool for the inversion of Rayleigh wave dispersion curves with additional depth constraints drawn from any external data source.
MuLTI implements a Bayesian formulation using a Markov chain Monte Carlo approach to explore the dependence of Vs with depth, and rests on the assumption that subsurface interfaces correspond to colocated contrasts in both elastic and electromagnetic properties. It uses a new methodology to restrict the space of admissible subsurface models that are compatible with the observed Rayleigh wave data by adding fixed depth constraints, here using a GPR-derived layered structure. The uncertainty in the depth constraints applied is negligible (centimeter-accuracy) compared to the dispersion curve uncertainty (of meteraccuracy), motivating us to fix the internal interface depths. The constraining layered structure is implemented by narrowing the Vs bounds for each material layer defined from the GPR, however Vs variability within each layer is still permissible.
The Bayesian formulation within MuLTI is employed to produce a variety of diagnostic statistics of the ensemble used to assess the reliability of the solution for interpretation. The multiple different outputs available enables a variety of marginal posterior distributions to be examined, for example, the most likely Vs solution along with its uncertainty range, and the distribution of the number of nuclei. A potential criticism of our methodology is that we only invert for S-wave velocity while holding Vp and density constant throughout the velocity model. By deriving material-layer boundaries we are able to fix the Vp and density appropriately in each layer according to the material expected. This is an improvement from models that have no defined layers with Vp and density fixed as constants throughout the model space (Hayashi, 2012). However, a development of the algorithm would be, at increased computational cost, to also consider resolving Vp and density in each layer.
Using synthetic tests based on a glaciological snow-ice-sediment layered setting, we showed that the depth constrained inversion gives a marked improvement in accuracy (decreasing the absolute error between the best fitting and true model by a factor of 10) and depth resolution compared to inverting Rayleigh wave data in isolation. Based on the difference between the upper and lower 95% credible interval limits, the posterior distribution of Vs in the constrained inversion compared to the nonconstrained inversion is 2 times better resolved within the glaciological subsurface. We also demonstrate that including high frequency data (>100 Hz) causes ambiguities in the higher order modes associated with spatially fine scale structure, which overwhelms the natural parsimony of the Bayesian methods, producing solutions that are far from the true model. Therefore, it is important to take caution if using frequencies >100 Hz in any multimodal Rayleigh wave dispersion curve inversion.