Global‐Scale Full‐Waveform Ambient Noise Inversion

We present the first application of full‐waveform ambient noise inversion to observed correlation functions that jointly constrains 3‐D Earth structure and heterogeneous noise sources. For this, we model and interpret ambient noise correlations as recordings of correlation wavefields, which completely eliminates the limiting assumptions of Green's function retrieval, such as equipartitioning and homogeneous random noise sources. Our method accounts for seismic wave propagation physics in 3‐D heterogeneous and attenuating media and also for the heterogeneous and nonstationary nature of the ambient noise field. Designed as a proof of concept, the study considers long periods from 100 to 300 s, thus focusing on the Earth's hum. Treating correlations as self‐consistent observables allows us to make separate measurements on the causal and acausal branches of correlation functions, without any need to choose one of them or form the average. We validate our approach by assessing the quality of the obtained models and by comparing them to previous studies. This work is a step toward the establishment of full‐waveform ambient noise inversion as a tomographic technique with the goal to go beyond ambient noise tomography based on Green's function retrieval.


Introduction
Since Lobkis and Weaver (2001) demonstrated that recordings of a diffuse wavefield can be turned into Green's functions by computing correlation functions, studies of the Earth's ambient noise have evolved into their own field of research in seismology. At the beginning, the priority was on tomography using noise correlations (Sabra et al., 2005;Shapiro et al., 2005) and the theoretical foundation of the underlying principle of Green's function retrieval (Fichtner & Tsai, 2019;Sánchez-Sesma & Campillo, 2006;Wapenaar, 2004;Wapenaar & Fokkema, 2006;. It was established that correlation functions can be interpreted in terms of Green's functions when the ambient noise field is diffuse and equipartitioned or, equivalently, when it is excited by an isotropic distribution of monopolar and dipolar uncorrelated random sources. Although the nature of the ambient noise sources is interesting in itself, its investigation gained importance through the peculiarity of these assumptions. In fact, an overwhelming number of studies consistently show that they are not generally met and that the ambient noise field at any frequency is excited by a heterogeneous and nonstationary distribution of sources (Ardhuin et al., 2015(Ardhuin et al., , 2011Ermert et al., 2016Ermert et al., , 2017Gal et al., 2015Gal et al., , 2018Juretzek & Hadziioannou, 2016;Stehly et al., 2006). The reconstructed Green's functions are thus deteriorated, and information should only be extracted with caution (Delaney et al., 2017;Halliday & Curtis, 2008;Fichtner, 2014Fichtner, , 2015Kimman & Trampert, 2010;Snieder et al., 2006;Stehly & Boué, 2017;Tsai, 2009Tsai, , 2011. This problem largely prevents the application of full-waveform inversion techniques under the assumption of Green's function retrieval. Inspired by helioseismology (Duvall Jr et al., 1993;Gizon & Birch, 2002;Woodard, 1997), Tromp et al. (2010) introduced a new approach -in the following referred to as full-waveform ambient noise inversionthat acknowledges the fact that both the distribution of ambient noise sources and structure shape interstation correlations. Although Basini et al. (2013) followed with a comprehensive sensitivity analysis using observations, subsequent studies only used synthetic tests to improve our intuition for the physics behind correlation functions (Hanasoge, 2013(Hanasoge, , 2014Fichtner, 2014;Fichtner et al., 2017;Xu et al., 2019). Only recently, real data applications started to emerge, but they keep the structural model fixed and focus on the distribution of ambient noise sources (Datta et al., 2019; The development of full-waveform inversion techniques for earthquake and active-source recordings took a similar road, evolving from theoretical considerations (Bamberger et al., 1977(Bamberger et al., , 1979Lailly, 1983;Tarantola, 1988;Tromp et al., 2005) to real data applications (Igel et al., 1996;Chen et al., 2007;Fichtner et al., 2009;Tape et al., 2009Tape et al., , 2010. The fact that it advanced to an established tomographic method relatively quickly once the computational resources were available probably reflects that it could build upon a profound understanding of the physics, which was based on ray theory and its finite-frequency extensions (Dahlen et al., 2000;Friederich et al., 1993;Yomogida, 1992;Yoshizawa & Kennett, 2004). Although distributed sources are harder to grasp compared to point sources and we are still retraining our intuition of the physics, the evolution of full-waveform inversion makes us optimistic that full-waveform ambient noise inversion will continue on this road and develop into a standard tomographic technique in the future.
The following work is intended as a methodological proof of concept. We thus use relatively long periods and investigate a global data set focusing on the Earth's hum (Ardhuin et al., 2015;Nishida et al., 2000;Rhie & Romanowicz, 2006;Traer & Gerstoft, 2014). The long-wavelength structure of the Earth is generally well known, and both Nishida et al. (2009) andHaned et al. (2016) demonstrated that global tomography is possible with signals emerging in correlations of long-period ambient noise recordings. Furthermore, Nishida and Fukao (2007) and Ermert et al. (2016Ermert et al. ( , 2017 investigated the source distribution of the hum based on the inversion of correlation functions. A comparison of the source and structure components of our joint inversion to previous results is thus possible. An additional benefit of global-scale inversion is the absence of unphysical boundaries that need to be handled in regional-scale numerical simulations. The principal purposes of this work are (1) to demonstrate the technical feasibility of joint full-waveform inversion for noise sources and Earth structure, (2) to develop the necessary methods for numerical modeling and data management, and (3) to prepare the ground for future applications at higher frequencies, smaller spatial scales (reservoir to regional scale), and shorter time intervals (monitoring with variable noise sources over hours or days).
This manuscript is structured in the following way. We first describe the forward and inverse modeling theory in section 2 and list details of the data compilation and the inversion framework in sections 3 and 4. The inversion results are presented and compared to existing source and Earth models in section 5, followed by a discussion of potential next steps and conclusions in sections 6 and 7.

Theory
Not invoking Green's function retrieval and interpreting cross-correlation functions as self-consistent observables instead is not yet an established method. We thus briefly summarize the theory in the following. For simplicity, we present it in the frequency domain and omit dependencies on the angular frequency wherever possible. Further details and expressions in the time domain can be found in Ermert et al. (2017), , , and Tromp et al. (2010). We only consider recordings of the vertical displacement. Xu et al. (2019) and Wang et al. (2019) specifically consider radialand transverse-component correlations, and Paitz et al. (2018) provide a framework for different physical quantities.

Forward Modeling
The frequency-domain vertical displacement u z of the ambient noise field at position x 1 is given by where G z,z (x 1 , , ) indicates the vertical component of the Green's function at x 1 due to an impulse in time acting at in the vertical direction. The quantity N z ( , ) is the vertical component of the force as a function of position which excites the wavefield. We assume that vertical forcing at the ocean bottom ⊙ is sufficient to explain vertical noise recordings for periods between 13 and 300 s and that horizontal forcings can be neglected. Computing the cross-correlation function C zz between this recording and another, measured at where * denotes complex conjugation or, equivalently, time reversal. The term S zz ( , ′ ) = N z ( ) N * z ( ′ ) denotes the power spectral density distribution of the sources, and it captures the magnitude of the excitation of the ambient noise in space and frequency. Making the common assumption that the spatial correlation length of noise sources is much smaller than the seismic wavelength, several hundred kilometers in our case, effectively reduces S zz ( , ′ ) to the spatial autocorrelation S zz ( , ′ ) = S zz ( ) ( − ′ ) (Wapenaar, 2004;Wapenaar & Fokkema, 2006). This simplifies equation (2) to A nonzero correlation length for neighboring noise sources acts as a low-pass filter, which might render its accurate representation unnecessary. In turn, however, this potentially inhibits the inference of the noise source correlation lengths from seismic correlation functions. Interactions between spatially separated sources are thought to be reduced by averaging correlation functions over time (Bensen et al., 2007).
In this study, we focus on a narrow frequency band. This effectively allows us to partition the spatial and frequency dependence of the noise source power spectral density as S zz ( , ) = s( ) zz ( ).
Realizing that the part in square brackets in equation (3) plays the role of the forcing term in equation (1) allows us to interpret a correlation function as a recording of a correlation wavefield. In contrast to equation (1), however, the numerical evaluation of equation (3) is more involved and requires the following steps: 1. Numerically compute Green's function G z,z (x 2 , ) using source-receiver reciprocity, that is, inject the source at the location x 2 , and save the vertical displacement field where the power spectral density distribution is assumed to be nonzero. We also refer to the output as the generating wavefield in the following. 2. Combine the complex conjugate, or equivalently the time-reversed version of the filtered Green's function s( ) G * z,z (x 2 , ), with the power spectral density distribution  zz ( ). 3. Inject G * z,z (x 2 , )s( ) zz ( ) distributed on ⊙ into a numerical wave propagation solver and sample the resulting correlation wavefield where the other receivers are located, in this case at x 1 .
Compared to direct computations of the ambient noise field (equation (1)), the simulation of correlation functions neither requires accurate knowledge of the phase of the noise force N z ( ) nor multiple long time series realizations to approximate its statistical properties. Instead, correlation functions account for the magnitude of the excitation in terms of the power spectral density distribution. Our forward model is thus free of assumptions regarding diffuse and equipartitioned ambient noise fields invoked in derivations of Green's function retrieval . A movie illustrating step 3 of the forward modeling recipe is part of the supporting information.

Inverse Modeling
The numerical recipe above allows us to forward model synthetic correlation functions, given an Earth model and a power spectral density distribution. Conversely, we may infer both by comparing synthetic and observed correlation functions. Since the misfit functionals used to quantify the discrepancies between observations and synthetics depend nonlinearly on Earth structure and sources, we use iterative optimization algorithms. To steer the minimization, we derive efficient expressions for first derivatives based on adjoint techniques. We only present the final expressions for source and structure kernels and refer the reader to Fichtner et al. (2017), Tromp et al. (2010),  for details.

Source Kernels
To update the spatial component of the power spectral density distribution, we are interested in how changes in  zz (x) affect the misfit. In principle, we could go through the discretized model, perturb each grid point, and then simulate synthetic correlation functions. By evaluating the resulting misfit change, we could map the first-order change of the misfit functional with respect to  zz (x), also referred to as a source kernel. With an expensive forward model or a large model space, however, this brute force finite-differencing approach 10.1029/2019JB018644 is not feasible. We thus apply adjoint techniques, which gives us the same result at the cost of only one additional simulation, for the adjoint wavefield u † (x). The source kernel can then be composed according to In our case, the adjoint operator is also a wave equation (Fichtner, 2010) and we compute the adjoint wavefield with the same numerical solver by injecting adjoint source time functions, which depend on the deployed misfit functionals, at the receiver locations, that is, here at x 1 . From step 1 of the forward modeling recipe, we already have a stored version of Green's function G z,z (x 2 , x), and building a source kernel is thus relatively straightforward. An example is shown in Figure S1a in the supporting information. Source kernels by themselves contain valuable information on our ability to recover noise sources, and they provide insight into the physics of correlation wavefields ) that helps us to solve an ill-posed inverse problem in a meaningful way.

Structure Kernels
The Earth model, described by the vector m(x) containing all material parameters, influences both Green's functions in equation (3). Following the same logic as above, we again apply adjoint techniques. Since the correlation function is the result of two wavefield simulations (steps 1 and 3 in the recipe), the assembly of the structure kernels requires two adjoint simulations, subject to the same adjoint operator as in section 2.2.1 for the source kernel. The final expression is given by where (•) and C † (x) denote the forward modeling operator for a wavefield (solution of the seismic wave equation) and the adjoint correlation wavefield, respectively. The latter is explicitly defined by with u † z again being the vertical component of the regular adjoint field, emanating from a receiver location. For the elastic case, the wavefield modeling operator (•) can be written as with the density (x) and the stress tensor (x). According to equation (5), the density kernel can be obtained by For a comprehensive collection of expressions for sensitivities to different parameters for different physics, we refer the reader to chapter 9 in Fichtner (2010). An example for a structure kernel for a homogeneous distribution of ambient noise sources is shown in Figure S1b.

Notes on Implementation
Both the forward and inverse problem are based on the spectral element solver Salvus (Afanasiev et al., 2019). We thus operate in the time domain, and we can account for 3-D heterogeneous and attenuating media, as well as for topography and internal discontinuities. Noise sources are implemented as a superposition of different spectral basis functions and their spatial distributions. As mentioned above, we confine ourselves for this study to one basis function and one distribution. In order to keep track of the various steps required for the forward and inverse problem, we store simulations and their type in a database. This allows us to reuse existing simulations, for example, the generating wavefields for source inversions. This partly mimics the approach taken by Ermert et al. (2017) and Xu et al. (2019).

Data and Processing
We use the data set prepared by Ermert et al. (2016). Owing to the relatively long periods of the Earth's hum and its low amplitude, only data from STS-1 broadband seismometers were considered. Continuous recordings for 146 stations between 2004 and 2013 were retrieved from the IRIS data management center (ds.iris.edu/mda/_STS-1). A station map is shown in Figure 1. Cross-correlations were computed for ∼9 hr windows and then stacked, where each correlation function was normalized by the energy of both traces to downweight the contribution of large-amplitude (transient) signals. Since our forward model defined in equations (2) and (3) does not account for nonlinear processing, we specifically avoid further processing steps, which are typically applied in ambient noise studies (Bensen et al., 2007;Schimmel et al., 2011). In the following, we study the Northern Hemisphere winter, and so all Decembers, Januaries, and Februaries of the 10 years were stacked in order to increase the signal-to-noise ratio. The correlations were prefiltered with a passband between 20 and 500 s. For the inversion, we further narrow the frequency content and apply an additional zero-phase filter to focus on the period band between 100 and 300 s. Correlations are likely to be more contaminated by earthquakes for periods smaller than 100 s, and a more elaborate processing scheme would be necessary for periods longer than 300 s, for example, including barometric corrections (Bormann, 2012). Furthermore, a broader frequency range might require modeling a frequency-dependent spatial distribution of the sources. For a more detailed description of the data compilation and processing, we refer the reader to Ermert et al. (2016).

Initial Model, Misfit Functionals, and Optimization Scheme
This study primarily serves as a proof of concept, and we thus start with a model that is as unbiased as possible. The initial structure model is composed of PREM, including anisotropy and attenuation (Dziewoński & Anderson, 1981). It is parameterized in terms of vertically and horizontally propagating/polarized P and S velocity (v pv , v ph , v sv , v sh ), density , shear and bulk attenuation Q and Q , and the dimensionless parameter , which controls the incidence angle-dependent speed of seismic waves. We only invert for v pv , v ph , v sv , v sh , and density . Due to the coverage, the considered period band, and the extraction of mostly Rayleigh waves, we only interpret the model for v sv in the following. The remaining parameters are part of the model made available publicly (http://doi.org/10.5281/zenodo.3564935). Instead of restricting noise sources to the oceans, we allow for a nonzero power spectral density anywhere at the Earth's surface and take a homogeneous distribution as starting point. The source inversion is based on the asymmetry of correlation functions, quantified in terms of the logarithmic signal energy ratio , and for the structure inversion cross-correlation time shifts (Dahlen et al., 2000;Luo & Schuster, 1991) are measured on the surface wave packages on the causal and the acausal branches separately. An example of the measurement procedure is presented in Figure 2. The gradients drive a limited-memory BFGS method (Nocedal & Wright, 2006), and we apply a diffusion-based smoothing operator to the gradients, where the smoothing length is based on the dominant wavelength. We further impose the physical constraint that power spectral densities are equal to or larger than zero. Ermert et al. (2017) estimated the data error by comparing asymmetry measurements from correlations between quasi colocated stations of the Canadian Yellowknife array and all other stations. The maximum  Figure S1) for the period band between 100 and 300 s. The windows for the surface wave arrivals are indicated by gray boxes. The cross-correlation time shifts reveal that the synthetics are too slow, −8.0 and 6.5 s on the acausal and the causal branches, respectively (note the different sign convention for both branches). As expected and quantified by the asymmetry measurement, the synthetic correlation function is symmetric for a homogeneous distribution of noise sources. The observed correlation function, however, is asymmetric indicating stronger noise sources behind station CN.DRLN in the Atlantic Ocean.

Quality Control
distance between the stations of the array is 10 km and consequently much less than the seismic wavelength used in this study, which is on the order of 1,000 km. The correlation between individual recordings of the array and a common reference station should yield similar values for the asymmetry. Differences are expected due to instrumental noise and site-specific effects. For the considered period band, the data variance is relatively high and we estimate that the asymmetry misfit should only be reduced by ∼30%. In contrast, phase measurements seem relatively stable (Figure 2 in Ermert et al., 2017) and the variance is smaller than the misfit residual expected for the final model. Due to the relatively high level of incoherent noise in the data set, we apply different quality criteria to ensure extraction of meaningful information. We only consider correlation functions with a minimum number of 200 stacked windows. For the asymmetry calculation, we require a signal-to-noise ratio of 2 on either the causal or the acausal branch. Time shifts larger than half a dominant period are rejected, and we demand a cross-correlation coefficient of at least 0.6 between the windowed synthetics and observations. The windows for the source and structure inversions are 1,200 s wide and centered on the arrival of a Rayleigh wave with a group velocity of 3,700 m/s. Measurements with overlapping windows are discarded. Ermert et al. (2017) applied a similar strategy for the definition of the windows and reported satisfactory results for this simplistic approach.
Considering all the different quality criteria, we measure the asymmetry of 2,732 correlations and 1,658 time shifts. In 415 cases, traveltime information can be extracted on both the causal and the acausal branch of the same correlation function. Since the distribution of ambient noise sources is not homogeneous, differences in both time shifts are expected. The asymmetry in time contains valuable information on the noise source distribution (Stehly et al., 2007), and its usage is currently under investigation.

Measures to Reduce Computational Requirements
To model correlation functions with an accuracy that is sufficient for our data, we empirically found that a generating wavefield, that is, an acausal branch, of ∼6 hr duration is required. To limit the computational cost of these simulations, we take the following measures: 1. Only 50 receivers act as reference stations, often also referred to as virtual sources, while trying to ensure a good coverage. Virtual sources, receivers, and the respective ray coverage are shown in Figures 1 and 3. 2. If it is possible to define a maximum time for the latest measurement window, the simulation time can be adjusted accordingly . In our case, the data quality only allows us to make measurements on the minor-arc surface wave. We thus limit the causal branch to 7,000 s. 3. The shallowest discontinuity in PREM is at 15 km depth, which leads to small elements in the computational mesh and consequently to a small time step. Since we work at long periods above 100 s, we can remove the 15 km discontinuity and extend the first layer to the next discontinuity at 24.4 km depth. The resulting numerical error is negligible compared to the observational uncertainties in the data. While the latter constitutes the well-known case, where a single ray connects a station pair along the great-circle minor arc, the former is fundamentally different and rays start at each station following the great-circle major arc. Their lengths are limited to 10,000 km to mimic attenuation. In total, we obtain 2, 732 · 2 = 5, 464 rays for the source coverage and 1, 658 − 415 = 1, 243 rays (in 415 cases traveltimes could be extracted on both the causal and acausal branches) for the structure inversion.
4. The computational domain is reduced to a spherical shell with 1,000 km depth, which is possible because the correlations are dominated by fundamental surface waves. For a maximum period of 300 s, they are hardly sensitive to structure deeper than 700 km (Boschi et al., 2009). We do not observe significant changes in the simulated correlation waveforms by increasing the depth of the spherical shell, suggesting that 1,000 km is a reasonable choice. 5. We only store the generating wavefield where the power spectral density  zz (x) is a priori allowed to be nonzero, that is, at the surface of the Earth. This reduces storage requirements significantly. 6. As discussed in section 2, we approximate the power spectral density distribution S zz (x, ) for the period band between 100 and 300 s with one (Gaussian) spectral basis function s( ) and one corresponding spatial distribution  zz (x). Green's function in step 1 of the forward modeling recipe in section 2 is thus already modeled for the appropriate frequency range, which simplifies the convolution in step 2 and the assembly of the source kernel significantly. 7. Although feasible in principle and also employed in the synthetic study of , we do not perform a joint inversion by updating the source and structure models simultaneously. Instead, we update the models sequentially, which eliminates the difficult choice of an appropriate weight for the different misfit functionals. (A poorly chosen weighting leads to a slow convergence rate.) The sequential approach taken here is reasonable because the asymmetry measure used to infer noise sources is nearly insensitive to unmodeled Earth structure .

Estimation of Resolution
In order to estimate the potential resolution of the final models, we plot the ray density for both the source and the structure inversions in Figure 3, where we account for the respective quality control. Although finite-frequency effects exploited in full-waveform ambient noise inversion are neglected, ray coverage still provides a useful proxy. The approach in Figure 3a, for the noise source inversion, is inspired by the ray-based source imaging method introduced by Ermert et al. (2016). Since the noise source sensitivity for the asymmetry measurement is mostly located behind both stations (see Figure S1a), we plot rays along the great-circle major arc starting from each receiver. To imitate attenuation, we restrict the length of the rays to 10,000 km. Figure 3a reveals a relatively even coverage of the entire globe. The concentration of stations on the continents of the Northern Hemisphere leads to two broad bands of rays connecting Europe-South America-South Pacific-Asia-Europe and South America-North America-Asia-Indian Ocean-Atlantic Ocean-South America. The center of the Pacific Ocean and the southern part of the Atlantic Ocean is covered less but still shows a significant crossing of rays.
In the ray density plot of Figure 3b for the Earth structure inversion, the station pairs are connected by rays on the great-circle minor arc, which mimics the sensitivity to structure (see Figure S1b). The resulting coverage is significantly different with a higher density in the Northern Hemisphere, especially on the continents, compared to the Southern Hemisphere. We will thus mostly focus on the Northern Hemisphere for the interpretation of the final velocity model. A noteworthy feature is the relatively good coverage in North Asia, a region that is often problematic in tomographic studies due to the lack of earthquakes.

Results and Link to Existing Models
In total, we perform six iterations for the distribution of the ambient noise sources and eight iterations for 3-D velocity structure. The order of the different updates is chosen such that we can reuse simulations of Green's functions several times for the source inversion, while still providing insight into the interdependence of the different updates. The final order and the evolution of the misfits is plotted in Figure 4. The traveltime misfit is reduced by ∼55%, and further updates would, in principle, be possible. We observe, however, that artifacts caused by the simplistic initial model start to dominate further updates. For the source inversion, we reach the error level estimated for the asymmetry measurement after six iterations. This translates to a final misfit reduction of ∼30%. Each sequence of source updates has a minor influence on the following sequence of structure updates and vice versa. The related misfit changes are on the order of ∼1% (dashed lines in Figure 4). In addition, we present the evolution of the misfit in terms of histograms and examples of observed and synthetic waveforms in Figure 5.  . We focus the waveform plots on the surface wave signals on the causal and the acausal branches and remove the part in between (indicated with dots). Values for asymmetry differences and time shifts are indicated in the waveform plots, in the center or on the corresponding branches. Red and blue correspond to the initial and final models, respectively, and observed correlations are plotted in black.

Source Model
The final model for the distribution of the power spectral density is shown in Figure 6a. Smearing effects along the two broadbands of rays described above that connect Europe and South America with Asia are visible, but do not dominate the features that we interpret in the following. The interpretation is not straightforward due to the two-sided nature of the sensitivity kernels of the asymmetry measurement. We thus mostly focus on the anomalies with high power spectral density and avoid interpreting regions with magnitudes lower than average. The strongest sources are imaged in the oceans on the Northern Hemisphere, that is, along the coasts of the northern Pacific and in the Northeast Atlantic. We also observe strong contributions in the Coral and Tasman Sea, in the southern parts of the Pacific, Atlantic, and Indian Oceans. In general, sources on continents are weak, except for an anomaly in North Asia.
Most of the large-scale features observed in Figure 6a can be linked to the result obtained by Ermert et al. (2017) (see Figure 6b). They use a precomputed database of Green's functions computed for the 3-D Figure 6. Power spectral density (PSD) distribution of the Earth's hum in Northern Hemisphere winter obtain in (a) this study and (b) by Ermert et al. (2017). The values are normalized, and the color bars are clipped such that the mean values of the models have a similar color. The maps show relative PSDs, since the absolute values cannot be obtained from this particular data set due to processing requirements and the employed measurement type. Yellow regions thus indicate where the Earth's hum is predominantly generated. mantle model S40RTS (Ritsema et al., 2011) combined with the crustal model Crust2.0 (Bassin et al., 2000) to calculate correlation functions and source sensitivity kernels. The anomalies in our study are more confined in space, and more regions have zero power spectral density, which are probably consequences of reducing the misfit more, that is, by ∼30% instead of ∼18% in Ermert et al. (2017). This illustrates the benefit of choosing limited-memory BFGS over steepest descent, because a total of six iterations has been used in both cases. Major differences in our inversion result are weaker sources along the coast of Chile, stronger contributions in the southern Atlantic and in North Asia. All in all, we argue that our model supports the interpretation by Ermert et al. (2017) and their conclusions drawn regarding the excitation of the Earth's hum. We refer the reader to their publication for a detailed interpretation and a thorough comparison to previous studies (Bromirski & Gerstoft, 2009;Nishida & Fukao, 2007;Rhie & Romanowicz, 2004Traer et al., 2012). The effect of a different structure model on a source inversion is presented in Figure S2 in the supporting information.

Tomographic Model
The final v sv model from the inversion of correlation functions for a period band between 100 and 300 s is presented in Figure 7a. It is important to keep in mind that we only use ∼1,600 time shift measurements, 10.1029/2019JB018644 Figure 7. Comparison of (a) the final inversion result for v sv and (b) S20RTS plotted at 100 km depth. Both are presented in deviations from the mean. S20RTS was downloaded from Submachine (Hosseini et al., 2018).
In order to facilitate the interpretation and the identification of structural features, we plot S20RTS (Ritsema et al., 1999) in Figure 7b and a comparison to the Collaborative Seismic Earth Model, CSEM (Afanasiev et al., 2016;Fichtner et al., 2018) in Figure 8. The depth of 100 km for the comparison could be chosen differently to better match the depth sensitivity of Rayleigh waves (Boschi et al., 2009), but we expect a strong imprint of the crust and upper mantle, since we do not account for its variability in our 1-D starting model. The vertical resolution is relatively poor in our model, since we mostly exploit long-period Rayleigh waves and we applied a relatively strong radial smoothing. We therefore restrain ourselves from comparisons at various depths. A higher magnitude of the deviations from the mean in S20RTS and in CSEM can be explained by the exploitation of higher frequencies, a different inversion approach in S20RTS and more iterations for the regional refinements in CSEM.
Considering the wavelengths of several hundreds of kilometers and the relatively small number of time shift measurements in this study, we observe good agreement of the tomographic model with S20RTS. The cratons are faster on average, that is, eastern North America, central and western Australia, the eastern part of Eurasia, eastern South America, and western and southern Africa. We image a slow Pacific belt, the African rift system, and the northern part of the Mid-Atlantic ridge. Strong undulations of the crust caused by mountain ranges are imaged as slow anomalies, for example, the Himalayas, the Andes, and potentially also the Urals. Nishida et al. (2009) estimated a crustal correction for the considered period band, and we observe a strong correlation with the features described above. These anomalies may have been avoided by adding a crustal model to the initial model, but now also serve as criteria for a proof of concept. As expected by the ray coverage (Figure 3), the resolution in the Northern Hemisphere is significantly better compared to the Southern Hemisphere. The comparison to CSEM in Figure 8 thus focuses on the Northern Hemisphere. We observe similar anomalies where CSEM is not updated since it starts from S20RTS and regional refinements appear as smooth versions in our model.

Discussion
In the previous sections, we presented the first full-waveform ambient noise inversion that jointly constrains ambient noise sources and 3-D Earth structure, without any of the limiting assumptions of Green's function retrieval. The primary goal of this study is a methodological/technological proof of concept for full-waveform ambient noise inversion. In the following, we discuss advantages of the method and also current limitations that may, for instance, be related to computational simplifications.

Principal Advantages
The motivating advantages of full-waveform ambient noise inversion are (1) to overcome the requirements of Green's function retrieval that are not satisfied on Earth and (2) the related exploitation of complete correlation waveforms, including waveform features that are not part of the interstation Green's function. The latter is, as in deterministic source full-waveform inversion, beneficial for the improvement of tomographic resolution.
By design, our method indeed does not rely on Green's function retrieval, meaning that no assumptions on wavefield equipartitioning or the nature of the noise sources are required. As a consequence, all possible artifacts, such as amplitude and traveltime biases, that may result from the violation of these assumptions are automatically avoided.
In principle, full-waveform ambient noise inversion allows us, as the name suggests, to exploit complete correlation waveforms for the benefit of improved resolution. Still, we currently limit ourselves to time shift measurements on fundamental mode surface waves. This is mostly because other measurements would produce a stronger coupling between the structural and source parts of the joint inversion. Though this is not a problem in itself, it would likely require a larger number of iterations, and also a more careful analysis of source-structure trade-offs, which we wanted to avoid for this first technological feasibility study. Yet our method already allows us to measure time shifts on the causal and acausal correlation branches separately. This not only increases the amount of data but also avoids having to make a choice between the two or taking the average.
Besides already discussed shortcomings of traditional ambient noise tomography, its applicability is also limited in terms of the strength of source and structure heterogeneities. While strong variations of the ambient noise sources increase their associated bias, ray-based methods break down for large velocity contrasts. Full-waveform techniques are then the preferred approach. Suitable initial models for structure are typically derived from traveltime tomography (Fichtner, 2010;Modrak & Tromp, 2016), and a similar approach for source inversions was developed by Ermert et al. (2016).

Numerical Simplifications
The tomographic result should be interpreted as an effective model. We extend the first layer in PREM and neglect ocean loading and variations of the Moho. Crustal effects are consequently mapped into our structure model at various depths. In addition, we do not include topography, ellipticity, rotation, and self-gravitation. Full-waveform inversions are particularly affected by topographic fluctuations on the order of the minimum seismic wavelengths (Nuber et al., 2016), which we do not expect in our case. Rotation is of small concern, since it has a small effect on amplitudes and affects acausal and causal times in the same manner. Ellipticity and self-gravitation, however, should be accounted for in the future (Komatitsch & Tromp, 2002).

Prior Information on Noise Sources
Regarding the source inversion, introducing an additional constraint that only allows noise sources in the oceans should be taken into consideration. In general, the contribution on continents is relatively small in our model, except for an anomaly in North Asia. Performing the inversion without this constraint facilitates the comparison to Ermert et al. (2017) and further tests the hypothesis of atmospheric contributions to the Earth's hum (Nishida & Kobayashi, 1999;Nishida, 2014;Tanimoto & Um, 1999). Potential sources at arbitrary depths could have been easily included with our implementation. However, since there is currently no observational evidence for significant noise sources at depth, we limit ourselves to surface sources.

Source-Structure Trade-Offs
The interpretation of specific features, for example, the noise source anomaly in North Asia, would benefit from a resolution analysis that captures intraparameter and interparameter trade-offs. Hessian-vector products are, for instance, used in conventional full-waveform inversions (Fichtner & Trampert, 2011;Hagen et al., 2018;Pan et al., 2019) and have recently been derived for correlation functions . Source-structure trade-offs are expected to be small in our inversions thanks to the long periods and the misfit functionals that effectively decouple the joint inversion into inversions for noise sources and Earth structure. The specific choice of the misfit functions is the result of extensive forward and inverse modeling in . We found that traveltime shifts are primarily sensitive to velocity anomalies and are hardly affected by heterogeneous noise source distributions, which is to be expected (Froment et al., 2010;Tsai, 2009;Weaver et al., 2009). In contrast, the asymmetry measurement is particularly suited for source inversions, since it is mostly influenced by heterogeneous noise source distributions and is -at least to first order-insensitive to Earth structure. This is also confirmed by Figure S2 in the supporting information that compares two source inversions for two different Earth models. The results are hardly affected by the choice of the structural model, which indicates small source-structure trade-offs. A quantitative resolution analysis would provide further insights and train our understanding of the underlying physics, but is due to technical and computational limitations subject of ongoing research and as such beyond the scope of this methodological study.

Effects of Data Processing
The forward model defined in equations (2) and (3) does not incorporate nonlinear processing.  propose an approach based on transfer functions that map raw to processed correlation functions.
Applying the transfer functions also to the synthetics offers a possibility to account for nonlinear processing. An alternative approach is to keep the processing as linear as possible. Bowden et al. (2015Bowden et al. ( , 2017, for instance, increase the signal-to-noise ratio of their data set by using a representative spectrum for spectral whitening.

Updating Global Multiscale Earth Structure
Full-waveform ambient noise inversion provides a more consistent framework for the combination of transient and continuous sources. The principle of Green's function retrieval transforms continuous and quasi-random signals into transient deterministic signals, but ignores the heterogeneous and nonstationary nature of the ambient noise sources. Describing reality more accurately and quantifying inevitable source-structure trade-offs have the potential to connect correlations more consistently with other data sets.
With this successful proof of concept, we follow this line of thought and take the opportunity to contribute to the CSEM (Afanasiev et al., 2016;Fichtner et al., 2018). The first-generation CSEM contains 12 regional refinements built from waveform data with varying minimum period. One global inversion was performed to ensure the consistency of the different subregions. Similarly, we use the existing CSEM model and supply a global update using the data set and framework described above. The update is presented in Figure S3 in the supporting information.

Future Work and Potentials
Following a successful proof of concept, the question of what comes next naturally arises.
First and foremost, it is evident that a study at long periods with few reference receivers (virtual sources) can only be the starting point. However, successively incorporating shorter periods and more stations is primarily a matter of more bookkeeping and increased computational requirements.
Much of the success of full-waveform inversion in general results from the ability to use complete waveforms. Though we could already make independent time shift measurements on the causal and acausal branches, this potential has remained rather unexploited in this study. In particular, the extraction of spurious arrivals, which we are not able to reproduce with the inferred source distribution, is essential to improve the resolution capabilities of the source inversion . Going in this direction is also technically simple, but the inversion and its results would need to be complemented by a more thorough trade-off analysis, which is possible thanks to the development of second-order adjoints for noise correlations .
Over long time intervals, at long periods, and at the global scale, ambient noise sources appear relatively homogeneous. Therefore, we expect the benefits of this method to be more pronounced and practically relevant for higher frequencies and at regional scale. While theoretically straightforward, a regional-scale application would need to deal with ambient noise sources that are located outside a computational domain that must be finite to limit computational requirements. This difficulty is one of the main reasons why this proof of concept has been performed at the global scale.
Finally, full-waveform ambient noise inversion offers the potential to constrain ambient noise sources as a function of time. This suggests that time-variable noise sources may naturally be taken into account in noise-based seismic monitoring studies.

Conclusions
This study constitutes a successful proof of concept for full-waveform ambient noise inversion. It is as such the first application of the method to observed correlation functions that jointly constrains 3-D Earth structure and heterogeneous noise source distributions. The method is, by design, free from the limiting conditions of Green's function retrieval and in principle enables the exploitation of complete waveforms, even when they appear "spurious", that is, when they are not part of the correct Green's function.
The hum source model is consistent with earlier work by Ermert et al. (2017), and we identify tomographic features that are in accord with global seismic velocity models at length scales of several thousand kilometers. Full-waveform ambient noise inversion can be used as a tomographic method by itself and has the potential to provide complementary information in regions with poor earthquake coverage. Invoking the principle of Green's function retrieval is not necessary for the interpretation of correlation functions, and we are on the way to establish them as self-consistent observables.