Direct Inversion of S‐P Differential Arrival Times for 
 VpVs Ratio in SE Asia

Southeast Asia lies within one of the most complex tectonic settings on Earth and exhibits a range of features, including strongly curved subduction zones, arc‐continent collision, and slab break‐off, which are not well understood. To help gain insight into mantle structure and processes beneath this region, we perform an inversion for variations in Vp, Vs, and VpVs structure using arrival time information from the ISC‐EHB catalog. The oceanic lithosphere subducting beneath Java is imaged as a positive dVp and negative d(Vp/Vs) anomaly. At 200 km depth, the forearc mantle beneath Sumatra and Java is revealed by positive dVp and d(Vp/Vs) anomalies which cease at Sumba island, where negative d(Vp/Vs) anomalies mark the presence of cold Australian lithosphere (down to 200–250 km depth) which is colliding with Sundaland. These negative d(Vp/Vs) anomalies depict a ∼WE trending structure that appears to correspond with the underthrusting of Australian continental crust. One notable salient has a location and shape which appears to coincide with those of ancient terranes or a Gondwana‐related microcontinent reconstructed by paleogeographic studies and may have been entrained in the subduction process. The velocity and d(Vp/Vs) patterns beneath the Banda Arc support the existence of a single curved subducting slab associated with rollback. The extreme extensional strike‐slip setting in Seram produces the highest positive d(Vp/Vs) anomalies in the model which may be due to one or more of widespread serpentinization, high concentrations of intraslab fluid‐filled faulting, and mantle upwelling.


Introduction
The current configuration of landmasses which make up Southeast (SE) Asia is an assemblage of Tethyan sutures and Gondwana fragments (Sundaland) combined with island arcs. These structures were intertwined by the thousands of kilometers of oceanic lithosphere subduction that was precipitated by the closure of the Tethys, northward migration of the Australian-Indian plate, and westward convergence of the Philippine Sea plate from the Mesozoic to the Cenozoic (Hall, 2011(Hall, , 2017Hall & Spakman, 2015;. Today, the primary tectonic plates that underpin SE Asia are the Australian-Indian, Philippine Sea, and Eurasian plates (see Figure 1). Despite our broad understanding of SE Asian assemblage, several outstanding questions regarding its tectonic evolution still remain, including where and how collision and subduction interact across the Sunda and Banda arcs; whether tears or holes are present in subducting slabs (Hall & Spakman, 2015), for example, due to the high stress exerted by the rapidly moving Australian plate (Debayle et al., 2005); how and where the Australian plate interacts and sutures with SE Asian lithosphere (Bird, 2003); and if preexisting forearc terranes consisting of various metamorphic/volcanic rocks and Gondwana-related microcontinents influence the formation and evolution of subduction zones along the northern margin of Australia (Hall, 2017;Porritt et al., 2016).
Seismic tomography has been successful at imaging the seismic velocity structure of subducting slabs in Southeast Asia using global (e.g., Hall & Spakman, 2015) and regional (e.g., Zenonos et al., 2019) body wave arrival times. The seismic velocity of a material depends on its elastic properties and density, which in turn are influenced by variations in temperature, pressure, and composition (Cammarano et al., 2003). Consequently, P wave and S wave velocity models can provide useful insight into these material parameters, but in the absence of detailed prior constraints, a direct conversion is highly nonunique (Afonso et al., 2016). It has been demonstrated that well-constrained estimates of V p V s ratio provide additional insight, particularly into

RESEARCH ARTICLE
the presence of partial melt and fluids, beyond what can be obtained from P wave and S wave velocity alone (Bass, 1995;Duffy & Anderson, 1989;Goes et al., 2000;Vacher et al., 1998). Interpretations can occasionally be straightforward, at least at the regional scale: Low V p V s anomalies associated with high V p and V s appear consistent with the underside of subducting plates, while high V p V s and low V p and V s often correlate with a fluid/melt-rich mantle wedge (Reyners et al., 2006;Zhao et al., 2010). Other observations are less straightforward: A typical example is low V p V s associated with average-to-high absolute velocities, caused by seismic anisotropy (Hacker & Abers, 2012), while an increase in orthopyroxene concentration along the upper subducting slab (Wagner et al., 2008) may be associated with high velocities and low V p V s . This interpretation becomes particularly complex in the presence of ancient Gondwana blocks, such as those found across SE Asia (Hall, 2017), whose structure could cause low V p V s anomalies similar to those observed in continental cratons (Bastow et al., 2015;James et al., 2004). Section 2 discusses the different interpretations of velocity ratio variations at the regional scale.
In SE Asia, regional-scale studies of upper mantle seismic structure have largely used P wave velocity models and focused on subduction along the Sunda and Banda arcs. Global tomography models demonstrate that subducted slabs are characterized by anomalously high velocities, which likely have a dominantly thermal origin (e.g., Hall & Spakman, 2015). Bijwaard et al. (1998) and Ritsema et al. (2011) find low velocities in the upper mantle beneath Sundaland that are likely caused by elevated temperatures coupled with possible mantle composition changes and elevated volatile content, where wedge hydration from the slab, decompression melting, and slab-sediment melting all contribute to magmatism. The low upper-mantle velocities and elevated heat flow observations suggest the presence of a thin and weak lithosphere that extends hundreds of kilometers from volcanic arcs (Hyndman et al., 2005). These observations are associated with abundant evidence of magmatism during the Late Cretaceous to Eocene across the Sunda and Banda arcs (Hall, 2012). Widespread granite magmatism across southern Sundaland in the Cretaceous is not arranged in linear belts, which in turn suggests a period of protracted melting across the region rather than subduction-related plutonism (Conder & Wiens, 2006). The SE Asian volcanic arcs were active from the Cretaceous to Pliocene and are associated with gold mineralization in andesitic arcs (Garwin et al., 2005); the Indonesian arcs alone are estimated to host in excess of 2,500 tonnes of gold and 20 million tonnes of copper (Carlile & Mitchell, 1994). The intensive search for gold and copper in Indonesia has demonstrated a close association of magmatic arcs with mineralization, thus providing insight into the tectonic setting in which mineralization takes place (van Leeuwen, 1994).
Seismic studies have also focused on the shift from oceanic subduction beneath Sunda and Java to continental collision/subduction under the Banda Sea. Extreme subduction rollback of a single folded slab has been inferred from the combination of paleogeographic and global tomography images (Hall, 2002;Hall & Spakman, 2015). The presence of continental terrane currently subducting beneath the southern Banda Sea has been proposed based on results from ambient noise tomography studies (Porritt et al., 2016). In this paper, we use body wave arrival time tomography to constrain new 3-D models of upper mantle P and S wavespeed and V p V s ratio beneath Southeast Asia. The models provide important constraints on the dynamics and geometry of subduction zones as well as the composition and temperature of the lithosphere-asthenosphere system. V p V s ratio is particularly sensitive to the presence of melt/fluids and therefore has the potential to contribute to our understanding of plumbing systems associated with recent mineralization and volcanism. The best resolved anomalies in our models are associated with the Banda arc collision zone due to good crossing path coverage provided by seismicity from two subducting slabs which face each other (see Figure 2), with a south dipping slab in the northern part of the arc and a north dipping slab in the southern part of the arc.

Interpretation of Seismic Velocities and Velocity Ratio From Seismic Tomography
A number of studies have investigated the relationship between V p , V s , their ratio, and the physical properties of the Earth's mantle (Cammarano et al., 2003;Hacker & Abers, 2012 presence of volatiles and fluid migration from the upper part of the slab within the mantle (Chiarabba et al., 2008;Hyndman & Peacock, 2003). Hydrated minerals in the subducting slab and shallow mantle, such as serpentinite, can also produce elevated V p /V s ratios (Hyndman & Peacock, 2003;Reyners et al., 2006;Xia et al., 2008). Furthermore, at depths > 150 km, high velocity and low V p V s is often associated with the lower plate and core of subducting oceanic lithosphere (Zhao, 2009). This is due to slab dehydration liberating fluids which then enter the mantle wedge (Graeber & Asch, 1999;Reyners et al., 2006). Wiens and Smith (2004) and Wiens et al. (2006) (among others) describe the relationship between seismic velocities and mantle rock properties including temperature, percentage of partial melt, composition, and volatile content. The variations of P and S wave velocities with temperature are well known from laboratory experiments at ultrasonic frequencies. Experiments for single crystals suggest values of −0.62 × 10 −4 K −1 for lnVp∕ T and −0.76 × 10 −4 K −1 for lnVs∕ T for olivine at upper mantle pressures and temperatures (Wiens & Smith, 2004). A change of 100 • C would thus produce changes in P and S wave velocity of 0.10 and 0.08 km/s, respectively, or about 1.2% for P waves and 1.8% for S waves. Seismic wavespeed is not just a function of temperature; ultrasonic experiments on molten peridotite show a rapid drop in velocity if the melt fraction is larger than about 2%, after which the velocity decreases by about 2% for every 1% of partial melt . This effect is larger on S waves than P waves such that lnVs∕ lnVp = 2.2 for anomalies caused by melt or strongly altered mantle (Carlson & Miller, 1997) The effect of composition on seismic velocity in the upper mantle is lower than that of temperature and partial melting (Goes et al., 2000;Sobolev et al., 1996); however, strong variations in the Mg Mg+Fe ratio (often referred to as Mg#, which generally increases as a function of melt depletion) will have a significant effect on the seismic velocity. For example, seismic velocities are increased by ∼2% in highly depleted harzburgite periodites when compared to undepleted peridotites (Wiens & Smith, 2004). The most abundant hydrous minerals in altered ultramafic rocks is serpentine (∼Mg 3 Si 2 O 5 (OH) 4 ), whose presence substantially affects seismic velocities (Wiens & Smith, 2004). At active margins, volcanic forearcs are generally marked by low velocities and high V p V s , which is a sign of mantle hydration resulting in the formation of serpentine (serpentinization) (Graeber & Asch, 1999;Reyners et al., 2006). Regions of intense seismicity at the upper boundary of downgoing high-velocity slabs generally mark the product of dehydration, imaged as high V p V s above the seismic zone (Chiarabba et al., 2008(Chiarabba et al., , 2016. These potentially large volumes of fluids can reach depths of about 100 km, and as a consequence partial melting can be triggered in the overlying mantle wedge, which is the source of arc volcanism (Chiarabba et al., 2008;Hyndman & Peacock, 2003). Above 100 km depth, high V p V s ratios may also indicate the presence of fluids triggering high rates of seismicity in the mantle wedge (e.g., Halpaap et al., 2019;White et al., 2019) and promoting intraslab faulting (Chiarabba et al., 2008). Recent rock physics research demonstrates that highly pressurized fluids can produce high V p V s anomalies in subduction settings in the presence of intense microfracturing (Pimienta et al., 2018).
While partial melting and pressurized fluids in the mantle increase the V p V s ratio, pore fluids, water, cracks, fractures, and, especially, enrichment in silica can significantly decrease crustal V p V s ratios (Takei, 2002;Wagner et al., 2008;Zheng & Lay, 2006). If combined with low V p and V s , low V p V s ratios can mark the presence of quartz in the crust (Chiarabba & Amato, 2003). In the mantle, enrichment in silica combined with fluid saturation has been proposed to explain anomalously low V p V s in the mantle across the Sea of Okhotsk; in this relatively cold subduction setting, the flushing of fluids through the upper mantle inboard of the coast occurs progressively over time, building up the fluid content and boosting SiO 2 enrichment (Zheng & Lay, 2006). A low-curvature subducting slab (Liu et al., 2018) descending beneath the continental crust (Lange et al., 2018) represents the tectonic setting of Sumatra. This cold subduction setting underlying high-silica volcanism (e.g., Toba and Krakatau as stated by Budd et al., 2017;Dahren et al., 2012) can develop low-V p V s anomalies, marking depleted cold lithosphere, cooled trapped asthenosphere, and frozen pooled melt (Wagner et al., 2008). On the other hand, upper-mantle low-V p and low-V p V s anomalies above the subduction of oceanic lithosphere can also be explained by strong anisotropy in peridotite, rather than unusual composition (Hacker & Abers, 2012). This effect is most likely to manifest at shallowly dipping slabs, because a gently inclined slab likely promotes increased anisotropy, at least where volcanism is absent (Hacker & Abers, 2012). We provide a summary of the relationship between velocities and velocity ratios in the crust and upper-mantle in Table 1.

Data
To carry out our tomography study, we have used a recent version of the ISC-EHB data set, which is a groomed version of the International Seismological Centre (ISC) Bulletin, containing 136,024 seismic events from 1960 to 2013, with reanalysis of event data in the period 2000-2013 (Weston et al., 2018). All events were relocated using the EHB algorithm , in which a nonlinear relocation scheme is applied in the presence of the radially stratified ak135 velocity model developed by Kennett et al. (1995). The same selection criteria outlined in Zenonos et al. (2019), Bijwaard et al. (1998), and Amaru (2007) were applied to arrival times contained in the data set; these criteria permit the use of arrival time residuals from local events (< 25 • ) which are less than 7.5 s for P and S and teleseismic arrival time anomalies (> 25 • ) which are less than 3.5 s for P and 7.5 s for S. The aforementioned criteria were determined empirically from density plots of ISC delay times versus epicentral distance (Bijwaard et al., 1998). We have tested moderate departures from these values (3.5 ± 0.5 and 7.5 ± 1 s) and found that the final models are essentially been unchanged. As such, we feel that these selection criteria are robust for our purposes. While it is possible that strong velocity heterogeneity may be suppressed as a result, this is preferable to including data with high levels of noise-caused, for example, by misidentified seismic phases and mislocated hypocenters -which leads to the recovery of spurious structure.
We have also used the unsupervised machine learning approach described in Zenonos et al. (2019) to bundle rays with similar source-receiver geometries, thus removing redundant data and improving the reliability of the arrival time data set. The resultant number of P and S wave arrival times following the aforementioned procedure are 572,504 and 84,217, respectively. The uncertainty on the resultant arrival times was determined by taking the standard deviation of the arrival times for each ray bundle, which in turn was supplied by the unsupervised machine learning technique. These values are used to weight the data in the subsequent inversion via the data covariance matrix. To avoid the inclusion of unrealistically small uncertainty estimates, the minimum value is set to 0.1. This resulted in uncertainty estimates with a standard deviation of 0.40 s for S and 0.23 s for P, respectively. For the subsequent inversion, we reduce the initial data set to one in which all source-receiver combinations have both a P and S wave arrival. By excluding unique P and S wave arrival times we significantly decrease the number of picks, which results in 72,712 summary arrival times from 25,645 sources and 574 receivers. Of these, 5.4% are teleseismic (epicentral distance > 25 • ). This approach has the advantage of allowing for more direct comparison between the V p , V s , and Vp Vs models, which will all have similar raypath coverage. Potentially higher resolution models can be obtained for V p and V s by using all available data, and these are also presented in the supporting information (see Figures S1 and S2) along with comparable models, produced by Zenonos et al. (2019) (see Figures S3 and S4).

Method
We performed an iterative nonlinear tomographic inversion for V p and V s variations with the software package FMTOMO (de Kool et al., 2006;Rawlinson & Sambridge, 2004a, 2004bRawlinson & Urvoy, 2006) similar to what is described in Zenonos et al. (2019). FMTOMO uses the fast marching method (FMM) (Sethian, 1996;Sethian & Popovici, 1999), a grid-based eikonal solver, for the forward step of traveltime prediction. The main advantage of this method is that it can produce reliable traveltime predictions in regions comprising highly heterogeneous velocity structure (Rawlinson et al., 2008). The forward step was executed in parallel mode in order to compute traveltime predictions for our large data set. FMTOMO uses a subspace inversion scheme (a gradient-based technique) to solve the linearized inversion step where model parameters are adjusted to satisfy data observations subject to damping and smoothing regularization (see Kennett et al., 1988, for details). The iterative application of the forward and inversion steps accounts for the nonlinearity of the inverse problem. Detailed descriptions of the forward and inversion steps which underpin FMTOMO can be found in de Kool et al. (2006) and , respectively. The package uses cubic B splines to describe continuous velocity variations in 3-D, which are controlled by a regular grid of nodes in latitude, longitude, and depth. Similarly, cubic B splines are used to describe continuous surfaces with variable geometry. In the case of Southeast Asia, we use a single layer model with an interface at the surface. While a crustal layer could be included, our previous results (Zenonos et al., 2019) demonstrate that introduction of a prior crustal model makes little difference to the recovery of mantle anomalies. The starting model we use in this case is defined by the global ak135 reference model (Kennett et al., 1995). Note that all earthquakes, receivers, and raypaths lie within the defined model volume, including the 5.4% of arrivals that have an epicentral distance range defined as teleseismic.

Nonlinear Location of Events
The local linearity assumption which is used by FMTOMO (and most local earthquake tomography algorithms; e.g., ) to relocate events is not strictly valid in regions with highly heterogeneous velocity and/or poor initial source locations. This approach was used in Zenonos et al. (2019), but we explore the nonlinearity effect of source location more fully in this study by adopting the fully nonlinear relocation algorithm described in Pilia et al. (2013), which is applied after each velocity update. This technique makes use of the grid-based nature of FMM, which computes traveltimes to all the points in the medium from each source location in order to calculate the two-point traveltimes. Sources and receivers can be interchanged under the assumption of traveltime reciprocity in order to maximize the computational efficiency of FMM. Most of the computing time in FMM is required to calculate the traveltime field for each source. If there are more sources than receivers, this means that we can more efficiently calculate traveltimes to all the grid points from each receiver. Therefore, the fewer traveltime field calculations, the greater the saving in computing time. Having a traveltime to all points of a 3-D grid for each receiver also means that a grid search for the optimum source location can be done efficiently, irrespective of the complexity of the velocity model (Pilia et al., 2013).

Direct Inversion for Velocity ratio
In general, the simplest approach for calculating the V p V s ratio is to divide V p by V s after independently constraining them via body wave tomography. This approach is nonrobust and tends to produce V p V s models which are dominated by anomalies that are present in the V p model. This is due to the lower number of raypaths and hence poorer resolution of the V s model in most local or regional earthquake tomography studies (e.g., Zenonos et al., 2019), as well as the ad hoc regularization that is imposed, which results in poorly constrained amplitudes. For d(V p /V s ) (perturbations from a reference V p V s model), the relative amplitudes of V p and V s are of great importance, and different regularization choices introduce arbitrary anomalies in d(V p /V s ).
The drawbacks of dividing V p by V s in order to estimate V p V s have long been recognized. In the field of local earthquake tomography, the most commonly used approach to circumvent such issues is to directly invert differential S-P arrival times for perturbations in V p V s (e.g., Thurber, 1993;Walck, 1988). We adopt this direct inversion approach to estimate d(V p /V s ) in Southeast Asia and make use of a modified version of the original FMTOMO package that was first described in Pilia et al. (2013). The direct inversion approach is based on two assumptions: (i) For every P wave raypath there is a corresponding S wave raypath, and (ii) for a given source-receiver pair, the P and S raypaths are identical. In the latter case, the only way that this condition can be met is to set up the inverse problem within a purely linear framework and use a constant V p V s starting model. The linear formulation used by Pilia et al. (2013) which is based on Walck (1988) can be written as where t p and t s are the P and S wave traveltimes, respectively, L s is the S wave path, v is the velocity, l is the path length, and (1) shows that the V p V s ratio can be obtained without explicit knowledge of P or S wave velocities. Although the application of damping and smoothing regularization is required to obtain a robust solution, this dominantly influences the amplitude and wavelength of V p V s anomalies rather than the recovered pattern. If errors in P and S traveltimes are assumed to be independent, the error dt S−P in the weighted differential arrival times can be expressed . These values are used to weight the differential S-P arrival times in the inversion. Synthetic checkerboard tests in the next section show how d(V p /V s ) is recovered using this linear approach and how it compares to the method of dividing V p by V s sourced from separate V p and V s models. We find this to be a useful exercise, since it provides additional confidence in our results when the two models agree. It should be noted that since the P wave and S wave data sets are identical in terms of number of picks and source-receiver combinations used, issues relating to differences in path coverage are minimized. Furthermore, the division approach has the advantage that the nonlinearity of the inverse problem is accounted for. However, differences in regularization used to constrain the P wave and S wave inversions means that the direct inversion approach is still favored.

Solution Stability and Model Resolution
We use trade-off curves to investigate how data fit varies as a function of model perturbation (damping) and roughness (smoothing), with the goal of finding a model that contains features required by the data. Figure 3 illustrates these curves for P, S, and d(V p /V s ) models; we use the standard criterion that the optimum trade-off occurs at the point of maximum curvature but acknowledge that this is ad hoc and does not guarantee that the "best" model will be produced. Furthermore, some curves (Figures 3d and 3f) do not have clear optimum trade-off points, which may be due to the nonlinearity of the inverse problem and noise in the data, which may not match the assumption of a Gaussian distribution which underlies the least squares method. Consequently, we also include rougher and smoother V p V s models in the supporting information ( Figures S5 and S6, respectively). For our chosen damping and smoothing parameters, the recovered P and S wave tomographic models exhibit a data variance reduction of 43% for V p and 49% for V s , which corresponds to an absolute reduction of 527 ms for V p and 1,195 ms for V s from the initial fit. The average V p V s ratio in the upper mantle of SE Asia is found to be 1.81, and in subsequent illustrations of d(V p /V s ) from the direct inversion of S-P, we plot perturbations relative to this background average. For the direct division plots, we show perturbations relative to a horizontally averaged 1-D model.
We performed synthetic checkerboard tests that include the addition of Gaussian noise with a standard deviation of 0.40 s for S wave and 0.23 s for P wave synthetic traveltimes in order to simulate arrival time picking uncertainty of the observational data set. Checkerboard tests were performed to help evaluate the robustness of the P wave and S wave models as well as d(V p /V s ) obtained by direct inversion and division (see Figures 4, 5, S7, and S8). The grid spacing of the three checkerboard tests are coarse grid (80 km depth and 2.8 • longitude and latitude), medium grid (53 km and 2.0 • ), and fine grid (40 km and 1.2 • ). The positive and negative checkerboard anomalies are defined by two grid points in each dimension, with a further two required to transition between an adjacent positive and negative anomaly. Thus, the effective checkerboard size is up to 3 times the grid spacing. In the top 200 km of the model, the coarse grid is well resolved across almost the entire region (Figure 4). For the medium grid, significant smearing due to poor data coverage is visible (1) in regions southwest of the Java arc; (2) between Borneo, the Thai-Malay peninsula, and Vietnam; and (3) east of Sulawesi. The fine grid model is well reconstructed across the whole subduction arc between eastern India and east Java, north and south Sulawesi, and the Australian plate. The best resolution Figure 3. Procedure used to obtain the optimum damping and smoothing parameters for the dV p , dV s and d(V p /V s ) models. (a) We vary the smoothing parameter while holding the damping parameter fixed at = 1. We have identified the optimum value of to be 800. (b) We vary the damping parameter while holding the smoothing parameter fixed at = 800 which gives = 1, 000. (c) We repeat step (a) while holding the damping parameter fixed at the new optimum value. The original value of still appears to be an acceptable choice. Optimum damping and smoothing parameters for the P wave model are therefore chosen to be = 1, 000 and = 800, although we acknowledge that the low curvature of the trade-off plots means that there is no clear choice. Similarly, the aforementioned procedure was followed for S waves (panels d-f) with optimum parameters = 1, 000 and = 500 and d(V p /V s ) (panels g-i) with optimum parameters = 3, 000 and = 1, 000.
is obtained in the collision zone that includes south Sulawesi, Bali, the Banda arc, and the eastern and central regions of New Guinea. In areas with poorer coverage such as the Java arc and Borneo we obtain slightly better recovery with the d(V p /V s ) model obtained by direct inversion than by direct division (see Figure S7). Hence, when the two models disagree, we favor the direct inversion results. This is also supported by previous studies which widely agree that direct inversion is a more robust approach (e.g., Graeber & Asch, 1999;Pilia et al., 2013;Walck, 1988) and our observations of the division approach discussed in section 6.
In order to investigate the robustness of the crucial region between northern Australia and Indonesia, a synthetic test was performed in which a negative d(V p /V s ) anomaly simulating the Australian mantle lithosphere is recovered (see Figures S9-S12). The accuracy of this recovery, which is consistent with the checkerboard test results, suggests that the transition from the Australian continent to Southeast Asia is well resolved. Additional synthetic spike tests were performed that support the robust recovery of this transition zone by the tomography (see Figure S13).
To investigate the effects of source location accuracy, a fully nonlinear earthquake relocation scheme is applied at each iteration of the P wave and S wave tomography. After the source relocation, the average change in location of relocated events is 29.15 km, which ultimately makes little difference to the model, as illustrated in Figures S14 and S15. This suggests that the EHB catalog locations are sufficiently accurate with respect to the dominant wavelength of anomalies we recover. We check the robustness of the results against our previous tomographic models (Zenonos et al., 2019), where we included all the available P and S wave picks, the Moho interface, and crustal velocities from the crust1.0 model (Laske et al., 2013). Figures S1 and S3 show depth slices through the two P wave models, while Figures S2 and S4 show depth slices through the two S wave models. In both cases, the differences are not significant, particularly above ∼300 km depth. It is worth noting that a slightly different approach to choosing the optimum damping and smoothing parameters was used in the previous models of Zenonos et al. (2019), which explains why their anomaly amplitudes tend to be smaller, particularly at greater depth where path coverage is lower. P wave, S wave, and d ( V p V s ) models produced in this study using P and S wave arrival time data sets with identical source-receiver combinations are shown in Figures S16-S18 from 100 to 600 km depth.

Results and Discussion
In this section, we focus on presenting the new V p V s results, which are the primary outcome of this study. However, the associated V p and V s models are also included in some cases because they have a similar  Figures S3 and S4). In regions of good resolution, the two d(V p /V s ) models tend to be similar, but elsewhere, the differences can be quite pronounced. In the division model, we also see evidence of short scale length structures that appear to be largely inherited from the dV s model (e.g., Figure 6). This is likely caused by the P wave and S wave models having anomalies of similar magnitude (which in turn is controlled by regularization choices), but since the average P wave velocity is much higher than the average S wave velocity, the latter tends to dominate d(V p /V s ). In this scenario, the sign of the dV s and d(V p /V s ) anomalies are anticorrelated. Consequently, we only show the direct inversion results from now on, but include the division results in the supporting information. Figure 7 shows a series of slices through the final 3-D d(V p /V s ) model of SE Asia, as determined by the direct inversion of S-P arrival times (see Figure S19 for division results). Coherent anomaly patterns are clearly evident, which can be associated to large-scale structures in the upper mantle related to Australia-Eurasia collision and subduction of oceanic lithosphere. Peak anomalies are of the order of 5%, which corresponds to a range in absolute V p V s of approximately 1.81±0.09. This range may seem small in such a dynamic tectonic setting, where strong thermal anomalies, hydrated mineral assemblages and fluid/melt beneath subduction zones might be expected to produce large anomalies; for example, Koulakov, Yudistira, et al. (2009) anomalies as large as 10% in the uppermost mantle beneath the Toba Caldera complex in northern Sumatra. However, this was a much more local study that was able to image structures at scale lengths well below 100 km, which is where the largest anomalies were present. Thus, for our smoother regional model, the range of perturbations in d(V p /V s ) we obtain seem reasonable. Figure 8 shows horizontal sections taken at depths of 60, 160, and 260 km, through the dV p and dV s models and d(V p /V s ) model obtained by inversion of S-P arrival times (direct division results are shown in Figure  S20). This figure enables comparison of dV p , dV s , and d(V p /V s ) at similar resolution and clearly illustrates    Figure 9. Note: dV p , dV s , and d(V p /V s ) are plotted relative to their respective 1-D background depth average. that d(V p /V s ) does not simply mimic structures observed in dV p and dV s , which suggests a different sensitivity to physical properties as expected (Hyndman & Peacock, 2003;Wiens & Smith, 2004;Wiens et al., 2006). The P wave and S wave models do a good job at delineating the subducting slabs beneath the Sunda Arc, Philippines, and southern Celebes Sea, whereas d(V p /V s ) highlight regions where they are inconsistent. Figure 9 shows vertical S-N cross sections through the best resolved regions of the models at 118 • and 129 • longitude with seismicity superimposed (Figure 9) and main features for discussion highlighted (see Figure  S21 for a NE-SW slice through the Banda Sea). One of the most robust anomalies in our models is the W-E change in d(V p /V s ) below 100 km depth beneath Sumba (indicated in Figure 10): We thus separate the discussion between areas located to the west and to the east of this anomaly. We also present an overall interpretation figure (Figure 10) which is a 200 km depth slice through the dV p and d(V p /V s ) models with various features superimposed, including the plate boundary model of Bird (2003).
Our new models image several extensive subduction zones in the region, including the well-known Sunda-Java and Philippines subduction zones and the double subduction zone beneath the Molucca Sea (Figures 8 and 10). The most prominent subduction zone extends across Sumatra and Java; it appears as a positive dV p and dV s anomaly and a negative d(V p /V s ) anomaly down to 100 km depth. Previous studies have imaged this in dV p and dV s (Amaru, 2007;Bijwaard et al., 1998;Hall & Spakman, 2015;Widiyantoro et al., 2011) but this is the first d(V p /V s ) image of this feature. The cold subducted slabs are seismically active and imaged as high wavespeed and negative d(V p /V s ) anomalies in the mantle (Figures 8 and 9). The new d(V p /V s ) model shows a positive anomaly extending approximately W-E under Java in the shallow upper mantle which is interrupted under the island of Sumba by a wide negative dV s and d(V p /V s ) anomaly (Figures 8 and 10). This negative d(V p /V s ) anomaly extends south and underlies the northern reaches of the Australian plate at a depth of 200 km (Figure 10). Strongly negative d(V p /V s ) are associated with the curved subduction zone that is known (e.g., Widiyantoro & van der Hilst, 1997) to characterize the Banda arc 10.1029/2019JB019152  (Figure 9), an idea we expand on below. In the following discussions, our interpretation of the d(V p /V s ) results corresponds to the models produced by direct inversion unless stated otherwise.

North Borneo and Java
Borneo has a low density of seismic stations and few earthquakes, so the mantle beneath cannot be resolved in any detail. However, in the north it is consistently recovered as a region with positive dV p and dV s and negative d(V p /V s ) at the scale of the medium checkerboard test (Figures 8-10). This is consistent with an interpretation in terms of remnant subduction (see Table 1) (Zenonos et al., 2019) and correlates with the high P wave velocity anomaly observed in Hall and Spakman (2015); this could either be part of the northwest subducting Celebes Sea slab from the midlate Miocene or southeast subducting South China Sea slab from the mid-Miocene (Cottam et al., 2013), which subducted during convergence of the Dangerous Grounds Block and NE Sundaland.
We observe significant high-velocity anomalies corresponding to a subducting slab underlying Sumatra and Java (Figure 8), consistent with previous studies (Amaru, 2007;Bijwaard et al., 1998;Hall & Spakman, 2015;Widiyantoro & van der Hilst, 1996). The whole Java slab, including the forearc region between southern Sumatra and Sumba exhibits negative d(V p /V s ) above 100 km depth (Figure 8). Below this depth, the deeper part of the high-velocity subducting slab is also recovered as a negative d(V p /V s ) anomaly (Figure 9). The forearc mantle immediately south of the subducting slab is characterized by positive d(V p /V s ) anomalies (Figures 8-10). The upper side of the slab is delineated by seismicity down to depths of approximately 600 km as seen in Zenonos et al. (2019), with subduction continuing into the lower mantle.
Across Java, the lower part of the subducted Indo-Australian oceanic lithosphere is imaged as a classic postive dV p and dV s and large negative d(V p /V s ) anomaly (Figure 9) (cf. Reyners et al., 2006;Zhao, 2009). This is likely due to the lack of fluid phases in the core of the slab caused by rapid dehydration in the crust and uppermost mantle as the slab descends (Graeber & Asch, 1999;Reyners et al., 2006). We also observe negative dV p and dV s paired with large negative d(V p /V s ) anomalies both in the forearc and below the main volcanic centers down to 100 km depth (Figure 8). Such a pattern associated with deep subduction requires enrichment in silica of the magmatic fluids at crustal-shallow mantle scale, which are significant in the process of continentalization (Zheng & Lay, 2006), possibly combined with excess fluids and a relatively cold subduction setting (Wagner et al., 2008;Zheng & Lay, 2006). The presence of ancient Gondwana continental blocks north of the slab  could both provide the source of silica and lower temperatures that may explain these anomalies. Below 100 km depth, positive dV p and dV s associated with positive d(V p /V s ) ( Figure 10, black dotted line) likely reflects fluid migration in the forearc mantle due to slab dehydration (down to 400 km depth) (Chiarabba et al., 2008;Hyndman & Peacock, 2003).

Sumba and Banda Sea
Below 100 km depth and east of approximately 118 • E (Figures 8 and 10) there is a pronounced W-E transition from positive to negative d(V p /V s ), with the transition zone running almost perpendicular to the Java subduction trench (Figure 8). This change is not as clearly reflected in the P wave model, where low velocities appear under Sumba and Timor. Further north and between depths of 60 and 160 km ( Figure 8) the negative d(V p /V s ) anomaly underlies part of the Eastern Sulawesi Ophiolite Belt , which is distinct from the positive d(V p /V s ) anomaly under the Western Sulawesi Volcanic Arc. The suture between these structures identified by remote sensing and geological reconstruction studies (seismicity, volcanism, magnetic anomalies, and geodesy) (Bird, 2003;Hall, 2011Hall, , 2017 follows the contrast between positive and negative d(V p /V s ) anomalies almost exactly down to 160 km depth. This low-velocity and negative d(V p /V s ) anomaly is present between the Australian plate (characterized by high velocities), North Borneo and the Philippines down to a depth of 160 km (Figure 8), following known sutures of oceanic basins originating from the collision of the Philippine Sea and Australian plates with Sundaland (Bird, 2003).
The southern part of this extensive NS-trending negative d(V p /V s ) anomaly is still visible at a depth of 200 km, just south of Sulawesi ( Figure 10). It clearly juxtaposes the consistent high-velocity and positive 10.1029/2019JB019152 d(V p /V s ) W-E oriented anomaly which characterizes the mantle beneath the forearc across Java (Figure 9) against high amplitude negative d(V p /V s ) anomalies visible just east of Sumba and beneath Timor. At 200 km depth, a small bulge or salient of strongly negative d(V p /V s ) is evident in an otherwise continuous, negative d(V p /V s ) anomaly that extends from NW Australia across the Scott Plateau, Bali, Timor, and the Banda Sea to New Guinea ( Figure 10). The white dotted curved line in Figure 10 marks the transition from consistent high-velocity and negative d(V p /V s ) under the Australian plate to a low-velocity and positive d(V p /V s ) region that is approximately coincident with known continental convergent boundaries recognized by remote sensing (magnetic anomalies and geodesy) studies (Bird, 2003), and we believe that it likely represents orthopyroxene enriched xenoliths within the thick crust and cold continental Australian mantle lithosphere (Kennett et al., 2011;Wagner et al., 2008). Zenonos et al. (2019) previously observed a northward transition from faster to slower velocities in this region, which was interpreted in terms of a change from thick continental lithosphere to thinner oceanic/arc lithosphere, but the addition of d(V p /V s ) constraints from this study have allowed for a more detailed interpretation.
The negative 3% d(V p /V s ) bulge at the convergence of three plates ( Figure 10) (1) follows the contour of the continental Australian lithosphere but correlates with small and low amplitude high-V p anomalies (admittedly at the limit of our resolution) embedded in the wider low-velocity zone surrounding the collision and (2) is sandwiched to the east between oceanic subduction trenches ( Figure 10). While continental subduction is likely the dominant dynamic under the Banda Sea (Hall, 2012;Zenonos et al., 2019), the positive velocity anomaly and large negative d(V p /V s ) anomaly just east of Sumba and beneath Timor appears consistent with the presence of a small ancient microplate (Hall, 2017;. This old continental block and accompanying terranes, like the Banda Terrane imaged by Porritt et al. (2016) above 50 km depth in the same area, appears to be entrained within the wider subduction process (Harris, 2011). If we take the S-N transition from negative to positive d(V p /V s ) anomalies as the northern margin of the Australian plate, then the pronounced negative d(V p /V s ) anomaly forms an obvious bulge or salient (Figure 10), and is consistent with the presence of an uplifting forearc sliver inferred by Porritt et al. (2016) at crustal scale. The area broadly coincides with a block that includes south Sulawesi, Flores, and Sumba which greatly extended during the Neogene (Hall, 2017). The negative 3% d(V p /V s ), the lowest in our model at this depth, is consistent with a decrease in Poisson ratio that reflects extension.
The Banda Arc slab delineates a strongly curved Benioff zone which changes orientation by about 180 • (Cardwell & Isacks, 1978;Hamilton, 1974Hamilton, , 1979. The diagonal profile crossing Timor shows a continuous high-V p and negative d(V p /V s ) dipping anomaly that likely marks the lower plate of an apparently rolled-back slab across the southern Banda Sea ( Figure S21). These results suggest oceanic subduction and folding of a unique subduction arc . On the cross sections taken through the Banda Arc (Figures 9 and S21) we observe a continuous spoon-shaped high-velocity (first described by Widiyantoro & van der Hilst, 1997) and negative d(V p /V s ) anomaly down to ∼200 km depth, with no apparent double subduction from north and south ( Figure S21; see also Zenonos et al., 2019). This region includes some of the largest positive d(V p /V s ) anomalies in the model, and spans the Molucca Sea and Banda Sea (Figures 10  and S21). In Figures 8, 9, and S21 beneath the Banda Sea at about 50-150 km depth, we observe low V p , low V s , and positive d(V p /V s ) anomalies which suggest possible serpentinization, partial melting, presence of volatiles, and fluid migration from the slab within the mantle (Reyners et al., 2006). Down to 200 km depth, the positive dV p , dV s , and d(V p /V s ) might be evidence of a mantle wedge, similar to Java; in this scenario, the high concentration of volatiles would likely enhance serpentinization and volcanic activity in the area (Figure 1). Nevertheless, the high-compressional setting under the northern Banda Sea and, especially Seram, corresponds to the highest d(V p /V s ) in our model (+5%; Figure 10) and is consistent with strike-slip faulting produced by extreme horizontal extension driving mantle upwelling (Pownall et al., 2013), coupled with the likely presence of fluids sourced from depth. Fluid-pressurized regions have been recently recognized as a source of strong positive d(V p /V s ) variations in fault zones and subduction settings, independently of the accumulation of molten materials at depth (Pimienta et al., 2018). Fluid pressurized faulting in an extreme compressional setting is thus a possible explanation for the large positive d(V p /V s ) variations as well as for the complex spatial variation of the focal mechanisms determined in studies of the two-slab model for the Banda Sea (Das, 2004).

Conclusions
We present a new d(V p /V s ) model of SE Asia based on the inversion of differential S-P arrival times sourced from the EHB catalog. In order to carry out the inversion, we use an iterative nonlinear inversion scheme that combines a grid-based eikonal solver and subspace inversion method. After obtaining robust P and S wave velocity models for SE Asia a linear inversion of differential S-P arrival times and a direct division of V p and V s models were separately carried out to obtain estimates of d(V p /V s ) which were then compared for consistency. In regions of good path coverage the results were similar, but the model obtained from direct division of V p by V s appeared to preferentially mimic the S wave model, a trait that can be attributed to the use of ad hoc regularization. As such, the results from direct inversion of S-P arrival times were regarded as more robust and hence were used in the interpretation.
The new d(V p /V s ) results provide useful insights into the transition from oceanic subduction (Sumatra and Java) to continental subduction-collision (east of Sumba across the Banda Arc) located beneath Sumba and Timor. The lower part of the subducted Indo-Australian oceanic lithosphere along the Sunda to Banda arc is clearly imaged as a positive dV p and dV s and negative d(V p /V s ) anomaly up to 118 • E. A deep low-velocity and positive d(V p /V s ) mantle-wedge develops across the forearc up to 100 km depth; at shallow depths, negative d(V p /V s ) anomalies possibly have a correlation with the location of recently active volcanoes (see Figure 8) due to low-temperature and silica enrichment which play an important role in the process of continentalization (Zheng & Lay, 2006). At similar longitudes, the high-velocity and negative d(V p /V s ) anomaly beneath North Borneo supports the theory of remnant subduction dating from the Middle to Late Miocene.
The best resolution in our model is achieved where the Australian-Southeast Asia collision produces the largest negative d(V p /V s ) anomalies. At a depth of 200 km, these anomalies delineate the northern margin of the Australian continent. The negative d(V p /V s ) anomalies beneath Sumba and Timor, which are juxtaposed in the west against positive d(V p /V s ) anomalies associated with oceanic subduction beneath Java, extend north and east to delineate the Australian plate under the southern Banda Sea and the New Guinea highlands. A second trend of negative d(V p /V s ) anomalies extends north from Sumba, marking the Sulawesi Ophiolite Belt under east Sulawesi and continuing into the Philippines.
The pronounced low d(V p /V s ) salient (beneath Sumba, Figure 10) along the northern margin of the Australian plate spatially correlates with the location of preexisting crustal material such as terranes and/or Gondwana-derived microcontinents, previously inferred from seismic imaging and paleogeographic reconstructions. Our model suggests that this microcontinent is currently entrained in the wider zone of subduction of the Australian plate beneath Sumba and Timor. Oceanic crust continues to subduct under the southern Banda Sea producing apparent slab-rollback, and developing as a single curved spoon-shaped rather than double subduction zone. The largest positive d(V p /V s ) anomalies in the model are found under Seram, beneath the Molucca Sea and northern Banda Arc; while this may be evidence of serpentinization and intraslab faulting, such variations in velocity ratio can be caused by highly pressurized fluids and mantle upwelling in the region that has sustained the highest and most prolonged stress variations across the collision zone.