Tectonic Regionalization of the Southern California Crust From Tomographic Cluster Analysis

We map crustal regions in Southern California that have similar depth variations in seismic velocities by applying cluster analysis to 1.5 million P and S velocity profiles from the three‐dimensional tomographic model CVM‐S4.26. We use a K‐means algorithm to partition the profiles into K sets that minimize the inter‐cluster variance. The regionalizations for K ≤ 10 generate a coherent sequence of structural refinements: each increment of K introduces a new region typically by partitioning a larger region into two smaller regions or by occupying a transition zone between two regions. The results for K ≤ 7 are insensitive to initialization and trimming of the model periphery; nearly identical results are obtained if the P and S velocity profiles are treated separately or jointly. The regions for K = 7 can be associated with major physiographic provinces and geologic areas with recognized tectonic affinities, including the Continental Borderland, Great Valley, Salton Trough, and Mojave Desert. The regionalization splits the Sierra Nevada and Peninsular Range batholiths into the western and eastern zones consistent with geological, geochemical, and potential‐field mapping. Three of the regions define a geographic domain comprising almost all of the upper crust derived from continental lithosphere. Well‐resolved regional boundaries coincide with major faults, topographic fronts, and/or geochemical transitions mapped at the surface. The consistent alignment of these surface features with deeper transitions in the crustal velocity profiles indicates that regional boundaries are typically narrow, high‐angle structures separating regions with characteristic crustal columns that reflect different compositions and tectonic histories.


Introduction
The concept of tectonic regionalization is predicated on the notion that plate-tectonic processes produce lithospheric structures that are vertically correlated, regionally coherent, and recognizable from the surface. Lithosphere can be readily separated into regions of thicker continental crust and thinner oceanic crust, and the continents can be further regionalized into shields, platforms, and orogenic belts with distinguishable structural features (e.g., Chen et al., 2018;Jordan, 1981;Mooney et al., 1998). But how useful is the concept of tectonic regionalization at smaller scales?
We address this question by conducting a regionalization analysis of crustal structure in Southern California. Surface mapping (California Geological Survey, 2010;Jennings et al., 1977), in combination with studies of fault offsets, geodetic deformation monitoring, and other kinematic constraints, has taught us a lot about the tectonic assembly and disruption of its distinctive geologic units (Chapman, 2017;Chapman et al., 2012Chapman et al., , 2016Ducea et al., 2009;Jacobson et al., 2011;Saleeby, 2003;Sharman et al., 2015). Yet much remains to be learned about the tectonic relationships between surface geology and deeper crustal features. The crustal structure of Southern California is exceptionally heterogeneous, reflecting a long history of deformation that has juxtaposed crustal terranes of different composition and age, creating deep sedimentary basins flanked by steep mountain ranges (Atwater, 1970;McQuarrie & Wernicke, 2005;Whitmeyer & Karlstrom, 2007). In this study, we map crustal regions in Southern California that have similar depth variations in seismic velocities by a cluster analysis of velocity profiles from a three-dimensional (3D) tomographic model.
Here we regionalize the 3D velocity structure of Southern California according to similarities in the profiles of P-wave velocity α(z) and S-wave velocity β(z) at depths from z = 0 to 50 km. The α and β profiles were extracted from a fifth-generation SCEC model, CVM-S4.26 (Lee, Chen, Jordan, Maechling, et al., 2014), derived by full-3D waveform tomography (F3DT). The tomographic inversion process was initialized using a fourth-generation model, CVM-S4 (Kohler et al., 2003), and involved 26 linearized iterations that perturbed the two Lamé parameters of an isotropic model to fit a large set of earthquake waveforms and ambient-field correlagrams. Interspersed with these structural iterations were inversions that used the 3D structure to refine the earthquake source mechanisms and locations, which increased the accuracy of the tomography (Lee, Chen, Jordan, Maechling, et al., 2014). CVM-S4.26 provides good fits to waveforms of earthquakes recorded after the model was finalized Taborda et al., 2016;Lee & Chen, 2016). Moreover, it reproduces many crustal features first identified by surface mapping and active-source imaging, such as the geometry of major sedimentary basins , Lee, Chen, Jordan, Maechling, et al., 2014Lee & Chen, 2016) and the localized contrasts across faults that bound crustal blocks of different seismic velocities (Lee & Chen, 2017). Section 2 describes CVM-S4.26 in more detail.
We partition the set of 1.52 million velocity profiles from CVM-S4.26 into K distinct regions using the Kmeans clustering algorithm (MacQueen, 1967;Lloyd, 1982;Arthur & Vassilvitskii, 2007). Romanowicz (2011a, 2011b) have applied the K-means algorithm to the problem of mantle regionalization; our approach is similar, differing primarily in geographic scale and degree of geologic refinement. The algorithm iteratively categorizes the profiles into a specified number of clusters that minimize the total squared difference between the regionalized sets of profiles and their corresponding centroids. In Section 3, we show that the α and β profiles, treated separately, yield similar regionalizations for K ≤ 7, motivating us to use the concatenated profiles γ = {α, β} to fix the clusters. We highlight the γ regionalization for K = 7 as an especially informative result.
Section 4 compares the regional properties of the upper, middle, and lower crust and investigates vertical correlations in crustal properties using a simple isostatic model. In Section 5, we show that the regions identified by tomography can be associated with recognized physiographic and geologic provinces; moreover, their boundaries often coincide with major faults, topographic discontinuities, and geochemical transitions unknown to the tomographic model except as their manifestations in the seismic velocities. From these correlations, we argue that regional boundaries are typically localized, high-angle structures that separate crustal regions of distinctive compositions and tectonic histories.

Description of the tomographic model
The CVM-S4.26 study area covers Southern California and peripheral portions of Arizona, Nevada, and Baja California, extending from San Simeon on the Central Coast southeastward to the Colorado River Delta (Figure 1a). CVM-S4.26 is a digital model of isotropic seismic velocities comprising 992 × 1,536 geographic nodes spaced at 500-m intervals over an area of 380,928 km 2 . We express the latitude λ n and longitude φ n of the n th geographic node as x n ≡ (λ n , φ n ) and represent the velocity profiles at this location by two profile vectors, α n and β n . Each profile contains velocities α(x n , z) or β(x n , z) at 100 discrete depths z spaced at 500-m increments from the surface (z = 0 km) to the base of the model (z = 49.5 km). For any value of β and any point x n within the model area, we define z β to be the smallest depth where the shear velocity profile exceeds β. Thus, z 2.5 (x n ) maps the shallowest depth to the isovelocity surface for β = 2.5 km/s, which has been used as a proxy for basin depth in ground motion prediction equations (e.g., Campbell & Bozorgnia, 2008. The z 2.5 surface for CVM-S4.26 is given in Fig. 1d of Lee and Chen (2016).

Features of CVM-S4.26
The map of β(x n , z) at z = 2 km (Figure 1a) shows the major sedimentary basins as features of low shear velocity (β < 2.5, in warm colors). The San Joaquin Basin of the Great Valley is the deepest with z 2.5 ≈ 8 km. Also prominent are the Salton Trough and the basins of Santa Maria, Ventura, and Los Angeles. A number of the smaller, shallower basins are well represented in the model, such as those of the Indian Wells, Antelope, and San Fernando valleys (Lee & Chen, 2016).
Cross-sections through the model taken approximately perpendicular to the San Andreas fault system (Figures 1b,c,d) reveal regions of high velocity (β ≥ 4 km/s) at depths near 10 km, many beneath the deeper sedimentary basins. Most of these structures have been recognized in 2D-refraction studies, such as the highvelocity anomaly under the San Gabriel Basin, imaged by the first Los Angeles Regional Seismic Experiment, LARSE-I (Lutter et al., 1999), and beneath the Antelope Valley, imaged by LARSE-II (Lutter et al., 2004). Almost all of the major features appear in one or more of the earlier tomography models (Chen et al., 2007;Hauksson, 2000;Lin et al., 2010;Tape et al., 2010).
The particularly strong positive anomaly on cross-section A-A' is associated with the Great Valley ophiolite complex accreted to the western edge of the active continental margin in the Mesozoic (Godfrey et al., 1997;Godfrey & Klemperer, 1998). The cross-section shows that the high-velocity anomaly has an eastern boundary at the edge of the batholith and widens with depth as the top of the zone dips under the San Joaquin Valley. The shallow (~10°) dip of this upper interface in CVM-S4.26 is consistent with that inferred from refraction studies and potential-field modeling (Fliedner et al., 2000;Godfrey & Klemperer, 1998;Kaban & Mooney, 2001). This feature is also well-expressed in the tomographic model of Lin et al. (2010).
Shear velocity profiles at 15 selected geographic points, grouped by closest cross-section, are plotted in Figures 1e-g. Each profile was obtained by averaging the 2,601 profiles in a 25-km × 25-km square  Jennings et al. (1977). Numbered diamonds mark the locations of the profiles in the bottom panels; those in red were used in the supervised clustering ( Figure S6). (b)-(d) Cross sections through CVM-S4.26 showing β to a depth of 40 km along the profiles marked on the map; vertical exaggeration 2.4:1. White lines are the 4.3-km/s contours that demarcate the first depth to mantle velocities, which we take to be the nominal location of the M-discontinuity. Triangles mark the surface trace of the San Andreas Fault. (e)-(g) β profiles at points marked by numbers on the map in panel (a). The depth to the M-discontinuity is defined as the shallowest depths where the shear velocity reaches 4.3 km/s (dashed-dotted lines). Boundaries between crustal layers and between lower crust and mantle (horizontal dashed lines) are average depths for the study area, picked from gradients in the mean velocity profiles (dashed black profiles). centered on the point. The profiles show substantial point-to-point differences at all levels in the crust. The shear velocities in the upper crust rise rapidly with depth, forming the near-surface seismic waveguide; the near-surface gradients are typically highest where the surface velocities are lowest. For the purposes of structural classification, we take the nominal thickness of the upper crust to be 6 km, which approximates the average thickness of this surface waveguide.
Velocities in the mid-crust differ from place to place by as much as 20%, and the depth gradients are locally negative, indicating the presence of a mid-crust low-velocity zone (MC-LVZ) across much of the study area ( Figure 3). The extent of the MC-LVZ can be mapped as regions where the shear-velocity ratio β(8 km)/β(14 km) is greater than unity (red areas of Figure 3). Within the model area of high tomographic resolution (enclosed by the dotted line), the MC-LVZ distribution correlates with regions where pre-Cenozoic magmatism has produced rocks with continental compositional affinities. The MC-LVZ is strongest in the southern Sierra Nevada, the western Mojave block, and the San Gabriel Valley (dashed box).
The α and β profiles obtained by averaging CVM-S4.26 over the entire study region show a narrow zone with positive velocity gradients at 15-17 km depth ( Figure 1). In regions where the MC-LVZ is well expressed, this Conrad-type discontinuity forms the base of the lowvelocity zone, and we use its average depth of 16 km to separate the mid-crust from the lower crust.
The white lines on the cross-sections of Figure 1 delineate the nominal depth of the Mohorovičić (M) discontinuity, z M , which we approximate by z 4.3 . The M-discontinuity depths from this proxy are generally consistent with previous studies of crustal thickness (e.g., Fliedner et al., 2000;Ichinose et al., 1996;Lewis et al., 2000;Ozakin & Ben-Zion, 2015;Richards-Dinger & Shearer, 1997;Yan & Clayton, 2007;Zhu & Kanamori, 2000). Figure 4 compares them with the synthesis of the M-discontinuity depths by Tape et al. (2012). The overall agreement is fairly good on a large scale, although there are significant local differences. For example, the M-discontinuity in CVM-S4.26 is somewhat deeper beneath the Peninsular Ranges and Eastern Transverse Ranges, whereas it is a bit shallower beneath the Sierra Nevada. The tomographic model shows some small-scale features not in the Tape et al. (2012) synthesis, such as localized regions of shallow mantle depths in the Inner Borderland and the shallow occurrence of mantle velocities in the ophiolite belt along the west flank of the Sierra Nevada. The latter agrees with Godfrey et al.'s (1997) observation that mantle velocities in the ophiolite belt occur at depths of 11-16 km, which they call a "relic oceanic Moho." The average M-discontinuity depth in CVM-S4.26 is about 30 km.

Resolving power of the tomographic image
Based on checkerboard tests and sensitivity-density maps, the nominal resolution of CVM-S4.26 in the shallow crust near the center of the study region is about 5-10 km laterally and 3-5 km vertically (Lee, Chen, Jordan, Maechling, et al. , 2014). The resolving power of the tomography degrades with depth and towards the periphery of the model area, where the sensitivity of the CVM-S4.26 dataset drops off and model bias rises owing to the co-dependencies imposed by the starting model and inversion process. The model area where checkerboard tests show the tomographic resolution at 15-km depth to be relatively good at lateral scale lengths of 15 km (Lee, Chen, Jordan, Maechling, et al., 2014, Figure F1.e) is enclosed by the dotted line in Figure 3. Structures peripheral to this area can be inherited from the starting model and are therefore less reliable. The lack of an MC-LVZ in Proterozoic Terranes of western Arizona, for example, may reflect an inherited model bias unmodified during the CVM-S4.26 inversions owing to the lack of data coverage in the eastern corner of the model domain.
CVM-S4.26 also exhibits inversion artifacts that reduce its effective resolving power. Regular arrays of "bubble-like" features with horizontal spacings of 30-40 km are prominent in the velocity maps of Figure 2, especially at greater crustal depths, and these arrays are aligned with the CVM-S4.26 model grid. They exhibit Gibbs-like oscillations in the perturbation amplitudes, evidently due to a high-wavenumber cutoff applied to the 3D inversion operator. To a certain degree, these artifacts can be deconvolved by eye, and their effects on the regionalizations are reduced by the vertical averaging intrinsic to the cluster analysis. Nevertheless, they further compromise the reliability of the small-scale features of the regionalizations.

Cluster analysis
The CVM-S4.26 spatial grid comprises N T = 1,523,712 geographic nodes. The K-means cluster analysis divides the total set X T = {x n : n = 1, 2,…, N T } into K subsets (clusters), X k , k = 1, 2, …, K, that are non-empty, disjunct, and complete. Each cluster comprises N k profiles, where N k > 0 and ∑ K k¼1 N k ¼ N T . In the K-means analysis, the number of clusters, K, is fixed a priori.

Profile definitions
The criterion for cluster membership is based on the similarity of a velocity profile to the mean profile of a cluster. At the n th geographic point, CVM-S4.26 specifies P-wave and S-wave velocity profiles as 100component vectors, ; z ¼ 0:0; 0:5; 1:0; …; 49:5 km: The concatenation of these two vectors is a 200-component vector denoted by The factor of ffiffi ffi 3 p balances the weighting of profile differences according to a Poisson-solid scaling of the two seismic velocities. The mean profiles of the k th cluster, the cluster centroids, are The profile variations about these means are In this notation, α T ð Þ , β

Algorithm
We perform a partitional clustering analysis with the K-means function built into MATLAB®, which is based on the K-means++ algorithm of Arthur and Vassilvitskii (2007). At each step in this iterative procedure, a candidate profile is assigned to the cluster whose mean is closest to the candidate. We measure this closeness by the Euclidean distance, In K-means analysis of the γ profiles, the location x n is assigned to the cluster X k if k is the cluster index that minimizes Δγ k ð Þ n ; i.e., The clustering can either be supervised or unsupervised. In the supervised case, we specify initial values of the K cluster centroids. In unsupervised clustering, the K-means++ algorithm initializes the centroids sequentially, spreading them out by choosing the k th centroid from profiles assigned probabilities according to the squared distance from the (k − 1) previously chosen centroids. At each step, each profile is assigned to its closest centroid, and, after all assignments are made, the centroid for each k is recomputed. This procedure is iterated to convergence. In the case of unsupervised learning, we checked that the resulting minimum was global rather than local by running a number (10-100) of independently seeded iterations. For K ≤ 10, most converged to the same minimum of the Euclidean objective function. The γ regionalization of order K is the final set of clusters resulting from this procedure, and Δβ k ð Þ n produce regionalizations RK α and RK β that generally differ from RK γ .

Results
The three types of regionalization, RK α , RK β , and RK γ , were computed from CVM-S4.26 for K up to 10. The maps and centroid profiles for RK α and RK β are plotted for K = 3-10 in the Supplementary Information (S.I.). The RK γ regionalizations, which equally weight the α and β profiles, are displayed for K = 4-6 in Figure 5.
A simple measure of the efficacy of the cluster separation is variance reduction. The summed variance of a concatenated profile γ n about the mean profile γ k ð Þ is Δγ k ð Þ n 2 . We define δ k ð Þ n to be the membership indicator that assigns the n th profile to the k th region: Then, the summed variance over the K regions is Equation (9) is the objective function minimized by the K-means++ algorithm. Figure 6 plots the total variance for K = 1-10, normalized by the total (unregionalized) variance V 1 . The R3 γ regionalization reduces the variance by 42%. The reduction increases to 61% for K = 7 but flattens off thereafter, reaching only 66% for K = 10. RK α and RK β show similar curves, although the variance reduction of the latter is somewhat smaller, about 64% at K = 10.

R7 γ regionalization
One of the most informative results is R7 γ , the K-means regionalization of concatenated profiles for K = 7. Each of the seven regions can be associated with a recognized physiographic province (Table 1). These associations typify the regions and provide mnemonically useful regional labels ( Figure 7). The R7 γ centroid profiles, α k ð Þ z ð Þ and β k ð Þ z ð Þ, show correlated regional features ( Figure 8) that are further described in §4.1.
The associations among sedimentary basins and among plutonic provinces are geologically coherent. The regions comprise widely separated areas, some of which are physiographic provinces in their own right. The region associated with the Salton Trough, ST, also includes the Los Angeles basins, whereas GV, associated with the Great Valley, also includes the Santa Maria and Santa Barbara basins. The region associated with the Sierra Nevada, SN, includes the  eastern Transverse Ranges, whereas PR, associated with the Peninsular Ranges, includes the western Sierra Nevada. The SN-PR boundaries on the west side of the Sierra Nevada and the northeast side of the Peninsular Ranges approximate the previously recognized division of the batholiths into eastern and western zones (Chapman et al., , 2016Gromet & Silver, 1987;Jiang & Lee, 2017;Kimbrough et al., 2015;Sharman et al., 2015). The MD region, which includes most of the Mojave Desert but also southern parts of the Coast Ranges, becomes more consolidated if the right-lateral displacement across the San Andreas Fault is restored. The region comprising most of the Continental Borderland, CB, is almost exclusively the off-shore area of attenuated crustal thickness, including only a few insignificant bubblelike areas in the continental interior. These and other associations, described in §4.2, support the geologic plausibility of the regionalization.

Clustering verification and validation
We have explored the robustness of the low-order K-means regionalizations through numerical experiments and comparisons with other observations. Here we describe six robustness properties documented below and in the Supporting Information (S.I.): 1. The clustering results are insensitive to the cluster initialization. Initializations using randomly selected profiles as cluster seeds almost always converged to the same regionalization. As illustrated in Figure S1, experiments using supervised clustering recovered nearly identical regionalizations as unsupervised clustering. 2. The unsupervised regionalizations for K ≤ 7 from the full CVM-S4.26 study area show good agreement with those obtained from a reduced model domain that excluded the outer one-third of the study area ( Figure S2). 3. The agreement between RK α and RK β is consistently good for K ≤ 7 (cf. Figures S3 and S4). The minor differences are largely confined to small-scale features associated with region boundaries or the bubble-like inversion artifacts ( Figures S5-S7). 4. The regions defined by the clusters are multiply-connected, but they are largely composed of geographically coherent areas recognizable as distinct physiographic provinces. A given region typically comprises multiple provinces of similar physiography and geology. 5. Region boundaries often conform to major faults, topographic fronts, and geochemical transitions. 6. The progression of regionalizations with K generates a coherent set of structural refinements; each increment of K introduces a new region typically by partitioning a larger region into two smaller regions without reorganizing the other preëxisting regions, or by occupying a transition zone between two regions.
Because the number of regions K is finite and fixed, the objective function must decrease in every step and the K-means algorithm is thus guaranteed to converge to a local minimum. This minimum may not be global, however, owing to the nonlinearity of the classification scheme. Robustness properties (1) and (3) indicate that the global minima are achieved by the CVM-S4.26 regionalizations at the small K values investigated here.
We assessed the robustness of the tomographic clustering to the inherited model bias described in §2.2 by reducing the model domain. Perturbations to tomographic starting models obtained by quadratic optimization tend to be relatively weak and smooth in areas of poor data coverage. If the starting model is also smooth, then the profiles from these areas will be similar, which can bias the cluster centroids. In the case of CVM-S4.26, the poorest tomographic resolution is along the periphery of the model area. We therefore  Table 2. recomputed R7 γ over a domain that excluded the outer 50 km of CVM-S4.26, which reduced the total model area by 30% and eliminated most areas of poor tomographic resolution. As shown in Figure S2, the regionalization obtained from this trimmed tomographic model is very similar to that from the complete model (robustness property 2).
The similarity between two clustering results of the same order, RK 1 and RK 2 , can be measured by the fraction of profiles assigned to the same region, which defines an agreement score 0 ≤ a(RK 1 , RK 2 ) ≤ 100%. By this measure, we obtain a > 99.9% for the supervised and unsupervised R7 γ regionalizations of Figure S1, and a = 89.6% for the trimmed and untrimmed R7 γ regionalizations of Figure S2. In the latter case, a map of the inconsistent nodes shows that the largest area of disagreement involves the MD-PT regional boundary in the eastern Mojave Desert ( Figure S2c). Model trimming rotates this boundary westward, towards the Eastern California Shear Zone. Other areas of disagreement are largely confined to narrower zones distributed along region boundaries, particularly near the western corner of the model domain, where the regional coherence is relatively low.
Property (3) demonstrates the consistency of the structural information contained in the α and β profiles of CVM-S4.26. Lee, Chen, Jordan, Maechling, et al. (2014) inverted for both Lamé parameters, allowing the two velocities to vary differently in matching the data. These data included phase and group delays of wave types with different polarizations on all three displacement components, some sensitive primarily to β (e.g., S phases and Love waves) and others sensitive primarily to α (e.g., P phases and Rayleigh waves). Owing to correlations inherited from the starting model, as well as those imposed by the inversion operators, the α and β profiles are by no means statistically independent, but the derived Poisson ratios do contain structural information, especially in the central part of the model area where the tomographic resolution is highest. For example, averaging profiles over an area in the western Mojave Desert, Lee, Chen, Jordan, Maechling, et al. (2014, Fig. 10) found that Poisson's ratio σ increased by about 8% in the MC-LVZ (12-16 km depth) where α and β decreased by about 11%, suggesting that fluids may play a role in the development of this low-velocity zone.
The K = 7 agreement scores are Owing to the built-in correlation of RK γ with both RK α and RK β , the agreement score a(R7 α , R7 β ) must be less than both a(R7 α , R7 γ ) and a(R7 β , R7 γ ). The agreement scores among the three regionalization types are plotted as a function of K in Figure 9. They hold steady up to K = 7, hovering around 90% for the α-β comparison and around 95% for the α-γ and β-γ comparisons. We can see from the disagreement plots in the Supplementary Information (Figures S5-S7) that R7 γ negotiates a good compromise, assigning about half of the α-β discrepant points to the same region as R7 α and about half to the same region as R7 β . This explains the two approximate equalities we observe in Figure 9 for K ≤ 7: a(RK α , RK γ ) ≈ a(RK β , RK γ ), and a(RK α , RK β ) ≈ a(RK α , RK γ ) + a(RK β , RK γ ) − 1. Most of the discrepancies are along region boundaries or associated with bubble-like artifacts ( Figure S3).
Properties (4) and (5) are validations of the regionalization scheme in the sense that no physiographic provinces, faults, or topographic features were explicitly represented in the tomographic model , Lee, Chen, Jordan, Maechling, et al., 2014Lee & Chen, 2016, 2017. Basin structures of the Los Angeles region and the Salton Trough constrained by near-surface observations were built into the CVM-S4 starting model , but the CVM-S4.26 inversions modified these structures Figure 9. Agreement percentages between pairs of CVM-S4.26 profile regionalizations as a function of K. Blue curve is a(RK α , RK β ), red curve is a(RK α , RK γ ), and green curve is a(RK β , RK γ ).
where necessary and increased the area of deep sedimentary basins (z 2.5 > 1 km) by a factor of about three (Lee & Chen, 2016). Tomographic correlations with structures known from other information are illustrated in Figure 7, where the R7 γ regions show strong associations with major physiographic and geologic provinces. The spatial relationships are further investigated in §4.2 and §5.
The coherence of the structural refinements (property 5) supports the robustness of the regional classification. For K = 3, the cluster analysis identifies one region (red) comprising the major sedimentary basins; another (light green) extending along the batholithic axis from the Peninsular Ranges across the Mojave Desert and up through the Sierra Nevada; and a remainder (light blue) that includes two large subregions, the off-shore Continental Borderlands and the Proterozoic Terranes of eastern California and western Arizona (Condie, 1982). From the maps in Figure 5 and the S.I., we can see that, with each increment of K from 3 to 7, the cluster analysis for all three profile types, α, β, and γ, reorganizes the regionalizations primarily by partitioning a large region into two smaller regions or by differentiating a transition zone between two regions. In the broad-brush terms, these partitions can be sketched as follows: • R4 differentiates the Continental Borderlands (CB, light blue) from the Proterozoic Terranes (PT, dark blue), extending the latter into the eastern Mojave Desert. The α-β agreement declines for K > 7, dropping to 84% at K = 8 and to 72% at K = 9 (Figure 9). The disagreements are mapped in Figures S3-S5. The areas of α-β disagreement for R7 are small and scattered, but those for R8 and R9 are not. Both R8 α and R8 β introduce a new region that includes the Santa Maria Basin and the off-shore Santa Barbara and Ventura basins; however, in the case of R8 α , this new region also includes a transition zone between the Mojave and Proterozoic Terranes. In R9, the areas of α-β disagreement We will have more to say about these physiographic and geologic correlations in §5.

Regionalized structure of Southern California
The application of the K-means++ clustering algorithm to CVM-S4.26 generates a coherent sequence of regionalizations that can be associated with recognized physiographic and geologic provinces of Southern California. The signature of each region is its centroid velocity profile γ k ð Þ z ð Þ. Here we characterize these mean profiles by layer averages and investigate the regional isostatic balance implied by the inferred density profiles.

Properties of the R7 γ velocity profiles
To compare the regionalized velocity profiles, we obtained structural properties of the upper, middle, and lower crust by taking averages of the R7 γ centroid profiles α k ð Þ and β k ð Þ over the depth ranges 0-5 km, 10- Þ from these layer averages. We defined the regional depth of the M-discontinuity, z k ð Þ M , as the model depth closest to β k ð Þ z ð Þ ¼ 4:3 km/s, discretized in 0.5 km increments, and we estimated the mean elevation h (k) for each region using the ETOPO1 digital elevation model (Amante & Eakins, 2009). These regionalized properties are listed in Table 2.
The layer values of σ are plotted against the averages of α in Figure 11. Poisson's ratio is highest (0.31) in the sediment-dominated upper crust of the GV region. A similar anomaly is not observed in the ST upper crust, which is also sediment-dominated, suggesting that the GV value may be biased by near-surface structural complexity inherited from the CVM-S4.26 starting model. For all other regions, however, Poisson's ratios are typical of crustal rocks (0.24-0.27) (Brocher, 2005;Christensen, 1996). Among regions for a fixed layer, as well as among layers for a fixed region, the seismic velocities change in approximately equal Table 2 Regionalized structural properties of R7 γ .

Mean Elevation
Upper Crust (0-5 km)* Middle Crust (10-15 km)* Lower Crust (20-25 km)* proportions; hence, Poisson's ratio remains fairly constant across the α spectrum. In the two batholithic regions, SN and PR, α goes up by 0.06 km/s and σ by 0.015 from the middle to lower crust, consistent with a decrease in quartz content and an increase in anorthite content (Christensen, 1996). Similarly, the differences between the SN and PR lower crusts, Δα= 0.38 km/s and Δσ= -0.017, would be explained if the former were more plagioclase-rich and the latter more pyroxene-rich, consistent with the higher velocities and more mafic character of the PR region. These variations in σ are small, however, and their statistical significance is questionable.

M-Discontinuity Depth
The scatter plots of Figure 12 show that the lower-crustal compressional velocities are positively correlated with the mid-crustal compressional velocities and negatively correlated with M-discontinuity depths. The shear velocities display similar relationships, as implied by Figure 11. The first suggests that the loci of the mid-crust and lower crust on the felsic-to-mafic compositional spectrum are positively correlated; i.e., a more mafic lower crust is likely to be overlain by a more mafic mid-crust. The second indicates that the crust is thinner in regions with more mafic lower crust.

Isostatic model
The property correlations in Figure 12 (1)), and we calculated the   Table 2. Region labels and colors are the same as in Table 1. Colored dashed lines highlight regional comparisons made in the text. area-weighted averages [ρ uc , ρ mc , ρ lc ]. We assumed that the densities are constant within a layer, in which case the layer mass excesses of the k th region relative to the average are, We used the layer thicknesses Δz uc = 6 km, Δz mc = 10 km, and Δz lc = 14 km, the elevations h (k) and Mdiscontinuity depths z k ð Þ M given in Table 2, and an average upper mantle density ρ um ¼ 3; 200 kg/m 3 . The last term in equation (10) corrects the upper crustal mass for the fractional area of water f k ð Þ w of average elevation h k ð Þ w (a negative number) and density ρ w = 1,000 kg/m 3 . We only applied this correction to the CB region, which is largely offshore, where we obtained the region averages f CB um Δz um , we can then express isostatic balance as an equation for the regional upper-mantle density anomaly, Figure 13 plots the second term on the right side of equation (13) against the first term (in brackets); the antidiagonal lines indicate the inferred mantle density anomalies.
The strong anticorrelation of the region values in Figure 13 (Pearson correlation r = − 0.92) indicates that the lower-crustal mass differences are approximately balanced by those in the upper and middle crust. We see that the regions can be ordered by their projections onto the antidiagonal Δρ um = 0, which range from CB (heavy lower crust) to SN (light lower crust). The fact that this isostatic ordering disperses the regions along the anti-diagonal demonstrates that the diagnostic differences among CVM-S4.26 profiles-i.e., those that govern the relative positions of cluster centroids in γ-space-are not simply distinctive features confined to the upper crust (e.g., deep sedimentary basin) or the lower crust (e.g., shallow M-discontinuity depth). Instead, tectonic processes, balanced by isostasy, have produced crustal columns with vertically correlated regional properties (e.g., light upper crust implies heavy lower crust). The region-specific correlations of properties across the entire crustal column can be seen directly in the centroid-profile variations of Figure 8. These long-wavelength signatures are easily resolvable by the full-3D tomography, which helps to explain the robustness of the clustering results.
The upper-mantle density anomalies computed assuming isostatic balance vary from −62 kg/m 3 for the CB region to +53 kg/m 3 for the GV region, which lie within the range of upper-mantle density anomalies inferred from Kaban & Mooney's (2001) gravimetric analysis. For example, the positive anomaly we obtain for the SN region (+40 kg/m 3 ) is roughly consistent with the high upper mantle densities Kaban & Mooney (2001) found for the eastern zone of the Sierra Nevada batholith.

Physiographic and geologic correlations
A full interpretation of the regionalization results in the context of California tectonics is beyond the scope of this article (not to mention the competence of its authors). Here we focus on some key features of R7 γ that can be correlated with previously described geological structures of known tectonic significance. Figure 14 delineates a subset of the regional boundaries, labeled L1 through L13, that we judge to be robust features of the γ regionalizations. The boundary segments remain in approximately the same location when the R7 γ regionalization is applied to a reduced area that excludes the outer 50 km of the model domain, where the tomographic resolution is poor ( Figure S2). Moreover, once a boundary segment appears, it remains a feature of the regionalization as the number of regions increases to at least K = 10 ( Figure 10). One exception to these generalizations is L5, the MD-PT boundary, where the eastern side of the MD region rotates towards the Eastern California Shear Zone in the trimmed R7 γ regionalization and also in R10 γ . As discussed below, this rotation likely reflects a relatively wide MD-PT gradient zone in the eastern Mojave Desert.
The geographical endpoints of the line segments are listed in Table S2. In this section, we will explore the relationship of the 13 boundary segments in Figure 14 to faults and other structural transitions inferred from geological, geochemical, and geophysical mapping.

Deep sedimentary basins: GV & ST
Regions GV and ST are characterized by thick sediment columns that form near-surface waveguides with low velocity averages well constrained by the tomography. As noted in §4.1, GV has artificially high surface velocities inherited from the CVM-S4.26 starting model that may contribute to the regional differences. Aside from this near-surface artifact, the upper-crustal average velocities of the two regions are nearly the same (Figure 12a). However, the average mid-crustal and lower-crustal velocities of ST are higher and its M-discontinuity is shallower (Figure 12c).
Similar differences in lower-crustal structure explain the classification of the Santa Maria and Santa Barbara basins as GV and the Los Angeles basins as ST. The near-surface artifact of the GV region is absent in the  Table 2. The abscissa is the combined mass excess of upper and middle crust, and the ordinate is the mass excess of the lower crust, both expressed in upper mantle density units (equation (13)). Diagonal lines plot equal values of the upper-mantle density anomaly ρ um . Colored dashed lines highlight regional comparisons made in the text.
Santa Maria Basin (Figure 1e, profile 1). The association of these two provinces in R7 γ appears to reflect their geologic similarities; the Santa Maria Basin contains sediments of the Great Valley Group and, like the western Great Valley, is underlain by Franciscan metamorphic rocks (Chapman et al., 2016;Page, 1970). The higher mid-crustal velocities of the Salton Trough and Los Angeles basins are consistent with the emplacement of igneous and metamorphic rocks of more mafic composition during the rifting events that formed these basins (Christensen & Mooney, 1995;Godfrey et al., 2002;Persaud et al., 2016).
At the southwest side of the Great Valley, marked by L1 in Figure 14, the San Andreas Fault separates the Great Valley from the Salinian block. A regional boundary along this line is a consistent feature in all of the K-means clustering results (Figures 5, 7, 10; S.I.). L1 separates the GV region from the MD region in R7 γ . This regionalization largely classifies the Salinian block as MD within the model area where the midcrustal tomographic resolution is relatively high (Figure 7). Northwest along the axis of the Salinian block, MD transitions to PT and then to CB, but the tomographic coverage in the western part of the model area is poor and more sensitive to areal trimming ( Figure S2); for this reason, we discount the tectonic significance of this variable association.
The association of the Salinian Block with the MD region is consistent with the location of the GV-MD boundary marked by L13, which lies on the opposite side of the Salinian block, where the Nacimiento Fault separates granitic rocks from the Santa Maria Basin and its Franciscan basement (Chapman et al., 2016;Page, 1970). The Sur-Nacimiento fault has been variously modeled as a thrust fault (Ducea et al., 2009;Page, 1970), a sinistral strike-slip fault (Jacobson et al., 2011), or low-angle normal (exhumation) fault (Chapman et al., 2016). The association of the Salinian block with a coherent MD region is consistent with the pre-San Andreas palinspastic reconstruction of Chapman et al. (2016) that involves no sinistral motion and places the Nacimiento Franciscan Belt and Santa Maria Basin outboard of the Mojave Desert.
The Ventura Basin is regionalized as a narrow corridor of GV flanked by a larger area of ST. This mixed association, which can also be seen at the southern end of the Great Valley, reflects the close distance between the two centroid profiles (Figure 12). L12 is the ST-MD boundary, which is parallel to the San Cayetano- Figure 14. Regionalization map for R7 γ annotated with a subset of regional boundaries L1-L13 (dashed white lines) that delineate geologic structures discussed in the text.
Ventura-Pitas Point system of active thrust faults that separates the deep Ventura Basin from Franciscan and Mesozoic granitic basement (Hubbard et al., 2014;Rockwell, 1988). The L12 location is 15-20 km north of the active San Cayetano-Ventura-Pitas Point fault trace, but this is consistent with models in which the thrust ramp flattens into a nearly horizontal décollement at about 7 km depth and then steepens into a high-angle fault (Hubbard et al., 2014;Marshall et al., 2017). The high-angle master fault in the thrust-ramp model collocates with the major crustal boundary at L12. The boundary lies about 30 km south of the Big Pine Mountain Fault, usually identified as the southern limit of the Salinian Block (Chapman et al., 2014;Dickinson, 1996).
L6, which separates ST from PT on the northeast side of the Salton Trough, is another deep-basin boundary persistent across the range of regionalization experiments. This boundary is parallel to the southeastward extrapolation of the present-day San Andreas but offset from it by about 10-15 km to the northeast, consistent with the northeastern dip of the San Andreas Fault in the northern part of the Salton Trough (Fuis et al., 2012(Fuis et al., , 2017. Near the Mexican border, L6 accurately aligns with the surface trace of the Algodones-Altar fault zone, thought to be the main locus of the North-America-Pacific plate boundary during the Pliocene (Beard et al., 2016;Howard et al., 2015).
Several other fault-controlled edges of deep sedimentary basins correspond to region boundaries. The irregular ST-MD border on the west side of the Salton Trough (L7) roughly aligns with surface exposures of the Western Salton Detachment Fault, a major low-angle normal fault system that Mason et al. (2017) argue was active during the late-Miocene localization of the Pacific-North America plate boundary in the northern Salton Trough. The vertical alignment implied by the regionalization suggests that the low-angle detachment localized along the edge of a major structural boundary.
A major feature of R7 γ , not marked in Figure 14, is the GV-ST-MD transition that is aligned with the White Wolf Fault at the southern end of the Great Valley. This profound structural transition has a long history of thrust, detachment, and strike-slip faulting Malin et al., 1995).
In R7 γ , this batholithic dichotomy is expressed by the PR-SN cluster division, which occurs for all regionalizations with K = 7-10. The SN region comprises the eastern zone of the Sierra Nevada batholith, the central Transverse Ranges, and a narrow, broken-up zone on the eastern side of the Peninsular Range batholith. The PR region comprises the western zones of both batholiths. The integrity of the PR cluster is supported by the potential-field mapping, which shows positive isostatic gravity anomalies and positive geomagnetic anomalies congruent with both of the PR belts. The reader is referred to Figure 3 of Lee, Chen, Jordan, Maechling, et al. k2014), which plots the potential-field data from Simpson et al. (1986) and Roberts and Jachens (1999) on the CVM-S4.26 model grid.
The differences between SN and PR are also reflected in the initial strontium ratios ( 87 Sr/ 86 Sr) i measured across the batholiths. Geochemists use the initial Sr ratio to distinguish the lithospheric affinities of igneous rocks, usually associating low values (≲ 0.706) with oceanic lithosphere and high values (≳ 0.706) with continental lithosphere (Glazner & O'Neil, 1989;Kistler & Peterman, 1978;Silver et al., 1979). In the central Sierra Nevada, the PR-SN boundary marked by L3 approximates the 0.706 line used by Kistler and Peterman (1978) and Chapman et al. (2012) to separate the western and eastern plutonic zones (Figure 15a). In the Peninsula Ranges, PR-SN boundary marked by L8 lies 20-50 km west of the 0.706 line (Figure 15b), which runs along the San Jacinto Fault (Kistler et al., 2003;Langenheim et al., 2004). However, L8 accurately separates the petrologically distinct western-zone plutons from those of the eastern-zone, as mapped by Jiang and Lee (2017, Figure 1). This eastern PR border also closely tracks the distinct "Peninsular Range Batholith boundary" mapped by Langenheim et al. (2004) using gravimetric and magnetic data. The initial Sr values of rocks west of L8 are generally less than 0.705.
The centroid PR profile has higher velocities than the centroid SN profile throughout the crust (Table 2, Figure 12). The difference is about 0.4 km/s in the middle and lower crust, which agrees with the more mafic compositions observed in the western plutonic zone. The mantle density anomalies of the two regions obtained from the isostatic model of §4.2 are positive and of similar magnitude, Δρ PR ð Þ um ¼ 27 kg=m 3 and Δρ SN ð Þ um ¼ 40 kg =m 3 , consistent with a denser "arclogite root", or the remnants of such a root, beneath both regions (Saleeby, 2003;Saleeby et al., 2012).
The western border of the batholith at the north end of Peninsular Ranges abuts the sedimentary basins of the Los Angeles area along the PR boundary delineated by L9. The southern part of this line coincides with the Cristianitos Fault (Ehlig, 1979;Klotsko et al., 2015), an element of the San Gabriel-Chino Hills-Cristianitos fault system that acted as the primary transform plate boundary during the late Miocene formation of the Los Angeles Basin (Ingersoll & Rumelhart, 1999). L9 also corresponds to a NNW-striking gravity gradient in Langenheim et al.'s (2006) basement gravity map.
The GV-PR boundary delineated by L2 is a strong feature of the regionalization. It lies buried beneath the Great Valley sediments, where the eastern edge of the deep San Joaquin Basin abuts the western zone of the Sierra Nevada batholith. At these latitudes, the western zone comprises the Paleozoic Kings-Kaweah ophiolite belt, which was accreted to the continent circa 250 Ma and subsequently intruded by Mesozoic plutons Saleeby, 2011). L2 marks a nearly vertical discontinuity between this western plutonic zone and the Great Valley, where forearc strata were deposited above the Middle Jurassic basement of Coast Range ophiolite (Chapman, 2017;Godfrey & Klemperer, 1998). The mantle component of this ophiolite has been juxtaposed against the western zone at mid-crustal depths, forming a high-velocity anomaly that dips gently westward and widens beneath the Great Valley (Figure 1b).

Central Transverse Ranges: SN
A large area comprising the San Gorgonio Massif, the San Bernardino Mountains, and the eastern San Gabriel Mountains persistently associates with SN in all of the regionalization experiments (e.g., light green near the center of Figures 7 & 13). The central Transverse Ranges are known to have high crustal thicknesses and low crustal velocities, comparable to those of the Sierra Nevada batholith (Figure 4; Zhu & Kanamori, 2000;Yan & Clayton, 2007). Like the Sierra Nevada, they have negative isostatic gravity anomalies, indicating crustal compositional buoyancy.
On the south side of the central Transverse Ranges, the SN region includes the topographically elevated San Gorgonio, Yucaipa Ridge-Wilson Creek, and Morongo blocks that are moving relative to one another along the Mill Creek and Mission Creek strands of the San Andreas fault system (Spotila et al., 1998). This "San Gorgonio knot" forms the largest restraining step in the San Andreas Fault (Yule, 2009) and questions remain about the mechanical role it might play in governing the length of ruptures on the southern San Andreas (Beyer et al., 2018;Heermance & Yule, 2017). The SN region comprises the Big Bear block north of the San Andreas, blocks within the knot, and also the eastern zone of the Peninsular Ranges south of the San Andreas and east of L8 ( Figure 14). This association is consistent with the potential-field modeling of Langenheim et al. (2005), who inferred that eastern-zone Peninsular Ranges basement rocks have been wedged above the basement of the central Transverse Ranges in a layered structure that extends at least 10 km south of the Banning fault segment of the San Andreas.  (2012) and Kistler et al. (2003), plotted as colored dots on R7 γ regionalization maps of (a) the Sierra Nevada batholith and (b) the Peninsular Range batholith. Dashed black lines L3 and L8 mark the boundaries between the SN (light green) and PR (dark green) regions that approximate the division of the two batholiths into eastern and western zones. PR and SN correlate with low and high ( 87 Sr/ 86 Sr) i values, respectively. Maps are plotted at the same scale.
The knot thus appears to be a complication in the San Andreas kinematics as it jumps from inboard to outboard of the eastern batholithic zone, perhaps related to the rheology of this region's thicker, more felsic lower crust (Lyakhovsky & Ben-Zion, 2009). This kinematic resistance of the Central Transverse Ranges to translation by the San Andreas Fault likely plays a role in the partitioning of San Andreas motion onto the San Jacinto Fault and the Eastern California Shear Zone (Ozakin & Ben-Zion, 2015).
The SN-MD boundary on the north side of the Central Transverse Ranges (marked as L10 in Figure 14) coincides with the North Frontal Thrust of the San Bernardino Mountains (Meisling & Weldon, 1989;Spotila & Sieh, 2000). The frontal thrust system turns south near 117.2°W, forming the west side of the Big Bear Plateau, whereas the SN region extends westward across the lower topography of the southern Mojave block. Its northern boundary with MD meets the San Andreas Fault near Palmdale. This westward extension is consistent with seismic data showing a thick, low-velocity crust north of the San Gabriel Mountains on the Mojave side of the San Andreas Fault (Zhu, 2000). Combining the LARSE-I refraction lines with receiver functions stacked at SCSN stations, Yan and Clayton (2007) concluded that "the deep root of the San Bernardino Mountains most probably extends farther to the NW, where no surficial topography high exists," consistent with the regionalization results. The topographic difference remains unexplained, however.
The SN region of the central Transverse Ranges includes the eastern San Gabriel Mountains. Given the small area of overlap, this could be spurious association. It is interesting to note, however, that the western SN boundary, delineated by L11, aligns with a NW-trending reflector that Fuis Ryberg, Lutter, and Ehlig (2001) observed in the LARSE-I data and associated with the Vincent Fault. The Vincent Fault, once identified as a possible remnant of the Laramide subduction megathrust, has been recently reinterpreted as a shallow-angle normal fault involved in the exhumation of the Pelona Schist (Xia & Platt, 2018). Schists of this type are thought to underlie the San Gabriel Mountains at mid-crustal depths, where they have been associated with bright reflectors characterized by velocity inversions (Ryberg & Fuis, 1998). These reflectors could be the top of or within an MC-LVZ containing low-velocity Pelona Schist (Fuis, Ryberg, Godfrey, et al., 2001). The MC-LVZ of CVM-S4.26 is well expressed in the San Gabriel Mountains west of the Vincent Fault but weak in the SN crust of the central Transverse Ranges (Figure 3), consistent with Fuis, Ryberg, Godfrey, et al.'s (2001) conclusion that the Pelona Schist is not present at depth along the LARSE-I transect in the Mojave Desert.

The Continental Lithospheric Domain: SN, MD, and PT
The three R7 γ regions SN, MD, and PT include almost all of the upper crust in the study region known to be composed of igneous and metamorphic rocks derived from continental lithosphere; taken together, they form the "continental lithospheric domain" of Southern California. To a first approximation, the boundary of the SN-MD-PT super-region is congruent with the Sr 0.706 line ( Figure 15; see also Chapman et al., 2014, Figure 1). The crustal properties among the three regions of this domain (connected by orange dashed lines in Figures 12 & 13) show a variation SN→MD→PT from a thicker, more felsic crust to a thinner, more mafic crust.
The upper-crustal velocities of the three regions are similar and consistent with cracked rocks of felsic composition at low-pressure (Brocher, 2005;McCaffree Pellerin & Christensen, 1998); the upper-crustal densities inferred from Brocher's (2005) relation differ by only a few percent. The lower-crustal α increases from 6.3 km/s for SN to 6.6 km/s for MD to 6.8 km/s for PT, spanning a range from felsic granulite to mafic granulite in Christensen and Mooney's (1995) compositional spectrum. The mid-crust shows the same regional progression, although the velocity differences are only about one-third as large (Figure 12b). The Mdiscontinuity depths, which decrease from 35 km for SN to 28 km for PT, are inversely correlated with the lower crustal velocity (Figure 12c). According to the isostatic model of Figure 13, the mantle density of PT is lower than that of SN and MD by about 60 kg/m 3 , consistent with either eclogitic depletion of the uppermost mantle beneath PT or eclogitic enrichment beneath SN and MD, or both.
The crustal variation in the sequence SN→MD→PT can be explained by progressive crustal thinning accompanied by basaltic intrusion and densification of the lower and middle crust (e.g., Lachenbruch et al., 1994). According to this hypothesis, all three regions started with a thick, buoyant continental crust developed in the Mesozoic along convergent margin of the California Arc (Ducea et al., 2009;Tosdal & Wooden, 2015); subsequent extension thinned and densified PT more than MD, and MD more than SN. This ordering is consistent with the observed increase in heat flow from SN to MD to PT (Bonner et al., 2003;Sass et al., 1994). The primary mechanism that drove the regional extension is thought to be the gravitational collapse of orogenically over-thickened crust Rey et al., 2001;Saleeby, 2003). Within each region, extension was accommodated by detachment normal faulting in the upper crust and ductile flow in the lower crust (Brun et al., 2018;Davis & Lister, 1988;Platt et al., 2015;Wernicke & Axen, 1988). This hierarchical regional structure is reflected in the regional boundaries internal to the continental domain. The boundaries separating SN from MD along lines L4 and L10 are both expressed as topographic fronts, where the more buoyant SN region stands higher than MD ( Table 2). As we have noted, L10 coincides with the North Frontal Thrust of the San Bernardino Mountains.
L4 is an interesting lineament. It aligns with an irregular topographic front, expressed by the topographic gradient at 1,300 m contour, which cuts SSW-NNE across the Death Valley/Panamint Valley extensional domain at a high angle to the SSE-NNW axis of the late-Cenozoic pull-apart basins (Andrew & Walker, 2009;Norton, 2011). The L4 boundary corresponds to a strong Bouguer gravity gradient that marks the transition from the thicker SN crust to the thinner MD crust (Kucks, 1999). Its geologic expression is more cryptic, but it may be associated with a northeast-trending segment of the Early Permian continental margin, where compression was accommodated by faults of the southeast-verging Last Chance thrust system (Stevens & Stone, 2005).
The boundary between MD and PT in the eastern Mojave Desert, marked by L5, runs approximately northsouth along a longitude of 115.6°. This transition is probably not a sharp boundary, as indicated by the bubble-like appearances of MD in the PT region; moreover, the model-trimming experiment of Figure S2 suggests that its location may be biased eastward by the inherited structure of the PT region. Evidence for a lithospheric boundary in the eastern Mojave has been documented by Miller et al. (2000), who found that the geochemical signatures of Miocene basalts east of about 116°W indicate an ancient (Proterozoic) enriched lithospheric source, whereas the signature of enriched-mantle lithosphere appears to be absent west of this longitude. They also note that the dominant orientation of mountain ranges changes at this longitude and that it marks the eastern limit of schist exposures.

The Schist Problem
The Pelona-Orocopia-Rand schists of Late Cretaceous/Early Cenozoic age outcrop along detachment structures beneath older crystalline rocks at a number of localities east and west of the San Andreas Fault (Ehlig, 1968;Haxel & Dillon, 1978;Chapman et al., 2016). Geologists have hypothesized that the once-continuous California Arc was disrupted by Laramide subduction of an oceanic plateau in a 500-km-wide corridor between the southern Sierra Nevada and northern Peninsular Ranges (Saleeby, 2003;Saleeby et al., 2007;Nadin & Saleeby, 2008;Chapman et al., 2012). Comparing R7 γ with Chapman's (2017) pre-San Andreas palinspastic reconstruction shows that this batholithic gap is filled primarily by crust associated with the MD region.
According to this tectonic hypothesis, flat-slab subduction erosion removed the westernmost arc and inner forearc basin, as well as the sub-batholithic mantle lithosphere, shutting down arc magmatism and thickening the overriding crust. Detritus shed into the trench from the elevated, thickened crust formed the schist protoliths, which were then underthrust beneath the extinguished arc and accreted to its lower crust. Extension associated with the gravitational collapse of the elevated crust in the Late Cretaceous initiated large-scale detachment faulting that exhumed the schists to mid-crustal levels by the early Tertiary (Chapman, 2017;Chapman et al., 2012;Jacobson et al., 2011). Miller et al. (2000) speculate that 116°W marks a decrease in the overall strength of the lithosphere owing to the presence of mechanically weak Pelona-Orocopia-Rand schists in the crustal column to the west of this longitude. The hypothesis thus predicts that the upper-crustal igneous rocks of the MD region are underlain by schists at mid-crustal depths.
As Lee, Chen, Jordan, Maechling, et al. (2014) pointed out, the presence of the schist is likely to be associated with the MC-LVZ of CVM-S4.26. The MD region shows a shallow low-velocity channel (Figure 8), but its mid-crustal compressional velocity (6.15 km/s) and Poisson's ratio (0.26) are more consistent with a granitic gneiss than a schist (Christensen & Mooney, 1995;McCaffree Pellerin & Christensen, 1998). On the regional scale, the average schist fraction of MD mid-crust is likely to be rather small. Locally, however, velocities within the channel drop to values consistent with a much larger fraction of schist. The MC-LVZ is particularly well expressed in the southern Sierra Nevada, the western Mojave block, and beneath the Los Angeles and San Gabriel basins (Figure 3). This central part of the study area is where a schist-rich mid-crust is most likely to be found.

Conclusions
Our goal has been to map the variations in the crustal structure of Southern California on horizontal scales larger than the crustal thickness but smaller than the width of the North America-Pacific plate-boundary deformation zone. In this exploratory study, we applied K-means cluster analysis to a single tomographic model, CVM-S4.26. We did not attempt to characterize the aleatory variability of the tomographic imaging (e.g., due to seismic noise) or its epistemic uncertainty (e.g., due to modeling limitations), nor did we fully assess the effects of the model uncertainties on the cluster partitioning. Cluster analysis is a machinelearning technique susceptible to various types of bias, such as the inheritance bias due to poor data coverage on the periphery of the study area. Although we have tested the robustness of the regionalizations to this particular type of bias, our understanding of the clustering uncertainties is as yet rudimentary, and some of the associations we have drawn from CVM-S4.26 may turn out to be spurious.
An obvious next step towards validating the regional distinctions mapped here will be to compare the CVM-S4.26 clusters with those derived in a similar way from other tomographic models of Southern California, such as CVM-H15.1 (Shaw et al., 2015). These two SCEC-curated velocity models display substantial differences in crustal structure. For example, CVM-H15.1 is based on the Tape et al. (2012) M-discontinuity surface shown in Figure 4a, whereas CVM-S4.26 produces the surface in Figure 4b. Comparing regionalizations from alternative crustal velocity models should help to assess how epistemic uncertainties in the underlying 3D structure translate into uncertainties in 2D regionalizations.
Despite the limitations of this exploratory study, we are encouraged by its results. The K-means cluster analysis produces regionalizations that generally conform to the large-scale physiographic provinces of Southern California, and it does a good job in recognizing major tectonic features previously identified by geological, geochemical, and potential-field mapping, active-source profiling, and tomographic imaging. Notable examples from our seven-region model R7 γ include faults bounding the deep sedimentary basins of the GV and ST regions, as well as the west-east compositional dichotomy of the California batholith, represented by PR (western zone) and SN (eastern zone).
The three regions SN, MD, and PT comprise almost all of the upper-crustal igneous and metamorphic rocks in the CVM-S4.26 study area that show geochemical signatures derived from continental lithosphere. Our geophysical regionalization is thus consistent with the continent-ocean dichotomy expressed in the hardrock geochemistry of Southern California. Lower-crustal seismic velocities within this continental lithospheric domain are negatively correlated with crustal thicknesses, yet the crustal columns of the three regions are in approximate isostatic balance. These properties are consistent with an interpretation of SN→MD→PT as a regional progression in the extensional thinning of the crust and the concomitant mafic densification of the lower crust.
All of the R7 γ regions except CB comprise multiple provinces with similar physiographic and geologic characteristics (Table 1). In some cases, the provinces have a common tectogenesis; e.g., the east-west (SN-PR) dichotomy of the Sierra Nevada and Peninsular Range batholiths, and the association of Santa Maria Basin with the Great Valley in the GV region. In other cases, the similarity of the tectonic pathways is less obvious; e.g., the association of the Los Angeles basins with the Salton Trough as distinct components of the ST region.
This raises a general caveat. Only the present-day structure of the crust can be directly observed; hence, any tomographic regionalization can only be interpreted as the net product of long-term tectonic activity. Subregions are not required to share a common tectonic evolution, only similar velocity profiles. Each region may, in principle, contain multiple provinces that evolved along different tectonic pathways. That said, inter-region associations among geologic provinces, such as those seen in Figure 7, should stimulate further scrutiny of the tectonic commonalities that might explain these associations.
One commonality is regional isostasy. From the centroid velocity profiles and mean elevations, we constructed an isostatic model and found that the regionalized crustal columns are in approximate isostatic balance ( Figure 13). The mass anomalies of the upper and middle crust correlate negatively with lower-crustal mass anomalies, and the mantle density anomalies required for exact balance are within the range inferred from petrologic data for eclogitic enrichment and depletion of the mantle lid. The dispersion of regions along the anti-diagonal of Figure 13 demonstrates that the diagnostic differences among CVM-S4.26 profiles are not simply distinctive features confined to the upper crust or the lower crust, but rather region-wide properties vertically correlated across the entire crustal column. These regional signatures are characterized by correlated variations of seismic velocities in crustal layers with dimensions comparable to the crustal thickness -a vertical scale that is easily resolved by the CVM-S4.26 inversions in areas of good wavepath coverage (Lee, Chen, Jordan, Maechling, et al., 2014). The ability of the full-3D tomography to distinguish these nearly isostatic crustal signatures helps to explain the robustness of the regionalization results.
Well-constrained regional boundaries obtained from a whole-crust profile analysis match major faults, topographic fronts, and geochemical transitions mapped at or near the surface. These results indicate that major lateral transitions in the crustal structure of Southern California are fairly narrow features that are nearly vertically aligned throughout the crust. In some cases, the vertical alignment is to be expected, such as along the strike-slip fault boundaries of the deep sedimentary basins (e.g., L1 and L6). The near verticality of other boundaries is perhaps more surprising, such as the SN-MD boundaries marked by L4 and L10. As we have shown, potential-field modeling supports the geographic localization of these transitions.
Therefore, in Southern California, the remarkable alignment of regional boundaries with mapped surficial features suggests that the horizontal scale of the regional transitions is typically on the order of, or less than, a crustal thickness. We conclude that the crustal boundaries delineated in Figure 14 are high-angle structures marking relatively narrow transition zones between the characteristic crustal columns of the juxtaposed regions. One exception to this generalization is the MD-PT transition marked by L5, which may extend over a zone in the eastern Mojave several times wider than the crustal thickness.