Landslide susceptibility maps indicate the spatial distribution of landslide likelihood. Modeling susceptibility over large or diverse terrains remains a challenge due to the sparsity of landslide data (mapped extent of known landslides) and the variability in triggering conditions. Several different data sampling strategies of landslide locations used to train a susceptibility model are used to mitigate this challenge. However, to our knowledge, no study has systematically evaluated how different sampling strategies alter a model's predictor effects (i.e., how a predictor value influences the susceptibility output) critical to explaining differences in model outputs. Here, we introduce a statistical framework that examines the variation in predictor effects and the model accuracy (measured using receiver operator characteristics) to highlight why certain sampling strategies are more effective than others. Specifically, we apply our framework to an array of logistic regression models trained on landslide inventories collected at sub-regional scales over four terrains across the United States. Results show significant variations in predictor effects depending on the inventory used to train the models. The inconsistent predictor effects cause low accuracies when testing models on inventories outside the domain of the training data. Grouping test and training sets according to physiographic and ecological characteristics, which are thought to share similar triggering mechanisms, does not improve model accuracy. We also show that using limited landslide data distributed uniformly over the entire modeling domain is better than using dense but spatially isolated data to train a model for applications over large regions.

Key Points

We use a statistical framework to investigate the influence of data sampling strategies on landslide susceptibility model performance
The framework shows that the predictor data effects on output probability vary drastically with the sampling strategy used
The best sampling strategy we evaluate uses landslide data sampled uniformly from the entire modeling domain

Plain Language Summary

Landslide susceptibility maps show which areas in a region are more prone to landsliding than others. These maps are created from attributes of mapped landslides. The variation in landslide attributes and amount of landslide data required makes it difficult to map landslide susceptibility accurately over large regions. It is unclear whether any previously proposed methods to overcome these difficulties produce accurate susceptibility maps. Here, we develop a framework that evaluates the effectiveness of the following methods: using landslide data sets from only a few locations where data are readily available, applying models only to regions presumed to have landslide attributes similar to the regions used to develop the models, or gathering a few uniformly distributed (i.e., spread approximately equally) landslide data points. We show that the wide variation in landslide attributes over large regions reduces the accuracy of landslide susceptibility models that are developed using data from only a few locations. Restricting model application to regions with presumed similar attributes does not improve model performance. However, using a limited landslide data set that covers the entire region produces accurate susceptibility maps.

1 Introduction

Landslides (defined here as any form of mass wasting, including debris flows, rock falls, or rotational slides) occur naturally across the world and cause substantial losses in life, infrastructure, property, and economies (Froude & Petley, 2018; Kirschbaum et al., 2015; Mirus et al., 2020; National Research Council, 1985; Varnes & IAEG Comission on Landslides, 1984) with the highest disaster risk occurring among the world's most vulnerable populations (Hallegatte et al., 2017). As the climate continues to change, the frequency and intensity of severe weather events are expected to increase in some parts of the world (Pendergrass & Knutti, 2018), which may result in increased landslides and associated losses (Froude & Petley, 2018; Haque et al., 2019; IPCC, 2019; Kirschbaum et al., 2015). As such, resources from many countries have been devoted to studying landslides and mitigating future losses. Products from these studies include hazards maps (e.g., Kirschbaum & Stanley, 2018; Micu et al., 2023; Nowicki Jessee et al., 2018), susceptibility maps (e.g., Crawford et al., 2021; Huang et al., 2020; Hughes & Schulz, 2020), early warning systems (e.g., Baum & Godt, 2010; Guzzetti et al., 2020), and emergency response plans (e.g., Godt et al., 2022; Wooten et al., 2017). Although these efforts have improved our knowledge and response to landslides, additional work would be beneficial to further reduce landslide effects.

Susceptibility maps provide critical information on the spatial pattern and likelihood of landslide occurrence given the local terrain conditions (Reichenbach et al., 2018). In contrast, hazard maps quantify the timing and magnitude of landslides and risk maps measure the expected losses from landslides. Regional-scale or larger (>1,000 km²) maps are fundamental for mitigating future losses from landslides by providing uniform information across scales relevant for land management, planning, infrastructure, and emergency response decisions (Godt et al., 2022). Several methods for categorizing landslide susceptibility have been used including geomorphic mapping, heuristic methods, physically based methods, and data-driven statistical models (Reichenbach et al., 2018). Statistical methods are generally preferred when assessing landslide susceptibility over large regions due to the methods' ability to provide estimates without the prohibitively detailed and extensive data necessary for the parameterization and evaluation of physically based methods. Statistical methods facilitate leveraging large and complex data sets that are often incomplete while outputting accurate results (Korup & Stolle, 2014). These models require the attributes (e.g., slope, soil thickness) of the modeling domain (i.e., area where the model is applied) and landslide inventories that identify areas with geomorphic evidence of landsliding. The models output probabilities that indicate the relative level of landslide susceptibility within the modeling domain by estimating the probability of the location containing a mapped landslide. Since the 1980s, hundreds of papers have been published that evaluate the use of statistical models (Reichenbach et al., 2018). Common types of statistical models include logistic regression (Budimir et al., 2015), random forest (Chen et al., 2017; Tanyu et al., 2021; Trigila et al., 2015), generalized additive models (Bordoni et al., 2020; Steger et al., 2022), and deep learning (Thi Ngo et al., 2021). Often, the overall accuracies among these model types are comparable (Chen et al., 2017; Pradhan, 2013; Reichenbach et al., 2018; Trigila et al., 2015; Wang et al., 2019; Youssef et al., 2016). However, logistic regression is the most commonly used due, in part, to its ease in programming and simplicity (Steger et al., 2016, 2017).

Most landslide susceptibility models (LSSMs) are, by design, dictated by input data. Here, LSSMs are referring exclusively to data-driven statistical methods. This reliance presents a fundamental challenge when trying to apply LSSMs to regional scales or greater due to the chronic lack of consistent, accurate, and representative landslide inventories over the entire modeling domain. The landslide inventories used to train the models should outline areas within the modeling domain with geomorphic evidence of landsliding. Despite the proliferation of new automated landslide mapping techniques that use increasingly available remote sensing data (e.g., Benz & Blum, 2019; Ghorbanzadeh et al., 2021; Nagendra et al., 2022), landslide inventories required for LSSMs are still lacking over most of the world.

Previous attempts to create susceptibility maps over large areas have used several methods to model regions with little or no landslide data (Von Ruette et al., 2011). Van Den Eeckhaut et al. (2012) and Broeckx et al. (2018) created susceptibility maps over the European and African continents, respectively, using an inventory of landslide and non-landslide (i.e., no geomorphic evidence of landsliding) locations they describe as uniformly distributed in space. That is, the coverage of landslide locations is approximately equal across the area of interest. Van Den Eeckhaut et al. (2012) used an inventory of 1,340 landslides and Broeckx et al. (2018) used an inventory of 18,050 landslides. Although the inventories used were admittedly limited in terms of landslides per unit area, the authors argued that a uniformly distributed landslide inventory with samples from a range of different environments (i.e., across the entire modeling domain) creates an accurate LSSM. Stanley and Kirschbaum (2017) developed a heuristic fuzzy logic approach for modeling susceptibility on a global scale. The fuzzy logic approach combines heuristic assumptions about the effects of different predictors on susceptibility with the measured effects derived from the available landslide inventories to make an LSSM. A predictor is an environmental attribute (e.g., slope, soil thickness) used by the LSSM to evaluate landslide susceptibility. A predictor effect measures the change in the model outcome (i.e., the probability of landslide occurrence) due to a change in the predictor value and is often referred to as weights or coefficients, depending on the model used. The fuzzy logic method, in theory, helps overcome some of the data shortage problems by forcing the LSSM to include the expected effects of the predictors. The landslide inventory used by Stanley and Kirschbaum (2017) included one globally distributed database of 1,194 landslides derived largely from media and citizen scientist reports and eight higher-density inventories totaling 61,704 landslides mapped over targeted areas around the world that included individual U.S. states and a hurricane-affected area. Hervás (2007) and Hervás et al. (2010) outlined a framework for using a heuristic index-based approach to evaluate landslide susceptibility on regional scales with limited data. The index-based approach requires a user to assign a relative weight to each predictor based on their assumed effects on susceptibility. The user may then either use the purely heuristic weights or adjust them by evaluating their effectiveness on the available landslide data using different methods (e.g., analytic hierarchy process). As of 2018, the index-based approach was the second most used model for landslide susceptibility, comprising 29% of all publications on the subject (Reichenbach et al., 2018). Many authors have tried to refine this approach by defining different subregions of the mapping area of interest according to physiographic or climate attributes (Bălteanu et al., 2020; Günther et al., 2013; Malet et al., 2009; Wilde et al., 2018). By subdividing the modeled domain according to these attributes, the predictor effects may be better constrained due to similar triggering mechanisms and are expected to result in more accurate LSSMs. Despite these refinements, the index-based models remain largely dependent on expert opinion to determine susceptibility and may not account for variations in landslide characteristics obtained from data-driven approaches. Notwithstanding the variable approaches for building LSSMs with sparse and incomplete landslide inventories, the relative effectiveness of these approaches is still an open research question.

The purpose of these different methods for creating susceptibility maps is to obtain a representative model that accurately locates areas of mapped landslides over the domain of interest. For data-driven techniques, the level of model representation is largely dictated by the selection of landslide location data used to train the model (i.e., data sampling or model training strategy). A few notable studies have explored the impacts of different sampling strategies in detail. Tanyas et al. (2019) examined 25 earthquake-induced landslide inventories from around the world to evaluate which inventories were representative of others using logistic regression model performance metrics and k-means cluster analysis on the predictor data sets from each inventory. They show that the level of representativeness of a given event inventory varied across the other inventories and that grouping data sets by predictor similarities (k-means) did not significantly improve their representativeness. Petschko et al. (2013) analyzed the effects of dividing the study domain into distinct homogeneous subdomains with geological similarities for creating a regional-scale susceptibility map. They demonstrate that the most influential predictors were inconsistent between models trained on data from the different subdomains. This suggests differences in the level of similarity in landslide characteristics between each subdomain. However, the authors do not explore this phenomenon any further. Additionally, Kornejady et al. (2017) explored the differences between using random sampling and a Mahalanobis distance-dependent sampling strategy (Tsangaratos & Benardos, 2014). This technique prevents a purely random selection of data for model training and testing, instead creating a training data set that increases the variance in the landslide predictor values compared to most random sampling strategies. This approach improved the model's representativeness as reflected in an increase in model performance compared to random sampling. Despite this work, questions persist about the best sampling strategies to use over large regions and why some techniques perform better than others. The lack of consensus hinders the formulation of a consistent sampling strategy.

Validation procedures of susceptibility models generally use receiver operator characteristics (ROCs) (Ayalew & Yamagishi, 2005), qualitative evaluation based on expert opinion (Bălteanu et al., 2020), and comparisons to subregional-scale maps (Van Den Eeckhaut et al., 2012). The latter two methods are qualitative, preventing objective analysis of the accuracy of the LSSMs. The ROCs provide reproducible estimates of the effectiveness of the model at fitting the available data. However, these metrics are often measured without an objective evaluation of how factors within the model affect its outputs (Budimir et al., 2015; Reichenbach et al., 2018).

The purpose of this study is to evaluate how to improve susceptibility model performance over broad geographic regions when extrapolating to areas with limited landslide data. We do this by employing a statistical framework (workflow) that examines the predictor effects within LSSMs, which allows us to better understand how different sampling strategies affect the model outputs. This approach provides a more detailed understanding of the impact of different input data on model behavior and performance over previous efforts at evaluating data sampling strategies. We carry out several illustrative experiments to help determine the effectiveness of different sampling strategies for modeling susceptibility over large and diverse regions. The results of this study can help determine the best practices and mitigate the misrepresentation of landslide susceptibility that results from models with low accuracies.

2 Methods

We employ a framework for understanding LSSM output and evaluating different sampling strategies for characterizing landslide susceptibility over large and diverse terrains. We compile landslide and predictor data from four regions across the United States with varying degrees of environmental similarity. Our data preprocessing steps are outlined in Figure 1 and follow the recommendations of previous work (e.g., Budimir et al., 2015; Chang et al., 2019; Ozturk et al., 2021; Segoni et al., 2020). After processing the data, LSSMs are created using Bayesian logistic regression (Das et al., 2012). Implementing the common logistic regression model within a Bayesian framework incorporates prior information to constrain the logistic coefficients and allows the explicit treatment of the uncertainty in the model's predictor coefficients and probability output (Korup, 2021; Korup & Stolle, 2014). After the different LSSMs are trained, we use a comparison of logistic regression coefficients to measure the predictor effects derived from the logistic regression LSSMs. We also estimate the ROC area under the curve (AUC) metric for each LSSM when applied to different data sets. Comparing the results of the variation in predictor effects between the models with the estimated ROC-AUC values helps determine why different sampling strategies used on the LSSMs produce more accurate results.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Workflow of landslide susceptibility mapping.

We use our framework to carry out an array of experiments that illustrate three commonly used landslide data sampling techniques. First, we study the effects of applying a susceptibility model trained on data collected across diverse regions to areas with no data. We simulate this by training LSSMs on all the landslide inventories except one and testing the LSSM on the omitted site and repeating so that each site is left out once. Second, we test if limiting model development and application to regions with shared physiographic and ecologic characteristics improves model performance. Our selected landslide data sets include inventories with varying degrees of ecological and physiographic similarity. By training models on a single inventory and testing it on another inventory with shared ecological and physiographic properties, we will be able to determine whether restricting model training and application based on these attributes improves model performance. Third, we evaluate the effectiveness of using a limited but uniformly distributed landslide inventory to develop LSSMs. That is, we intentionally do not include every known landslide within an area (limited) but randomly sample from a compilation of all landslides across the study areas (uniformly distributed) using a Mersenne-Twister random number generator (Matsumoto & Nishimura, 1998). To do this, we train a model using 5% of all the available landslide data from the compiled inventories to train the model and then test the model on the remaining 95% of the data. All the data processing and modeling are carried out in ArcGIS Pro (Esri Inc., 2021) and R (R Core Team, 2016).

2.1 Physiographic and Ecological Divisions

We use ecological regions (ecoregions) and physiographic provinces to subdivide the continental United States for susceptibility model tests (Figure 2). Level II ecoregions divide North America into 50 ecologically distinct areas at the subcontinent scale (Omernik & Griffith, 2014). Ecoregions are identified by analyzing spatial patterns of factors that affect the local ecosystem. Factors include geology, landforms, soils, vegetation, climate, land use, wildlife, and hydrology. The 25 physiographic provinces of the contiguous United States group areas with common topographic and geologic characteristics (Fenneman & Johnson, 1946). In contrast to ecoregions, physiographic provinces are distinguished by homogenous landforms that result from geologic structures and do not consider variations in climate or vegetation (Fenneman, 1917). Ecoregions and physiographic provinces may delimit regions with similar landslide triggering mechanisms (e.g., rainfall, spring thaw) and terrain attributes (Bălteanu et al., 2020; Günther et al., 2013; Malet et al., 2009; Wilde et al., 2018).

We chose four locations to explore the effects of physiographic and ecological grouping on LSSM performance (Figure 2). These locations were chosen for their variable levels of physiographic and ecological similarity and the systematic approaches used to compile their landslide inventories (see Section 2.2). Magoffin County, Kentucky, is in the Appalachian Plateaus physiographic province and the Ozark/Ouachita-Appalachian forest ecoregion. Doddridge County, West Virginia, shares the same physiographic and ecological divisions as Magoffin County. Macon County, North Carolina, shares the same ecoregion as the previous two but is in the Blue Ridge Physiographic province. Lastly, the Elkhorn Ridge Wilderness, California, is in the Pacific Border physiographic province and the Marine West Coast Forest ecoregion. Attributes of these ecoregions and physiographic provinces are shown in Table 1.

Table 1. Physiographic Province and Level II Ecoregion Attributes

Physiographic province
Name	Geology	Topography
Appalachian Plateau	Mostly undeformed Paleozoic sedimentary rocks	Steep rugged terrain
Blue Ridge	Extensively deformed Precambrian metamorphic rock	Steep rugged terrain
Pacific Border	Extensively deformed and faulted rocks of variable ages	Steep rugged terrain

Level II Ecoregion
Name	Precipitation range (mm per year)	Vegetation
Ozark/Ouachita-Appalachian Forests	900–1,500	Low mountain forests
Marine West Coast	650–5,000	Coniferous forests

2.2 Landslide Inventory

We compiled existing inventories for Magoffin County (Crawford, 2023; Crawford et al., 2021), Doddridge County (Kite et al., 2019), Macon County (Wooten et al., 2017), and the Elkhorn Ridge Wilderness (Wills et al., 2016). The different inventories are publicly available and consist of polygons and points of landslide features (i.e., head scarp, flanks, toe slopes, and hummocky topography) apparent in base maps (e.g., slope, hillshade, curvature, contour) derived from digital elevation models (DEMs), aerial photography, and field investigations. Details of the different inventories are shown in Table 2. All four inventories were mapped by experienced geologists using a well-defined and systematic approach to landslide identification, as detailed in each reference above, with high-resolution DEM or aerial imagery. As such, all the mapped locations of landslides considered in this study meet or exceed the criteria for good confidence (or level 3) defined by Mirus et al. (2020), meaning the landslides features are at or near (within the resolution and accuracy limits of the identification tools) their mapped locations. Although no landslide inventory is perfect, the dense coverage of mapped landslides and the systematic approaches of the different mapping teams make these inventories highly suitable for our study objectives. Shaded relief maps of sections of the counties with landslide points plotted are shown in Supporting Information S1 (Figures S1–S4).

Table 2. Landslide Inventory Attributes

Location	Mapping tools	Landslide data format	Landslide count	Location area (km²)	Landslide density (km⁻²)	Reference
Magoffin County, Kentucky	1.5-m DEM and derivatives, aerial photography, field reconnaissance	Polygon (total affected area)	2,003	800	2.5	Crawford et al. (2021)
Doddridge County, West Virginia	3-m DEM and derivatives, landslides verified by two independent surveyors	Point (at head scarp)	1,731	829	2.09	Kite et al. (2019)
Macon County, North Carolina	6-m DEM and derivatives, aerial photography, geologic maps, field reconnaissance	Point (at head scarp) and Polygon (total affected area or deposit)	640	1,347	0.48	Wooten et al. (2017)
Elkhorn Ridge Wilderness, California	Variable resolution DEMs and their derivatives, aerial photography, field reconnaissance, previous geologic data	Point (if too small to map on 1:24,000 map) and Polygon (total affected area or deposit)	3,087	711	4.34	Wills et al. (2016)

Landslide locations are standardized to points because of the inconsistent data format between the inventories. We convert landslides mapped as polygons to points by finding the highest elevation point within the polygon. In cases where multiple pixels have the same maximum elevation, we select the pixel with the highest slope. The point of highest elevation within the polygon will more closely approximate the location of the landslide head scarp. The LSSMs also require training data that contain non-landslide points, which represent areas without any signs of landslide occurrence. To extract the non-landslide points, we randomly sample areas outside the mapped landslide polygons. For landslides originally mapped as points, we sample locations outside a buffer with a radius derived from the average area of the polygons within the same data set, where possible. All landslides in the Magoffin data set are mapped as polygons, so there are no points to buffer. Macon and Elkhorn use a radius of 158 and 150 m, respectively. For Doddridge County, only point data of landslide head scarps exist; so, we use the radius derived from Magoffin County (45 m) due to the shared physiographic province and ecoregion. Buffering the landslide point data helps prevent sampling landslide locations as non-landslide locations. Importantly, work by Zhu et al. (2017) and Nowicki Jessee et al. (2018) showed that varying the buffer size does not significantly affect susceptibility model output.

The sampling ratio between landslide and non-landslide points can have significant effects on model outcomes due to potential sampling bias (King & Zeng, 2001; Nad'o & Kaňuch, 2018; Oommen et al., 2011). To help mitigate these effects, we followed the methods outlined by Oommen et al. (2011) and King and Zeng (2001) to detect the most appropriate sampling ratio. A frequentist logistic regression model is run on a range of different sampling ratios of non-landslide to landslide points ranging from 100:1 to 1:1. Values of recall, precision, and the weighted harmonic mean of precision and recall (F-measure) are then calculated for landslide and non-landslide classes. The sampling ratio that shows the most consistent recall, precision, and F-measure values within each class has the least sampling bias. This was determined by measuring the standard deviation of the three metrics within each class and finding the ratio with the smallest Euclidean distance of the two class standard deviations. In this study, a sampling ratio of 1:1 proved best for mitigating sampling bias.

2.3 Model Predictors

We compile an array of predictors to differentiate between landslide and non-landslide locations (Table S1 in Supporting Information S1). Predictors are chosen based on their effectiveness in other studies for determining landslide susceptibility (e.g., Budimir et al., 2015). They are designed to characterize the local geology (e.g., lithology), soil (e.g., soil thickness), topography (e.g., slope), hydrology (e.g., flow accumulation), anthropogenic impacts (e.g., proximity to roads), climate (e.g., mean annual precipitation), weather (e.g., precipitation frequency), and seismology (e.g., peak horizontal acceleration), all of which have the potential to influence the driving and resistive forces that affect slope stability. The raw predictor data are available in a variety of different resolutions and formats (i.e., vector and raster). Thus, all data are converted to raster and resampled to the same resolution as the DEMs used to derive the topographic predictors (i.e., 10 m) using the nearest neighbor interpolation method. We use a 10-m DEM from the U.S. Geological Survey 3D Elevation Program database (U.S. Geological Survey, 2019) because it is the finest resolution available over all the study sites and coarser resolutions would obscure the predictor values that led to ground failure. Categorical predictors are converted to model matrices with K−1 different categories (McElreath, 2020, Section 5.4.2), where K is the number of categories within a given predictor. The Soil Survey Geographic Database (SSURGO) (U.S. Department of Agriculture, 2021b) is a high-resolution soil database but has some null data points within the study regions. Thus, we fill the null data with values with the coarser-resolution STATSGO data set (U.S. Department of Agriculture, 2021a). To increase computational efficiency in the susceptibility models, we standardize the compiled predictors of each location included in training the LSSM to have a mean of zero and a standard deviation of one (Kruschke, 2015, Section 17.2.1.1). Herein, we refer to the combined data sets of the locations used to train a particular model as a training group. Although some of the included predictors are not available in many parts of the world, the overall findings of our analysis are pertinent to any study that uses machine learning or other statistical methods to assess landslide susceptibility.

Correlation between predictors can cause inaccurate estimates of the measured predictor effects, which makes meaningful comparisons between the LSSMs difficult. Thus, we use the variation inflation factor (VIF), a measure of collinearity between predictors in the LSSM (James et al., 2013), to eliminate predictors that are the most correlated with others (Hong et al., 2015) using an iterative approach for each training group. For each iteration we run a frequentist logistic regression model and eliminate the predictors in the highest tenth percentile of VIF values greater than five, continuing until all predictors have a VIF value less than five. A VIF value of five is a conservative threshold for eliminating collinearity within a statistical model (James et al., 2013). Using an iterative approach allows us to account for variations in the VIF values from changes in the predictor combinations. After the correlated predictors are removed from the LSSMs, predictors are matched between training groups by eliminating predictors absent in any member of the training groups. This allows us to analyze the variation in effects of the same predictors between training groups.

2.4 Modeling Strategy

Landslide susceptibility models are created and tested on the following training groups (Figure 3):

Two models trained on a random sampling of half of the Magoffin landslide data set (Magoffin 1 and Magoffin 2) (Figure 3a). This training group acts as the control by measuring the model accuracies when the training and test data are from the same area.
One training group for each location individually (four in total) (Figure 3b). Testing LSSMs trained at one location and tested on another will determine whether grouping data sets according to physiographic or ecological attributes is a meaningful division for creating broad-scale LSSMs.
A training group that consists of data from all the locations (All) (Figure 3c). This training group is tested on the training group for each individual location to evaluate how the model behavior changes when it is trained on an aggregated data set compared to models developed on a subset of the data.
Four training groups with all the data except one location (e.g., All-Doddridge indicates all locations except Doddridge) (Figure 3d). These training groups were tested on the withheld location. This experiment will evaluate the expected model accuracies when applying an LSSM trained on a compilation of data from diverse regions to an area with no data.

The models trained on these groups were evaluated by looking at the measured predictor effects between the models and by testing the models on the other training groups' data and measuring their performance using ROC-AUC.

2.4.1 Logistic Regression

We use Bayesian logistic regression to model landslide susceptibility. Logistic regression estimates the log-odds, or logit function ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0001$ ), of a binary outcome (i.e., landslide occurrence or no landslide occurrence) given some predictor input data, where P is the probability of there being a mapped landslide. The logistic regression model for observation i and M input predictors is expressed as follows:

$urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0002$ (1)

where $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0003$ is an N by M predictor data matrix, N is the number of observations, $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0004$ is the coefficient vector of length $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0005$ , $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0006$ is the intercept, and $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0007$ is the logit (log-odds) function. The logistic coefficient of a given predictor variable provides the change in log-odds with unit change in the predictor variable. We use logistic regression because it is one of the most commonly used models for predicting landslide susceptibility (Reichenbach et al., 2018). Implementing a statistical analysis of the different logistic regression coefficients between the training groups will allow us to statistically evaluate changes in predictor effects on landslide susceptibility.

Bayesian logistic regression incorporates uncertainty into the model by using probability distributions of the model parameters. While the frequentist methodology estimates the probability of the true value of a parameter being within a given range using confidence intervals established by the data, the Bayesian framework assumes the data to be fixed and knowledge about the parameter to be a distribution (van de Schoot et al., 2021). Bayesian methods require the incorporation of prior knowledge about the parameters of interest, uncertainty estimates of the probability outputs, and greater flexibility in post-processing, which facilitates more transparent and interpretable results (Das et al., 2012; Korup, 2021; Loche et al., 2022). These benefits help to prevent model users from making unjustified conclusions about the data compared to traditional frequentist models (Wasserstein & Lazar, 2016).

The basis of Bayesian analysis is that the probability ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0008$ ) of observing the unobserved parameter(s) of interest ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0009$ ) provided some data (x) is given by the posterior probability ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0010$ ):

$urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0011$ (2)

where $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0012$ is the prior probability and $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0013$ is the likelihood function. The prior probability is the estimated probability of $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0014$ before x is observed. The likelihood function is the probability of observing x for a given $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0015$ and is given by Equation 1. The posterior probability distributions using a logistic regression model have no analytical solution; thus, a Markov Chain Monte Carlo approach is used to numerically estimate $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0016$ (Kruschke, 2015). We assume independent modeling regions and provide Gaussian priors with a mean of zero and standard deviation of 2.5 and 10 for the parameter coefficients ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0017$ ) and intercepts ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0018$ ), respectively. As the input data are large and standardized, these priors are considered weakly informative (Gelman et al., 2013). That is, with the amount of data we use in these models (Table 2), the likelihood function will dominate and the priors will not heavily affect the posterior probabilities. Posterior probabilities are estimated using the statistical software Stan (Stan Development Team, 2017) through the computational environment R (R Core Team, 2016). Stan is run with four chains at 4,000 iterations and a 1,000 iteration warmup (burn-in) to omit the unrepresentative initial values of the model before the model converges on the representative parameter distributions. Diagnostics run on the Markov chains indicate that they were well mixed (i.e., provide a representative sampling of the posterior distribution).

2.5 Model Evaluation

In addition to the widely used ROC metrics, we use an analysis of regression coefficients to understand the logistic regression model performance. Each method is explained in detail in the following sections. Although ROC metrics can provide meaningful insights into the overall model performance, they do not elucidate the controls of a given model's performance. A better understanding of the model by analyzing the predictor effects provides valuable information about why certain LSSMs perform better in different scenarios. This will inform modelers on the effects of the input data and the limits of a model's utility when applied to different settings.

2.5.1 Receiver Operator Characteristics

The AUC metric of ROC is used to evaluate the performance of each model applied to different training data. The ROC curves compare the true positive rate against the false-positive rate (see Oommen et al., 2011 for an overview). AUC values near one indicate perfect model accuracy (i.e., every landslide and non-landslide from the data is modeled correctly), whereas AUC values near 0.5 indicate the model classification is equivalent to random guessing. Generally, values from 0.5 to 0.6, 0.6 to 0.7, 0.7 to 0.8, 0.8 to 0.9, and 0.9 to 1.0 are classified as poor, average, good, very good, and excellent performance, respectively (Yesilnacar, 2005). Importantly, before applying the trained model on the test data sets, predictors are standardized using the mean and standard deviations of the model's training group. We measure the ROC-AUC for each LSSM both on its training data set and for every permutation of the other training group pairs. These comparisons will allow us to first evaluate how accurate the model is at recreating its training data and then determine how accurate the model is when applied to other landslide data sets. We use the posterior distributions of $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0019$ and $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0020$ to obtain distributions of ROC-AUC for each LSSM comparison.

2.5.2 Logistic Coefficient Comparison

Raw coefficients of uncorrelated predictors cannot be directly compared between LSSMs (Mood, 2010) without first converting the logistic coefficients to a measure of probability changes (e.g., average marginal effects, AME). In brief, the fixed variance of the logit function (Equation 1) requires that any variance in the unobserved response must be accommodated by a change in the logistic coefficients ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0021$ ). A detailed explanation and proof of this concept is found in Mood (2010). The AME measures the average change in landslide occurrence probabilities attributed to a given predictor (m) and is given by

$urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0022$ (3)

$urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0023$ (4)

where $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0024$ is the linear combination of the predictors (x_i) with their coefficients ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0025$ ) for the ith observation, N is the number of observations, and $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0026$ is the logistic probability density function given by the derivative of the logistic cumulative distribution function. Large magnitudes of AME indicate that the given predictor has a major influence on landslide susceptibility, whereas small AME magnitudes indicate that the predictor has only minor influence. The AME sign shows if the probability decreases (negative sign) or increases (positive sign) with an increase in the predictor value. In summary, AME distributions can be used to directly compare logistic LSSMs trained on different data.

2.6 Limited Data Experiment

We simulate the effects of using a uniformly distributed in space but limited landslide inventory to model landslide susceptibility over large areas by subsampling all the landslide data previously described (Sections 2.2 and 2.3). We randomly sample 5% of the compiled landslide and non-landslide inventory data using a 1:1 sampling ratio from all the study sites to optimize and train a Bayesian logistic regression model (Figure 3e). We then evaluate the models' ROC-AUC metric on all the landslide data not used in the model training phase (i.e., 95% of all the landslide data) using the mean of the posterior distributions of $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0027$ . We also analyze the models' AME distributions to compare them with the ROC-AUC metrics. We iterate this procedure 100 times to estimate the variation in model performance from different training data.

3 Results

3.1 Predictor Effects

Eleven predictors were included in the LSSMs after the correlation and matching phases of workflow (Figure 1). These predictors were soil thickness (Thick), available water holding capacity (AWC), slope, aspect ( $urn:x-wiley:21699003:media:jgrf21714:jgrf21714-math-0028$ ) (McCune & Keon, 2002), topographic roughness with 30- and 100-m windows (Rough30 and Rough100), flow length (FlowLength), flow accumulation (FlowAcc), topographic wetness index (TWI), proximity to rivers (ProxRivers), and proximity to roads (ProxRoads). Figure S5 in Supporting Information S1 shows that non-standardized distributions of the slope, Rough30m, and Rough100m predictors for landslides generally have elevated values compared to non-landslide locations. Most of the other predictors show little variation between landslide and non-landslide locations. Many of the predictors show different distributions between training groups.

Posterior distributions of the AMEs show large variation between locations for the logistic regression LSSMs (Figure 4). There is no consistent sign (i.e., direction) within many of the predictors, indicating that an average unit increase in predictor values may increase or decrease the probability of detecting a mapped landslide, depending on the location. For instance, Macon County shows a negative AME value for the slope, indicating that, on average, steeper slopes decrease the chances of capturing mapped landslides in that area, whereas in all the other study areas, increased slope values increase the chances of capturing a mapped landslide. In addition to the inconsistent AME signs, the AME magnitudes are highly variable. This indicates highly variable predictor importance between locations. The most consistent AME distributions are between the split Magoffin data distributions (Magoffin1 and Magoffin2) and the combined Magoffin data (Magoffin1+Magoffin2). The extreme AME values for FlowAcc are due to the highly skewed distribution of that predictor (Figure S5 in Supporting Information S1). However, the influence of the extreme values is minimized in the models due to their paucity. A more in-depth statistical analysis and interpretation of the AME posterior distributions is presented in Supporting Information S1 (Text S1, Figures S6 and S7). This analysis confirms that most of the predictors' AME distributions are credibly (95% credibility interval) different between locations. Finally, Figure 4 shows that many predictor AME distributions overlap with zero (e.g., Aspect), indicating that these predictors consistently have minimal influence on model performance. However, these predictors are influential in a minority of the models (e.g., Aspect for the Doddridge model). Other predictors show distribution magnitudes that generally have little overlap with zero (e.g., Slope and Rough30) suggesting that these predictors are consistently highly influential on the model outputs.

3.2 Receiver Operator Characteristics

The ROC-AUC values show notable variations in the different model comparisons. All models perform satisfactorily (≥0.6) using resubstitution (i.e., the same data are used to train and validate the model) (Figure 5). The Magoffin County self-comparisons show higher AUC values (mean of 0.70) compared to the other models with independent training and test data. By independent we mean that there is no shared data between the training and test data sets (e.g., the trained All-Location_x model evaluations), as opposed to some ROC-AUC results showing results of non-independent training and test data sets (e.g., the resubstitution model evaluations). However, comparisons of models within the same ecological or physiographic regions show no increase in model performance compared with areas outside a shared region. Models that include all the data and are applied to any specific location also perform better (average of 0.61) than models with independent training and test data. However, when the test location data are omitted from the compiled training set, model performance decreases (average of 0.55). The lack of overlap between many of the AUC distributions indicates that many of the differences in model performance are statistically meaningful (i.e., exceed random noise).

3.3 Limited Data Experiment

Landslide susceptibility models developed with a limited but uniformly distributed landslide data set perform relatively well compared to models trained on areas outside their test areas (Figure 5). Figure 6 shows the ROC-AUC scores (gray dots) of the 100 model iterations trained on 5% of the compiled landslide inventory from all the study sites and tested on the remaining 95%. The average ROC-AUC value of the model iterations is 0.63. This is lower than the Magoffin self-comparisons, which used 50% of the landslide data but is higher than most of the other comparisons with independent training and test data sets (Figure 5). Even the minimum AUC score (0.615) from the limited data runs provides higher AUC values than most of the All-Location_x and the ecoregion-segregated models. The LSSMs trained with limited data generally show consistent predictor effects (i.e., AME distributions) with the LSSMs trained using all the data (Figure 7). The magnitudes of the posterior AME values also reflect the influence of predictors in the combined domain. The high average magnitude of slope, Rough30, and AWC suggest that, on average, these predictors are influential in the combined domain. In contrast, the predictors FlowLength, Thick, Aspect, FlowAcc, ProxRoads, ProxRivers, and TWI all have AME values that are indistinguishable from zero at high probability. This suggests that, on average, these parameters have little influence on the model output over the combined domain.

4 Discussion

Our results illustrate why certain sampling (or model training) strategies designed for creating LSSMs for regional or greater scales perform better than others by examining the resultant variation in predictor effects. Through our analysis, we demonstrate the following:

LSSMs trained with extensive local landslide data may still perform poorly when applied to other regions with no landslide data;
Using a model in areas with shared physiographic provinces and level II ecological divisions does not warrant improved performance;
Data uniformly distributed (i.e., spread approximately equally) in extent but limited in quantity can produce relatively accurate LSSMs compared to the previous two approaches.

Thus, focusing efforts on obtaining uniformly distributed landslide inventories over the entire modeling domain will likely produce better results than using a few high-density but spatially isolated landslide inventories. Using spatially isolated landslide inventories to infer landslide susceptibility in data-poor regions has been common practice for regional and global applications of landslide susceptibility (Reichenbach et al., 2018; Stanley & Kirschbaum, 2017). Below we explore each of these points in detail.

4.1 Model Accuracy in Limited-Data Regions

We show that LSSMs are very sensitive to the local conditions of the different landslide inventories and this sensitivity manifests in the predictors' effects (Figure 4). Thus, applying LSSMs trained on a location with different predictor effects from the test location is likely to yield poor overall landslide susceptibility characterization, as indicated by the ROC-AUC scores (Figure 5). This variation in local conditions may indicate differences in the triggering events responsible for landsliding at our study locations. Although the landslide inventories used in this study do not include trigger mechanisms, the negative AME values estimated for the slope predictor at Macon County may indicate that landslides in that area are predominantly rainfall-triggered, whereas the other study areas may include more (or some) earthquake-triggered landslides (Marc et al., 2018; Meunier et al., 2008; Rault et al., 2019). Earthquake-triggered landslide generally cluster toward ridge crests where slopes are the steepest due to topographic amplification effects. Alternatively, this difference could reflect the influence of human activity on more accessible slopes within Macon County (Wooten et al., 2017). Finally, our framework explains why different methods used to determine landslide susceptibility over regions with limited data may not perform well.

The observed variable effects of the model predictors between locations indicate that applying LSSMs to limited data regions may lead to spurious local results as measured by the ROC-AUC and divergence in predictor effects. While poor model performance when models are applied outside of their training domain is a commonly reported finding (e.g., Tanyas et al., 2019), our analysis highlights why this sampling strategy is ineffective. Applying LSSMs to regions where the model was not trained is often done to evaluate the versatility of the model. In most studies, this is carried out by dividing the landslide data set by random sampling (∼60% of publications), temporal attributes (∼20% of publications), or location (∼15% of publications) (Reichenbach et al., 2018). Thus, in most cases, the test data set is spatially near the training data set. The comparison between the split Magoffin data sets simulates the variation in local effects expected when testing LSSMs on areas spatially near, or overlapping, the training data (Figure 4). Although the local effects are not identical in sign and magnitude, they are relatively consistent, resulting in higher ROC-AUC values compared to other model evaluations that don't include the training data (Figure 5). However, when studies develop susceptibility models for areas that include regions with little or no landslide data, there is often no way to effectively evaluate the model performance over these regions. The low ROC-AUC scores of models applied to areas omitted from the training data indicate the model accuracies that might be expected when applying LSSMs to limited data regions (Figure 5). A model applied to limited data regions will likely omit important variations in predictor effects needed to accurately model landslide susceptibility (Figure 4).

Our observed differences in predictor effects between locations may partially explain the common poor model results from heuristic and physically based susceptibility methods when applied to diverse terrains (e.g., Fusco et al., 2021). Like data-driven statistical methods, physically based methods are calibrated on available data from local observations that may not manifest the full range of attributes responsible for slope failure within the study site. Additionally, heuristic approaches (e.g., fuzzy logic and index-based methods) assumed fixed predictor effects across the entire modeling domain. Thus, the susceptibility models developed using these methods may perform poorly when applied to areas with different landslide attributes (i.e., predictor values).

4.2 Ecological and Physiographic Divisions

We show that using continental-scale physiographic and ecological divisions to restrict where models are trained and applied sometimes produce maps worse than random susceptibility assignments (i.e., ROC-AUC values less than 0.5; Figure 5). Previous studies that use this approach assume that restricting the region where a model is trained and applied will lead to more uniformity in the predictor effects across the restricted domain and more accurate model performance. The logic for such restrictions appears sound because areas with similar climate and terrain are expected to have similar triggering mechanisms and landslide attributes. However, predictor effects are too diverse within subregions of the level II Ecoregion and physiographic provinces for them to improve the models' performances at the 10 m pixel scale used herein. Applying models to data sets with the same physiographic and ecological attributes as the training data did not help constrain the predictor effects between the training groups (Figure 4). This prevented any improvements in LSSM performance (Figure 5). It is possible that the physiographic and ecological divisions used herein are too broad to segregate the landslide inventories into representative groups or that they do not properly capture the environmental attributes that control slope stability. Work by Tanyas et al. (2019) attempted to segregate the modeling domain by clustering locations with similar landslide predictor values and also found negligible improvement in model performance compared to an aggregated model. Effective means of segregating a modeling domain into representative subdomains remains an open research question (Kornejady et al., 2017; Loche et al., 2022; Petschko et al., 2013; Tanyas et al., 2019). Future work could use our proposed framework to evaluate whether using more localized subdivisions reduce the variation of predictor effects in LSSMs.

4.3 Uniformly Distributed Landslide Inventory

A limited but uniformly distributed landslide inventory over the modeling domain performs relatively well (Figure 6); however, omitting any area from the model training phase may lead to poor susceptibility characterization of the withheld area despite assumed environmental similarities (Figure 5). The ROC-AUC values for the logistic regression models in the limited data experiment indicate above-average performances compared to the other model experiments with independent training and test data (Figure 6). Figure 7 illustrates that the reason for this good model performance is the relatively consistent predictor effects between LSSMs trained on all the data (bold red lines) and LSSMs trained on only a small sampling (multi-colored lines). In contrast, Figure 4 shows differences in the magnitudes and scales of the AME distributions between models. By using uniformly distributed training data, the model converges on the most representative coefficients over the whole domain but at the expense of poor accuracies at some smaller spatial scales (see the Trained on All section of Figure 5) that have divergent predictor effects from the average of all the data sets (Figures 4 and 7). Other empirical observations in machine learning applications indicate that having at least 10 events (landslides) for every predictor is sufficient to estimate the predictors' coefficients within a prediction model (Moons et al., 2014; Pavlou et al., 2016; Peduzzi et al., 1996). As we sample 5% of the available data (373 landslides), we have ∼33 landslides for every predictor in our models, which may indicate why the limited data experiments performed well on the 95% of data left for cross-validation.

While restricting the training data to so few landslides greatly limits the representation of the range of possible environmental conditions and its representativeness of external domains, using uniformly distributed but limited data is better than using more data that do not cover the entire modeling domain. For example, the large drop in ROC-AUC scores in regions not included in the compiled data sets illustrates the consequences of not having a uniformly distributed landslide inventory when using statistical LSSMs (Figure 5). The omission of any given area with predictor effects different from the training data could result in very poor susceptibility characterization of that area. Regional-scale or greater landslide susceptibility models trained on spatially constrained inventories are unlikely to represent the full range of predictor effects necessary to accurately characterize landslide susceptibility over data-poor areas. Although some studies have found that applying LSSMs to areas outside the training domain can perform well (e.g., Von Ruette et al., 2011), the reason for the good model performance is often not explored in depth. Application of the framework used herein for analyzing predictor effects would allow better understanding of the controls of LSSM performance in other studies. In summary, our results indicate that when modeling susceptibility over broad extents, randomly sampling landslide and non-landslide locations over the entire modeling domain capture a greater range of predictor effects and will generally improve model performance over using dense but spatially separated training data or grouping data sets by assumed environment controls.

4.4 Improving Landslides Susceptibility Models at Regional Scales

We cannot exclude the possibility of differences in the landslide inventories or our sampling strategies leading to some of the observed model variations between locations. The landslide inventories used were collected by different teams, with unique objectives, at different times and using various methods (Table 2). This inconsistency is common for any inventory, regardless of scale or extent. However, restricting our study to landslide inventories that use well-defined and systematic approaches with high-resolution DEM or aerial imagery, coupled with the systematization procedures implemented herein (Figure 1), minimize the effects of inconsistencies in the landslide inventories (Sections 2.2 and 2.3). The lack of any specific location showing consistently divergent results supports the idea that we are accurately characterizing the relative controls as described by the measured predictors of landslide susceptibility, not merely artifacts caused by differences in the landslide mapping methodologies.

Although many challenges regarding the representativeness of data for susceptibility studies remain, our analysis offers several promising avenues for improvement. First, LSSMs applied to areas where there is no landslide inventory may have areas with poorly characterized landslide susceptibility due to differences in landslide characteristics not represented in the LSSM. The implications of the assumptions used to extrapolate susceptibility models across the entire modeling domain (including data-limited regions) need to be carefully explained to end-users. Second, extrapolating models developed on other regions based on the assumption that landslide triggering mechanisms and attributes are similar (e.g., due to shared ecoregions and physiographic provinces) may lead to spurious results. For instance, although Macon, Doddridge, and Magoffin share similar physiographic and/or ecological environments (Figure 2) that would indicate the predictor effects should be relatively consistent, our analysis indicates otherwise (Figure 4). Third, the use of heuristic methods in other studies explicitly assumes specific predictor effects that may not match local observations. This is consistent with observations that national-scale susceptibility maps tend to under-represent the hazard in moderately sloping terrain (Mirus et al., 2020). Finally, our results indicate that using a uniformly distributed landslide inventory with widespread representation across the modeling domain will likely produce the most accurate susceptibility maps over regional or greater scales. By giving the model as many variations in predictor values as possible, the model will be more robust and produce more accurate results on diverse terrains (Figure 5) (Halevy et al., 2009). Thus, when attempting to create susceptibility models where no current data exist, efforts focusing on gathering a uniformly distributed sample of landslide and non-landslide locations across the entire study site, even if the inventory is limited, would be more useful than trying to gather spatially limited but more complete inventories.

5 Conclusion

Accurate LSSMs for large and diverse terrains are needed worldwide. Here, we use a statistical framework to evaluate the influence of a few common sampling (or model training) strategies on model parameters and performance. This approach provides a more detailed understanding of the impacts of different input data on model behavior and performance over previous efforts. We emphasize that the choice of sampling strategy can have drastic impacts on the predictor effects within the model, which influence the representativeness of the model for new (i.e., unsampled) domains. For example, sampling a few spatially dense but isolated landslide inventories scattered throughout the modeling domain generally provides very poor representation of areas with limited landslide data. Additionally, limiting model development and application to regions within the same physiographic provinces and level II ecoregions does not help constrain the predictor effects or improve model performance. Finally, using a limited but uniformly distributed landslide inventory can create accurate landslide susceptibility maps over the same domain where the training data were gathered. In summary, our results illustrate the diverse conditions that can lead to landslide susceptible conditions across geologic settings and some of the challenges in creating representative susceptibility maps over these settings.

Acknowledgments

We appreciate the insights from three anonymous reviewers and the constructive suggestions from Oliver Korup and Eric Thompson, that helped us to improve the manuscript. This work was funded by the U.S. Geological Survey. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Open Research

Data Availability Statement

The Doddridge County landslide inventory data are available at https://services1.arcgis.com/cTNi34MxOdcfum3A/arcgis/rest/services/LandslideIncidenceUser/FeatureServer and the Magoffin County inventory is available at https://doi.org/10.13023/kgs.data.2022.01 (Crawford, 2023). The other landslide inventories are compiled within the USGS Landslide inventories across the United States Version 2 via https://doi.org/10.5066/P9FZUX6N (Belair et al., 2022). Data sources for the predictor data are shown in Supporting Information S1 (Table S1). The Stan model and data required to run the model are deposited at https://doi.org/10.5066/P959G9JN.

Supporting Information

References

Ayalew, L., & Yamagishi, H. (2005). The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains. Central Japan, 65(1–2), 15–31. https://doi.org/10.1016/j.geomorph.2004.06.010
10.1016/j.geomorph.2004.06.010
Web of Science®Google Scholar
Bălteanu, D., Micu, M., Jurchescu, M., Malet, J. P., Sima, M., Kucsicsa, G., et al. (2020). National-scale landslide susceptibility map of Romania in a European methodological framework. Geomorphology, 371, 107432. https://doi.org/10.1016/j.geomorph.2020.107432
10.1016/j.geomorph.2020.107432
Web of Science®Google Scholar
Baum, R. L., & Godt, J. W. (2010). Early warning of rainfall-induced shallow landslides and debris flows in the USA. Landslides, 7(3), 259–272. https://doi.org/10.1007/s10346-009-0177-0
10.1007/s10346-009-0177-0
Web of Science®Google Scholar
Belair, G. M., Jones, E. S., Slaughter, S. L., & Mirus, B. B. (2022). Landslide inventories across the United States version 2. U.S. Geological Survey Data Release. https://doi.org/10.5066/P9FZUX6N
Google Scholar
Benz, S. A., & Blum, P. (2019). Global detection of rainfall-triggered landslide clusters. Natural Hazards and Earth System Sciences, 19(7), 1433–1444. https://doi.org/10.5194/NHESS-19-1433-2019
10.5194/NHESS-19-1433-2019
ADSWeb of Science®Google Scholar
Bordoni, M., Galanti, Y., Bartelletti, C., Persichillo, M. G., Barsanti, M., Giannecchini, R., et al. (2020). The influence of the inventory on the determination of the rainfall-induced shallow landslides susceptibility using generalized additive models. Catena, 193, 104630. https://doi.org/10.1016/j.catena.2020.104630
10.1016/j.catena.2020.104630
Web of Science®Google Scholar
Broeckx, J., Vanmaercke, M., Duchateau, R., & Poesen, J. (2018). A data-based landslide susceptibility map of Africa. Earth-Science Reviews, 185, 102–121. https://doi.org/10.1016/j.earscirev.2018.05.002
10.1016/j.earscirev.2018.05.002
ADSWeb of Science®Google Scholar
Budimir, M. E. A., Atkinson, P. M., & Lewis, H. G. (2015). A systematic review of landslide probability mapping using logistic regression. Landslides, 12(3), 419–436. https://doi.org/10.1007/s10346-014-0550-5
10.1007/s10346-014-0550-5
Web of Science®Google Scholar
Chang, K. T., Merghadi, A., Yunus, A. P., Pham, B. T., & Dou, J. (2019). Evaluating scale effects of topographic variables in landslide susceptibility models using GIS-based machine learning techniques. Scientific Reports, 9(1), 12296. https://doi.org/10.1038/s41598-019-48773-2
10.1038/s41598-019-48773-2
ADSPubMedWeb of Science®Google Scholar
Chen, W., Xie, X., Wang, J., Pradhan, B., Hong, H., Bui, D. T., et al. (2017). A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. Catena, 151, 147–160. https://doi.org/10.1016/J.CATENA.2016.11.032
10.1016/J.CATENA.2016.11.032
Web of Science®Google Scholar
Crawford, M. M. (2023). Kentucky geological survey landslide inventory [2023-03]: Kentucky geological survey research data. https://doi.org/10.13023/kgs.data.2022.01
10.13023/kgs.data.2022.01
Google Scholar
Crawford, M. M., Dortch, J. M., Koch, H. J., Killen, A. A., Zhu, J., Zhu, Y., et al. (2021). Using landslide-inventory mapping for a combined bagged-trees and logistic-regression approach to determining landslide susceptibility in eastern Kentucky, USA. The Quarterly Journal of Engineering Geology and Hydrogeology, 54(4). https://doi.org/10.1144/qjegh2020-177
10.1144/qjegh2020-177
Web of Science®Google Scholar
Das, I., Stein, A., Kerle, N., & Dadhwal, V. K. (2012). Landslide susceptibility mapping along road corridors in the Indian Himalayas using Bayesian logistic regression models. Geomorphology, 179, 116–125. https://doi.org/10.1016/j.geomorph.2012.08.004
10.1016/j.geomorph.2012.08.004
ADSWeb of Science®Google Scholar
Esri Inc. (2021). ArcGIS Pro (version 2.9). Esri Inc. Retrieved from https://www.esri.com/en-us/arcgis/products/arcgis-pro/overview
Google Scholar
Fenneman, N. M. (1917). Physiographic subdivision of the United States. Proceedings of the National Academy of Sciences of the United States of America, 3(1), 17–22. https://doi.org/10.1073/PNAS.3.1.17
10.1073/PNAS.3.1.17
CASADSPubMedGoogle Scholar
Fenneman, N. M., & Johnson, D. W. (1946). Physiographic divisions of the conterminous U. S. Retrieved from https://water.usgs.gov/lookup/getspatial?physio
Google Scholar
Froude, M. J., & Petley, D. N. (2018). Global fatal landslide occurrence from 2004 to 2016. Natural Hazards and Earth System Sciences, 18(8), 2161–2181. https://doi.org/10.5194/nhess-18-2161-2018
10.5194/nhess-18-2161-2018
ADSWeb of Science®Google Scholar
Fusco, F., Mirus, B. B., Baum, R. L., Calcaterra, D., & De Vita, P. (2021). Incorporating the effects of complex soil layering and thickness local variability into distributed landslide susceptibility assessments. Water (Switzerland), 13(5), 713. https://doi.org/10.3390/w13050713
10.3390/w13050713
Web of Science®Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis ( 3rd ed.). Taylor & Francis.
10.1201/b16018
Google Scholar
Ghorbanzadeh, O., Crivellari, A., Ghamisi, P., Shahabi, H., & Blaschke, T. (2021). A comprehensive transferability evaluation of U-Net and ResU-Net for landslide detection from Sentinel-2 data (case study areas from Taiwan, China, and Japan). Scientific Reports, 11(1), 1–20. https://doi.org/10.1038/s41598-021-94190-9
10.1038/s41598-021-94190-9
PubMedWeb of Science®Google Scholar
Godt, J. W., Wood, N. J., Pennaz, A. B., Mirus, B. B., Schaefer, L. N., & Slaughter, S. L. (2022). National strategy for landslide loss reduction. https://doi.org/10.3133/ofr20221075
10.3133/ofr20221075
Google Scholar
Günther, A., Reichenbach, P., Malet, J. P., van den Eeckhaut, M., Hervás, J., Dashwood, C., & Guzzetti, F. (2013). Tier-based approaches for landslide susceptibility assessment in Europe. Landslides, 10(5), 529–546. https://doi.org/10.1007/s10346-012-0349-1
10.1007/s10346-012-0349-1
Web of Science®Google Scholar
Guzzetti, F., Gariano, S. L., Peruccacci, S., Brunetti, M. T., Marchesini, I., Rossi, M., & Melillo, M. (2020). Geographical landslide early warning systems. Earth-Science Reviews, 200, 102973. https://doi.org/10.1016/j.earscirev.2019.102973
10.1016/j.earscirev.2019.102973
Web of Science®Google Scholar
Halevy, A., Norvig, P., & Pereira, F. (2009). The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2), 8–12. https://doi.org/10.1109/MIS.2009.36
10.1109/MIS.2009.36
Web of Science®Google Scholar
Hallegatte, S., Vogt-Schilb, A., Bnagalore, M., & Rozenberg, J. (2017). Unbreakable: Building the resilience of the poor in the face of natural disasters. World Bank.
Google Scholar
Haque, U., da Silva, P. F., Devoli, G., Pilz, J., Zhao, B., Khaloua, A., et al. (2019). The human cost of global warming: Deadly landslides and their triggers (1995–2014). Science of the Total Environment, 682, 673–684. https://doi.org/10.1016/j.scitotenv.2019.03.415
10.1016/j.scitotenv.2019.03.415
CASADSPubMedWeb of Science®Google Scholar
Hervás, J. (2007). Recommendations on a common approach for mapping areas at risk of landslides in Europe. Guidelines for Mapping Areas at Risk of Landslides in Europe. JRC Report EUR, 23093, 45–50.
Google Scholar
Hervás, J., Günther, a., Reichenbach, P., Malet, J. P., & Van Den Eeckhaut, M. (2010). Harmonised approaches for landslide susceptibility mapping in Europe. Proceeding of the international conference mountain risks: Bringing science to society, (Tier 1), (pp. 501–505).
Google Scholar
Hong, H., Pradhan, B., Xu, C., & Tien Bui, D. (2015). Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena, 133, 266–281. https://doi.org/10.1016/j.catena.2015.05.019
10.1016/j.catena.2015.05.019
Web of Science®Google Scholar
Huang, F., Cao, Z., Jiang, S. H., Zhou, C., Huang, J., & Guo, Z. (2020). Landslide susceptibility prediction based on a semi-supervised multiple-layer perceptron model. Landslides, 17(12), 2919–2930. https://doi.org/10.1007/s10346-020-01473-9
10.1007/s10346-020-01473-9
Web of Science®Google Scholar
Hughes, K. S., & Schulz, W. H. (2020). Map depicting susceptibility to landslides triggered by intense rainfall, Puerto Rico. US geological survey open-file report, 2020–1022. (p. 91). 1 plate, scale 1:150,000. https://doi.org/10.3133/ofr20201022
10.3133/ofr20201022
Google Scholar
IPCC. (2019). Climate change and land: An IPCC special report on climate change, desertification, land degradation, sustainable land management, food security, and greenhouse gas fluxes in terrestrial ecosystems [P.R. Shukla, J. Skea, E. Calvo Buendia, V. Masson-Delmot].
Google Scholar
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning: With applications in R ( 1st ed.). Springer.
10.1007/978-1-4614-7138-7
Google Scholar
King, G., & Zeng, L. (2001). Explaining rare events in international relations. International Organization, 55(3), 693–715. https://doi.org/10.1162/00208180152507597
10.1162/00208180152507597
Web of Science®Google Scholar
Kirschbaum, D., & Stanley, T. (2018). Satellite-based assessment of rainfall-triggered landslide hazard for situational awareness. Earth’s Future, 6(3), 505–523. https://doi.org/10.1002/2017EF000715
10.1002/2017EF000715
ADSPubMedWeb of Science®Google Scholar
Kirschbaum, D., Stanley, T., & Zhou, Y. (2015). Spatial and temporal analysis of a global landslide catalog. Geomorphology, 249, 4–15. https://doi.org/10.1016/j.geomorph.2015.03.016
10.1016/j.geomorph.2015.03.016
ADSWeb of Science®Google Scholar
Kite, J. S., Maynard, S. M., Sharma, M., Donaldson, K., Bell, M., Maxwell, A. E., et al. (2019). Building a landslide inventory for West Virginia: Step 1 in statewide risk assessment. https://doi.org/10.1130/abs/2019am-337478
10.1130/abs/2019am-337478
Google Scholar
Kornejady, A., Ownegh, M., & Bahremand, A. (2017). Landslide susceptibility assessment using maximum entropy model with two different data sampling methods. Catena, 152, 144–162. https://doi.org/10.1016/j.catena.2017.01.010
10.1016/j.catena.2017.01.010
Web of Science®Google Scholar
Korup, O. (2021). Bayesian geomorphology. Earth Surface Processes and Landforms, 46(1), 151–172. https://doi.org/10.1002/esp.4995
10.1002/esp.4995
ADSWeb of Science®Google Scholar
Korup, O., & Stolle, A. (2014). Landslide prediction from machine learning. Geology Today, 30(1), 26–33. https://doi.org/10.1111/gto.12034
10.1111/gto.12034
Google Scholar
Kruschke, J. (2015). Doing Bayesian data analysis ( Second). Academic Press.
Google Scholar
Loche, M., Alvioli, M., Marchesini, I., Bakka, H., & Lombardo, L. (2022). Landslide susceptibility maps of Italy: Lesson learnt from dealing with multiple landslide types and the uneven spatial distribution of the national inventory. Earth-Science Reviews, https://doi.org/10.1016/j.earscirev.2022.104125
10.1016/j.earscirev.2022.104125
PubMedWeb of Science®Google Scholar
Malet, J.-P., Thiery, Y., Hervás, J., Günther, A., Puissant, A., & Grandjean, G. (2009). Landslide susceptibility mapping at 1:1M scale over France: Exploratory results with heuristic model. In Proceedings of landslide process, from geomorphologic mapping to dynamic modelling (pp. 315–320). A Tribute to Dr. Theo van Asch.
Google Scholar
Marc, O., Stumpf, A., Malet, J. P., Gosset, M., Uchida, T., & Chiang, S. H. (2018). Initial insights from a global database of rainfall-induced landslide inventories: The weak influence of slope and strong influence of total storm rainfall. Earth Surface Dynamics, 6(4), 903–922. https://doi.org/10.5194/esurf-6-903-2018
10.5194/esurf-6-903-2018
ADSWeb of Science®Google Scholar
Matsumoto, M., & Nishimura, T. (1998). Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation, 8(1), 3–30. https://doi.org/10.1145/272991.272995
10.1145/272991.272995
Google Scholar
McCune, B., & Keon, D. (2002). Equations for potential annual direct incident radiation and heat load. Journal of Vegetation Science, 13(4), 603–606. https://doi.org/10.1111/j.1654-1103.2002.tb02087.x
10.1111/j.1654-1103.2002.tb02087.x
Web of Science®Google Scholar
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan ( 2nd ed.). Chapman and Hall.
10.1201/9780429029608
Google Scholar
Meunier, P., Hovius, N., & Haines, J. A. (2008). Topographic site effects and the location of earthquake induced landslides. Earth and Planetary Science Letters, 275(3–4), 221–232. https://doi.org/10.1016/j.epsl.2008.07.020
10.1016/j.epsl.2008.07.020
CASADSWeb of Science®Google Scholar
Micu, M., Micu, D., & Havenith, H. B. (2023). Earthquake-induced landslide hazard assessment in the Vrancea Seismic Region (eastern Carpathians, Romania): Constraints and perspectives. Geomorphology, 427, 108635. https://doi.org/10.1016/j.geomorph.2023.108635
10.1016/j.geomorph.2023.108635
Web of Science®Google Scholar
Mirus, B. B., Jones, E. S., Baum, R. L., Godt, J. W., Slaughter, S., Crawford, M. M., et al. (2020). Landslides across the USA: Occurrence, susceptibility, and data limitations. Landslides, 17(10), 2271–2285. https://doi.org/10.1007/s10346-020-01424-4
10.1007/s10346-020-01424-4
Web of Science®Google Scholar
Mood, C. (2010). Logistic regression: Why we cannot do what we think we can do, and what we can do about it. European Sociological Review, 26(1), 67–82. https://doi.org/10.1093/esr/jcp006
10.1093/esr/jcp006
Web of Science®Google Scholar
Moons, K. G. M., de Groot, J. A. H., Bouwmeester, W., Vergouwe, Y., Mallett, S., Altman, D. G., et al. (2014). Critical appraisal and data extraction for systematic reviews of prediction modelling studies: The CHARMS checklist. PLoS Medicine, 11(10), e1001744. https://doi.org/10.1371/journal.pmed.1001744
10.1371/journal.pmed.1001744
PubMedWeb of Science®Google Scholar
Nad’o, L., & Kaňuch, P. (2018). Why sampling ratio matters: Logistic regression and studies of habitat use. PLoS One, 13(7), e0200742. https://doi.org/10.1371/JOURNAL.PONE.0200742
10.1371/JOURNAL.PONE.0200742
PubMedWeb of Science®Google Scholar
Nagendra, S., Manjunatha, S. B., Kifer, D., Pei, T., Mirus, B. B., Li, W., et al. (2022). Constructing a global landslide database across heterogeneous environments using learning without forgetting. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 4349–4370. https://doi.org/10.1109/jstars.2022.3177025
10.1109/jstars.2022.3177025
ADSWeb of Science®Google Scholar
National Research Council. (1985). Reducing losses from landsliding in the United States. National Academies Press. https://doi.org/10.17226/19286
Google Scholar
Nowicki Jessee, M. A., Hamburger, M. W., Allstadt, K., Wald, D. J., Robeson, S. M., Tanyas, H., et al. (2018). A global empirical model for near-real-time assessment of seismically induced landslides. Journal of Geophysical Research: Earth Surface, 123(8), 1835–1859. https://doi.org/10.1029/2017JF004494
10.1029/2017JF004494
ADSWeb of Science®Google Scholar
Omernik, J. M., & Griffith, G. E. (2014). Ecoregions of the conterminous United States: Evolution of a hierarchical spatial framework. Environmental Management, 54(6), 1249–1266. https://doi.org/10.1007/s00267-014-0364-1
10.1007/s00267-014-0364-1
ADSPubMedWeb of Science®Google Scholar
Oommen, T., Baise, L. G., & Vogel, R. M. (2011). Sampling bias and class imbalance in maximum-likelihood logistic regression. Mathematical Geosciences, 43(1), 99–120. https://doi.org/10.1007/s11004-010-9311-8
10.1007/s11004-010-9311-8
Web of Science®Google Scholar
Ozturk, U., Pittore, M., Behling, R., Roessner, S., Andreani, L., & Korup, O. (2021). How robust are landslide susceptibility estimates? Landslides, 18(2), 681–695. https://doi.org/10.1007/s10346-020-01485-5
10.1007/s10346-020-01485-5
Web of Science®Google Scholar
Pavlou, M., Ambler, G., Seaman, S., De iorio, M., & Omar, R. Z. (2016). Review and evaluation of penalized regression methods for risk prediction in low-dimensional data with few events. Statistics in Medicine, 35(7), 1159–1177. https://doi.org/10.1002/sim.6782
10.1002/sim.6782
PubMedWeb of Science®Google Scholar
Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., & Feinstein, A. R. (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology, 49(12), 1373–1379. https://doi.org/10.1016/s0895-4356(96)00236-3
10.1016/S0895-4356(96)00236-3
CASPubMedWeb of Science®Google Scholar
Pendergrass, A. G., & Knutti, R. (2018). The uneven nature of daily precipitation and its change. Geophysical Research Letters, 45(21), 11980–11988. https://doi.org/10.1029/2018GL080298
10.1029/2018GL080298
ADSWeb of Science®Google Scholar
Petschko, H., Brenning, A., Bell, R., Goetz, J., & Glade, T. (2013). Assessing the quality of landslide susceptibility maps—Case study Lower Austria. Natural Hazards and Earth System Sciences Discussions, 1(2), 1001–1050. https://doi.org/10.5194/nhessd-1-1001-2013
10.5194/nhessd-1-1001-2013
ADSGoogle Scholar
Pradhan, B. (2013). A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Computers & Geosciences, 51, 350–365. https://doi.org/10.1016/J.CAGEO.2012.08.023
10.1016/J.CAGEO.2012.08.023
ADSWeb of Science®Google Scholar
Rault, C., Robert, A., Marc, O., Hovius, N., & Meunier, P. (2019). Seismic and geologic controls on spatial clustering of landslides in three large earthquakes. Earth Surface Dynamics, 7(3), 829–839. https://doi.org/10.5194/esurf-7-829-2019
10.5194/esurf-7-829-2019
ADSWeb of Science®Google Scholar
R Core Team. (2016). R: A language and environment for statistical computing.
Google Scholar
Reichenbach, P., Rossi, M., Malamud, B. D., Mihir, M., & Guzzetti, F. (2018). A review of statistically-based landslide susceptibility models. Earth-Science Reviews, 180, 60–91. https://doi.org/10.1016/j.earscirev.2018.03.001
10.1016/j.earscirev.2018.03.001
ADSWeb of Science®Google Scholar
Segoni, S., Pappafico, G., Luti, T., & Catani, F. (2020). Landslide susceptibility assessment in complex geological settings: Sensitivity to geological information and insights on its parameterization. Landslides, 17(10), 2443–2453. https://doi.org/10.1007/s10346-019-01340-2
10.1007/s10346-019-01340-2
Web of Science®Google Scholar
Stan Development Team. (2017). Stan modeling language. User’s guide and reference manual (pp. 1–488). Retrieved from http://mc-stan.org/manual.html%5Cnpapers2://publication/uuid/C0937B19-1CC1-423C-B569-3FDB66090102
Google Scholar
Stanley, T., & Kirschbaum, D. B. (2017). A heuristic approach to global landslide susceptibility mapping. Natural Hazards, 87(1), 145–164. https://doi.org/10.1007/s11069-017-2757-y
10.1007/s11069-017-2757-y
PubMedWeb of Science®Google Scholar
Steger, S., Brenning, A., Bell, R., & Glade, T. (2016). The propagation of inventory-based positional errors into statistical landslide susceptibility models. Natural Hazards and Earth System Sciences, 16(12), 2729–2745. https://doi.org/10.5194/nhess-16-2729-2016
10.5194/nhess-16-2729-2016
ADSWeb of Science®Google Scholar
Steger, S., Brenning, A., Bell, R., & Glade, T. (2017). The influence of systematically incomplete shallow landslide inventories on statistical susceptibility models and suggestions for improvements. Landslides, 14(5), 1767–1781. https://doi.org/10.1007/s10346-017-0820-0
10.1007/s10346-017-0820-0
Web of Science®Google Scholar
Steger, S., Moreno, M., Crespi, A., Zellner, P. J., Gariano, S. L., Brunetti, M. T., et al. (2022). Deciphering seasonal effects of triggering and preparatory precipitation for improved shallow landslide prediction using generalized additive mixed models. Natural Hazards and Earth System Sciences Discussions, 1–38. Retrieved from https://nhess.copernicus.org/preprints/nhess-2022-271/
Web of Science®Google Scholar
Tanyas, H., Rossi, M., Alvioli, M., van Westen, C. J., & Marchesini, I. (2019). A global slope unit-based method for the near real-time prediction of earthquake-induced landslides. Geomorphology, 327, 126–146. https://doi.org/10.1016/j.geomorph.2018.10.022
10.1016/j.geomorph.2018.10.022
ADSWeb of Science®Google Scholar
Tanyu, B. F., Abbaspour, A., Alimohammadlou, Y., & Tecuci, G. (2021). Landslide susceptibility analyses using Random Forest, C4.5, and C5.0 with balanced and unbalanced datasets. Catena, 203, 105355. https://doi.org/10.1016/j.catena.2021.105355
10.1016/j.catena.2021.105355
Web of Science®Google Scholar
Thi Ngo, P. T., Panahi, M., Khosravi, K., Ghorbanzadeh, O., Kariminejad, N., Cerda, A., & Lee, S. (2021). Evaluation of deep learning algorithms for national scale landslide susceptibility mapping of Iran. Geoscience Frontiers, 12(2), 505–519. https://doi.org/10.1016/j.gsf.2020.06.013
10.1016/j.gsf.2020.06.013
Web of Science®Google Scholar
Trigila, A., Iadanza, C., Esposito, C., & Scarascia-Mugnozza, G. (2015). Comparison of logistic regression and random forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy). Geomorphology, 249, 119–136. https://doi.org/10.1016/J.GEOMORPH.2015.06.001
10.1016/J.GEOMORPH.2015.06.001
ADSWeb of Science®Google Scholar
Tsangaratos, P., & Benardos, A. (2014). Estimating landslide susceptibility through a artificial neural network classifier. Natural Hazards, 74(3), 1489–1516. https://doi.org/10.1007/S11069-014-1245-X/FIGURES/6
10.1007/S11069-014-1245-X/FIGURES/6
Web of Science®Google Scholar
U.S. Department of Agriculture. (2021a). U.S. general soil map (STATSGO2). Retrieved from https://sdmdataaccess.sc.egov.usda.gov/?referrer=Citation.htm-STATSGOLink
Google Scholar
U.S. Department of Agriculture. (2021b). Soil survey geographic (SSURGO) database. Retrieved from https://sdmdataaccess.sc.egov.usda.gov/?referrer=Citation.htm-SSURGOLink
Google Scholar
U.S. Geological Survey. (2019). 3D elevation program 1/3 arcsecond. Retrieved from https://apps.nationalmap.gov/downloader/
Google Scholar
van de Schoot, R., Depaoli, S., King, R., Kramer, B., Märtens, K., Tadesse, M. G., et al. (2021). Bayesian statistics and modelling. Nature Reviews Methods Primers, 1(1), 1. https://doi.org/10.1038/s43586-020-00001-2
10.1038/s43586-020-00001-2
Google Scholar
Van Den Eeckhaut, M., Hervás, J., Jaedicke, C., Malet, J. P., Montanarella, L., & Nadim, F. (2012). Statistical modelling of Europe-wide landslide susceptibility using limited landslide inventory data. Landslides, 9(3), 357–369. https://doi.org/10.1007/s10346-011-0299-z
10.1007/s10346-011-0299-z
Web of Science®Google Scholar
Varnes, D. J., & IAEG Comission on Landslides. (1984). Landslide hazard zonation: A review of principles and practice. UNESCO.
Google Scholar
Von Ruette, J., Papritz, A., Lehmann, P., Rickli, C., & Or, D. (2011). Spatial statistical modeling of shallow landslides-validating predictions for different landslide inventories and rainfall events. Geomorphology, 133(1–2), 11–22. https://doi.org/10.1016/j.geomorph.2011.06.010
10.1016/j.geomorph.2011.06.010
ADSWeb of Science®Google Scholar
Wang, Y., Fang, Z., & Hong, H. (2019). Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China. Science of the Total Environment, 666, 975–993. https://doi.org/10.1016/J.SCITOTENV.2019.02.263
10.1016/J.SCITOTENV.2019.02.263
CASADSPubMedWeb of Science®Google Scholar
Wasserstein, R. L., & Lazar, N. A. (2016). The ASA’s statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133. https://doi.org/10.1080/00031305.2016.1154108
10.1080/00031305.2016.1154108
Web of Science®Google Scholar
Wilde, M., Günther, A., Reichenbach, P., Malet, J. P., & Hervás, J. (2018). Pan-European landslide susceptibility mapping: ELSUS version 2. Journal of Maps, 14(2), 97–104. https://doi.org/10.1080/17445647.2018.1432511
10.1080/17445647.2018.1432511
Web of Science®Google Scholar
Wills, C., Roth, N., McCrink, T. P., & Short, W. R. (2016). The California landslide inventory database, (June). https://doi.org/10.1130/abs/2016cd-274476
10.1130/abs/2016cd-274476
Google Scholar
Wooten, R. M., Cattanach, B. L., Bozdog, G. N., Isard, S. J., Fuemmeler, S. J., Bauer, J. B., et al. (2017). The North Carolina Geological Survey’s response to landslide events: Methods, findings, lessons learned, and challenges. In Association of environmental and engineering geologists 3rd North American symposium on landslides (pp. 359–370).
Google Scholar
Yesilnacar, E. K. (2005). The application of computational intelligence to landslide susceptibility mapping in Turkey. University of Melbourne.
Google Scholar
Youssef, A. M., Pourghasemi, H. R., Pourtaghi, Z. S., & Al-Katheeri, M. M. (2016). Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides, 13(5), 839–856. https://doi.org/10.1007/S10346-015-0614-1
10.1007/S10346-015-0614-1
Web of Science®Google Scholar
Zhu, J., Baise, L. G., & Thompson, E. M. (2017). An updated geospatial liquefaction model for global application. Bulletin of the Seismological Society of America, 107(3), 1365–1385. https://doi.org/10.1785/0120160198
10.1785/0120160198
Web of Science®Google Scholar

Citing Literature

Volume128, Issue5

May 2023

e2022JF006810

Mapping Landslide Susceptibility Over Large Regions With Limited Data

Abstract

Key Points

Plain Language Summary

1 Introduction

2 Methods