Radar-Based Deep Learning for Debris Flow Identification Amid the Environmental Disturbances
Abstract
Microwave radar, utilizing the differences in Doppler frequencies from moving target echoes, offers remote sensing capabilities and continuous all-weather monitoring for geological disasters. However, intelligent identification of debris flow signals using such radar remains unexplored. Therefore, we implemented 12 deep learning models coupled with a voting strategy to develop classification models for identifying the debris flow, using 24,000 samples across eight categories of targets obtained from field experiments. Each model demonstrated significant proficiency in classification, achieving a remarkable highest accuracy of 95.46% for the multi-object classification. Among the individual models, the vgg16 model with a simple and deep architecture excelled in debris flow identification, exhibiting a high precision and a low false alarm rate. The voting strategy further improved the reliability of individual deep learning model. We propose that employing radar-based deep learning techniques combined with extensive field data represents a crucial advancement in the monitoring and early warning of debris flow.
Key Points
-
Radar-based deep learning models were established for debris flow identification
-
Eight labeled targets, including debris flow and falling rocks, can be effectively classified
-
Energy spectrums from radar signals were skillfully used for the multi-object detection
Plain Language Summary
Microwave radar detects geological disasters by sending out signals and analyzing how they bounce back from moving targets. It has many advantages, such as being able to see through obstructions and working in any weather conditions, both day and night. However, using radar signals and artificial intelligence models to identify debris flow has not been thoroughly studied. In our research, through field experiments and indoor compilation, we developed a large data set of 24,000 samples with eight different categories including the debris flow and falling rocks. Then we tested 12 deep learning algorithms with a voting approach to create a series of models that can recognize the debris flow in the complex environment. Each model performed well, with the best one achieving an impressive accuracy of 95.46% in classifying multiple objectives. The vgg16 model with a simple and deep architecture stood out for its effectiveness in identifying the debris flow. Our findings suggest that combining radar technology with deep learning models, especially with extensive real-world data, will significantly improve how we prevent the debris flow, making it a major step forward in monitoring and early warning of similar natural disasters.
1 Introduction
Debris flow is a type of solid-liquid two-phase flow that consists of mud, sand, and rocks (Iverson, 2005), frequently causing substantial economic losses and casualties in mountainous areas (Dowling & Santi, 2014). It has impulsive phenomena, and the formation of a solid-liquid surge during the downstream routing often occurs in a short time (Reid et al., 2016; Simoni et al., 2020). Especially, some debris flows that occur at night can result in a shorter evacuation time (Jalayer et al., 2018), making disaster prevention challenging and highlighting the importance of early warning.
Since the mid-20th century, there has been a rapid development in monitoring and early warning technology for debris flow. The debris flow early warning can be categorized into regional and small watershed scales, as highlighted by Hürlimann et al. (2019). At a regional scale, meteorological early warning can be facilitated using rainfall thresholds. A long time ago, Caine (1980) introduced a rainfall threshold model for debris flow based on the relationship between rainfall intensity and duration, which was further enhanced by the scholars such as Godt et al. (2006) who incorporated antecedent soil moisture into the debris flow early warning system in Seattle. Moreover, some methods that consider dimensionless rainfall intensity, antecedent accumulated effective rainfall, and heavy-rain recurrence periods have demonstrated distinct advantages at various regional scales, as shown by Cannon and Ellen (1985), Destro et al. (2017), S. Liu et al. (2020), and Zhuang et al. (2015). As well, because wild fire could make the terrain waterproof to cause hydrological response somehow uniform, Staley et al. (2017) presented a predictive approach that utilizes rainfall, hydrologic response, and readily available geospatial data to predict rainfall intensity–duration thresholds for debris flow generation in burned areas. At a watershed scale, there is certain period from the initiation at the source area to the propagation in the gully, ultimately leading to the debris flow. Effectively utilizing this time window for early warning can provide crucial response time for individuals in high-risk zones to escape, thereby minimizing disaster-related losses (Chen et al., 2016). However, due to the uncertainty in obtaining rainfall information at the small watershed scale (Guo et al., 2021), it is difficult to use rainfall thresholds for reliable early warning as is done for regional scales. So, many scientists have deployed monitoring equipments like geophones, infrasound sensors, mud level gauges, and impact force sensors in small watersheds to track the movement of debris flow to establish a comprehensive monitoring and early warning system (Huang et al., 2015; Marchi et al., 2002). Nevertheless, these monitoring devices require placement in debris flow-prone areas, posing some risk including potential burial by debris flow and challenges in transmitting warning information due to poor communication signal in complex canyons. Therefore, remote video monitoring has emerged as a crucial supplementary (K. Liu et al., 2021). With the fast advancement in machine vision, researchers have significantly enhanced the accuracy of early warning by implementing machine learning algorithms into image recognition for automatic identification of debris flow (Pham & Kim, 2022). However, video monitoring presents inherent limitations. The equipment depends on natural visible light, and there exists a trade-off between visible distance, picture definition, and communication bandwidth.
Microwave radar leverages the difference in Doppler frequencies between moving target echoes to differentiate the targets, effectively surpassing the limitations of video monitoring methods such as poor night visibility and limited detection range. While microwave radar has been extensively employed in civil applications like vehicle recognition (Zhao & Su, 2021) and gait recognition (Yang et al., 2022), its application in natural disaster prevention and mitigation is gradually expanding. In the realm of natural disasters, ground-based microwave radar is playing a crucial role in monitoring avalanches, rockfalls, and debris flow. The application of radar-based observations for avalanche research has been trialed at numerous test sites across Europe, including the Ryggfonn avalanche hazard site in Norway (Gauer et al., 2007; Schreiber et al., 2001) and the Vallée de la Sionne (Köhler et al., 2016). In recent years, the implementation of radar technology for debris flow monitoring (Koschuch et al., 2015) has been tested for high-resolution discharge and total volume estimations at the Lattenbach gully in Austria (Huebl & Kaitna, 2021) and for in-depth wave analysis at the Gadria creek in Italy (Schöffl et al., 2023). Furthermore, a 5-year microwave radar observation on avalanches, debris flows, and rockfalls has been carried out in the Swiss Alps (Gubler, 2000). Additionally, some scholars have monitored volcanic stability in Indonesia using radar technology for 3 years (Vöge & Hort, 2008; Vöge et al., 2008). It is very important to recognize that in real-world situations, frequent phenomena such as collapses, landslides, and stream water level rising in gullies can produce strong echo signals. These signals may affect the precision of debris flow detection and result in false alarms. So, although radar-based monitoring technology for debris flow has been developed well, identification accuracy is subject to interference from complex natural processes and thus requires further improvement.
Artificial intelligence is being increasingly harnessed to enhance the recognition accuracy of microwave radar through a traditional machine learning method (Hyun et al., 2020). This way presents a promising chance to improving the detection of radar signals of debris flow. However, no research has yet been conducted on debris flow identification by combining deep learning models with radar technology. Therefore, in this study, we have employed 12 deep learning models and a voting strategy to build the multi-object classification models toward debris flow detection amid environmental disturbances, with 24,000 samples for eight kinds of targets from field experiments.
2 Materials and Methods
2.1 Field Radar-Based Data Collection
2.1.1 Doppler Radar System
The system originates from a Pulse-Doppler Radar (Schöffl et al., 2023). When a moving target enters the radar beam, it causes Doppler shifts in the backscattered radio waves, or echo signals, allowing the system to determine the radial velocity of moving objects. To achieve spatial resolution, the radar emits pulsed electromagnetic waves of length at discrete intervals, known as the pulse repetition frequency (PRF). By discretely matching the echo signal at the radar receiver, the radar beam is segmented into resolution cells or range gates (Rammer et al., 2007; Schreiber et al., 2001), which correspond to a length of 30 m in this study.
The signal is decoded using a 1024-bin Fast Fourier Transform, which outputs Doppler frequencies and corresponding echo intensities as Doppler spectra with a timestep of approximately 3 Hz (). The radar receiver employed in this study is the RC-DFDR-01A, featuring high-sensitivity X-band (10.0 GHz) capability. The detection velocity range (maximum unambiguous velocity) within each range gate is from −23 to 23 m/s. Once received, the echo signals are transmitted synchronously to software (Radarexplore), which displays the incoming data in real-time. Radarexplore utilizes coherent accumulation and digital matched filtering techniques to process the echo signals, enhancing target tracking capabilities by detecting small moving targets. The Doppler signal is decoded into an array of signals spread across 1,024 velocity classes and range gates. Each array of signals, termed a Doppler spectrum, is continuously retrieved by the radar during operation.
The incoming information is displayed as a waterfall graph in Radarexplore for each time step (). In this graph (some examples in Figure S1 in Supporting Information S1), the vertical axis represents the range gates, stacked on top of each other to show distance information, while the horizontal axis represents the 1,024 velocity classes, indicating the Doppler frequencies associated with the moving targets. The color scale in the graph represents the echo signal intensities, with different colors corresponding to varying signal strengths. This graph provides a visual representation of how the velocity of targets varies across different range gates over time. Each horizontal line in the graph corresponds to a Doppler spectrum at a specific time step, allowing for the observation of changes in target velocity and intensity over time.
2.1.2 Data Acquisition
Based on field experiments and in-situ measurement, radar signals of water flow, falling rocks, debris flow, animals (yaks), prayer flags, vegetation, vehicles, and natural gully were collected (Table S1 and Figure S2 in Supporting Information S1). In order to ensure the fairness of deep learning as much as possible, the parameters of radar system were uniformly set (Table S2 in Supporting Information S1). Field sample collection was conducted in various locations including rurals in Chengdu, Heixiluo gully in Ganluo, and Jiajin Mountain in Baoxing in Yaan, China. Among these locations, the Heixiluo Gully, where a large-scale debris flow occurred in 2020 (Yan et al., 2023), provided a suitable place for artificial rockfall and debris flow experiments as well as radar data collection. We excavated a channel on the slope approximately 9 m long with a slope gradient of about 15°. We finished 30 sets of artificial debris flow experiments, thoroughly simulating the movement of debris flow by creating a torrent with a large amount of loose soil in the channel. The radar and video equipment were positioned directly facing the flow zone. The artificial rockfall experiment featured a steep slope of about 85°, with a slope length of 10 m and rock diameters ranging from 5 to 50 cm.
2.1.3 Data Preprocessing
First, based on the video records and the screen-capture data from Radarexplore, abnormalities were manually filtered out to determine the time range of the movement of debris flow and falling rocks. Subsequently, the screen-captured segments reflecting the movement of disasters were cropped. Next, using Python and its OpenCV library, the total number of frames in the video was computed, and the number of frames to be saved was set randomly, finally resulting in 3,000 samples for each kind of targets. As the energy spectrum displayed in Radarexplore can comprehensively reflects the movements of disasters or other environmental disturbances, we further cropped the captured image samples to the target area for deep learning model training and test (Figure S1 in Supporting Information S1).
2.2 Deep Learning for Multi-Object Classification
As shown in Figure 1, we used 12 widely-used deep learning models (Albardi et al., 2021; Raschka et al., 2022), including alexnet, densenet161, googlenet, inception_v3, mnasnet1_0, mobilenet_v2, resnet18, resnext50_32x4d, shufflenet_v2_x1_0, squeezenet1_0, vgg16, and wide_resnet50_2. First, we divided 3,000 samples into 2,000 training samples and 1,000 test samples for each kind of targets. Then, the inputs were adjusted to a specified size, where the short edge of the image was resized to 256 pixels, and the long edge was adjusted proportionally. Since neural networks typically require fixed-size inputs, we further cropped the images from the center to a size of 224 × 224 pixels. Next, we normalized the tensors, by regularizing the image data to speed up network convergence and improve model stability and generalization. To accelerate the training, following the idea of transfer learning (Hosna et al., 2022), we performed 30 rounds of training epochs based on the pre-training of each model. Lastly, we identified the test samples and evaluated the model performance. The computing platform was Windows 10 with a Xeon® W-2223 processor, running at 3.60 GHz and 32 GB of memory.

Flow chart of radar-based deep learning for multi-object classification toward debris flow detection amid environmental disturbances.
2.3 Voting Strategy
Single model may be easily influenced by inherent limitations when performing classification tasks, resulting in varying degrees of uncertainty in the classification (Ganaie et al., 2022). To minimize uncertainty, the ensemble approach has been widely adopted. Here, we used a voting approach (Seetha et al., 2016) to combine the predictions of multiple models and select the most frequently occurring result as the final prediction. First, aforementioned deep learning models were utilized to classify the remaining 1,000 samples in each category. Subsequently, we obtained the predictions from all the models for eight target categories. Based on the predictions from individual models, we counted the occurrences of each predicted element, then retrieved the most common target and its count. Thus, the majority-based voting results were appended to the final predictions.
2.4 Evaluation Criterion
Furthermore, the confusion matrix could provide detailed insights into the classification performance (Heydarian et al., 2022). In the confusion matrix plots of this paper, each row represents the number of actual samples that belong to that class, and each column corresponds to the total number of samples predicted to belong to that class. The diagonal elements indicate the number of samples that are correctly predicted. Generally, we hope that during the model prediction, the predicted categories are concentrated on the diagonal. A denser distribution of predicted values along the diagonal indicates a superior performance of the model. It is also easy to know which targets the model tends to misclassify.
3 Results
3.1 Model Performance
According to evaluation metrics, some differences existed in the performance of the 12 models, but all of them demonstrated proficiency in completing the multi-object classification (Figure 2a and Figure S3 in Supporting Information S1). Notably, googlenet, resnet18, and wide_resnet50_2 exhibited exceptional accuracy over all targets, each exceeding 95% (maximum equals to 95.46%), whereas mnasnet1_0 performed the least at only 81.95%. Other models such as alexnet, densenet161, inception_v3, mobilenet_v2, shufflenet_v2_x1_0, squeezenet1_0, and vgg16 also achieved high accuracy exceeding 90%, which was satisfactory as well. More precisely, the precision of googlenet, resnet18, resnext50_32x4d, and wide_resnet50_2 has exceeded 95%, with yielding the best performance of 95.71% for wide_resnet50_2. Correspondingly, the recall rate and F1 score were consistent with the aforementioned metrics. In general, the deep learning models effectively distinguished movements such as debris flow, falling rocks, and other kinds of environmental disturbances. In especial, wide_resnet50_2 showed the best performance across all models for the multi-object classification.

(a) Heat map of the multi-object classification evaluation by Accuracy, Precision, Recall and F1 score. (b) Confusion matrix for the multi-object classification after using voting classifier. The axes labeled A ∼ H represent the objectives including water flow, debris flow, prayer flags, natural gully, vehicles, falling rocks, vegetation, and animals (yaks) in turn.
From the confusion matrix, concerning water flow, approximately 10%–20% of samples were misclassified as other categories by the models, primarily mistaken for vehicles. But, concerning vehicles, approximately 20% were erroneously identified as water flow and vegetation. For instance, alexnet erroneously categorized 159 water flow samples as vehicles and misclassifies 221 vehicles as water flow, debris flow, natural gully, falling rocks, vegetation, and animals (Figure 3a). Focusing specifically on debris flow, except the mnasnet1_0 (Figure 3e), a maximum of 2.2% of test samples are misclassified as other categories, when squeezenet1_0 and vgg16 misclassified about 0.5% of debris flow samples as other categories (Figures 3j and 3k), having the lowest false negative rate. The squeezenet1_0 (Figure 3j) exhibited a markedly higher false positive rate compared to vgg16 (Figure 3k). Though wide_resnet50_2 had a lower false alarm rate (Figure 3l), the higher rate of missing report would be not as satisfactory as the vgg16. Regarding falling rocks, a significant level of misclassification occurred, with 5.4%–14% of samples primarily misclassified as vehicles and vegetation. Remarkably, mobilenet_v2 misclassified merely 5.4% of falling rocks samples as other categories, with only 30 additional samples erroneously labeled as falling rocks (Figure 3f). googlenet exhibited a good recognition capability, with a slightly higher false negative rate (5.7%) but a lower false alarm rate.

Confusion matrix for the multi-object classification on 1,000 test samples of each type. The deep learning models includes (a) alexnet, (b) densenet161, (c) googlenet, (d) inception_v3, (e) mnasnet1_0, (f) mobilenet_v2, (g) resnet18, (h) resnext50_32x4d, (i) shufflenet_v2_x1_0, (j) squeezenet1_0, (k) vgg16, and (l) wide_resnet50_2, respectively. The significance of the axes labeled A ∼ H corresponds to their counterparts in Figure 2.
3.2 Voting Classification
The voting method with the results of 12 deep learning models has shown some improvement in evaluation metrics (Figure 2). Although the magnitude of improvement in each metric was limited, compared to individual models, it could further enhance the reliability of single-model-based classification (Figure 2a). Confusion matrix plot indicates that the voting algorithm exhibited to some extent enhancement in the recognition capability of certain types compared to individual models (Figure 2b). It reduced the false negative rate for water flow, with only 86 samples misclassified as vehicles and vegetation. Additionally, it ensured the accuracy of debris flow identification while further reducing the false positive rate. Only 25 other kinds of samples were identified as debris flow, a significant improvement compared to the 54 false alarms produced by the vgg16 model. Similarly, this voting strategy significantly enhanced the recognition capability of falling rock samples. Only 4.5% of samples were misclassified as other types, while only 6 samples of other types were identified as falling rocks, greatly reducing the false positive rate.
4 Conclusions and Discussion
In this study, 12 deep learning models and a voting strategy with 24,000 samples for eight kinds of targets were used to build the multi-object classification models including debris flow detection. Samples were divided into training and testing samples in 2:1 ratio. Results show that individual model demonstrated proficiency in completing the classification task, yielding the highest accuracy of 95.46% and the lowest one of 81.95% for the multi-object classification. From the perspective of minimizing the impact of disasters to the greatest extent, vgg16 and mobilenet_v2 as well as googlenet performed best in identifying debris flow and falling rocks, respectively. The voting algorithm exhibited to some extent improvement in the recognition capability of some targets compared to individual models. Our research indicates that the integration of Doppley radar based energy spectrum with deep learning models, particularly when augmented with substantial real-world data, could markedly enhance our capacity to evade natural disasters like the debris flow.
When the deep learning becoming popular in various research fields in recent years, the technologies of geological disaster monitoring and early warning are gradually shifting from statistical and theoretical methods to data-driven models (Zhang et al., 2019). For example, Thirugnanam et al. (2020) improved the reliability of landslide early warning systems using support vector machines and neural networks. Chmiel et al. (2021) conducted a research on debris flow identification based on continuous vibration signals, using a random forest model, and achieving an accuracy over 90%. Recently, Leng et al. (2024) converted infrasound signals of debris flow into two-dimensional images, and then input them into a deep learning model for training, achieving an accuracy of 88.60%. In this study, we used the converted two-dimensional spectrogram of radar signals as the targets, highlighting the utilization of deep learning models to dig the radar signals of debris flow. Additionally, a voting strategy was introduced to integrate the predicted results of multiple models, resulting in a better performance of identification. These results could further enhance the capability of debris flow monitoring and early warning in complex mountains.
Although the established models have received satisfactory results, there are some limitations. First, we collected different types of data through a large number of field multi-scene samples, but the sampling of debris flow and rockfall was mainly conducted through on-site experiments due to the lack of natural observations. It is true that the effects of experimental debris flow are difficult to align with real-world scenarios, primarily due to the scale effect. However, from the perspective of monitoring and early warning, artificially generated debris flow can still produce radar echo signals. We used it as an example to draw further attention to the use of radar for various natural hazards, such as debris flow, flash flood, and rockfalls. Second, if explaining the mechanism of debris flow movement by the radar signals, more detailed experiments and analysis will be needed. Lastly, due to computational limitations, this study only used 12 deep learning models as case studies. In general, radar-based deep learning for debris flow detection in this work could be a critical step toward the monitoring and early warning for similar natural disasters.
Acknowledgments
This work was jointly supported by Sichuan Science and Technology Program (2024NSFSC0072) and Key R&D Program of Xizang Autonomous Region (XZ202301ZY0039G), the Science and Technology Research Program of Institute of Mountain Hazards and Environment, Chinese Academy of Sciences (IMHE-ZDRW-01).
Open Research
Data Availability Statement
All the data for model training and test, basic model codes, and trained models can be found at S. Liu (2024).