Application of Machine Learning to Classification of Volcanic Deformation in Routinely Generated InSAR Data
Abstract
[1] Recent improvements in the frequency, type, and availability of satellite images mean it is now feasible to routinely study volcanoes in remote and inaccessible regions, including those with no ground-based monitoring. In particular, Interferometric Synthetic Aperture Radar data can detect surface deformation, which has a strong statistical link to eruption. However, the data set produced by the recently launched Sentinel-1 satellite is too large to be manually analyzed on a global basis. In this study, we systematically process >30,000 short-term interferograms at over 900 volcanoes and apply machine learning algorithms to automatically detect volcanic ground deformation. We use a convolutional neutral network to classify interferometric fringes in wrapped interferograms with no atmospheric corrections. We employ a transfer learning strategy and test a range of pretrained networks, finding that AlexNet is best suited to this task. The positive results are checked by an expert and fed back for model updating. Following training with a combination of both positive and negative examples, this method reduced the number of interferograms to ∼100 which required further inspection, of which at least 39 are considered true positives. We demonstrate that machine learning can efficiently detect large, rapid deformation signals in wrapped interferograms, but further development is required to detect slow or small deformation patterns which do not generate multiple fringes in short duration interferograms. This study is the first to use machine learning approaches for detecting volcanic deformation in large data sets and demonstrates the potential of such techniques for developing alert systems based on satellite imagery.
Key Points
- We present a machine learning framework to detect volcanic ground deformation in wrapped interferograms using convolutional neural networks
- The classification model is initialized with Envisat data set and then tested and retrained with Sentinel-1 data set covering over 900 volcanoes
- This framework can reduce the number of interferograms for manual inspection from more than 30,000 to approximately 100
1 Introduction
[2] Globally, 800 million people live within 100 km of a volcano (Loughlin et al., 2015). Improvements in monitoring and forecasting have been shown to reduce fatalities due to volcanic eruptions (Auker et al., 2013; Mei et al., 2013), but a significant proportion of the ∼1,500 holocene volcanoes has no ground-based monitoring. Interferometric Synthetic Aperture Radar (InSAR) is a satellite remote sensing technique used to measure ground displacement at the centimeter scale over large geographic areas and has been widely applied to volcanology (e.g., Biggs & Pritchard, 2017; Pinel et al., 2014). Furthermore, InSAR measurements of volcanic deformation have a significant statistical link to eruption (Biggs et al., 2014). Modern satellites provide large coverage with high resolution signals, generating large data sets. For example, the two-satellite constellation, Sentinel-1 A and B, offers a 6-day repeat cycle and acquires data with a 250-km swath at a 5 m by 20 m spatial resolution (single look). This amounts to >10-TB per day or about 2 PB collected between its launch in 2014 and June 2017 (Fernández et al., 2017). The explosion in data has brought major challenges associated with manual inspection of imagery and timely dissemination of information. Moreover, many volcano observatories lack the expertise needed to exploit satellite data sets, particularly those in developing countries.
[3] Machine learning technologies have been widely implemented in the field of computer science, where the computers use statistical techniques to learn a specific and complex task from given data. In the Earth Sciences, machine learning has been employed in several applications (Lary et al., 2016), such as predicting earthquake magnitudes (Adeli & Panakkat, 2009), land surface classification (C. Li et al., 2014), vegetation indices (Brown et al., 2008), landslide susceptibility mapping (Yilmaz, 2010), and so forth. The techniques used previously include tree-based methods (Wei et al., 2013), artificial neural networks (Conforti et al., 2014), support vector machines (SVMs; Tien Bui et al., 2017), and Bayesian methods (Totaro et al., 2016).
[4] Here we present a novel approach to detect volcanic ground deformation automatically from InSAR images. This approach brings together satellite-based volcano geodesy and machine learning algorithms to develop new ways of automatically searching through large volumes of InSAR images to detect patterns that may be related to volcanic activity. The proposed method works on wrapped interferograms displayed as fringes, each representing a set amount of displacement equal to half the radar wavelength. However, these interferograms also contain artifacts associated with atmospheric conditions, and at volcanoes, the effect of stratified atmospheric water vapor can be particularly difficult to distinguish from ground deformation (e.g., Ebmeier et al., 2013b; Parker et al., 2015a). In this paper, we extract the spatial characteristics of the interferograms using deep convolutional neural networks (CNN)—biologically-inspired architectures that comprise multiple layers of neural connections that have learnable weights and biases (Krizhevsky et al., 2012). Similar approaches have been highly successful when applied to the analysis of visual imagery, image classification, object detection, and tracking (LeCun et al., 2015) and could ultimately be used in near-real time to detect volcano deformation and inform the local volcano observatories.
2 Background: Machine Learning Algorithms
[5] Machine learning is a generic term for the automatic discrimination of input patterns into learnt or defined classes, originally introduced in the 1950s (Samuel, 1959). For the case of volcanic unrest classification, the input is InSAR interferograms, and the output will be one of two classes: unrest or no unrest (or the likelihood of each). Machine learning techniques can be separated into two categories: supervised and unsupervised methods. Supervised methods learn representations of the output classes using labeled ground truth examples of those classes (Kotsiantis, 2007; i.e., in this case volcanic unrest and no volcanic unrest), whereas unsupervised methods cluster together similar groups in the data without any ground truth (e.g., Zanero & Savaresi, 2004). In this study, we focus on supervised methods, particularly deep CNNs (Krizhevsky et al., 2012; Rumelhart et al., 1986; Szegedy et al., 2016) and SVMs (Christianini & Shawe-Taylor, 2000).
[6] SVMs typically use hand-defined inputs such as intensity distributions and Gabor features extracted from the input images (Chang & Lin, 2011). SVMs classify using a maximum margin technique and are able to linearly distinguish two or more classes. However, using the kernel trick, the input domain is projected into a (possibly infinite) higher dimensional space to provide very effective nonlinear classification (Burges, 1998). The main advantages of SVMs are that the training process does not require a truly large data set (large in this context can be considered to be on the order of 10,000 or more data points). The SVM process is also fast even for machines without a graphics processing unit. However, in many supervised classification problems with large ground truth data sets, deep networks such as CNNs often outperform shallow machine learning algorithms such as SVMs (Goodfellow et al., 2016).
[7] CNNs are a class of neural networks that employ locally connected layers that apply convolution between a kernel (filter matrix) and an internal signal and are most commonly used for image recognition and classification. The deep, hierarchical, and densely connected nature of CNNs enables them not only to classify but also to generate discriminating features of progressive complexity from the input to the output layers (Jia et al., 2014). For image-based classification, the first layers convolve small spatial regions with learnt blocks of weights. These weight blocks can be considered to be feature extractors and often resemble early vision basis functions found in the human visual cortex (i.e., similar to 2-D Gabor functions; Matsugu et al., 2003). The output of these layers is often integrated (or pooled) before connection to lower layers. The convolutional layers are commonly then connected to dense layers of fully connected neurons leading to a final classification (often using an output activation function such as softmax; Goodfellow et al., 2016). All neurons within the convolutional and fully connected layers are defined by weights and a bias from the connected neurons one layer above. Depending on the architecture, all layers use activation functions such as the hyperbolic tangent or rectified linear unit to introduce nonlinearity into the networks (Agostinelli et al., 2015). The weights in all layers are initiated in training with nonzero random or pseudorandom values. All weights are then modified using a batch-based iterative back propagation method using a testing data set with associated ground truth. To prevent overfitting, regularization techniques such as dropout are used to ensure the network is able to effectively generalize (Srivastava et al., 2014). The effective training of deep CNN networks is both extremely computationally expensive and requires very large training data sets (Simonyan & Zisserman, 2014). It is therefore common to pretrain convolutional layers in an unsupervised fashion, followed by supervised fine-tuning (Erhan et al., 2010). Several high-performance pretrained models have been employed to serve a specific purpose, such as AlexNet (Krizhevsky et al., 2012) and ResNet (He et al., 2016). Figure 1 illustrates the architecture of an example CNN (AlexNet). This figure shows the 2-D convolution layers and the 1-D output linear layers and how they are connected giving a hierarchical representation across all the layers. The input to AlexNet (as shown in the figure) is a 2-D image of 224 × 224 pixels.
3 InSAR Data Set
[8] The first Sentinel-1 satellite (S1A) was launched in 2014, and the mission ensures Earth's observations for the next 25 years with repeat intervals of 6–24 days globally. The data are freely available in near-real time making it ideal for routine volcano monitoring. The global data set used in this study consists of 30,249 interferograms covering ∼900 volcanoes in 2016–2017. The interferograms were processed with the automated InSAR processing system LiCSAR (Looking inside the Continents from Space)(http://comet.nerc.ac.uk/COMET-LiCS-portal/) developed by the Centre for Observation and Modelling of Earthquakes, Volcanoes and Tectonics. Each acquisition is connected to the three preceding acquisitions, forming a trio of interferograms of increasing time span. We crop the images to a region spanning 0.5∘ in latitude and longitude for each of the ∼900 volcanoes. We include volcanoes in temperate, tropical, and arid environments with morphologies ranging from steep stratovolcanoes to large calderas and small islands (Figure 2a). The data set is weighted towards the European volcanoes where images are acquired every 6 days and the LiCSAR system has been running the longest (2016–2017). Temporal baseline of the interferograms ranges from 6 to 120 days, including one third of the data set with timespans of 6 and 12 days (Figure 2b).
[9] A major challenge for both manual and automated InSAR monitoring systems is distinguishing deformation signals from atmospheric artifacts which can also generate concentric fringes around volcanoes, particularly those with steep topography (e.g., Ebmeier et al., 2013b; Pinel et al., 2014). Several approaches have been proposed to correct these artifacts, with external data sources such weather models or GPS tropospheric delays or by applying statistical approaches to phase-elevation correlations or time series (e.g., Bekaert et al., 2015; Jolivet et al., 2014; Z. Li et al., 2005). The quality of atmospheric correction is highly dependent on geographical location and is hence variable (Parker et al., 2015a). Furthermore, atmospheric corrections can only be applied to unwrapped interferograms, and unwrapping is computationally expensive and slow and can introduce phase errors. For our initial, proof-of-concept study, we chose to use wrapped, uncorrected interferograms and test the ability of our approach to discriminate between deformation and atmospheric signals.
[10] To provide ground-truth information for training and verification of supervised classification systems, it is necessary to manually identify a selection of interferograms where several fringes can be attributed to volcanic deformation. Even though there are >30,000 interferograms in our Sentinel-1 data set, the majority are short-duration inteferograms covering volcanoes that are not deforming or are deforming slowly. Identifying a sufficient number of positive images in the Sentinel-1 data set is challenging, so we pretrain the network using an older archive of interferograms from the European Space Agency's Envisat satellite. Several possible data sets exist, including over the Main Ethiopian Rift (Biggs et al., 2011), the Kenyan Rift (Biggs et al., 2009), the Central Andes (Pritchard & Simons, 2004a), and the Southern Andes (Pritchard & Simons, 2004b). All of these contain (1) multiple volcanic systems displaying persistent deformation at variable rates and (2) areas which are not deforming but show a range of features including incoherence and atmospheric artifacts (Figure 3). We chose to use a data set over the Main Ethiopian Rift for convenience. The Envisat background mission (2003–2010) acquired three to four images per year over the Main Ethiopian Rift and has been used to identify deformation at four volcanoes previously considered dormant: Alutu, Corbetti, Bora, and Haledebi (Biggs et al., 2011). These interferograms are a good test case. The rates of deformation are several centimeters per year, which means that over the time period of the interofergrams (variable but typically several months), the interferograms show several fringes of deformation.
[11] Despite the small number of examples, it is important to train the network using some Sentinel-1 data to account for differences in processing strategy and atmospheric behavior. A small dyke intrusion at Erte Ale volcano (Ethiopia) occurred in January 2017 associated with the overflow of the lava lake (Xu et al., 2017), and interferograms spanning this event shows four fringes of deformation (Figure 4a). Interferograms of Etna volcano (Italy) spanning October 2016 show fringes potentially related to an intrusive event (Figures 4b and 4c); the National Institute of Geophysics and Volcanology reported the opening of a new volcanic vent on 7 August and an explosion at Bocca Nova on 10 October. Interferograms from other time periods at Erte Ale and Etna show multiple fringes that are atmospheric in origin (Figures 4e and 4f). Cerro Azul and Fernandina volcanoes (Galapagos) have been deforming during 2017 (Bagnardi, 2017) and typically show several fringes of deformation in a single interferogram. Several other volcanoes are known to be deforming slowly during this time period, for example, Medicine Lake, United States, which has been subsiding for ∼60 years at ∼10 mm/year (Parker et al., 2015b) and Laguna del Maule, Chile (Singer et al., 2014), and Corbetti Ethiopia (Lloyd et al., 2018), which are uplifting at rates of >6 cm/year. However, in short interferograms, these slow rates of deformation are not sufficient to produce multiple fringes of deformation, and we do not attempt to identify them in the current study. We use interferograms spanning the intrusions at Erte Ale and Etna and to train the network (Figures 4e and 4f) and include the Galapagos volcanoes in the test data set to assess detection capability. For our initial runs, we do not flag interferograms with atmospheric artifacts as negative results, instead testing the ability of to distinguish deformation patterns based on positive examples alone.
[12] Table 1 shows the list of volcanoes used as positive samples in the training process (section 4.2). The negative samples are generated from both nondeformation and background as described in section 4.1.
Training process | Volcano name | Type | Period | # of interferograms |
---|---|---|---|---|
Initial (Envisat) | Alutu | Stratovolcano | 2003–2010 | 158 |
Bora | Pyroclastic cone | 2003–2010 | 52 | |
Corbetti | Caldera | 2003–2010 | 44 | |
Haledebi | Fissure vent | 2003–2010 | 46 | |
Retrained (Sentinel) | Etna | Stratovolcano | 2016 | 2 |
Ale Bagu | Stratovolcano | 2017 | 3 | |
Bora Ale | Stratovolcano | 2017 | 3 | |
Cerro Azul | Shield | 2017 | 8 | |
Erta Ale | Shield | 2017 | 3 | |
Hayli Gubbi | Shield | 2017 | 3 | |
Sierra Negra | Shield | 2017 | 17 |
- Note. The number of interferograms is before applying data augmentation (which is the process of increasing the number of positive samples to be balanced with that of the negative samples in the training data set).
4 Method Development
[13] The proposed framework for using machine learning to identify volcanic deformation in interferograms is shown in Figure 5. For the training process, each image is processed as described in section 4.1 and then fed into the CNN to learn ground deformation characteristics (positive class) against those of background, atmosphere, and noise (negative class). We conducted initial tests on a range of pretrained CNNs and SVMs using small archive and test data sets from Envisat and Sentinel-1, respectively.
4.1 Data Preparation
[14] The values of wrapped interferograms vary between −π and π, and they are typically displayed with colors (red, green, and blue intensities). For the purposes of machine learning, we first convert the wrapped interferogram into grayscale image, that is, the pixel value in the range of [−π,π] is scaled to [0,255] or [−125,125] if zero-center normalization is required (Figure 6b). Subsequently, each training image is divided into patches equal to the input size of the CNN (e.g., 224 × 224 pixels for AlexNet; Krizhevsky et al., 2012). The patches overlap by half their size (Figure 6c).
[15] We then employ Canny (1986) edge detection, where a Gaussian filter is first applied to remove noise, and then double thresholding is applied to the intensity gradients of the image. As the wrapped-phase interferograms shows strong edges where the phase jumps between −π and π, the Canny operator can straightforwardly extract fringes occurring from volcano deformation (Figure 6c). As the number of background areas (negative samples) is significantly larger than those associated with volcano deformation (positive samples), only the patches in which strong edges have been detected are used. Since areas without strong edges are unlikely to contain volcanic deformation, they are instantly defined as background without classification by the CNN.
[16] For machine learning, balancing the number of training samples between classes is very important, but we have only 300 positive examples. There are over 100 times more negative patches containing strong edges than positive patches. Therefore, we increase the number positive patches for training using a data augmentation approach (Krizhevsky et al., 2012). We generate more positive patches by (i) shifting every 10 pixels around the volcano; (ii) flipping horizontally and vertically; (iii) rotation through angles of 22.5∘, 45∘, 67.5∘, and 90∘; and (iv) distorting the shape of deformation by varying scales along the x and y axes of the affine transformation. This data augmentation technique increases the 300 positive samples initially identified in the Envisat data set to approximately 10,000 positive patches (Figure 6d). We randomly select negative patches so that the numbers are balanced.
4.2 Initial Tests
[17] We employ a transfer learning strategy by fine-tuning a pretrained network. This approach is faster and easier than training a network with randomly initialized weights from scratch (which could take months for training). Parameters and features of these networks have been learnt from a very large data set of natural images thereby being applicable to numerous specific applications. The last two layers are replaced with a fully connected layer and a softmax layer to give a classification output related to volcanic unrest. The learning rates of the new layers are defined to be faster than the transferred layers. We set the maximum number of epochs to 50 and the batch size to 100. The output of the softmax layer, the top layer of the CNN, is the probability P of the patch being a positive result. The probabilities for each patch are merged with Gaussian weights (μ = 0, σ = 1), where μ and σ are the mean and the standard deviation, respectively.
[18] Initially, we use the Envisat archive to test three popular pretrained CNN architectures: AlexNet (Krizhevsky et al., 2012), ResNet50 (He et al., 2016), and InceptionV3 (Szegedy et al., 2016). We also test a SVM classifier based on textural features following Anantrasirichai et al. (2013). The objective results were evaluated using a receiver operating characteristic curve (ROC curve), which illustrates the performance of the identification method by comparing true positive rates (TPRs) and false positive rates. The TPR (or sensitivity or recall) is the ratio between the number of positive samples correctly categorized as positive and the total number of actual positive samples. The false positive rate is calculated as the ratio between the number of negative samples wrongly categorized as positive and the total number of actual negative samples. The area under curve (AUC) is the integrated area under the ROC curve. Better performance results in higher AUC values (maximum = 1), achieved through a high TPR and low false positive rate, such that most true ground deformations are correctly identified and only a few backgrounds are falsely identified as positive results.
[19] Figure 7 shows the ROC curve for a twofold cross validation, where half of the data set is employed for training and the other half is used for testing, and then they are swapped, and the results are averaged. We also calculate the accuracy and true negative rate (TNR); for comparison, the accuracy is the proportion of correctly predicted results among all testing samples, while the TNR measures the proportion of negative samples that are correctly identified. AlexNet achieves 0.995, 0.925, 0.899, and 0.992 for AUC, accuracy, TPR, and TNR, respectively. It outperforms ResNet50, InceptionV3, and texture features with SVM by approximately 8%, 5%, and 11% on the average of these four metrics, respectively.
[20] Next, we employ the initial model based on the AlexNet CNN and the SVM, trained by Envisat described above and retrain it by including a subset of the Sentinel-1 data set. We use interferograms covering Erta Ale, Ethiopia, and Etna, Italy, which include both deformation and atmospheric signals as previously discussed (Figure 4). We evaluate these tests using twofold cross validation and compute the accuracy, TPR, and TNR as before (Table 2). For both Erta Ale and Etna, the AlexNet CNN outperforms the SVM. The results for Erta Ale show exceptional performance, with an accuracy of 0.994 for the CNN, while those for the Etna area are less good (accuracy of 0.871), probably due to atmospheric interference.
Region | Methods | Accuracy | TPR | TNR |
---|---|---|---|---|
Erta Ale | CNN | 0.994 | 1.000 | 0.988 |
SVM | 0.985 | 0.982 | 0.985 | |
Etna | CNN | 0.871 | 0.747 | 0.981 |
SVM | 0.742 | 0.654 | 0.783 |
- Note. TPR = true positive rate; TNR = true negative rate; CNN = convolutional neutral network; SVM = support vector machine.
5 Application to the Global Data Set
[21] In the previous sections, we have demonstrated that deep learning with CNNs has significant potential to capture the characteristics of volcano deformation present in interferograms despite the challenges of large scale, heterogeneity, and nonstationary distribution that such problems typically present for deep learning (X. W. ; Chen & Lin, 2014). In this section, we apply our pretrained CNN to the global data set of ∼900 volcanoes and 30,249 interferograms described in section 3, using the framework illustrated in Figure 5. Following an initial run, we use expert analysis of the results to retrain the model and rerun it.
[22] The CNN-training process was run on a graphics processing unit at the High Performance Computing facility (BlueCrystal) at the University of Bristol. The initial and retrained models were completed in approximately 38 and 26 hr, respectively. The retraining process was faster, despite using a larger training data set (Envisat data set plus some positive results of Sentinel data set), because the weights and biases of the network are initialized with values closer to the optimum. The prediction process for each 500 × 500 pixel interferogram took ∼1.5 s (∼10 hr for 30,249 interferograms). In theory, the CNN model can be retrained whenever a new result is confirmed by an expert, a process that would likely focus on true positive and false negative results (i.e., if a real deformation event is missed). However, the training data set requires balanced numbers of positive and negative samples, and since false positives occur more frequently than false negatives, care is required to augment the deformation samples, ensuring that data points are positioned to prevent overfitting.
[23] For each run, we calculate the number of total positive results (positive), confirmed true positives, confirmed false positive, and results requiring further analysis (unconfirmed; Table 3). The initial model run identified 1,368 positive results, of which 39 were considered to be true positives, including the examples at Sierra Negra and Cero Azul in the Galapagos that were included as a test and additional interferograms showing deformation at Etna (Figures 8a–8c). These examples all have detection probabilities >0.999. Of the remaining 1,329 positive interferograms, 894 were quickly identified as false positives, mostly small islands and turbulent atmospheric artifacts, which typically have detection probabilities less than 0.85 (Figures 8d–8f). The true positive and false positive results were then fed back to the CNN to retrain the model.
Model | # of positives | # of true positives | # of false positives | # of unconfirmed |
---|---|---|---|---|
Initial | 1,369 | 39 | 894 | 435 |
Retrained | 104 | 39 | - | 65 |
- Note. This shows that the performances of the convolutional neutral network model is improved from 2.85% to 37.5% in term of the positive predictive value (the fraction of true positives among all retrieved positives) when the model is retrained with the confirmed positives of the Sentinel-1 data set.
[24] The retrained model identified 104 positive results, including the 39 true positives identified initially. The other 65 examples all contained concentric fringes around the volcano, and even experts were unable to determine from a single inteferogram whether the fringes were caused by volcanic deformation or atmospheric artifacts. This includes Tambora, Indonesia; Alayta, Ethiopia; Adwa, Ethiopia; and Etna, Italy (Figure 9) which are all high relief stratovolcanoes. The merged probabilities assigned to these detections are 0.965, 0.867, 0.733, and 0.953 respectively, slightly lower than those assigned to the true positives. By calculating the correlation between the phase and the elevation and looking at pair-wise logic in the time series (Ebmeier et al., 2013b), we finally conclude that these 65 signals were caused by atmospheric artifacts.
[25] The CNN identified over 30,000 negative results but manually searching through all these for false negatives is not feasible. However, we have checked all scenes associated with reported eruptions during this time period (Global Volcanism Program, 2013). The only example with a visible fringe pattern was detected at Ulawun, Papua New Guinea (20170604-20170722), which erupted between 11 June 2017 and 3 November 2017 (Figure 10). The full interferogram and a zoomed-in version showing the fringes are shown in Figures 10c and 10d, respectively. Our framework did not detect this signal because the visible fringe area is relatively small compared to those in the training positive patches, and it is surrounded by noise. After applying several convolutions and pooling in the CNN, the features of the noise become dominant, and it is classified as a negative result.
6 Discussion
[26] The majority of volcanoes worldwide have little or no ground-based monitoring. Satellite systems, such as InSAR, have the potential to measure surface deformation at volcanoes globally, but until now, the utility of these systems has been limited by the acquisition strategy and data policy of the space agencies. The launch of Sentinel-1 is providing unprecedented data access but poses new challenges, as more data are available than can be analyzed by manual inspection. This paper demonstrates that machine learning using deep convolutional neural networks (CNNs) has the capability to identify rapid deformation signals from a large data set of interferograms. This is a proof-of-concept, and further development is still required to develop an operational global alert system for volcanic unrest based on satellite observations of surface deformation. In this section, we discuss the limitations of the current process and outline future developments that would lead to the development of an operational system.
[27] The first component of any automated alert system is the automatic processing of satellite data. The currently available Sentinel-1 data set has a relatively small number of interferograms that show deformation, meaning a limited number of positive samples are available for training. For this initial test, we have resorted to using examples from Envisat and data augmentation approached to increase the number of available positive results for training. However, these may not truly reflect the characteristics of global volcanic deformation. As the system continues running, more positive samples will become available, and as the model is retrained, the system performance will improve.
[28] The European Space Agency posts raw Sentinel-1 data to their website within hours of acquisition, but limited bandwidth makes this data access route unsuitable for automated systems operating on a global scale. The LiCSAR system uses the archive held by the UK Centre for Environmental Data Analysis which typically has a latency period of a few weeks. This latency is well suited for routine surveys of ground deformation (e.g., Biggs et al., 2011; Chaussard & Amelung, 2012; Ebmeier et al., 2013a; Pritchard & Simons, 2004c), which can be used for motivating changes in long-term monitoring strategies, but would be too slow for crisis response (Ebmeier et al., 2018). Automated processing of archived data could be supplemented by direct download for a limited number of volcanoes which are considered to be high threat because of changes in behavior identified by other methods, such as seismic swarms. Once trained, the CNN runs in a matter of seconds and would not add noticeably to the time taken for data to be communicated. The retraining process is slower and could be undertaken periodically, or when particularly significant events are detected, such as a new type of deformation pattern.
[29] The current proof-of-concept study demonstrates the ability of CNNs to identify rapidly deforming systems that generate multiple fringes in wrapped interferograms. For a 12-day C-band interferogram, this corresponds to a deformation rate of 1.8 m/year. Such high rates are typically only observed for very short periods and are often associated with dyke intrusions or eruptions (Biggs & Pritchard, 2017). There are several possible adaptations that would enable a machine learning system to detect slower rates of deformation associated with sustained unrest. The first option is to generate long time-span interferograms, which will increase the number of fringes per image where deformation is sustained. For example, in a year-long interferogram, the average deformation rate required to generate two fringes is only 6 cm/year. The second option is to develop a machine learning approach capable of detecting deformation in unwrapped data. However, fringes are ideally suited to machine-learning approaches because the high-frequency content is easy to identify using edge-detection methods and provides strong features for distinguishing deformation from other signals. Anomaly detection techniques may be suitable for classifying unusual events in unwrapped data (e.g., Gaddam et al., 2007), but care needs to be taken when scaling the unwrapped data to the settings of the pretrained network (e.g., 0–255), as clipping large magnitudes may loose information.
[30] For our initial tests, we have chosen to use wrapped interferograms; as although several unwrapping algorithms exist (e.g., C. W. Chen & Zebker, 2001; Goldstein et al., 1988), they are computationally expensive, particularly in areas of low or patchy coherence. Although the automatic processing of unwrapped interferograms on a global basis is challenging, there are several advantages. In general, stacking multiple short-time period interferograms will produce more coherent results than directly processing longer time-span interferograms (e.g., Biggs et al., 2007). However, there are exceptions, particularly where the level of coherence is seasonally variable, for example, due to snowfall, and further analysis of global patterns of coherence is required in order to determine the most appropriate strategy for automating this. Once the interferogram has been unwrapinterferograms spanning this event shows four fringesped, it can be rewrapped at any chosen interval, meaning that higher fringe rates can be artificially generated. The optimal fringe rate will depend on the ability of the CNN to distinguish the spatial patterns of fringes as increasing the rate will also increase the number of fringes associated with turbulent atmospheric artifacts. Using unwrapped interferograms also improved the ability to identify atmospheric signals, either by applying a direct correction or as a secondary stage. Weather models are available globally and services such as the Generic Atmospheric Correction Online Service exist but are not yet routinely applied on a global basis (Yu et al., 2018). A more efficient approach would be to use the weather models as a secondary stage, once the CNN has identified a smaller subset of positive results.
[31] The final challenge is ensuring that information is provided to the appropriate authorities in a timely and useful manner. The proof-of-concept algorithm reduces the number of interferograms that require manual inspection from >30,000 to 104, but expert analysis is still required to distinguish deformation from some types of atmospheric artifacts and to interpret the deformation patterns in terms of source processes. Although there is a strong statistical link between satellite observations of deformation and eruptions, Biggs et al.'s (2014) global study found that only about half of deforming volcanoes erupted on a decadal timescale. Therefore, these alerts should be considered flags for further investigation using complementary data sets rather than indicators of impending eruption. The ability of volcano observatories to interpret InSAR data is highly variable between countries. The algorithms developed here provide a probability that a given interferogram contains surface deformation, but further capacity building will be required before many volcano observatories, particularly those in developing countries, are able to use this information to influence alert levels or long-term monitoring strategies. Identifying all volcanic ground deformation signals will expand our understanding of the behavior of a wide range of magmatic systems and improve eruption forecasting in the future.
7 Conclusions
[32] This paper is the first to demonstrate the capability of machine learning algorithms for detecting volcanic ground deformation in large sets of InSAR data. The proposed method was developed using a current popular machine learning algorithm for image classification—CNN. Our classification model was initialized with archive data from the Envisat mission using the pretrained CNN, AlexNet. It was then applied to a Sentinel data set consisting of over 30,000 images at 900 volcanoes. After an initial run, expert classification of the positive results were used to retrain the network, and the classification performance was improved, increasing the proportion of correctly identified deformations among all positive results from 2.85% to 37.5%. This retrained network reduced the number of interferograms that required manual inspection from >30,000 to ∼100, and more training is likely to improve the performance yet further. These results indicate that machine learning algorithms combined with automated processing systems have the potential to form an alert system for volcanic unrest in remote and inaccessible regions.
Acknowledgments
[33] This work was supported by the EPSRC Global Challenges Research Fund, the NERC BGS Centre for Observation and Modelling of Earthquakes, Volcanoes and Tectonics (COMET), the NERC large grant Looking into the Continents from Space (LiCS) - grant code NE/K010913/1, the EPSRC Platform Grant—Vision for the Future (EP/M000885/1), and a seed grant from the University of Bristol Cabot Institute. The InSAR data sets are available at https://volcanodeformation.blogs.ilrt.org/ and http://comet.nerc.ac.uk/COMET-LiCS-portal/, and the training data set is available at https://seis.bristol.ac.uk/~eexna/download.html. We would like to thank the Advanced Computing Research Centre, University of Bristol, for the free access of the High Performance Computing machine (BlueCrystal) used for training machine learning algorithms. Also, we would like to thank Dr Karsten Spaans and Emma Hatton, School of Earth and Environment, University of Leeds, for Sentinel data set.